Posts Tagged ‘Cyclomatic Complexity’

In my previous post, I came up with an enhanced pygenie, which could count total complexity of a Python project as a whole. As explained earlier, my motivation is purely for fun. I took a test run of the enhanced pygenie on some of the popular open-source Python projects. And the results are out ūüôā

Django 1.2.1

time ./pygenie/pygenie.py Django-1.2.1/ -r
Total Cumulative Statistics
----------------------------------
Type         Count      Complexity
----------------------------------
X            1431             1629
C            2393             2408
M            6299            13000
F            1144             3600
T           11267            20637
----------------------------------

real    0m11.840s
user    0m11.489s
sys     0m0.056s

Zope 2.11.4

time ./pygenie/pygenie.py Zope-2.11.4-final/ -r
Total Cumulative Statistics
----------------------------------
Type         Count      Complexity
----------------------------------
X            3005             4549
C            6453             6500
M           23137            45392
F            4985            12960
T           37580            69401
----------------------------------

Totally 5 files failed
FAILED to process file: /home/aufather/Downloads/Zope-2.11.4-final/lib/python/ZEO/scripts/zeoserverlog.py
FAILED to process file: /home/aufather/Downloads/Zope-2.11.4-final/lib/python/zope/app/testing/ztapi.py
FAILED to process file: /home/aufather/Downloads/Zope-2.11.4-final/lib/python/zope/app/component/back35.py
FAILED to process file: /home/aufather/Downloads/Zope-2.11.4-final/lib/python/zope/app/component/site.py
FAILED to process file: /home/aufather/Downloads/Zope-2.11.4-final/lib/python/zope/rdb/gadfly/sqlbind.py

real    0m37.042s
user    0m36.302s
sys     0m0.140s

Twisted 10.1.0

time ./pygenie/pygenie.py Twisted-10.1.0/ -r
Total Cumulative Statistics
----------------------------------
Type         Count      Complexity
----------------------------------
X             935             1282
C            3736             3922
M           16782            26142
F            1323             3268
T           22776            34614
----------------------------------

Totally 6 files failed
FAILED to process file: /home/aufather/Downloads/Twisted-10.1.0/doc/core/howto/listings/udp/MulticastClient.py
FAILED to process file: /home/aufather/Downloads/Twisted-10.1.0/doc/core/howto/listings/udp/MulticastServer.py
FAILED to process file: /home/aufather/Downloads/Twisted-10.1.0/doc/historic/2003/pycon/deferex/deferex-listing0.py
FAILED to process file: /home/aufather/Downloads/Twisted-10.1.0/doc/historic/2003/pycon/deferex/deferex-listing2.py
FAILED to process file: /home/aufather/Downloads/Twisted-10.1.0/doc/historic/2003/pycon/deferex/deferex-bad-adding.py
FAILED to process file: /home/aufather/Downloads/Twisted-10.1.0/twisted/python/test/test_win32.py

real    0m20.794s
user    0m20.525s
sys     0m0.036s

Even though I started this activity for fun, I was surprised by the speed and utility of pygenie. Some of the uses I can think of

  • Identify syntax errors within a project. Since pygenie uses the same python compiler, if pygenie could not process a file, the file cannot be processed by Python byte code compiler. Zope has 5 files which will not run while Twisted has 6. Django is clean in this front. When we take a closer look, it is test cases, demo code, tutorials, historical code etc that are broken.
  • Object orientation used within a project. The ratio between methods to functions is a fair indicator. The scores for the three projects are Django(5.5), Zope (4.6) and Twisted (12.7). This is a good indicator of existing style of code.
  • Overall size of a project. Pygenie report indicates that Zope is 3.4 times and Twisted is 1.7 times as big as Django.

The original pygenie is available here. Enhanced pygenie is available from my dropbox here.

Recently I measured some code metrics of a C/C++ project at work. I used C and C++ Code Counter(CCCC) and it is pretty good. There was a similar project at work which was almost completely in Python. I wanted to compare the two and searched for a suitable tool to measure McCabe’s Cyclomatic Complexity(CC) for Python code and found pygenie.

Original pygenie, by default, prints Files(X), Classes(C), Methods(M), Functions(F) only if their CC exceeds 7. If –verbose or -v is given, it prints CC irrespective of whether it crosses the threshold of 7 or not. But I wanted overall CC. My main motivation is purely for fun or for interesting numbers. Kind of like the calories burnt counter on a treadmill or a cycle computer for a bicycle. But overall CC can be useful in estimating testing effort.

Here is a sample run of original pygenie on its own source code.

aufather@simstudio:~/Downloads/pygenie_orig$ ./pygenie.py complexity *.py
File: /home/aufather/Downloads/pygenie_orig/cc.py
This code looks all good!

File: /home/aufather/Downloads/pygenie_orig/pygenie.py
Type Name Complexity
--------------------
F    main 8          

File: /home/aufather/Downloads/pygenie_orig/test_cc.py
This code looks all good!

And here is a sample run of modified pygenie on its own source code.

aufather@simstudio:~/Downloads/pygenie$ ./pygenie.py * -c
File: /home/aufather/Downloads/pygenie/cc.py
This code looks all good!

File: /home/aufather/Downloads/pygenie/pygenie.py
This code looks all good!

File: /home/aufather/Downloads/pygenie/test_cc.py
This code looks all good!

Total Cumulative Statistics
----------------------------------
Type         Count      Complexity
----------------------------------
X               3                4
C               8                8
M              31               70
F              13               26
T              55              108
----------------------------------

I added the total cumulative statistics and a summary for each file. Total is represented by ‘T’. This gives a fair idea of how complex the overall project is. ‘F’ stands for functions, ‘M’ stands for methods, ‘C’ stands for classes and ‘X’ stands for files. Also while I was at it, added a whole bunch of command line options to pygenie.

Here is the help for original pygenie. Valid commands are “complexity” or “all”.

aufather@simstudio:~/Downloads/pygenie_orig$ ./pygenie.py all -h
Usage: ./cc.py command [options] *.py

Options:
 -h, --help     show this help message and exit
 -v, --verbose  print detailed statistics to stdout

And the new options added.

aufather@simstudio:~/Downloads/pygenie$ ./pygenie.py -h
Usage: pygenie.py [options] *.py

Options:
 --version             show program's version number and exit
 -h, --help            show this help message and exit
 -c, --complexity      print complexity details for each file/module
 -t THRESHOLD, --threshold=THRESHOLD
 threshold of complexity to be ignored (default=7)
 -a, --all             print all metrics
 -s, --summary         print cumulative summary for each file/module
 -r, --recurs          process files recursively in a folder
 -d, --debug           print debugging info like file being processed
 -o OUTFILE, --outfile=OUTFILE
 output to OUTFILE (default=stdout)

You can get the full zipped source code from my dropbox here

http://dl.dropbox.com/u/10402332/pygenie.zip.

PS: I based my modification on revision 44 of pygenie. I got the original from http://svn.traceback.org/python/cyclic_complexity/.

PPS: I could not find a better way of sharing the code in wordpress. I have updated the test_cc.py too so that unit test do not fail. If any one knows a better way of sharing code, please let me know. Dropbox works well for me.