Friday, 27 November 2015

Cleaning Python scripts with Pylint and GNU/Expand

Python is a wonderful programming language. But it can be quite syntax error tolerant. For example, the indentation is really important, but you can use either tabulations or spaces. You can also mix them in Python 2 (forbidden in Python 3).

These days, I am trying to stick to the rules (I am getting old). For that, there is the PEP8 style guide: https://www.python.org/dev/peps/pep-0008/

For example, the official format for identation is 4 spaces per indentation level.

I found Pylint, which check your code for such errors: http://www.pylint.org/

It gives a list of all erors line by line, and a global score.

Let's give a try. Here is a code to print fibonacci numbers, which you can download it from here: https://drive.google.com/open?id=0BxcXpZeUylGbSTNCZDVCQmRQZTA

Save it as fibo.py and launch it:

python fibo.py 9
0 1
1 1
1 2
2 3
3 5
5 8
8 13
13 21
(9, 21)

It works! Great!

But let's have a look with Pylint:


Global evaluation
-----------------
Your code has been rated at 0.83/10

Ouch, this hurts!

Many problems apparently:

************* Module fibo
C:  9, 0: Exactly one space required after comma
    i,j = 1,0
     ^ (bad-whitespace)
C:  9, 0: Exactly one space required after comma
    i,j = 1,0
           ^ (bad-whitespace)
C: 11, 0: Exactly one space required after comma
    for k in range(1,n + 1):
                    ^ (bad-whitespace)
W: 12, 0: Found indentation with tabs instead of spaces (mixed-indentation)
C: 12, 0: Exactly one space required after comma
        i,j = j, i + j
      ^ (bad-whitespace)
C: 13, 0: Trailing whitespace (trailing-whitespace)
W: 13, 0: Found indentation with tabs instead of spaces (mixed-indentation)
C: 13, 0: Exactly one space required after comma
        print i,j
            ^ (bad-whitespace)
C: 15, 0: Trailing whitespace (trailing-whitespace)
W: 11, 8: Unused variable 'k' (unused-variable)
W:  3, 0: Unused import os (unused-import)

First problem, there is a mix and match with tab and space. It is easy to manually correct a small file, it can be tough with a very big file. GNU/UNIX provides some tools for that:

1) command cat 

https://www.gnu.org/software/coreutils/manual/html_node/cat-invocation.html
 
With Linux: cat -A fibo.py
With MacOSX: cat -e -t  fibo.py
With Windows: guru meditation error, sorry.

=> space will be displayed as space, but tab will now be displayed as "^I". Easier to see any problem.

2) command expand

http://www.gnu.org/software/coreutils/manual/html_node/expand-invocation.html
 
expand -t 4 fibo.py > tmp.txt  # Change all tab to 4-space
mv tmp.txt fibo.py  # move back the filename.
chmod +x fibo.py  # make it executable again.


And now try again
cat -e -t  fibo.py
pylint fibo.py

Global evaluation
-----------------
Your code has been rated at 2.50/10 (previous run: 0.83/10, +1.67)


Good, there is some progresses.


C:  9, 0: Exactly one space required after comma
    i,j = 1,0
     ^ (bad-whitespace)
C:  9, 0: Exactly one space required after comma
    i,j = 1,0
           ^ (bad-whitespace)
C: 11, 0: Exactly one space required after comma
    for k in range(1,n + 1):
                    ^ (bad-whitespace)
C: 12, 0: Exactly one space required after comma
        i,j = j, i + j
         ^ (bad-whitespace)
C: 13, 0: Trailing whitespace (trailing-whitespace)
C: 13, 0: Exactly one space required after comma
        print i,j
               ^ (bad-whitespace)
C: 15, 0: Trailing whitespace (trailing-whitespace)
W: 11, 8: Unused variable 'k' (unused-variable)
W:  3, 0: Unused import os (unused-import)

The rest of the code is more styling errors:
- Add space after the comma.
- Remove import os
- Remove trailing white-space (visible with cat -A / cat -e -t)

And now:
Your code has been rated at 9.09/10.
Much better!

The last problem is an unused variable k. We can let it like this or change it with a more elegant way (i.e. with a while loop).


So, recommandation:

- Do your code properly since the begining.
- Use Pylint to identify errors.
- Correct the errors.
- Have a look to see if your code is not changed (i.e. identation shifted).
- Run your code again to check if it is working.


PS: It looks like Emacs 25.0 is properly handling the tab key, aka adding 4-spaces in the file instead of a tab.



No comments:

Post a Comment