Skip to content

Clean Code

Writing clean code can be hard to do by yourself. It involves paying attention to a lot of small details, and remembering to apply a large set of rules consistently. Fortunately, these are things that programs are good at. This is how we end up with a set of tools that can either:

  • Check our code and alert us of mistakes we've made, or
  • Check our code and automatically fix mistakes we've made.

In this article we'll explore four such tools that we use. The article on CI/CD will go into more detail on exactly how they are applied to code. For now, we focus on our choice of tool and its available configuration.

For more details on our configuration, see our PyProject defaults.

Formatting

Code formatting refers to the re-organization of lines of code to increase readability. It never changes the implementation of code that it alters.

Some examples of a change a code formatter might make are to:

  • Split a long line into smaller lines.
  • Alter the indentation of a line.
  • Swap single quotes for double quotes.
  • Alter the number of blank lines between lines of code.

Our choice of code formatter is Black. It has become the standard for Python projects, so this was a very simple choice to make.

Black is very opinionated. There is only one bit of configuration that we have done ourselves: the max line length. The default is 79, but we have raised it to 100.

Linting

Linting is becoming something of a broad term, but in general it refers to the evaluation of code against style guides and language syntax. A good linter can inform the developer when an undefined variable is referenced just as well as it can inform them of a formatted string without any format placeholders.

Our linter of choice is flake8. We choose flake8 primarily because of its many community extensions. Extensions are where linting takes on new, less-defined meaning.

Flake8 extensions are, generally, programs that run continuously against source code to find errors and warnings to surface to the user based on a clearly defined rule set. They cover concepts such as consistent naming, code complexity, testing, annotations, and bugs.

With extension selection, we try to strike a balance between being comprehensive in our code evaluation and not being overly restrictive. There is a line we don't wish to cross, the other side of which is an environment where an engineer is inundated with messages in every file that do not actually increase their productivity or the efficiency of their code.

These are the extensions we have currently enabled:

  • flake8-print
    • Print statements can be used for debugging, but they are rarely needed in final production code.
    • The goal of this extension is to prevent debugging code from making into the codebase.
    • Print statements that are actually necessary can add a comment to ignore the rule.
  • flake8-todos
    • TODO statements are acceptable in code, but they need structure to avoid being forgotten.
    • When writing a TODO, this extension makes sure that the developer creates an associated issue, and signs the TODO with their name.
  • flake8-eradicate
    • Commenting out code is another debugging procedure that is sometimes easy to forget about; the developer may leave a secondary implementation commented out and forget to remove it entirely before creating their pull request.
  • flake8-unused-arguments
    • A function that accepts an argument must use that argument somewhere in its implementation. Otherwise the argument should be removed from the signature.
    • There are exceptions here, and this rule may be the most common use of a flake8 ignore comment:
      • cls is treated as an argument, and there are many places it is not used. This comes up most often in Pydantic validators.
      • Tests may use fixtures by including them as arguments without actually using them in the implementation of the test.
  • flake8-tidy-imports
    • Relative imports are bad, and this extension prevents them.
    • Import aliases that don't actually change the name of the import are unnecessary.

Our configuration includes some rule ignores:

  • E203: whitespace before a colon.
    • This rule conflicts with Black, so we ignore it.
  • W503: line break occurred before a binary operator.
    • This rule conflicts with Black, so we ignore it.
  • E402: module level imports not at top of file.
    • While rare, there are cases where a line of implementation must come before an import.
    • For example, a sys.path.append may be required in order for a subsequent import to work.

We also ignore some file types, which are mainly extra files which are either not our own creation or not checked into version control.

Import Sorting

Import sorting is not simply alphabetical: it organizes imports according to their relationship to the rest of the code. Python standard library packages are different from third-party libraries, which are different from first-party libraries, and this should be reflected in the organization of imports. This is the goal of an import sorter.

For readability, it also sorts import ... lines and from ... import ... lines within a logical block, and roughly sorts imports within a logical block alphabetically.

The tool of choice for this, again as something of a default in the Python community, is isort.

As a developer, don't try to sort your imports manually. When you need a new import, throw it at the top of the file, save the file, and let isort organize it for you.

There is one caveat here: isort cannot reach beyond the first bit of implementation code to retrieve imports and put them at the top of the file. Because of this, new imports should always be placed at the top of the file first, unless for some reason it is necessary to place the import after the definition of some function, class, or constant. This should be incredibly rare.

Type Checking

We have gone over the benefits of type checking in another article. We have also justified our selection of type checker, Pyright, in the same article.

Configuration

Configuration values for all of the above tools are held in pyproject.toml. This file is gitignored because it is auto-generated by our CLI when an environment is "activated".