Style Guide¶

PEP 8 is a fantastic basis for clean Python code. The standards of PEP 8 are, for the most part, enforced with tooling, via a linter and formatter. There are, however, a handful of decisions that fall outside the scope of this PEP, and for the sake of consistency and readability we define our own rules for these things here.

This style guide can serve as a reference not only during development but also during peer review. Reviewers should work to internalize this article so that they can enforce its rules during review with consistency. The purpose of the style guide is to cover rules which are not enforcable through tooling; they are the responsibility of reviewers.

Use of Data Models¶

We use three primary tools for constructing data models: Pydantic, data classes, and typed dicts. Which one should you use in a given situation? We present two arguments here, as well as our conclusion on which one we adhere to.

The first argument is in favour of consistency. We can eliminate a decision point for engineers by making the opinionated declaration that Pydantic should be used unless a typed dict is required. Pydantic's feature set is essentially a superset of dataclasses, meaning we are strictly gaining functionality by opting to always prefer it. In some cases, particularly when dealing with third-party data, representing data as a dict may be a requirement, and this would be the scenario in which to use a typed dict. This approach will speed up development.

The second argument is in favour of efficiency. Pydantic combines static data validation with runtime validation, and runtime validation is a comparatively slow process. Pydantic is roughly 20 times slower at object creation than dataclasses, owing almost entirely to the runtime data validation functionality. If the data being passed to the data model constructor is trusted, as is often the case with platform code, runtime validation wastes CPU cycles. Determining when runtime validation is necessary -- when data is trusted or untrusted -- will speed up execution.

In this case, we prefer speed of development to speed of execution. While Pydantic is roughly 20 times slower, this is still on the order of microseconds. Pure Python is known for being slow to execute as it is. If as a team we were interested in optimizing on the order of microseconds, we would not be using Python to begin with. Given the number of times that data model creation is run on trusted data, the environments in which it is run, and their associated SLAs, the gains in efficiency are not meaningful. They certainly do not warrant the added burden placed on the engineer, who has to stop and think about future use cases of every data model they create in order to decide which library to use. This also creates inconsistency in the interfaces of data models across the codebase; it becomes difficult to track when a data model is a Pydantic object or a data class, and therefore which methods are available on it.

Use Pydantic, unless the data you are typing is necessarily in dictionary form, in which case you must use a typed dict.

Use of Pydantic Fields¶

While docstrings are optional for functions, descriptions for fields of Pydantic models are not.

Every field in a Pydantic model must make use of the Field default value in order to provide a description for the field. The title argument is optional. The examples argument is encouraged in places where there is additional runtime validators, such as a regex pattern.

Good data descriptions avoid simply re-stating the name of the field; they add context that cannot be communicated through a concise name.

Make liberal use of additional runtime validations, including custom functions if necessary. Lower bounds, upper bounds, regex patterns, minimum and maximum lengths, mutually exclusive fields, and enums instead of strings are all examples of tight data models that prevent mistakes.

Use Pydantic's strict data types in place of Python built-in data types whenever possible. StrictStr, StrictBytes, StrictInt, StrictFloat, and StrictBool prevent Pydantic from coercing compatible types. For example, this would prevent a field with type str that is passed 5.0 from converting the value to "5.0".

If your fields represent well known concepts, such as common machine learning hyperparameters, it is not necessary to explain the concept in the description. The description should focus on the properties of the field that are unique to its usage in our codebase.

For long descriptions, prefer concatenated strings to single multi-line strings. This makes for a more usable final product. Concatenate strings by wrapping them in parentheses and putting one string on each line, taking care to end lines with spaces where necessary.

Here is a complete example of a solidly documented Pydantic model:

from typing import Any
from pydantic import BaseModel, StrictInt, StrictFloat, validator, root_validator
from pydantic.types import AnyHttpUrl

class ModelHyperparameters(BaseModel):
    content_min_length: StrictInt = Field(
        default=0,
        description="The minimum number of characters accepted as string input to the model.",
        gt=-1,
        lt=1000,
    )
    content_max_length: StrictInt = Field(
        ...,
        description=(
            "The maximum number of characters accepted as string input to the model. "
            "This dictates the structure of the embedding layer."
        ),
        gt=0,
        lt=1000,
    )
    auxiliary_content_source: AnyHttpUrl = Field(
        ...,
        description=(
            "The data collector will submit a GET request to this URL in order to retrieve "
            "additional text content with which to fine tune the model. The response from "
            "this URL should be plain text only, not HTML, as it will not be parsed."
        ),
    )
    learning_rate: StrictFloat = Field(
        ...,
        description=(
            "This is the hyperparameter that model training is most sensitive to. "
            "The recommended range is between 0.001 and 0.005."
        ),
        gt=0,
        lt=1,
    )

    @validator("auxiliary_content_source")
    def is_https(cls, v: AnyHttpUrl) -> AnyHttpUrl:  # noqa: U100
        if not v.startswith("https"):
            raise ValueError("auxiliary_content_source must be HTTPS for security reasons.")

        return v

    @root_validator
    def min_length_is_not_larger_than_max(
        cls, values: dict[str, Any]
    ) -> AnyHttpUrl:  # noqa: U100
        if values["content_min_length"] > values["content_max_length"]:
            raise ValueError("Min length cannot be larger than max length.")

        return values

To summarize the takeaways from the example:

All fields have descriptions.
Fields for which it makes sense have minimum and maximum bounds on their value.
Strict types are used.
The field whose value is a string, but a particular kind of string, uses a more narrow type.
Validation which cannot be done with a type is done with a custom validator.
Validation that involves the values for multiple fields uses a custom root validator.
The validators are statically typed.

Use of Ignore Comments¶

Black, flake8, isort and pyright all come with the ability to be disabled temporarily using comments. There are very few acceptable uses for this.

There are virtually no valid reasons to ignore Black formatting. The same is true of isort.

Flake8¶

There is one type of flake8 error that may commonly be ignored: U100, which is the error code that represents unused arguments in a function. This is most often ignored in two settings:

Pydantic validators which do not make use of the cls argument.
Tests which do not make use of a fixture.

The second case can be mitigated through the use of pytest.mark.usefixture, which exposes the fixture to the function without passing it as an argument. The first case is a quirk of flake8 that is difficult to get around, so this usage is perfectly acceptable.

Pyright¶

When faced with a type error from Pyright, there are many steps to go through before reaching the unsavoury conclusion of needing to ignore the error.

First, are you certain that the error is not a valid criticism of the code? Typing can be a complex task, and sometimes it just takes more research and massaging to find the correct typing representation for a solution.

If you are certain that you are in the right, search for issues in Pyright's Github. You may find something helpful there. If no issues exist and you are still confident that something is wrong, open an issue of your own. You will likely get a swift response.

Depending on the response, you may be able to rectify the issue. If even Pyright's maintainers say that this is not something you can resolve, and you have no other refactoring ideas, a detailed comment explaining the type ignore will be required. Link to your issue in Pyright's Github within the comments surrounding the ignore.

Commenting¶

Comments should never state the obvious. A comment which re-words the functionality of a small piece of code is not a valuable comment. Comments which describe logical sections in a function are generally an indication that the function is too long and has too many responsibilities.

Comments should be used to document decisions made by the engineer which took considerable thought. If it is not obvious how an implementation was decided on, a comment should explain why the code works the way that it does. If applicable, a reference should be included to the source used to inform the decision (e.g. a stack overflow question).

Some examples, starting with a good example:

# if we don't update these annotations, they will remain as ForwardRefs even on import by other code
MyType.update_forward_refs()

And a bad example:

# this only logs requests that have a status code of 200
if response.status_code == 200:
    log_response(response)

TODOs are valid comments, and their format is controlled by a flake8 extension.

Functions¶

All arguments to all functions and methods should be keyword-only, using * syntax, described in PEP 570.
- The exception is for functions which, perhaps due to their use as a callback in the context of a third-party library function, must accept positional arguments. In this case, these arguments should be marked positional only. So, in general, all parameters should be either keyword- or positional-only, with strong preference for keyword-only.
- Keyword-only syntax: def f(*, a: str, b: str, c: str) -> str:
- Positional-only syntax: def f(a: str, b: str, c: str, /) -> str:

Naming¶

PEP 8 covers the majority of questions regarding naming of variables, functions, and classes. There are just a few extra cases to cover here.

Type variables must end with a capital T, and be descriptive. T is not an acceptable name for a type variable.
In general, verbosity is preferred. Do not sacrifice clarity in a function name for brevity.
When possible, variable names and function names should be aligned. If a function returns a URL, a clean line of code that executes the function would be url = get_url(). In general, the pattern is [x] = get_[x]().

Common Package Structure¶

Types¶

Python packages will often require their own custom types. These types can exist in one of two formats:

If there are only a handful of types with no logical categorization, a types.py file at the root of the package is sufficient.
If the types are numerous and have a logical categorization, a types/ submodule with modules for each categorization is preferred.

Utilities¶

The module name utils is an escape hatch for when functionality is poorly organized. It should be avoided whenever possible. Instead, consider the role that the utility function plays in the purpose of the package, and name it accordingly.

This is a difficult rule to follow, as evidenced by our very own codebase. Utils is something of a drug, and it is very hard to quit. But do your best.

Constants¶

Constants follow the same rules as types. Either create a constants.py file in your package, or a constants/ module with submodules for categories.

Formatting¶

Logical Sections¶

Line breaks should separate blocks of code which make up a single logical "section" of a function or method. While at times having multiple logical sections to a function is a sign that it needs to be factored out, there are cases where it makes sense. In these cases, separate these sections with a single line break.

To take an example from Python source code:

with self._shutdown_lock, _global_shutdown_lock:
    if self._broken:
        raise BrokenThreadPool(self._broken)

    if self._shutdown:
        raise RuntimeError('cannot schedule new futures after shutdown')
    if _shutdown:
        raise RuntimeError("cannot schedule new futures after interpreter shutdown")

Since the 4 lines regarding shutdown are a separate logical section from the 2 lines regarding broken threads, there is a blank line separating them.