-
-
Notifications
You must be signed in to change notification settings - Fork 2.9k
Mypy Daemon
If you run mypy using the mypy
tool, it performs incremental checking by default which reuses type checking results from the previous run via files in the .mypy_cache
directory.
This coarse-grained (module-level) incremental checking is often enough, but it can result in a slow iteration speed in large code bases with hundreds of thousands of lines.
The mypy daemon makes incremental runs very fast (often a few hundred ms) by having a long-lived process that stores program state in memory between runs.
The daemon also maintains fine-grained dependencies. These dependencies are between definitions, not modules.
Assume that we have a file a.py
that contains dozens of functions, only one of which uses module b
:
# a.py
import b
... # lots of stuff that doesn't use b
def f() -> None:
b.g(1)
The mypy daemon will process the entire program when starting up, and record that a.f
depends on b.g
, among other things.
Now we edit b.py
and change the signature of b.g
:
# b.py
def g(x: str) -> None: ...
Mypy daemon will first notice that the file b.py
is changed (by checking the modification time). It will process b.py
again and notice that the only externally visible change in b
is the modified signature of b.g
.
Using fine-grained dependencies, mypy daemon will figure out that in module a
, we'll need to reprocess only the function a.f
. This is much faster than processing the entire module a
if the module is big.
After reprocessing a.f
, mypy daemon will report an incompatible argument error.
The mypy daemon has several things going on, summarized below.
Mypy daemon keeps track of fine-grained dependencies between functions, attributes, classes, modules, etc.
The smallest unit of (re)processing is either a module top level or a top-level function/method. We only track dependencies at this level of granularity.
Each file that is modified since the previous mypy daemon run will be processed in full, since mypy doesn't have an incremental parser.
After processing a file, the daemon takes a diff of the old and new symbol tables to find changed definitions. It will then "trigger" these definitions (i.e. reprocess everything that depends on them).
Now comes a tricky bit: we merge the new AST to the AST corresponding to the previous revision of the file by copying the contents of new AST nodes over to the corresponding old nodes (when they exist). This way references to the AST nodes in other modules will continue to point to the correct things.
Mypy daemon uses fine-grained dependencies to find other parts of the codebase that need to be reprocessed in response to the triggered definitions.
To reprocess a triggered definition, the daemon first "strips" (or resets) the relevant AST nodes to match a fresh AST we get from the parser. We use this hack, since we don't have an incremental parser and want to avoid processing the entire file containing a triggered definition.
Reprocessing a definition implies performing semantic analysis and type checking on a stripped subset of some module AST.
After reprocessing, we again check if something externally visible has changed, and we may need to also trigger the dependencies of definitions we just reprocessed.
Eventually we'll reach a fixed point where there are no additional triggered dependencies and we are done.
-
mypy/dmypy_server.py
: Implementation of the daemon process -
mypy/server/deps.py
: Generate fine-grained dependencies for AST nodes -
mypy/server/update.py
: Fine-grained incremental processing logic -
mypy/server/astdiff.py
: Compare two versions of a module symbol table and find changed definitions -
mypy/server/astmerge.py
: Merge new version of an AST to the previous version to preserve object identities of nodes -
mypy/server/aststrip.py
: Strip an AST to make it "fresh", i.e. similar to what is produced by parser, with all changes performed by semantic analysis or type checking reverted
Many daemon test cases perform multiple incremental steps to validate that propagating changes works as expected.
The primary test files are here: test-data/unit/fine-grained*.test
There are also more unit test style test cases for some operations:
- Generating deps (
test-data/unit/deps.test
) - Performing AST diffs (
test-data/unit/diff.test
) - Merging ASTs (
test-data/unit/merge.test
)