To repeat an earlier comment of mine from the launch of uv on hn (tl; dr: these new type checkers never support django):
The way these type checkers get fast is usually by not supporting the crazy rich reality of realworld python code.
The reason we're stuck on mypy at work is because it's the only type checker that has a plugin for Django that properly manages to type check its crazy runtime generated methods.
I wish more python tooling took the TS approach of "what's in the wild IS the language", as opposed to a "we only typecheck the constructs we think you SHOULD be using".
1. Maybe it's time to drop the crazy runtime generation and have something statically discoverable, or at least a way to annotate the typing statically.
2. Astral indicated already they plan to just add direct support for Django and other popular languages.
3. As people replied to similar comments on the previous threads (maybe to you?): that's not why ty is fast and why mypy is slow. It's also an easy claim to disprove: run both without plugins and you'll see ty is still 100x+ faster.
> 1. Maybe it's time to drop the crazy runtime generation and have something statically discoverable, or at least a way to annotate the typing statically.
That, and duck typing, are one of the biggest things that make Python what it is. If I have to drop all that for type checking and rewrite my code, why would I rewrite it in Python?
Having used Python for many years, it’s the least interesting aspect of the language. Almost all such tricks can be done with compile time meta programming, often even without API changes.
I don’t know about a blog post, but I mean the obvious stuff like codegen, generics/templates or macros to achieve the same things Python does by being dynamic.
Not only duck types (ie structural type hierarchies) can be statically verified but they can be statically infered as well, as demonstrated for instance by ocaml since 1996.
"Here is a nickel, kid, get yourself a better programming language" :-p
The only type checker that fully works (meaning it successfully performs the necessary type inference all for inherited objects) for our large and highly modular python codebase, is Pycharm (I'm guessing it's their own custom tool from the ground up? Not really sure, actually.)
these are two different issues. supporting django involves adding a special-case module that essentially replicates its code generation and then adds that to the type-level view of the code. pyrefly or ty could do that and would still be just as fast. my guess is that once they have the basic python type checker as close to 100% as they can, they will start looking at custom modules for various popular metaprogramming libraries, or add enough of a plugin framework that the community can contribute them.
source: spent several years working on a python type checker
TS has the luxury of being its own distinct language, defining its own semantics and having its own compiler. You could have something like that targeting Python.
The way these type checkers get fast is usually by not supporting the crazy rich reality of realworld python code.
The reason we're stuck on mypy at work is because it's the only type checker that has a plugin for Django that properly manages to type check its crazy runtime generated methods.
I wish more python tooling took the TS approach of "what's in the wild IS the language", as opposed to a "we only typecheck the constructs we think you SHOULD be using".
> The way these type checkers get fast is usually by not supporting the crazy rich reality of realworld python code.
Or in this case, writing it in Rust...
mypy is written in Python. People have forgotten that Python is really, really slow for CPU-intensive operations. Python's performance may not matter when you're writing web service code and the bottlenecks are database I/O and network calls, but for a tool that's loading up files, parsing into an AST, etc, it's no surprise that Rust/C/even Go would be an order of magnitude or two faster than Python.
uv and ruff have been fantastic for me. ty is definitely not production ready (I see several bizarre issues on a test codebase, such as claiming `datetime.UTC` doesn't exist) but I trust that Astral will match the "crazy reality" of real Python (which I agree, is very crazy).
We will probably change the default to "most recent supported Python version", but as mentioned elsewhere, this is very early and we're still working out these kinds of kinks!
You should be doing this dynamically based on the version of python you are running against, so that you don't have to hardcode or make such "conservative" choices by hand.
Note that we're not ever spinning up a Python interpreter to run your code, or monitoring an existing running Python process. So we do need some kind of metadata.
But yes, if you have a Python version specified in pyproject.toml, we respect that, and if you have a virtualenv, we can see the Python version that was used to create that. And that's what we use to type-check your code.
The default being discussed here is what we fall back on if that project metadata isn't available.
Criticism isn’t necessarily condescending. “You should be doing x because y” is just a plain assertion, it doesn’t imply any moral judgement or opinion of the author.
I don't necessarily read it as condescending, but I do read it as presumptuous. What someone "should" do depends on many things. Maybe, because this is software in alpha stage, they should _not_ focus on this part of the code if it is minor compared to other obligations. Or maybe there are other reasons they've chosen not to do this (as was explained in an above comment).
IMO, a less presumptuous criticism would be phrased like "if you did X then benefits Y would happen", or "if you haven't, consider X", or even (the least presumptuous - make it a conversation!) "have you considered X?", rather than "you should do X".
I see what you mean. Perhaps it was just a "poor" choice of words for whatever reasons. I am sure we can assume he intended it in a way of "have you considered X?".
Currently we default to our oldest supported Python version, in which `datetime.UTC` really doesn't exist! Use `--python-version 3.12` on the CLI, or add a `ty.toml` with e.g.
```
[environment]
python-version = "3.12"
```
And we'll find `datetime.UTC`.
We've discussed that this is probably the wrong default, and plan to change it.
I realize this might be hard from a technical / architecture standpoint, but it would be great if "does not exist" and "does not exist in this version of Python" were two different errors.
If I saw something like "datetime.UTC doesn't exist", I'd immediately think "wait, was that datetime.utc", not "ooh it got added in 3.11, I need to change my Python version"
I agree that would be nice; probably not near the top of our list right now (and not trivial to implement), but it makes sense. Thanks for the suggestion.
This information is already maintained via `if sys.version_info >= (...):` conditionals in typeshed stubs. I don't think this is important enough to justify maintaining the same information in a duplicate way.
aha makes sense! Yeah it'd be nice if you could divine the intended python version from the uv configuration/`.python-version`. Thanks for all your hard work, looking forward to the full release!
Defaulting is wrong: what is checked is the aggregate of actual user code, standard library for a given Python version and installed packages. It has to be the same environment as when the program is run, leaving conservative approximations (checking types with the oldest supported library versions and hoping newer ones are OK) to the user.
Yes, if you have a Python version specifed in pyproject.toml, for instance, we respect that, and that's what we use to type-check your code. The default being discussed here is what we fall back on if that project metadata isn't available.
There are some extremely CPU-intensive low-level operations that you can easily write in C and expose as a Python API, like what Numpy and Pandas do. You can then write really efficient algorithms in pure Python. As long as those low-level operations are fast, those Python-only algorithms will also be fast.
I don't think this is necessarily "cheating" or "just calling disguised C functions." As an example, you can write an efficient linear regression algorithm with Numpy, even though there's nothing in Numpy that supports linear regression specifically, it's just one of the ways a Python programmer can arrange Numpy's low-level primitives. If you invent some new numerical algorithm to solve some esoteric problem in chemistry, you may be able to implement it efficiently in Python too, even if you're literally the first person ever writing it in any language.
The actual problem is that it's hard for people to get an intuition of which Python operations can be made fast and which can't, AST and file manipulation are sadly in the latter group.
That is a confusing way to look at it. Python is slow, C is fast. If your python code is calling functions that were not written in Python (even if it is indirectly thru a library you are using), that is not "pure python".
That works in numerical libraries because you can encapsulate the loops into basic operations that you then lower to C.
In a domain like type checking it's not nearly as easy/doable.
> As long as those low-level operations are fast, those Python-only algorithms will also be fast.
Only if you spend more time on the C implementations than on Python. If you have pure Python loops, you'll be slow. You need quite high-level components and minimal Python glue for it to be fast.
CPU intensive is not quite the right metric. What python is slow at is all the extra administration that comes with basic stuff like accessing attributes and function calls.
This gives somewhat counterintuitive results where declaring and summing a whole list of integers in memory can be faster than a simple for loop with an iterator.
But yeah writing stuff in a different (compiled) language is often better if that means the python interpreter doesn't need to go through as many steps.
The semantics of Python makes it problematic to run at speed, it is not just about interpreted vs compiled code. Give the high levels of dynamic behaviors that are allowed, a Jit (like pypy) has a higher chance of getting decent performance if the code has an underlying behavior that can be extracted.
In defense of mypy et al, Typescript had some of the greatest minds of our generation working for a decade+ on properly typing every insane form found in every random Javascript file. Microsoft has funded a team of great developers to hammer away at every obscure edge case imaginable. No other python checker can compare to the resources that TS had.
TS might transpile to JS and can always be split into a js and type annotation file but is it's own language developed in tandem with the type check based on a holistisch approach to find how to type check then and then put it into the syntax and type checker.
Thats not true for python at all.
Python types where added as annotations to the language many years ago, but not in a holistic approach but in simplistic approach only adding some fundamental support and then extended bit by bit over the years (and not always consistently).
Furthermore this annotations are not limited to type checking which can confuse a type checker (through Annotated helps a lot, but is also verbose, wonder how long until there is a "Annotated" short syntax e.g. by impl @ on type or similar).
Basically what the type annotation feature was initially intended to be and what it is now differ quite a bit (dump example `list` vs. `List`, `Annotated` etc.).
This is made worse that a bunch of "magic" is deeply rooted in python, e.g. sub-classing `Enum`. Sure you have that in JS too, and it also doesn't work that well in TS (if you don't add annotation on the dynamically produced type).
Lastly TS is structurally typed, which allows handling a bunch of dynamic typing edge cases, while Python is, well, in-between. Duck-typing is (simplified) structural typing but `isinstance` is a common thing in Python and is nominal typing...
So yeah it's a mess in python and to make it worse there bunch of annoyances related to ambiguity or to many ways how to do a thing (e.g. re-exports+private modules, you can do that common coding pattern, but it sucks badly).
Why do you say that duck typing is simplified structural typing?
Its relationship with structural typing is on a different axis.
Duck typing is its dynamic-typing counterpart.
from dataclasses import dataclass
from typing import Protocol
class Globular(Protocol):
diameter: float
class Spherical(Protocol):
diameter: float
# In Python, we need to define concrete classes that implement the protocols.
@dataclass
class Ball:
diameter: float
@dataclass
class Sphere:
diameter: float
ball: Globular = Ball(diameter=10)
sphere: Spherical = Sphere(diameter=20)
# These assignments work because both types structurally conform to the protocols.
sphere = ball
ball = sphere
class Tubular(Protocol):
diameter: float
length: float
@dataclass
class Tube:
diameter: float
length: float
tube: Tubular = Tube(diameter=12, length=3)
tube = ball # Fail type check.
ball = tube # Passes.
This is what Pyright says about it:
Found 1 error.
/scratch/structural.py
/scratch/structural.py:37:8 - error: Type "Ball" is not assignable to declared type "Tubular"
"Ball" is incompatible with protocol "Tubular"
"length" is not present (reportAssignmentType)
1 error, 0 warnings, 0 informations
Edit: And this is ty:
error: lint:invalid-assignment: Object of type `Ball` is not assignable to `Tubular`
--> structural.py:37:1
|
35 | tube: Tubular = Tube(diameter=12, length=3)
36 |
37 | tube = ball # Fail type check.
| ^^^^
38 | ball = tube # Passes.
|
info: `lint:invalid-assignment` is enabled by default
Found 1 diagnostic
I'd go a step further and say that duck typing is more than just structural typing's dynamic counterpart. Because, again, that's confounding two different axes. Dynamic vs static describes when type checking happens and whether types are associated with names or with values. But it doesn't necessarily describe the definition of "type".
The real difference between structural typing and duck typing is that structural typing requires all of a type's declared members to be present for an object to be considered compatible. Duck typing only requires the members that are actually being accessed to be present.
This is definitely more common in dynamic languages, but I'm not aware of any particular reason why that kind of checking couldn't also be done statically.
If I understand correctly, defining the protocol like this forces the implementation classes to have the members as proper fields and disallows properties. If you define `diameter` as a property in the protocol, it supports both:
from dataclasses import dataclass
from typing import Protocol
class Field(Protocol):
diameter: float
class Property(Protocol):
@property
def diameter(self) -> float: ...
class Ball:
@property
def diameter(self) -> float:
return 1
@dataclass
class Sphere:
diameter: float
ball_field: Field = Ball()
sphere_field: Field = Sphere(diameter=20)
ball_prop: Property = Ball()
sphere_prop: Property = Sphere(diameter=20)
Pyright output:
/Users/italo/dev/paper-hypergraph/t.py
/Users/italo/dev/paper-hypergraph/t.py:27:21 - error: Type "Ball" is not assignable to declared type "Field"
"Ball" is incompatible with protocol "Field"
"diameter" is invariant because it is mutable
"diameter" is an incompatible type
"property" is not assignable to "float" (reportAssignmentType)
1 error, 0 warnings, 0 information
That is to say, I find Python's support for structural typing to be limited in practice.
What annoys me is that every programmers who wish their favourite language / feature was as popular as Python and they choose to implement it in Python to make Python "better". Python was created as a dynamically typed language. If you want a language with type checking, there are plenty of others available.
Rust devs in particular are on a bend to replace all other languages by stealth, which is both obviously visible and annoying, because they ignore what they don't know about the ecosystem they choose to target. As cool as some of the tools written for Python in Rust are (ruff, uv) they are not a replacement for Python. They don't even solve some annoying problems that we have workarounds for. Sometimes they create new ones. Case in point is uv, which offers custom Docker images. Hello? A package manager is not supposed to determine the base Docker image or Python version for the project. It's a tool, not even an essential one since we have others, so know your place. As much as I appreciate some of the performance gains I do not appreciate the false narratives spread by some Rust devs about the end of Python/JavaScript/Golang based on the fact that Rust allowed them to introduce faster build tools into other programming languages' build chains. Rust community is quickly evolving into the friends you are embarrassed to have, a bit like any JVM-based language that suddenly has a bunch of Enterprise Java guys showing up to a Kotlin party and telling everyone "we can be like Python too...".
This argument doesn't make a whole lot of sense because nothing about type annotations constrains Python code at all. In fact because they're designed to be introspectable they make Python even more dynamic and you can do even crazier stuff than you could before. Type checkers are
working very hard to handle the weird code.
Pydantic being so fast because it's written in Rust is a good thing, you can do crazy dynamic (de-)serializations everywhere with very little performance penalty.
> nothing about type annotations constrains Python code at all
Sorry, but this is just not true. Don't get me wrong, I write typed Python 99% of the time (pyright in strict mode, to be precise), but you can't type check every possible construct in the language. By choosing to write typed Python, you're limiting how much of the language you can use. I don't think that's a bad thing, but it can be a problem for untyped codebases trying to adopt typing.
How many people understand the intricacies of any complex language or type system? Though I think one of the great things about TS is that you need to understand none of it in order to do `npm install @types/lodash` and get all the benefits.
IMHO they created type annotations, not a type system
and how you use the type annotations to indicate a type system is inconsistent and incomplete (e.g. NoneType vs. None for inconsistency and a lot of mess related to mataclasses (e.g. Enum) and supporting type annotations for them for incomplete)
the fact that even today something as fundamental as enums have issues with type checking _which are not just type checker incompetence_ is I think a good way to highlight what mess it is
or that `Annotated[]` was only added in 3.9 and has a ton of visual overhead even through its essential for a lot of clean definitions in modern python code (where for backwards compatibility there is often some other way, which can be de-facto wrongly typed but shouldn't be type linted, have fun type checkers).
It's not mypy issue. Comparing to TS python typehints (spec wise) are a joke. It's started as bolted on adhoc solution and evolved quite chaotically. For example [1]. TS doesn't require a special decorator (sic!) to make your custom classes to be picked up by type checkers.
Or how make a wrapper function with args and kwargs to pass through?
The dataclass decorator isn't there to make the type checkers understand the class. Its main purpose is to automatically implement trivial methods like the constructor, equality, repr, etc. The type hints make this more convenient, but something similar already existed with attrs.
through algorithmic improvements can also go a long way and if you are one of the first type checker which have to figure out the mess the python type annotation system is you will vast a lot of time on figuring that out instead of refactoring it's architecture to allow for algorithmic improvements
which brings us to another python issue, python is quite bad at such huge refactoring even with type checkers
but yeah python is by far the slowest widely used language, and for some use cases you can side step it by placing most hot code in C++/Rust extension modules, (or don't care because you are much much more network latency bound) but a type checker probably doesn't belong into that category
The CPython API is such a dumpster fire, even writing very simple modules the reference counting is very difficult to do correctly. The majority of python modules written in C are probably leaking memory somewhere but nobody knows.
My problem is that debugging a segfault in a Python system is impossible because of all the noise generated by Python modules that never bothered to clean up their Valgrind output.
>The way these type checkers get fast is usually by not supporting the crazy rich reality of realworld python code.
Nah, that's just part of the parade of excuses that comes out any time existing software solutions get smoked by a newcomer in performance, or when existing software gets more slow and bloated.
the thing is most (all) of the type checkers including e.g. mypy _do not_ support most crazy python ...
not because they don't want to or because it's to slow
but because it's not really viable without fully executing module loading in a sandbox, which might seem viable until you realize that you still need to type check `__main__` modules etc. and that its a common trend in python to do configs by loading python modules and grabbing the module locals as keys of the config or loading some things might actually idk. initialize a GPU driver :sob: So it's kinda 100% guaranteed not possible to do fully correct type checking for all project :smh:
But also python is one of the slowest popular languages (and with a large margin to any not also "one of slowest" languages). Only by moving hot code into C++/Rust is it fast, which often is good enough, but a type checker is exactly this kind of software where this approach stops working.
Python's static checking capabilities could significantly improve both tracing and compilation efficiency. The language features that currently limit type checkers are likely the same ones making efficient compilation difficult. Perhaps we'll eventually see a Python 3.40 with complete JIT compilation, functioning similarly to Julia but retaining Python's extensive ecosystem that makes it essential in certain domains.
I also mentioned this up-thread, but https://pypi.org/project/django-types/ is compatible with pyright without plugins, so it should theoretically work with ty. It's not quite as good as the mypy-django plugin but it still catches a lot.
SQLAlchemy 2.x has direct support for mypy, it works out of the box, no longer needing mypy plugins. Many things in SQLAlchemy as are still dynamic and can't be type checked, but the native support works great where it can.
All type checkers other than mypy (e.g. pyright, intellij) have ignored the level of plugin support necessary to make django work well, and so they are DOA for any large existing django codebase. Unless ruff decides to support such a dynamic interface as mypy's, it'll fare no better.
There was an effort to create a typechecking plugin interface for dataclass-style transforms for python type checkers, but what was merged was so lacking that one couldn't even make something close to django-stubs with it.
I bought my 3900x just under 5 years ago. Norwegian consumer protection laws give me 5 years where the producer is required to fix any defects that came with the product.
As this bug now has become known to always have been there, i could probably force amd to replace my 3900x if they don't provide software patches.
Has anyone else attempted a similar RTM for software defects?
Pyright doesn't work with Django, as Django's so dynamic that it requires a plugin to infer all types correctly. Sadly, even mypy with plugins is a mess to get set up in vscode, especially if you want it to use the same config as you use for ci checks from the command line.
We use mypy + [django-stubs](https://github.com/typeddjango/django-stubs) (in a huge Django + drf project at day job) which includes a plugin for mypy allowing it to recognize all reverse relations and manager methods. Mypy is still really rough around the edges. The cli args are poorly documented, and how they correspond to declarations in a mypy.ini / pyproject.toml is mysterious. Match-statements still have bugs even a year after release. Exclusion of untyped / partially typed files and packages we've had to solve with grep filtering mypy's output for our whitelisted set of files, as it's been unable to separate properly between errors you care about (in your own codebase) and errors in others code (dependencies, untypable dynamic python packages etc).
The largest issue IMO is that mypy tried to adapt a java / OOP style way of type system onto python, instead of recognizing the language's real power within duck typing and passing structural types around. Typescript chose the right approach here, modelling javascript the way it is actually written, favoring structural over nominal typing, instead of the archaic and now left-behind way of Java-style OOP that has influenced mypy.
There was a recently accepted PEP which allowed for limited dataclass transforms, enough to cover the @attr.s usecase for both mypy and pyright, but nowhere near expressive enough to cover django's models and ORM sadly. It's probably impossible / undesirable to allow for such rich plugins, so i see the future for proper pluginless typing to be more akin to how pydantic / normal dataclasses solve typing, by starting with a specification of the types, deriving its runtime implementation, instead of plugins having to reverse the type representation of a custom DSL.
Did you manage to get MyPy + Django to work together usefully? I tried the plugin you mentioned but it still seemed stymied by the dynamic nature of Django (reverse relations, etc), so I gave up on it.
If it’s actually possible to live in typed nirvana with Django I’ll bang my head against the wall some more.
We’re using Django 3.2 btw + FactoryBoy.
Actually there’s a number of annoyingly dynamic Python libraries out there where methods are created dynamically that it feels a bit like playing whack-a-mole. Is the situation like TypeScript where you need a MyLibrary.types.ts for each library?
It's quite useful on ci, and as a `make mypy` can run before pushing up your code, but for interactive errors we all use pyright, which is a bit of a letdown because you don't get autocomplete for fields and models accessed through reverse relation managers etc, pyright doesn't know about them. Many packages ship types nowadays, so the type coverage isn't too bad for us right now.
Here's our plugin setup in mypy.ini in case this helps (Django 4.2 + drf 3.14)
with this horror of a regex in make (because you'll get drowned in wrong type errors in all of the untype files, and errors get shown from imports even if you don't care about that imported file), add more file targets as necessary:
FILES_TO_MYPY = $(shell ls mycompany/\*/validators.py mycompany/\*/services.py mycompany/\*/selectors.py mycompany/\*/managers.py | sort | uniq)
# 1)We have to grep out django-manager-missing like this until the following bug
# is fixed: https://github.com/python/mypy/issues/12987.
# 2) We grep out the line of `Found 95 errors in 16 files (checked 83 source
# files)` that now appears as we use follow-imports: silent, because there's a
# bug where errors from imported modules are counted against the total even
# though they aren't emitted. If any real errors appear we get them as a
# separate line anyways.
.PHONY: mypy
mypy:
@{ MYPY_FORCE_COLOR=${NOT_CI} $(VENV)/bin/mypy --config-file mypy.ini $(FILES_TO_MYPY) 2>&3 | grep -v 'django-manager-missing\|errors in'; } 3>&1 | tee $(HYRE_TESTS_OUTPUT_PATH)/mypy.stdout.txt
This allows you to get proper errors for things like
model = MyModel.objects.get()
othermodel = model.othermodel_set.first()
reveal_type(othermodel) # correctly revealed to note: Revealed type is "Union[mycompany.importpath.models.OtherModel None]"
and even errors on typos like
model = MyModel.objects.get()
othermodel = model.ooooothermodel_set.first() # revealed as MyModel has no attribute ooooothermodel_set, perhaps you ment othermodel_set
.
zulip is one of those rare pieces of software that really gives me joy to use.
feature-by-feature on paper, it doesn't necessarily stand out compared to the crowded landscape. It's not a new distributed protocol or anything unique.
but you really can tell it's well crafted, with love as slack used to say. And if you don't believe me, you can inspect the source code yourself!
Luckily the demoscene graphics showcase platform https://www.dwitter.net still enforces the 140 character limit! Keepon making those webgl canvas graphics!
The way these type checkers get fast is usually by not supporting the crazy rich reality of realworld python code.
The reason we're stuck on mypy at work is because it's the only type checker that has a plugin for Django that properly manages to type check its crazy runtime generated methods.
I wish more python tooling took the TS approach of "what's in the wild IS the language", as opposed to a "we only typecheck the constructs we think you SHOULD be using".