Instead of my usual Twitter and Fediverse threads, for this release of cattrs I figured I'd try something different. A blog post lets me describe the additions in more detail, provide context and usage examples, and produces a permanent record that can be linked to from the relevant places, like a GitHub release page and the cattrs changelog.
cattrs is a library for transforming Python data structures, the most obvious use case being de/serialization (to JSON, msgpack, YAML, and other formats).
Tagged Unions
cattrs has supported unions of attrs classes for a long time through our default automatic disambiguation strategy. This is a very simple way of supporting unions using no extra configuration. The way it works is: we examine every class in the union, find unique, mandatory attribute names for each class, and generate a function using that information to do the actual structuring. (Other unions are supported via manually-written hooks.)
But what if one of your classes has no unique attributes, or you just want to be able to tell the union member from a glance at the payload? Now you can use the tagged unions strategy.
This strategy adds a field into the unstructured payload, defaulting to _type
but configurable, which inserts a piece of data (by default the name of the class, but again configurable) to help with structuring.
This strategy isn't the default so you'll have to import it and configure it on a union-by-union basis.
from attrs import define
from cattrs import Converter
from cattrs.strategies import configure_tagged_union
@define
class A:
a: int
@define
class B:
a: str
c = Converter()
configure_tagged_union(A | B, c)
c.unstructure(A(1), unstructure_as=A|B)
# {"a": 1, "_type": "A"}
c.structure({"a": 1, "_type": "A"}, A|B)
# A(1)
A useful feature of configure_tagged_union
is that you can give it a default member class. This is a good way of evolving an API from a single class to a union in a backwards-compatible way.
from attrs import define
@define
class Request:
@define
class A:
field: int
payload: A
c = Converter()
c.structure({"payload": {"field": 1}}, Request) # Request(A(1))
# Next iteration:
@define
class Request:
@define
class A:
field: int
@define
class B:
field: int
payload: A | B
c = Converter()
configure_tagged_union(A | B, c, default=A) # No type info means `A`
c.structure({"payload": {"field": 1}}, Request) # Still Request(A(1))
Improved Validation Errors
cattrs has had a detailed validation mode for a few versions now, and it's enabled by default. In this mode, structuring errors are gathered and propagated out as an ExceptionGroup subclass, essentially creating a tree of errors mirroring the desired data structure. This ExceptionGroup can then be printed out using normal Python tooling for printing exceptions.
Still, sometimes you need a more succinct representation of your errors; for example if you need to display it to a user or return it to a web frontend. So now we have a simple transformer function available:
from attrs import define
from cattrs import structure, transform_error
@define
class Class:
a_list: list[int]
a_dict: dict[str, int]
try:
structure({"a_list": ["a"], "a_dict": {"str": "a"}}, Class)
except Exception as exc:
print(transform_error(exc))
[
'invalid value for type, expected int @ $.a_list[0]',
"invalid value for type, expected int @ $.a_dict['str']"
]
As you see, we generate a list of readable(-ish) error messages, including a path to every field. This can be customized, or you can copy/paste the transform_error
function and just alter it directly if you require absolute control. Learn more here.
Typed Dicts
cattrs now supports TypedDicts on all supported Python versions. Due to spotty TypedDict functionality in earlier Pythons, I recommend you use TypedDict from typing_extensions
when running on 3.9 or earlier. This is the reason cattrs now depends on typing_extensions
on those versions.
from typing import TypedDict
from datetime import datetime
from cattrs.preconf.json import make_converter
converter = make_converter()
class MyDict(TypedDict):
my_datetime: datetime
converter.structure({"my_datetime": "2023-05-01T00:00:00Z"}, MyDict)
# {'my_datetime': datetime.datetime(2023, 5, 1, 0, 0, tzinfo=datetime.timezone.utc)}
Generic TypedDicts are supported on 3.11+ (a language limitation), and totalities, Required
and NotRequred
are supported regardless.
The TypedDict implementation leverages the existing attrs/dataclasses base so it inherits most of the features. For example, structuring and unstructuring hooks can be customized to rename or omit keys. Here's an example with the forbid_extra_keys
functionality:
from typing import TypedDict
from cattrs import Converter
from cattrs.gen.typeddicts import make_dict_structure_fn
class MyTypedDict(TypedDict):
a: int
c = Converter()
c.register_structure_hook(
MyTypedDict, make_dict_structure_fn(MyTypedDict, c, _cattrs_forbid_extra_keys=True)
)
c.structure({"a": 1, "b": 2}, MyTypedDict) # Raises an exception
New Markdown Docs
The docs have been rewritten using Markdown and MyST! We can finally link like civilized people and not animals, so I'll be going through the docs and making them more interconnected. The theme has also been tweaked to be more airy and (in my opinion) better looking. The new docs are live at https://catt.rs now.
Misc
There are many more smaller changes in this release; I suggest inspecting the actual changelog. A quick shout out to the include_subclasses
strategy by Matthieu Melot!
What's Next
So I don't actually know exactly what'll end up in the next version of cattrs since I don't work via a strict roadmap and I can't predict what folks will contribute.
What I think will probably happen is the creation of some sort of OpenAPI/jsonschema and cattrs wrapper library. It's something folks have expressed interest in, and I already have the bones of it in the uapi project.
I'll also continue work on fleshing out the cattrs.v
validation subsystem. This will probably go hand-in-hand with efforts in attrs and Mypy making operations on class attributes type-safe.
I'll also almost certainly expand both our union strategies to additionally handle enums and literals automatically, enabling true sum type support by default.