Intro to cattrs 23.1.0

Instead of my usual Twitter and Fediverse threads, for this release of cattrs I figured I'd try something different. A blog post lets me describe the additions in more detail, provide context and usage examples, and produces a permanent record that can be linked to from the relevant places, like a GitHub release page and the cattrs changelog.

cattrs is a library for transforming Python data structures, the most obvious use case being de/serialization (to JSON, msgpack, YAML, and other formats).

Tagged Unions

cattrs has supported unions of attrs classes for a long time through our default automatic disambiguation strategy. This is a very simple way of supporting unions using no extra configuration. The way it works is: we examine every class in the union, find unique, mandatory attribute names for each class, and generate a function using that information to do the actual structuring. (Other unions are supported via manually-written hooks.)

But what if one of your classes has no unique attributes, or you just want to be able to tell the union member from a glance at the payload? Now you can use the tagged unions strategy.

This strategy adds a field into the unstructured payload, defaulting to _type but configurable, which inserts a piece of data (by default the name of the class, but again configurable) to help with structuring.

This strategy isn't the default so you'll have to import it and configure it on a union-by-union basis.

from attrs import define
from cattrs import Converter
from cattrs.strategies import configure_tagged_union

@define
class A:
    a: int
   
@define
class B:
    a: str
    
c = Converter()
configure_tagged_union(A | B, c)

c.unstructure(A(1), unstructure_as=A|B)
# {"a": 1, "_type": "A"}
c.structure({"a": 1, "_type": "A"}, A|B)
# A(1)

A useful feature of configure_tagged_union is that you can give it a default member class. This is a good way of evolving an API from a single class to a union in a backwards-compatible way.

from attrs import define

@define
class Request:
    @define
    class A:
        field: int
        
    payload: A

c = Converter()
c.structure({"payload": {"field": 1}}, Request)  # Request(A(1))

# Next iteration:
@define
class Request:
    @define
    class A:
        field: int

    @define
    class B:
        field: int
        
    payload: A | B


c = Converter()
configure_tagged_union(A | B, c, default=A)  # No type info means `A`

c.structure({"payload": {"field": 1}}, Request)  # Still Request(A(1))

Improved Validation Errors

cattrs has had a detailed validation mode for a few versions now, and it's enabled by default. In this mode, structuring errors are gathered and propagated out as an ExceptionGroup subclass, essentially creating a tree of errors mirroring the desired data structure. This ExceptionGroup can then be printed out using normal Python tooling for printing exceptions.

Still, sometimes you need a more succinct representation of your errors; for example if you need to display it to a user or return it to a web frontend. So now we have a simple transformer function available:

from attrs import define
from cattrs import structure, transform_error

@define
class Class:
    a_list: list[int]
    a_dict: dict[str, int]

try:
    structure({"a_list": ["a"], "a_dict": {"str": "a"}}, Class)
except Exception as exc:
    print(transform_error(exc))

[
    'invalid value for type, expected int @ $.a_list[0]',
    "invalid value for type, expected int @ $.a_dict['str']"
]

As you see, we generate a list of readable(-ish) error messages, including a path to every field. This can be customized, or you can copy/paste the transform_error function and just alter it directly if you require absolute control. Learn more here.

Typed Dicts

cattrs now supports TypedDicts on all supported Python versions. Due to spotty TypedDict functionality in earlier Pythons, I recommend you use TypedDict from typing_extensions when running on 3.9 or earlier. This is the reason cattrs now depends on typing_extensions on those versions.

from typing import TypedDict
from datetime import datetime

from cattrs.preconf.json import make_converter

converter = make_converter()

class MyDict(TypedDict):
    my_datetime: datetime

converter.structure({"my_datetime": "2023-05-01T00:00:00Z"}, MyDict)
# {'my_datetime': datetime.datetime(2023, 5, 1, 0, 0, tzinfo=datetime.timezone.utc)}

Generic TypedDicts are supported on 3.11+ (a language limitation), and totalities, Required and NotRequred are supported regardless.

The TypedDict implementation leverages the existing attrs/dataclasses base so it inherits most of the features. For example, structuring and unstructuring hooks can be customized to rename or omit keys. Here's an example with the forbid_extra_keys functionality:

from typing import TypedDict

from cattrs import Converter
from cattrs.gen.typeddicts import make_dict_structure_fn


class MyTypedDict(TypedDict):
    a: int


c = Converter()
c.register_structure_hook(
    MyTypedDict, make_dict_structure_fn(MyTypedDict, c, _cattrs_forbid_extra_keys=True)
)

c.structure({"a": 1, "b": 2}, MyTypedDict)  # Raises an exception

New Markdown Docs

The docs have been rewritten using Markdown and MyST! We can finally link like civilized people and not animals, so I'll be going through the docs and making them more interconnected. The theme has also been tweaked to be more airy and (in my opinion) better looking. The new docs are live at https://catt.rs now.

Misc

There are many more smaller changes in this release; I suggest inspecting the actual changelog. A quick shout out to the include_subclasses strategy by Matthieu Melot!

What's Next

So I don't actually know exactly what'll end up in the next version of cattrs since I don't work via a strict roadmap and I can't predict what folks will contribute.

What I think will probably happen is the creation of some sort of OpenAPI/jsonschema and cattrs wrapper library.  It's something folks have expressed interest in, and I already have the bones of it in the uapi project.

I'll also continue work on fleshing out the cattrs.v validation subsystem. This will probably go hand-in-hand with efforts in attrs and Mypy making operations on class attributes type-safe.

I'll also almost certainly expand both our union strategies to additionally handle enums and literals automatically, enabling true sum type support by default.

Tin
Zagreb, Croatia