skip to Main Content

Using pydantic in Python how can I parse data where I want the key of a mapping to be placed into one attribute and the value of a mapping placed into anothe rattribute?

For example, imagine my data is represented as

data = {
    "pets": [
        {"felix": "cat"},
        {"rover": "dog"},
        {"snuffles": "dog"},
    ]
}

And my pydantic models are

from typing import Literal
from pydantic import BaseModel


class Pet(BaseModel):
    name: str
    species: Literal["dog", "cat"]


class Household(BaseModel):
    pets: list[Pet]

Obviously Household(**data) doesn’t work to parse the data into the class. How can I adjust the class so this does work (efficiently).

Ideally the data would be in this format

data_transformed = {
    "pets": [
        {"name": "felix", "species": "cat"},
        {"name": "rover", "species": "dog"},
        {"name": "snuffles", "species": "dog"},
    ]
}

And this would then work.

Household(**data_transformed)

But if the data is in the original format (data), how can I write a pydantic class to parse it?

2

Answers


  1. Chosen as BEST ANSWER

    I have solved this by adding a pydantic.root_validator to the Pet class like so:

    from typing import Literal
    from pydantic import BaseModel, root_validator
    
    
    class Pet(BaseModel):
        name: str
        species: Literal["dog", "cat"]
    
        @root_validator(pre=True)
        def reorient_kv(cls, input):
            match input:
                case {"name": _, "species": _}:
                    return input
                case input if len(input.keys()) == 1:
                    name, species = next(iter(input.items()))
                    return {"name": name, "species": species}
                case input:
                    raise ValueError(f"Cannot interpret input as a pet: {input}")
    
    
    class Household(BaseModel):
        pets: list[Pet]
    
    
    data = {
        "pets": [
            {"felix": "cat"},
            {"rover": "dog"},
            {"snuffles": "dog"},
        ]
    }
    
    Household(**data)
    
    # Household(
    #     pets=[
    #         Pet(
    #             name='felix',
    #             species='cat',
    #         ),
    #         Pet(
    #             name='rover',
    #             species='dog',
    #         ),
    #         Pet(
    #             name='snuffles',
    #             species='dog',
    #         ),
    #     ],
    # )
    

  2. from typing import Any, Literal
    from pydantic import BaseModel, validator
    
    
    class Pet(BaseModel):
        name: str
        species: Literal["dog", "cat"]
    
    
    class Household(BaseModel):
        pets: list[Pet]
    
        @validator("pets", pre=True, each_item=True)
        def dict_to_pet(cls, v: Any) -> Any:
            if not isinstance(v, dict) or len(v) != 1:
                return v  # let the default validator handle it
            name, species = v.popitem()
            return Pet(name=name, species=species)
    
    
    obj = Household.parse_obj({
        "pets": [
            {"felix": "cat"},
            {"rover": "dog"},
            {"snuffles": "dog"},
        ]
    })
    print(obj.json(indent=4))
    

    Output:

    {
        "pets": [
            {
                "name": "felix",
                "species": "cat"
            },
            {
                "name": "rover",
                "species": "dog"
            },
            {
                "name": "snuffles",
                "species": "dog"
            }
        ]
    }
    

    This may be a matter of personal preference, but I think this is best solved by a regular validator on the Household model and not by a root validator on the Pet model.

    This seems to be a rather obscure edge-case, where the data you want to parse as a Household has weirdly-formatted pets. I would still assume that a single instance of Pet should not expect this format to ever occur, which is why I find it semantically more appropriate to perform this transformation on the Household model.

    More importantly however, I think a simple each_item=True validator makes the intent much clearer and is more readable.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search