skip to Main Content

I have a base schema and extended schema below

./resources/json-schemas/simple-person.schema

{
  "$id": "http://example.com/json-schemas/simple-person.schema",
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "Simple Person",
  "type": "object",
  "people": {
    "items": {
      "properties": {
        "name": {
          "type": "string",
          "description": "The person's name."
        },
        "age": {
          "type": "integer",
          "description": "The person's age."
        }
      },
      "required": [
        "name",
        "age"
      ]
    }
  }
}

./extended-person.schema

{
  "$id": "http://example.com/extended-person.schema",
  "$schema": "https://json-schema.org/draft/2020-12/schema",
  "title": "Extended Person",
  "type": "object",
  "allOf": [
    {
      "$ref": "http://example.com/json-schemas/simple-person.schema"
    },
    {
      "people": {
        "items": {
          "properties": {
            "height": {
              "type": "number",
              "description": "The person's height in centimeters."
            },
            "required": [
              "height"
            ]
          }
        }
      }
    }
  ]
}

I then have an instance of this dataset that I want to validate against.

./person-dataset.json

{
    "people": [
        {
            "name": "Bob",
            "age": 25,
            "new value": "value"
        }
    ]
}

I would expect the validation to fail, but it passes with the code below

from pathlib import Path

import json

from referencing import Registry, Resource
from referencing.exceptions import NoSuchResource
from jsonschema import Draft202012Validator


def retrieve_from_filesystem(uri: str):
    SCHEMAS = Path("./resources/json-schemas/")

    if uri.startswith("http://example.com/json-schemas/"):
        path = SCHEMAS / Path(uri.removeprefix("http://example.com/json-schemas/"))
    else:
        raise NoSuchResource(ref=uri)

    contents = json.loads(path.read_text())

    return Resource.from_contents(contents)

registry = Registry(retrieve=retrieve_from_filesystem)

schema = json.load(Path("./extended-person.schema").open())
instance = json.load(Path("./person-dataset.json").open())
validator = Draft202012Validator(schema, registry=registry)

validator.validate(instance)

I expected this to fail for two reasons

  1. no "height" property is included in the dataset
  2. a new property "new value" is included in the dataset that isn’t specified in the schemas

How do I fix this to make it so datasets like this will fail?

2

Answers


  1. Chosen as BEST ANSWER

    It looks like you have to be super strict in where you place properties. To get the desired behavior, I had to update my files like so...

    ./resources/json-schemas/simple-person.schema

    {
      "$id": "http://example.com/json-schemas/simple-person.schema",
      "$schema": "https://json-schema.org/draft/2020-12/schema",
      "title": "Simple Person",
      "type": "object",
      "properties": {
        "people": {
          "type": "array",
          "items": {
            "type": "object",
            "properties": {
              "name": {
                "type": "string",
                "description": "The person's name."
              },
              "age": {
                "type": "integer",
                "description": "The person's age."
              }
            },
            "required": [
              "name",
              "age"
            ]
          }
        },
        "required": [
          "people"
        ]
      }
    }
    

    ./extended-person.schema

    {
      "$id": "http://example.com/extended-person.schema",
      "$schema": "https://json-schema.org/draft/2020-12/schema",
      "title": "Extended Person",
      "type": "object",
      "allOf": [
        {
          "$ref": "http://example.com/json-schemas/simple-person.schema"
        },
        {
          "type": "object",
          "properties": {
            "people": {
              "type": "array",
              "items": {
                "type": "object",
                "properties": {
                  "height": {
                    "type": "number",
                    "description": "The person's height in centimeters."
                  }
                },
                "required": [
                  "height"
                ]
              }
            }
          }
        }
      ]
    }
    

    This will make the following dataset fail, because it doesn't have the required "height" parameter specified in the extension

    {
        "people": [
            {
                "name": "Bob",
                "age": 25
            }
        ]
    }
    

    However, apparently you cannot specify there will be no other parameters without redefining the requirements. So my extension can change to

    {
      "$id": "http://example.com/extended-person.schema",
      "$schema": "https://json-schema.org/draft/2020-12/schema",
      "title": "Extended Person",
      "type": "object",
      "allOf": [
        {
          "$ref": "http://example.com/json-schemas/simple-person.schema"
        },
        {
          "type": "object",
          "properties": {
            "people": {
              "type": "array",
              "items": {
                "type": "object",
                "properties": {
                  "height": {
                    "type": "number",
                    "description": "The person's height in centimeters."
                  }
                },
                "required": [
                  "name",
                  "age",
                  "height"
                ],
                "additionalProperties": false
              }
            }
          }
        }
      ]
    }
    

    Then the following dataset will fail

    {
        "people": [
            {
                "name": "Bob",
                "age": 25,
                "height": 145,
                "new value": "new value"
            }
        ]
    }
    

    I don't really like that I have to redefine the required properties in the extension. If there is a workaround to this, I would be interested in knowing.


  2. This one is a bit tricky in that you are adding constraints to an existing schema inside of a nested array

    • define the extension with properties>people>items
    • add your new keyword definition for height
    • inside of items you need to $ref the nested schema in the simple-person.schema where you are adding a new keyword
    • add unevaluatedProperties: false which allows you to see into the reference schema to understand name and age are defined, and also to allow height but, not any additional properties.
    {
        "$id": "http://example.com/json-schemas/extended-person.schema",
        "$schema": "https://json-schema.org/draft/2020-12/schema",
        "type": "object",
        "properties": {
            "people": {
                "type": "array",
                "items": {
                    "unevaluatedProperties": false,
                    "$ref": "simple-person.schema#/properties/people/items",
                    "properties": {
                        "height": {
                            "type": "number"
                        }
                    },
                    "required": [
                        "height"
                    ]
                }
            },
        "required": ["people"]
        }
    }
    

    this is a great reference for modeling inheritance. JSON Schema doesn’t completely support inheritance, but there are definitely ways to model it very close to the expected behavior with some concessions.
    https://json-schema.org/blog/posts/modelling-inheritance


    p.s. $ref is now allowed at the root of a schema, alongside sibling keywords such as properties or other applicators(except another $ref) which allows you to simplify your schema from using allOf to compose multiple schemas in this fashion

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search