Python: JSON schema validation

In this tutorial we will learn how to perform JSON schema validations using Python and the jsonschema module.

Introduction

In this tutorial we will learn how to perform JSON schema validations using Python and the jsonschema module.

To install the module using pip you can simply use the following command:

pip install jsonschema

In this introductory tutorial we will cover a very simple use case where we will compare two JSONs against a schema: one of them will be valid and the other invalid.

Note that in the code below the JSONs will be represented as Python dictionaries. So, for simplicity, the two terms might be used interchangeably.

The code shown below was tested on Python version 3.7.2.

The code

We will start by importing the validate function from the jsonschema module. We will also import two exception classes that we will analyze later: ValidationError and SchemaError.

from jsonschema import validate, ValidationError, SchemaError

Then we will define a JSON schema for a very simple use case. We will assume that this schema validates an object with two properties: name (a string) and age (a number). Both will be required.

We will define the schema as a Python dictionary.

schema = {
    "type" : "object",
    "properties" : {
        "name" : {"type" : "string"},
        "age" : {
            "type" : "number",
        }
    },
    "required": ["age","name"]

}

After this we will define two additional dictionaries to represent JSON objects: one that is valid and another that is invalid accordingly to the previous schema.

In the case of the invalid JSON, we will set the age attribute as a string, which doesn’t respect the restriction of the schema that indicates it should be a number.

validJson = {"name" : "Eggs", "age" : 10}
invalidJson =  {"name" : "Eggs", "age":"string"}

Then we will call the validate function to perform the validation of the JSON against the schema. We will start with the JSON that respects the schema restrictions.

As first input of the validate function we will pass the dictionary representing the JSON we want to evaluate. As second input we will pass the dictionary representing the schema.

When calling this function, there are two exceptions that can be raised:

  • SchemaError: The schema used is invalid.
  • ValidationError: The JSON is invalid for the provided schema.

In this case, since the JSON is valid accordingly to the schema, we don’t need to worry about these exceptions. Naturally, this is a simplification under controlled testing conditions. For a real application scenario where we don’t know if the JSON is valid or not we should handle the possible exceptions.

validate(validJson, schema)

Now, for the second case, we are going to call the validate function for the invalid JSON. In this case we will enclose it in a try except block.

The first exception we are going to handle is the SchemaError. We will simply print a message indicating to the user that there is an error with the schema. Note that for our use case the schema is correct so this exception should not be raised.

For the handling of the ValidationError we will print the obtained exception. This will print to the prompt a human readable message that explains the error [1].

As can be seen in the documentation, this exception class has many other attributes that we can use to process the validation error.

From these, we are going to print the absolute_path and absolute_schema_path, for exemplification. The absolute_path attribute contains the path to the offending element, as a deque [1]. The absolute_schema_path attribute contains the path to the failed validator within the schema, also as deque [1].

try:
    validate(invalidJson, schema)

except SchemaError as e:
    print("There is an error with the schema")
    
except ValidationError as e:
    print(e)
    
    print("---------")
    print(e.absolute_path)

    print("---------")
    print(e.absolute_schema_path)

The final code can be seen below.

from jsonschema import validate, ValidationError, SchemaError

schema = {
    "type" : "object",
    "properties" : {
        "name" : {"type" : "string"},
        "age" : {
            "type" : "number",
        }
    },
    "required": ["age","name"]

}

validJson = {"name" : "Eggs", "age" : 10}
invalidJson =  {"name" : "Eggs", "age":"string"}

validate(validJson, schema)

try:
    validate(invalidJson, schema)

except SchemaError as e:
    print("There is an error with the schema")
    
except ValidationError as e:
    print(e)
    
    print("---------")
    print(e.absolute_path)

    print("---------")
    print(e.absolute_schema_path)

Testing the code

To test the code, simply run it in a tool of your choice. In my case, I’ve performed the tests on IDLE, a Python IDE.

You should get an output similar to figure 1. As can be seen, the first content that gets printed is a user friendly message that explains the violation of the schema.

Then, we can see the path to the offending element. As we have seen on the code, the schema indicates that age should be a number but in the wrong JSON it was set as a string.

For the third and last print, we can see the path to the failed validator.

Output of the program.
Figure 1 – Output of the program.

References

[1] https://python-jsonschema.readthedocs.io/en/latest/errors/#jsonschema.exceptions.ValidationError

Leave a Reply