Scope and limitations
Scope:
– exposes an API familiar to users of the standard library marshal and pickle modules.
– Allows customization of serialization and deserialization
Drawbacks:
– Customisation of serialization and deserialization may be quite cumbersome, especially the deserialization
Main functions of the api
Serialization
Serialize obj as a JSON formatted stream to fp (a .write()-supporting file-like object) using
the conversion table:
json.dump(obj, fp, *, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw)
:
Serialize obj to a JSON formatted str using the conversion table:
json.dumps(obj, *, skipkeys=False, ensure_ascii=True, check_circular=True, allow_nan=True, cls=None, indent=None, separators=None, default=None, sort_keys=False, **kw)
:
Serialization optional parameters
The 2 methods expose the same parameters and with the same meaning.
Description of some helpful and common parameters:
indent:
– a non-negative integer or string means JSON array elements and object members
will be pretty-printed with that indent level.
– An indent level of 0
, negative, or ""
will only insert
newlines.
– None
(the default) selects the most compact representation.
cls:
Allow to specify a custom JSONEncoder
subclass (e.g. one that overrides the default() method
to serialize
additional types) instead of the default JSONEncoder
.
– default: function that gets called for objects that can’t otherwise be serialized.
It should return a JSON encodable version of the object or raise a TypeError. If not specified,
TypeError is raised.
Deserialization
Deserialize fp (a .read()-supporting text file or binary file containing a JSON document) to a Python
object using this conversion table:
json.load(fp, *, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)
Deserialize s (a str, bytes or bytearray instance containing a JSON document) to a Python object using
this conversion table:
json.loads(s, *, cls=None, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, object_pairs_hook=None, **kw)
Deserialization optional parameters
The 2 methods expose the same parameters and with the same meaning.
Description of some helpful and common parameters:
– object_hook : allow using custom decoders (e.g. JSON-RPC class hinting).
It is a function that will be called with the result of any object literal decoded (a dict
).
The return value of object_hook
will be used instead of the dict
.
cls:
Allow to specify a custom JSONDecoder
subclass to deserialize instead of the default.
– parse_float : will be called with the string of every JSON float
to be
decoded.
By default, this is equivalent to float(num_str)
. This can be used to use another datatype or
parser for JSON floats (e.g. decimal.Decimal
).
– parse_int : will be called with the string of every JSON int
to
be decoded.
By default, this is equivalent to int(num_str)
. This can be used to use another datatype or
parser for JSON integers (e.g. float
).
Encoders and Decoders
Simple JSON decoder:
class json.JSONDecoder(*, object_hook=None, parse_float=None, parse_int=None, parse_constant=None, strict=True, object_pairs_hook=None)
Performs the following translations in decoding by default:
JSON |
Python |
---|---|
object |
dict |
array |
list |
string |
str |
number (int) |
int |
number (real) |
float |
true |
True |
false |
False |
null |
None |
Extensible JSON encoder
class json.JSONEncoder(*, skipkeys=False, ensure_ascii=True, check_circular=True,
allow_nan=True, sort_keys=False, indent=None, separators=None, default=None)
Performs the following translations in encoding by default:
Python |
JSON |
---|---|
dict |
object |
list, tuple |
array |
str |
string |
int, float, int- & float-derived Enums |
number |
True |
true |
False |
false |
None |
null |
Serialization examples
Basic example with only built-in types
The id is to serialize a dictionary:
import json foo_dictionary = { 'name': 'john', 'pets': [{ 'name': 'witty', 'age': 5 }, { 'name': 'drago', 'age': 8 } ], 'nicknames': ['johnny', 'the player'], } # Serialize a dictionary into a json string foo_dictionary_json: str = json.dumps(foo_dictionary) print(f'foo_dictionary_json={foo_dictionary_json}') # Serialize a dictionary into a pretty json string pretty_foo_dictionary_json: str = json.dumps(foo_dictionary, indent=2) print(f'pretty_foo_dictionary_json={pretty_foo_dictionary_json}') |
Output:
foo_dictionary_json={"name": "john", "pets": [{"name": "witty", "age": 5}, {"name": "drago", "age": 8}], "nicknames": ["johnny", "the player"]} pretty_foo_dictionary_json={ "name": "john", "pets": [ { "name": "witty", "age": 5 }, { "name": "drago", "age": 8 } ], "nicknames": [ "johnny", "the player" ] } |
Example with custom object
We will expose to ways:
– with default
attribute
– with cls
attribute
– with default
attribute:
import json from typing import List class Pet: def __init__(self, name: str, age: int) -> None: self.age = age self.name = name class Person: def __init__(self, name: str, pets: List[Pet], nicknames: List[str]) -> None: self.name = name self.pets = pets self.nicknames = nicknames person = Person('john', [Pet('witty', 5), Pet('drago', 8)] , ['johnny', 'the player']) #### WE SPECIFY A DEFAULT FUNCTION FOR NOT BUILT-IN TYPES #### # Serialize a dictionary into a json string # TypeError: Object of type Person is not JSON serializable person_json: str = json.dumps(person, default=vars) print(f'person_json={person_json}') # Serialize a dictionary into a pretty json string pretty_person_json: str = json.dumps(person, indent=2, default=vars) print(f'pretty_person_json={pretty_person_json}') |
Output:
person_json={"name": "john", "pets": [{"age": 5, "name": "witty"}, {"age": 8, "name": "drago"}], "nicknames": ["johnny", "the player"]} pretty_person_json={ "name": "john", "pets": [ { "age": 5, "name": "witty" }, { "age": 8, "name": "drago" } ], "nicknames": [ "johnny", "the player" ] } |
– with cls
attribute:
import json from typing import List, Any class Pet: def __init__(self, name: str, age: int) -> None: self.age = age self.name = name class Person: def __init__(self, name: str, pets: List[Pet], nicknames: List[str]) -> None: self.name = name self.pets = pets self.nicknames = nicknames person = Person('john', [Pet('witty', 5), Pet('drago', 8)] , ['johnny', 'the player']) # It is the equivalent implementation we have seen with the default attribute class PersonEncoding(json.JSONEncoder): def default(self, obj: Any) -> Any: if isinstance(obj, Person): return vars(obj) elif isinstance(obj, Pet): return vars(obj) return super().default(obj) # Serialize a dictionary into a json string # TypeError: Object of type Person is not JSON serializable person_json: str = json.dumps(person, cls=PersonEncoding) print(f'person_json={person_json}') # Serialize a dictionary into a pretty json string pretty_person_json: str = json.dumps(person, indent=2, cls=PersonEncoding) print(f'pretty_person_json={pretty_person_json}') |
Output:
person_json={"name": "john", "pets": [{"age": 5, "name": "witty"}, {"age": 8, "name": "drago"}], "nicknames": ["johnny", "the player"]} pretty_person_json={ "name": "john", "pets": [ { "age": 5, "name": "witty" }, { "age": 8, "name": "drago" } ], "nicknames": [ "johnny", "the player" ] } |
Deserialization examples
Basic example with only built-in types
import json from typing import Dict foo_json: str = ''' { "name": "john", "pets": [{ "name": "witty", "age": 5 }, { "name": "drago", "age": 8 } ], "nicknames": ["johnny", "the player"] } ''' # Deserialize a json string into a dictionary foo_dic: Dict = json.loads(foo_json) print(f'foo_dic={foo_dic}') |
Output:
foo_dic={'name': 'john', 'pets': [{'name': 'witty', 'age': 5}, {'name': 'drago', 'age': 8}], 'nicknames': ['johnny', 'the player']} |
Example with custom object
Likewise serialization of custom object, we have multiple ways to achieve the deserialization of them.
BEWARE:
For custom objects there is not a very robust way to achieve deserialization with built-in json api.
Here the custom class and the json input we will use in the next examples:
import json from typing import Dict from typing import List class Pet: def __init__(self, name: str, age: int) -> None: self.age = age self.name = name def __repr__(self) -> str: return f'Pet age={self.age}, name={self.name}' class Person: def __init__(self, name: str, pets: List[Pet], nicknames: List[str]) -> None: self.name = name self.pets = pets self.nicknames = nicknames def __repr__(self) -> str: return f'Person: name={self.name}, pets={self.pets}, nicknames={self.nicknames}' person_json: str = ''' { "name": "john", "pets": [{ "name": "witty", "age": 5 }, { "name": "drago", "age": 8 } ], "nicknames": ["johnny", "the player"] } ''' |
1) passing the dictionary returned by json.loads()
to the constructor as **kwargs
NOT ROBUST WAY : This is working only for the first-level custom instance, if we have nested of them, these will be
created like dictionary or list (according to the case) instead of custom instances.
dic: Dict = json.loads(person_json) person: Person = Person(**dic) print(f'person={person}, type={type(person)}') # We can think that it is working when we display the textual representation of a person object # person=Person: name=john, pets=[{'name': 'witty', 'age': 5}, {'name': 'drago', 'age': 8}], # nicknames=['johnny', 'the player'], type=<class '__main__.Person'> # But in fact it doesn't work for the pet object like we can see here print(f'person.pets[0]={person.pets[0]}, type={type(person.pets[0])}') # person.pets=[{'name': 'witty', 'age': 5}, {'name': 'drago', 'age': 8}], type=<class 'list'> |
2) workaround of the problem of nested custom instances : passing a dictionary as **kwargs to the constructor of EACH
CUSTOM
INSTANCE.
Ok, it is working but it is also very cumbersome.
dic: Dict = json.loads(person_json) pets: List[Pet] = [Pet(**dic_pet) for dic_pet in dic['pets']] person: Person = Person(**{k: v for k, v in dic.items() if k != 'pets'}, pets=pets) print(f'person={person}, type={type(person)}') print(f'person.pets[0]={person.pets[0]}, type={type(person.pets[0])}') |
output:
person=Person: name=john, pets=[Pet age=5, name=witty, Pet age=8, name=drago], nicknames=['johnny', 'the player'], type=<class '__main__.Person'> person.pets[0]=Pet age=5, name=witty, type=<class '__main__.Pet'> |
3) more robust solution: Adding a special type element in each custom object.
It has the advantage to centralize and to make quite simple the process of deserialization for each
custom object.
But it is also quite intrusive, so in some circumstances the built-in json api is not the best choice.
def object_hook_fn(dic: Dict): if '__type__' in dic: dic_without_type = {k: v for k, v in dic.items() if k != '__type__'} if dic['__type__'] == 'Pet': return Pet(**dic_without_type) if dic['__type__'] == 'Person': return Person(**dic_without_type) return dic person_json: str = ''' { "__type__":"Person", "name": "john", "pets": [{ "__type__":"Pet", "name": "witty", "age": 5 }, { "__type__":"Pet", "name": "drago", "age": 8 } ], "nicknames": ["johnny", "the player"] } ''' person: Person = json.loads(person_json, object_hook=object_hook_fn) print(f'person={person}, type={type(person)}') print(f'person.pets[0]={person.pets[0]}, type={type(person.pets[0])}') |
The output is the same as the previous one, it’s working again:
person=Person: name=john, pets=[Pet age=5, name=witty, Pet age=8, name=drago], nicknames=['johnny', 'the player'], type=<class '__main__.Person'> person.pets[0]=Pet age=5, name=witty, type=<class '__main__.Pet'> |