JSON Type Definition, aka RFC8927, is an easy-to-learn, standardized way to define a schema for JSON data. You can use JSON Typedef to portably validate data across programming languages, create dummy data, generate code, and more.
jtd-infer
is a tool that generates ("infers") a JSON Typedef schema from
example data.
bash
echo '{ "name": "Joe", "age": 42 }' | jtd-infer | jq
json
{
"properties": {
"age": {
"type": "uint8"
},
"name": {
"type": "string"
}
}
}
On macOS, you can install jtd-infer
via Homebrew:
bash
brew install jsontypedef/jsontypedef/jtd-infer
For all other platforms, you can download and extract the binary yourself from
the latest release. You can also install using cargo
by running:
bash
cargo install jtd_infer
For high-level guidance on how to use jtd-infer
, see "Inferring a JSON
Typedef Schema from Real Data" in the JSON Typedef website docs.
To invoke jtd-infer
, you can either:
jtd-infer
.jtd-infer
reads a sequence of JSON messages. So for example, if you have a
file like this in data.json
:
json
{ "name": "john doe", "age": 42 }
{ "name": "jane doe", "age": 45 }
You can give it to jtd-infer
in two ways:
```bash
cat data.json | jtd-infer jtd-infer data.json ```
In both cases, you'd get this output:
json
{"properties":{"name":{"type":"string"},"age":{"type":"uint8"}}}
⚠️ This section is often important if you are retrofitting JSON Typedef to a JavaScript-based application.
By default, JSON Typedef will infer the most specific possible type for inputs.
So, for example, it will guess uint8
if it sees a 12
in your input:
bash
echo "12" | jtd-infer
json
{"type":"uint8"}
However, if you're giving JSON Typedef a small sample set, or if you in practice
have data that is far smaller than the actual numerical datatypes your
application supports, then this behavior may be undesirable. For example, it's
common for JavaScript-based applications to actually support float64
for all
numeric inputs, because JavaScript numbers are IEEE double-precision floats.
To tell JSON Typedef to prefer a different type than the one it would normally
guess, you can use --default-number-type
to change its behavior. For example:
```bash
#
echo "12" | jtd-infer --default-number-type=float64 ```
json
{"type":"float64"}
Another use-case is if you're writing an application that uses signed 32-bit
ints everywhere, and your example data simply never in practice has examples of
negative numbers or numbers too big for 8- or 16-bit numbers. You can achieve
that by using int32
as your default number type:
bash
echo "12" | jtd-infer --default-number-type=int32
json
{"type":"int32"}
Note that jtd-infer
will ignore your default if it doesn't match with the
data. For example, int32
only works with whole numbers, so if a decimal number
or a number too big for 32-bit signed integers comes in, it will fall back to
float64
:
```bash
echo "3.14" | jtd-infer --default-number-type=int32 echo "9999999999" | jtd-infer --default-number-type=int32 ```
By default, jtd-infer
will never output enum
, values
, or discriminator
schemas. This is by design: by always being consistent with what it outputs,
jtd-infer
is more predictable and reliable.
If you want jtd-infer
to output an enum
, values
, or discriminator
, you
can use the --enum-hint
, --values-hint
, and --discriminator-hint
flags.
You can pass each of these flags multiple times.
All of the hint flags accept JSON
Pointers as values. If you're used to the
JavaScript-y syntax of referring to things as $.foo.bar
, the equivalent JSON
Pointer is /foo/bar
. jtd-infer
treats -
as a "wildcard". /foo/-/bar
is
equivalent to the JavaScript-y $.foo.*.bar
.
As a corner-case, if you want to point to the root / top-level of your input,
then use the empty string as the path. See "Using
--values-hint
" for an example of this.
--enum-hint
By default, strings are always inferred to be { "type": "string" }
:
bash
echo '["foo", "bar", "baz"]' | jtd-infer
json
{"elements":{"type":"string"}}
But you can instead have jtd-infer
output an enum by providing a path to the
string you consider to be an enum. In this case, it's any element of the root of
the array -- the JSON Pointer for that is /-
:
bash
echo '["foo", "bar", "baz"]' | jtd-infer --enum-hint=/-
json
{"elements":{"enum":["bar","baz","foo"]}}
--values-hint
By default, objects are always assumed to be "structs", and jtd-infer
will
generate properties
/ optionalProperties
. For example:
bash
echo '{"x": [1, 2, 3], "y": [4, 5, 6], "z": [7, 8, 9]}' | jtd-infer
json
{"properties":{"y":{"elements":{"type":"uint8"}},"z":{"elements":{"type":"uint8"}},"x":{"elements":{"type":"uint8"}}}}
If your data is more like a map / dictionary, pass a values-hint
that points
to the object that you want a values
schema from. In this case, that's the
root-level object, which in JSON Pointer is just an empty string:
bash
echo '{"x": [1, 2, 3], "y": [4, 5, 6], "z": [7, 8, 9]}' | jtd-infer --values-hint=
json
{"values":{"elements":{"type":"uint8"}}}
--discriminator-hint
By default, objects are always assumed to be "structs", and jtd-infer
will
generate properties
/ optionalProperties
. For example:
bash
echo '[{"type": "s", "value": "foo"},{"type": "n", "value": 3.14}]' | jtd-infer
json
{"elements":{"properties":{"value":{},"type":{"type":"string"}}}}
If your data has a special "type" property that tells you what's in the rest of
the object, then use --discriminator-hint
to point to that property.
jtd-infer
will output an appropriate discriminator
schema instead:
bash
echo '[{"type": "s", "value": "foo"},{"type": "n", "value": 3.14}]' | jtd-infer --discriminator-hint=/-/type | jq
json
{
"elements": {
"discriminator": "type",
"mapping": {
"s": {
"properties": {
"value": {
"type": "string"
}
}
},
"n": {
"properties": {
"value": {
"type": "float64"
}
}
}
}
}
}