# JSON Zuar Runner's JSON plugin pipes data from any JSON file into a relational database. The input JSON files can be either JSON or JSONL. .. image:: assets/json-1.png ## Example use case Let's take a simple JSON file with nested data and convert the data into tables in a relational database. ### Example JSON file Here's what the source JSON file (`people_pets.json`) looks like: ```json [ { "id": 1, "name": "Justin", "pets": [ { "id": 1, "name": "Bear", "pet": { "type": "dog", "breed": "Goldendoodle" } }, { "id": 2, "name": "Birdie", "pet": { "type": "dog", "breed": "Goldendoodle" } } ] }, { "id": 2, "name": "Matt", "pets": [] }, { "id": 3, "name": "Ben", "pets": [ { "id": 3, "name": "Zuca", "pet": { "type": "dog", "breed": "Cavapoo" } } ] } ] ``` ### Expected Relational Database Results Because the JSON file has nested data in the `pets` array (`[ ]`), we will need to create two tables in the database. - **Table 1** will include top level key/value pairs: `id`, `name`. The resulting table should have 3 rows. - **Table 2** will include data in the `pets` array (`id`, `name`, and `pet` object's data). Table 2 will also include data from the parent object (`id`) so we can join the two database tables. The resulting table should have 3 rows. **Table 1** expected result: | id | name| |----|-----| | 1 | Justin | | 2 | Matt | | 3 | Ben | **Table 2** expected result: | id | name | pet__type | pet__breed | person_id | |----|------|-----------|------------|-----------| | 1 | Bear | dog | Goldendoodle | 1 | | 2 | Birdie| dog | Goldendoodle | 1 | | 3 | Zuca | dog | Cavapoo | 3 | ## Create a Zuar Runner JSON Job The JSON file(s) must exist in Zuar Runner's file system. Use the [file manager](/user-interface/files/) to manually add a JSON file to Zuar Runner. The JSON input requires creating a job manually, so create a job using the [Generic](/jobs/generic/) plugin. Set the job's `type` to `io`. ## Example Zuar Runner Job Configurations **Table 1** Zuar Runner job configuration: ```json { "input": { "use": "flatfile.iov2#JsonInput", "source": "/var/mitto/data/people_pets.json" }, "output": { "tablename": "people", "use": "call:mitto.iov2.db#todb", "schema": "json", "dbo": "postgresql://db/analytics" }, "steps": [ { "use": "mitto.iov2.steps#Input", "transforms": [ { "use": "mitto.iov2.transform#ExtraColumnsTransform" }, { "use": "mitto.iov2.transform#ColumnsTransform" } ] }, { "use": "mitto.iov2.steps#CreateTable" }, { "use": "mitto.iov2.steps#Output", "transforms": [ { "use": "mitto.iov2.transform#FlattenTransform" } ] }, { "use": "mitto.iov2.steps#CollectMeta" } ] } ``` This example job take the source JSON file (`people_pets.json`) and creates a database table (`json.people`) in Zuar Runner's internal PostgreSQL database (`localhost`). It creates columns from all of the "top level" keys excluding the arrays (`[ ]`). **Table 2** Zuar Runner job configuration: ```json { "input": { "source": "/var/mitto/data/people_pets.json", "use": "flatfile.iov2#JsonInput" }, "output": { "dbo": "postgresql://db/analytics", "schema": "json", "tablename": "people__pets", "use": "call:mitto.iov2.db#todb" }, "steps": [ { "use": "mitto.iov2.steps#Input", "transforms": [ { "use": "mitto.iov2.transform#PluckV2Transform", "jpath": "$.pets[*]", "members": [ { "name": "person_id", "value": "$.id" } ] }, { "use": "mitto.iov2.transform#ExtraColumnsTransform" }, { "use": "mitto.iov2.transform#ColumnsTransform" } ] }, { "use": "mitto.iov2.steps#CreateTable" }, { "use": "mitto.iov2.steps#Output", "transforms": [ { "use": "mitto.iov2.transform#FlattenTransform" } ] }, { "use": "mitto.iov2.steps#CollectMeta" } ] } ``` This example job take the source JSON file (`people_pets.json`) and creates a database table (`json.people__pets`) in Zuar Runner's internal PostgreSQL database (`localhost`). Note that this job configuration includes an additional `transform`, the `PluckV2Transform`. This transform takes content from the JSON array and extracts that content into a separate database table. Based on the `jpath` and `members` parameters in the `steps`, it creates columns from all the keys inside the `pets` array and the "top level" `id` key. ### Jpath The `jpath` parameter let's you define exactly what section of the source JSON file to pull from. This is necessary for **Table 2** because of the nested data in the `pets` array. Learn more about **JSONPath** expressions below. ### Members The `members` parameter let's you add additional columns from the source JSON file to the resulting database table. In our case, we need to add the `id` of each person to the `people__pets` table in order for us to join the two tables together. `members` also uses jpath in it's `value` key, and you can name the resulting column by adjusting the `name` key. ## JSONPath Expressions To understand how to use JSONPath to pick specific sections of data out of a JSON object you can visit: [https://goessner.net/articles/JsonPath/](https://goessner.net/articles/JsonPath/) To interactively learn how to use JSONPath syntax you can visit: [https://jsonpath.com/](https://jsonpath.com/) ### RegEx Like the other flat file types, Runner jobs can use RegEx for the source file selection. An example of a JSON job `input` section using RegEx is below: ```json { "input": { "base": "flatfile.iov2#JsonInput", "use": "flatfile.iov2#RegexInput", "regex": "/var/mitto/data/people_pets_.*.json$" } ``` This example JSON job will read data from an arbitrary collection of JSON files named in a way that matches the regular expression pattern defined. See [RegEx](/jobs/regex/) for more information. ### JSONL Runner also supports JSONL files; these files are commonly used in applications that deliver `log` type data, or applications that produce a large sets of records. In JSONL files, each line is a JSON record itself, followed by a specific non-printing character used to delimit the lines. The `people_pets.json` sample data can be represented as JSONL with some changes; it would have this content: ```json {"id":1,"name":"Justin","pets":[{"id":1,"name":"Bear","pet":{"type":"dog","breed":"Goldendoodle"}},{"id":2,"name":"Birdie","pet":{"type":"dog","breed":"Goldendoodle"}}]} {"id":2,"name":"Matt","pets":[]} {"id":3,"name":"Ben","pets":[{"id":3,"name":"Zuca","pet":{"type":"dog","breed":"Cavapoo"}}]} ``` If we need to process JSONL files, all of the previous discussion is applicable, with one modification to `input` section of the job, changing the `use` directive to `JsonlInput`: ```json { "input": { "source": "/var/mitto/data/people_pets.json", "use": "flatfile.iov2#JsonlInput" }, ```