How to Document a Zuar Runner “Entity”

This is a high level description of the steps necessary to document a Zuar Runner user-facing entity such as an inputter, a transform, or a job step. The XsvInput2 inputter is used as an example. When viewing that (or any) page, you can click on Show Source to view the markup used to create the page.

  1. Find the Python file in which the class is defined. XsvInput2 is defined here.

  2. If the class is not already a subclass of pydantic.BaseModel, update the class definition.

  3. The class’s docstring represents the top level of the documentation for the entity. Provide a brief description of the behavior and one or two examples of how the entity can be used in a Zuar Runner job configuration. If many different use cases are possible or if extensive explanation is necessary, create a separate documentation page to contain the extended documentation. Keep the docstring for the class as concise as possible.

    Although most Zuar Runner documentation is written in Markdown (e.g., this page), docstrings must be formatted using reStructuredText.

    Zuar Runner class docstrings are formatted in the Google style. Class docstrings are processed using sphinx.ext.napoleon.

  4. Each parameter for the entity becomes a Pydantic field. It is important to be as accurate (and restrictive) as possible with a field’s type hints, as the type hints are the source for much of the auto generated documentation. Where possible, using specific types provided by Pydantic is recommended, as this often eliminates the need for writing a custom field validator.<

  5. Best practices when defining a field:

    • Provide a documentation string (again, in reST)

    • Include examples via examples=[...].

    • Use constrained types, e.g.: if an int field can only have values between 1 and 12, add the constraints ge=1 and le=12 to the field definition.

    • When constrained values are insufficient, add validators. The use of validators provides the user with much better error messages than is otherwise possible.

  6. If possible, prohibit the assignment of “extra attributes”. This causes an error message to be raised when a user mis-types a field name. This is done by adding a Config class to the entity’s Pydantic class definition:

    class Config:
        extra = "forbid"
    
  7. Add the module and class name to the module_schema_configs.py. This causes sphinx-jsonschema to create JSON Schema documentation from the class when make schemas is run. This auto generated documentation is stored in docs/src/json_schemas; never update these documents manually.

  8. Create a documentation page for the entity that includes the JSON Schema documentation. For example, view the page created by docs/src/inputters/builtin_xsv_xsv_iov2_input__xsvinput2.md, and click Show Source to see the markup used to create it.

    Although this is a Markdown file, it (and any Markdown file in the Zuar Runner documentation tree) can contain reST directives, of which, .. jsonschema is one.

    There are similar directories for transforms, steps, and jobs.

  9. Update the related index.md to contain a link to the new documentation page. In the case of XsvInput2, the following line:

    * [builtin.xsv.xsv.iov2.input.XsvInput2](builtin_xsv_xsv_iov2_input__xsvinput2.md)
    

    was added to index.md.