Definition

The is JSON test allows you to check if a specified column contains a valid JSON.

Taxonomy

  • Category: Integrity.
  • Task types: LLM.
  • Availability: and .

Why it matters

  • LLMs are often prompted to generate a structured output, JSON being the most common format.
  • If the LLM doesn’t generate a valid JSON, it can break the downstream applications that rely on it.

Test configuration examples

If you are writing a tests.json, here are a few valid configurations for the character length test:

[
  {
    "name": "Outputs have valid JSON",
    "description": "Asserts that the output column for all rows contains valid JSON",
    "type": "integrity",
    "subtype": "isJson",
    "thresholds": [
      {
        "insightName": "isJson",
        "insightParameters": [
          { "name": "column_name", "value": "output" } // Selects the column `output`
        ],
        "measurement": "isJsonRowPercentage",
        "operator": ">=",
        "value": 1.0
      }
    ],
    "subpopulationFilters": null,
    "mode": "development",
    "usesValidationDataset": true, // Apply test to the validation set
    "usesTrainingDataset": false,
    "usesMlModel": false,
    "syncId": "b4dee7dc-4f15-48ca-a282-63e2c04e0689" // Some unique id
  }
]
  • JSON tests using Great expectations, such as expect_column_values_to_be_json_parseable, expect_column_values_to_match_json_schema, and expect_column_values_to_be_valid_json.