Definition

The column average test allows you to assert that the mean of a column is within a certain range.

Taxonomy

  • Category: Integrity.
  • Task types: LLM, tabular classification, tabular regression, text classification.
  • Availability: and .

Why it matters

  • If you are tracking quantities such as latency or cost (e.g. per LLM request), you can use the column average test to assert these quantities are within the expected range.
  • Some features may have a known average value, and you can use the column average test to assert that it is within the expected range.

Test configuration examples

If you are writing a tests.json, here are a few valid configurations for the character length test:

[
  {
    "name": "Average of column 'Age' is greater than 20",
    "description": "Asserts that the average value of the numeric column 'Age' is greater than 20",
    "type": "integrity",
    "subtype": "columnAverage",
    "thresholds": [
      {
        "insightName": "columnAverage",
        "insightParameters": [{ "name": "column_name", "value": "Age" }], // Check average on column `Age`
        "measurement": "columnAverage",
        "operator": ">",
        "value": 20.0
      }
    ],
    "subpopulationFilters": null,
    "mode": "development",
    "usesValidationDataset": true, // Apply test to the validation set
    "usesTrainingDataset": false,
    "usesMlModel": false,
    "syncId": "b4dee7dc-4f15-48ca-a282-63e2c04e0689" // Some unique id
  }
]
  • Great Expectations test with expectations such as expect_column_mean_to_be_between, expect_column_median_to_be_between, etc.