Definition

The feature values test allows you to define the expected range of values for a feature. For categorical features, you can define the expected categories.

Taxonomy

  • Category: Integrity.
  • Task types: Tabular classification, tabular regression.
  • Availability: and .

Why it matters

  • Ensuring that the values of a feature are always within a defined range is important to validate the hypotheses around the data. For example, for a feature such as Age, negative values would be invalid and signal an issue with the data collection/ingestion process.
  • Values outside the expected range can also be a sign of data drift.
  • For some categorical features, it is important to ensure that the values are always within the expected categories.

Test configuration examples

If you are writing a tests.json, here are a few valid configurations for the character length test:

[
  {
    "name": "Feature 'Year' less than 2026",
    "description": "Asserts that the values of the feature 'Year' are within the specified range",
    "type": "integrity",
    "subtype": "featureValueValidation",
    "thresholds": [
      {
        "insightName": "featureProfile",
        "insightParameters": [
          { "name": "name", "value": "Year" }, // Selects feature `Year`
        ],
        "measurement": "max", // Must be one of `max` or `min`
        "operator": "<=",
        "value": 2026
      }
    ],
    "subpopulationFilters": null,
    "mode": "development",
    "usesValidationDataset": true, // Apply test to the validation set
    "usesTrainingDataset": false,
    "usesMlModel": false,
    "syncId": "b4dee7dc-4f15-48ca-a282-63e2c04e0689" // Some unique id
  }
]