Documentation Index
Fetch the complete documentation index at: https://openlayer.com/docs/llms.txt
Use this file to discover all available pages before exploring further.
Definition
The column statistics test allows you to set thresholds on statistical measures of individual columns in your dataset. You can select any column and specify a statistic (such as mean, median, variance, etc.), then define acceptable ranges or values for that statistic. This test computes the specified statistical measure for the chosen column and compares it against your defined threshold.Taxonomy
- Task types: LLM, tabular classification, tabular regression.
- Availability: and .
Why it matters
- Column statistics tests help ensure that your data maintains expected statistical properties over time.
- They can detect data quality issues, distribution shifts, or unusual patterns in individual features.
- These tests are essential for monitoring data consistency and ensuring that model inputs remain within expected ranges.
- Statistical validation helps identify potential data pipeline issues or changes in data collection processes.
Available statistics
The following statistical measures are supported:| Statistic | Description | Typical Use Cases |
|---|---|---|
mean | Average value of the column | Monitor if average values stay within expected ranges |
median | Middle value when data is sorted | Detect shifts in central tendency, robust to outliers |
min | Minimum value in the column | Ensure no values fall below acceptable minimums |
max | Maximum value in the column | Detect outliers or values exceeding acceptable maximums |
std | Standard deviation of the column | Monitor data variability and spread |
sum | Sum of all values in the column | Useful for totals, counts, or aggregate validations |
count | Number of non-null values | Monitor data completeness |
variance | Variance of the column values | Alternative measure of data spread |
Test configuration examples
If you are writing atests.json, here are a few valid configurations for the column statistics test:

