Definition

The correlated features test checks if there are features that are strongly correlated with one another.

Taxonomy

  • Category: Integrity.
  • Task types: Tabular classification, tabular regression.
  • Availability: and .

Why it matters

  • Removing highly correlated features improves model interpretability and can improve generalization performance.
  • For some models, multicollinearity can be an issue, and the coefficients learned are unreliable.
  • Sometimes, correlated features can indicate data quality issues — such as duplicate or near-duplicate columns.

Test configuration examples

If you are writing a tests.json, here are a few valid configurations for the character length test: