Create test

Python

import os
from openlayer import Openlayer

client = Openlayer()
test = client.projects.tests.create(
    project_id="182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
    name="No duplicate rows",
    description="This test checks for duplicate rows in the dataset.",
    type="integrity",
    subtype="duplicateRowCount",
    thresholds=[
      {
        "insightName": "duplicateRowCount",
        "measurement": "duplicateRowCount", # Using the absolute row count
        "operator": "<=",
        "value": 0 # Integer
      }
    ],
    uses_production_data=True, # For monitoring mode
    evaluation_window=3600, # 1 hour
    delay_window=0,
    uses_training_dataset=False,
    uses_validation_dataset=False,

)

{
  "id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
  "number": 1,
  "name": "No duplicate rows",
  "dateCreated": "2024-03-22T11:31:01.185Z",
  "dateUpdated": "2024-03-22T11:31:01.185Z",
  "description": "This test checks for duplicate rows in the dataset.",
  "type": "integrity",
  "subtype": "duplicateRowCount",
  "creatorId": "589ece63-49a2-41b4-98e1-10547761d4b0",
  "originProjectVersionId": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
  "thresholds": [
    {
      "measurement": "duplicateRowCount",
      "insightName": "duplicateRowCount",
      "insightParameters": [
        {
          "name": "column_name",
          "value": "<unknown>"
        }
      ],
      "thresholdMode": "manual",
      "operator": "<=",
      "value": 0
    }
  ],
  "dateArchived": "2024-03-22T11:31:01.185Z",
  "suggested": false,
  "commentCount": 0,
  "evaluationWindow": 3600,
  "delayWindow": 0,
  "archived": false,
  "usesMlModel": false,
  "usesValidationDataset": true,
  "usesTrainingDataset": false,
  "usesReferenceDataset": false,
  "usesProductionData": false,
  "includeHistoricalData": false,
  "defaultToAllPipelines": true,
  "includePipelines": [
    "3c90c3cc-0d44-4b50-8888-8dd25736052a"
  ],
  "excludePipelines": [
    "3c90c3cc-0d44-4b50-8888-8dd25736052a"
  ]
}

Python

import os
from openlayer import Openlayer

client = Openlayer()
test = client.projects.tests.create(
    project_id="182bd5e5-6e1a-4fe4-a799-aa6d9a6ab26e",
    name="No duplicate rows",
    description="This test checks for duplicate rows in the dataset.",
    type="integrity",
    subtype="duplicateRowCount",
    thresholds=[
      {
        "insightName": "duplicateRowCount",
        "measurement": "duplicateRowCount", # Using the absolute row count
        "operator": "<=",
        "value": 0 # Integer
      }
    ],
    uses_production_data=True, # For monitoring mode
    evaluation_window=3600, # 1 hour
    delay_window=0,
    uses_training_dataset=False,
    uses_validation_dataset=False,

)

{
  "id": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
  "number": 1,
  "name": "No duplicate rows",
  "dateCreated": "2024-03-22T11:31:01.185Z",
  "dateUpdated": "2024-03-22T11:31:01.185Z",
  "description": "This test checks for duplicate rows in the dataset.",
  "type": "integrity",
  "subtype": "duplicateRowCount",
  "creatorId": "589ece63-49a2-41b4-98e1-10547761d4b0",
  "originProjectVersionId": "3fa85f64-5717-4562-b3fc-2c963f66afa6",
  "thresholds": [
    {
      "measurement": "duplicateRowCount",
      "insightName": "duplicateRowCount",
      "insightParameters": [
        {
          "name": "column_name",
          "value": "<unknown>"
        }
      ],
      "thresholdMode": "manual",
      "operator": "<=",
      "value": 0
    }
  ],
  "dateArchived": "2024-03-22T11:31:01.185Z",
  "suggested": false,
  "commentCount": 0,
  "evaluationWindow": 3600,
  "delayWindow": 0,
  "archived": false,
  "usesMlModel": false,
  "usesValidationDataset": true,
  "usesTrainingDataset": false,
  "usesReferenceDataset": false,
  "usesProductionData": false,
  "includeHistoricalData": false,
  "defaultToAllPipelines": true,
  "includePipelines": [
    "3c90c3cc-0d44-4b50-8888-8dd25736052a"
  ],
  "excludePipelines": [
    "3c90c3cc-0d44-4b50-8888-8dd25736052a"
  ]
}

Authorizations

Authorization

string

header

required

Bearer authentication header of the form Bearer <token>, where <token> is your workspace API key. See Find your API key for more information.

Path Parameters

projectId

string<uuid>

required

The project id.

Body

application/json

name

string

required

The test name.

Maximum string length: 100

Example:

"No duplicate rows"

description

object

required

The test description.

Example:

"This test checks for duplicate rows in the dataset."

type

enum<string>

required

The test type.

Available options:

integrity,

consistency,

performance

Example:

"integrity"

subtype

enum<string>

required

The test subtype.

Available options:

anomalousColumnCount,

characterLength,

classImbalanceRatio,

expectColumnAToBeInColumnB,

columnAverage,

columnDrift,

columnStatistic,

columnValuesMatch,

conflictingLabelRowCount,

containsPii,

containsValidUrl,

correlatedFeatureCount,

customMetricThreshold,

duplicateRowCount,

emptyFeature,

emptyFeatureCount,

driftedFeatureCount,

featureMissingValues,

featureValueValidation,

greatExpectations,

groupByColumnStatsCheck,

illFormedRowCount,

isCode,

isJson,

llmRubricThresholdV2,

labelDrift,

metricThreshold,

newCategoryCount,

newLabelCount,

nullRowCount,

rowCount,

ppScoreValueValidation,

quasiConstantFeature,

quasiConstantFeatureCount,

sqlQuery,

dtypeValidation,

sentenceLength,

sizeRatio,

specialCharactersRatio,

stringValidation,

trainValLeakageRowCount

Example:

"duplicateRowCount"

thresholds

object[]

required

Show child attributes

evaluationWindow

number | null

The evaluation window in seconds. Only applies to tests that use production data.

Required range: x <= 2592000

Example:

3600

delayWindow

number | null

The delay window in seconds. Only applies to tests that use production data.

Required range: 0 <= x <= 2592000

Example:

0

archived

boolean

Whether the test is archived.

Example:

false

usesMlModel

boolean

Whether the test uses an ML model.

Example:

false

usesValidationDataset

boolean

Whether the test uses a validation dataset.

Example:

true

usesTrainingDataset

boolean

Whether the test uses a training dataset.

Example:

false

usesReferenceDataset

boolean

Whether the test uses a reference dataset (monitoring mode only).

Example:

false

usesProductionData

boolean

Whether the test uses production data (monitoring mode only).

Example:

false

includeHistoricalData

boolean | null

default:false

Whether to include historical data in the test result. Only applies to tests that use production data.

defaultToAllPipelines

boolean | null

default:true

Whether to apply the test to all pipelines (data sources) or to a specific set of pipelines. Only applies to tests that use production data.

includePipelines

string<uuid>[] | null

Array of pipelines (data sources) to which the test should be applied. Only applies to tests that use production data.

excludePipelines

string<uuid>[] | null

Array of pipelines (data sources) to which the test should not be applied. Only applies to tests that use production data.

Response

Status OK.

string<uuid>

required

The test id.

Example:

"3fa85f64-5717-4562-b3fc-2c963f66afa6"

number

integer

required

The test number.

Example:

1

name

string

required

The test name.

Maximum string length: 100

Example:

"No duplicate rows"

dateCreated

string<date-time>

required

The creation date.

Example:

"2024-03-22T11:31:01.185Z"

dateUpdated

string<date-time>

required

The last updated date.

Example:

"2024-03-22T11:31:01.185Z"

description

object

required

The test description.

Example:

"This test checks for duplicate rows in the dataset."

type

enum<string>

required

The test type.

Available options:

integrity,

consistency,

performance

Example:

"integrity"

subtype

enum<string>

required

The test subtype.

Available options:

anomalousColumnCount,

characterLength,

classImbalanceRatio,

expectColumnAToBeInColumnB,

columnAverage,

columnDrift,

columnStatistic,

columnValuesMatch,

conflictingLabelRowCount,

containsPii,

containsValidUrl,

correlatedFeatureCount,

customMetricThreshold,

duplicateRowCount,

emptyFeature,

emptyFeatureCount,

driftedFeatureCount,

featureMissingValues,

featureValueValidation,

greatExpectations,

groupByColumnStatsCheck,

illFormedRowCount,

isCode,

isJson,

llmRubricThresholdV2,

labelDrift,

metricThreshold,

newCategoryCount,

newLabelCount,

nullRowCount,

rowCount,

ppScoreValueValidation,

quasiConstantFeature,

quasiConstantFeatureCount,

sqlQuery,

dtypeValidation,

sentenceLength,

sizeRatio,

specialCharactersRatio,

stringValidation,

trainValLeakageRowCount

Example:

"duplicateRowCount"

creatorId

string<uuid> | null

required

The test creator id.

Example:

"589ece63-49a2-41b4-98e1-10547761d4b0"

originProjectVersionId

string<uuid> | null

required

The project version (commit) id where the test was created.

Example:

"3fa85f64-5717-4562-b3fc-2c963f66afa6"

thresholds

object[]

required

Show child attributes

dateArchived

string<date-time> | null

required

The date the test was archived.

Example:

"2024-03-22T11:31:01.185Z"

suggested

boolean

required

Whether the test is suggested or user-created.

Example:

false

commentCount

integer

required

The number of comments on the test.

Required range: x >= 0

Example:

0

evaluationWindow

number | null

The evaluation window in seconds. Only applies to tests that use production data.

Required range: x <= 2592000

Example:

3600

delayWindow

number | null

The delay window in seconds. Only applies to tests that use production data.

Required range: 0 <= x <= 2592000

Example:

0

archived

boolean

Whether the test is archived.

Example:

false

usesMlModel

boolean

Whether the test uses an ML model.

Example:

false

usesValidationDataset

boolean

Whether the test uses a validation dataset.

Example:

true

usesTrainingDataset

boolean

Whether the test uses a training dataset.

Example:

false

usesReferenceDataset

boolean

Whether the test uses a reference dataset (monitoring mode only).

Example:

false

usesProductionData

boolean

Whether the test uses production data (monitoring mode only).

Example:

false

includeHistoricalData

boolean | null

default:false

Whether to include historical data in the test result. Only applies to tests that use production data.

defaultToAllPipelines

boolean | null

default:true

Whether to apply the test to all pipelines (data sources) or to a specific set of pipelines. Only applies to tests that use production data.

includePipelines

string<uuid>[] | null

Array of pipelines (data sources) to which the test should be applied. Only applies to tests that use production data.

excludePipelines

string<uuid>[] | null

Array of pipelines (data sources) to which the test should not be applied. Only applies to tests that use production data.

Create project

List tests

⌘I

Reference

SDKs

CLI

REST API

Authorizations

Path Parameters

Body

Response