Definition

The string validation test checks whether user-provided patterns appear in the text. Among the supported patterns, it is possible to check if the string contains/does not contain a specific substring or any/all of the substrings, if the string is equal/not equal to a specific value, if it matches/does not match a regular expression, and more. Refer to the Guide section for more details.

Taxonomy

  • Category: Integrity.
  • Task types: LLM.
  • Availability: and .

Why it matters

  • The string validation test acts as a quality control mechanism. It checks whether the generated text adheres to predefined standards or formats, which is particularly important in applications like content generation, code generation, or automated reporting.
  • In user-facing applications, meeting user expectations is crucial. String validation can ensure that the model’s output aligns with what users expect, whether it’s avoiding certain phrases or including specific types of information.
  • For some applications, ensuring that certain terms or phrases are included or excluded is critical. For example, if an LLM is answering questions about product documentation, it should not leak actual users’ API keys.
  • If the LLM is supposed to include specific keywords or phrases, this test can verify their presence. Validating the LLM’s generated output ensures relevancy.

Guide

To create a string validation test, you must select a column with string values and specify a pattern to check for.

The following patterns are supported:

  • Contains / Does not contain: The string contains or does not contain a specific substring.
  • iContains / Does not icontain: The string contains or does not contain a specific substring, case insensitive.
  • Contains all / Does not contain all: The string contains or does not contain all of the specified substrings (separated by commas).
  • Contains any / Does not contain any: The string contains any or does not contain any of the specified substrings (separated by commas).
  • Matches regex / Does not match regex: The string matches or does not match a specific regular expression.
  • Equal / Not equal: The string is equal or not equal to a specific value.

Test configuration examples

If you are writing a tests.json, here are a few valid configurations for the character length test: