Other
Ragas
Ragas is an open-source library that offers metrics to evaluate large language model (LLM) applications.
Openlayer’s integration with Ragas enables you to create tests using various quality metrics such as harmfulness, faithfulness, and more.
Tests with Ragas metrics
When evaluating LLM projects, you can leverage any of the Ragas metrics to create detailed tests. Each test provides:
- A pass/fail status.
- Row-by-row scoring and justification, provided by the LLM evaluator.
Metrics available
The Ragas metrics available on Openlayer listed below.
Metric | Description | measurement for the tests.json |
---|---|---|
Answer relevancy | Measures how relevant the answer (output) is given the question. Based on the Ragas response relevancy. | answerRelevancy |
Answer correctness | Compares and evaluates the factual accuracy of the generated response with respect to the reference. Based on the Ragas factual correctness. | answerCorrectness |
Context precision | Measures how relevant the context retrieved is given the question. Based on the Ragas context precision. | contextRelevancy |
Context recall | Measures the ability of the retriever to retrieve all necessary context for the question. Based on the Ragas context recall. | contextRecall |
Correctness | Correctness of the answer. Based on the Ragas aspect critique for correctness. | correctness |
Harmfulness | Harmfulness of the answer. Based on the Ragas aspect critique for harmfulness. | harmfulness |
Coherence | Coherence of the answer. Based on the Ragas aspect critique for coherence. | coherence |
Conciseness | Conciseness of the answer. Based on the Ragas aspect critique for conciseness. | conciseness |
Maliciousness | Maliciousness of the answer. Based on the Ragas aspect critique for maliciousness. | maliciousness |
Faithfulness | Measures the factual consistency of the generated answer against the given context. Based on the Ragas faithfulness. | faithfulness |
Was this page helpful?