Skip to content

Configuration Reference

Complete reference for RAGAS adapter configuration options.

The RAGAS adapter uses a standardised JobSpec structure:

{
"id": "string",
"provider_id": "ragas",
"benchmark_id": "string",
"model": {
"name": "string",
"url": "string"
},
"parameters": {
// RAGAS-specific configuration
},
"num_examples": 0
}
ParameterTypeDescriptionExample
idstringUnique job identifier"ragas-rag-eval-001"
provider_idstringMust be "ragas""ragas"
benchmark_idstringBenchmark identifier"ragas_rag_default"
model.namestringModel name for the LLM judge"Qwen/Qwen2.5-1.5B-Instruct"
model.urlstringOpenAI-compatible API endpoint"http://localhost:8000"
ParameterTypeDescriptionDefault
num_examplesintegerLimit the number of dataset samples to evaluateAll samples
callback_urlstringEvalHub service callback URLnull

Two pre-defined benchmark suites are available:

Runs the four core RAG evaluation metrics. Suitable for most use cases.

SettingValue
Metricsanswer_relevancy, context_precision, faithfulness, context_recall
Primary scorefaithfulness
Pass threshold0.5

Runs all 11 available metrics for comprehensive RAG evaluation.

SettingValue
MetricsAll 11 metrics (see Metrics reference)
Primary scorefaithfulness
Pass threshold0.5

All configuration is specified in the parameters object of the JobSpec.

ParameterTypeDescriptionDefault
metricsarrayList of RAGAS metric names to evaluateBenchmark default

Available metrics: answer_relevancy, answer_similarity, context_precision, faithfulness, context_recall, context_entity_recall, nv_accuracy, nv_context_relevance, factual_correctness, noise_sensitivity, nv_response_groundedness.

See the Metrics reference for details on each metric.

ParameterTypeDescriptionDefault
max_tokensintegerMaximum tokens for LLM completionsnull (server default)
temperaturenumberSampling temperature for LLM completionsnull (adapter default)
ParameterTypeDescriptionDefault
embedding_modelstringModel name for embeddingsSame as model.name
embedding_urlstringBase URL for embeddings endpointSame as model.url
ParameterTypeDescriptionDefault
data_pathstringExplicit path to dataset fileAuto-resolved
column_mapobjectMap dataset column names to RAGAS expected namesnull

Column mapping: RAGAS expects columns named user_input, response, retrieved_contexts, and reference. If your dataset uses different names, provide a mapping:

{
"parameters": {
"column_map": {
"question": "user_input",
"answer": "response",
"contexts": "retrieved_contexts",
"ground_truth": "reference"
}
}
}

Data resolution order (when data_path is not set):

  1. /test_data/dataset.jsonl, populated by EvalHub’s S3 init container
  2. First .jsonl or .json file in /test_data/
  3. /data/dataset.jsonl
  4. First .jsonl or .json file in /data/
ParameterTypeDescriptionDefault
max_workersintegerParallel workers for RAGAS evaluation (1–10)1

RAGAS uses an LLM as a judge for several metrics. The judge model receives structured prompts and must return parseable JSON responses.

MetricLLM JudgeEmbeddings
faithfulnessYesNo
answer_relevancyYesYes
context_precisionYesNo
context_recallYesNo
answer_similarityNoYes
context_entity_recallNoNo
factual_correctnessYesNo
noise_sensitivityYesNo
nv_accuracyYesNo
nv_context_relevanceYesNo
nv_response_groundednessYesNo
  • Each sample is evaluated independently per metric. For N samples and M judge-based metrics, expect roughly N * M LLM calls.
  • Judge prompts are structured and can be lengthy, so set max_tokens appropriately (512 is usually sufficient).
  • Use a lower temperature (e.g. 0.1) for more deterministic judge outputs.
  • The adapter uses chat completions (not legacy completions) to avoid truncation issues.

The adapter reads runtime settings from environment variables:

VariableDescriptionRequiredDefault
EVALHUB_MODEExecution mode (k8s or local)Nok8s
EVALHUB_JOB_SPEC_PATHPath to job spec JSONYes (local mode)/meta/job.json
SERVICE_URLEval-hub service URLNonull
REGISTRY_URLOCI registry URLNonull
REGISTRY_USERNAMERegistry usernameNonull
REGISTRY_PASSWORDRegistry passwordNonull
REGISTRY_INSECUREAllow insecure registryNofalse
LOG_LEVELLogging levelNoINFO
{
"id": "ragas-rag-eval-001",
"provider_id": "ragas",
"benchmark_id": "ragas_rag_default",
"benchmark_index": 0,
"model": {
"url": "http://127.0.0.1:8000",
"name": "Qwen/Qwen2.5-1.5B-Instruct"
},
"num_examples": 5,
"parameters": {
"metrics": [
"answer_relevancy",
"context_precision",
"faithfulness",
"context_recall"
],
"embedding_model": "all-MiniLM-L6-v2",
"embedding_url": "http://127.0.0.1:8001",
"max_tokens": 512,
"temperature": 0.1
},
"callback_url": "http://localhost:8080"
}