IBM CLEAR Adapter
The IBM CLEAR adapter integrates IBM CLEAR (Comprehensive LLM Error Analysis and Reporting) with the eval-hub evaluation service using the evalhub-sdk framework adapter pattern.
Overview
Section titled “Overview”CLEAR runs an agentic, step-by-step pipeline over JSON traces (for example MLflow-style agent traces). It uses an LLM-as-judge to identify recurring failure patterns and writes a structured report.
Key Features
Section titled “Key Features”- Agentic evaluation pipeline: Multi-step LLM-as-judge analysis of agent interaction traces
- Failure pattern detection: Identifies and clusters recurring error patterns across runs
- Trace-native input: Processes MLflow-style JSON agent traces directly
- Structured reporting: Outputs
clear_results.jsonwith categorised issue statistics and scores - Flexible inference backends: LiteLLM (default) or direct OpenAI-compatible endpoints
Supported Trace Formats
Section titled “Supported Trace Formats”- MLflow agent traces (JSON)
- LangGraph agent traces
- Any JSON trace format conforming to the CLEAR input schema
Architecture
Section titled “Architecture”The adapter resolves where traces live, runs the CLEAR agentic pipeline, reads clear_results.json, maps CLEAR statistics into JobResults / EvaluationResult metrics, reports progress to the eval-hub sidecar, and optionally pushes artifacts to MLflow or an OCI bundle.
Workflow:
- Input traces — Prefers
/test_dataor/datawhen eval-hub has staged data from S3 (test_data_ref), or setparameters.data_dirto a directory of*.jsontraces. - Configuration — Job parameters drive CLEAR (
eval_model_name,provider,inference_backend, frameworks, etc.);model.urlis used as the OpenAI-compatible endpoint. - Execution — CLEAR prepares trace data and runs the step-by-step agentic pipeline.
- Output — Metrics (interactions, issues, agent scores) are returned to eval-hub;
clear_results.jsonis persisted under the run output.
Quick Start
Section titled “Quick Start”Running Locally
Section titled “Running Locally”export EVALHUB_MODE=localexport EVALHUB_JOB_SPEC_PATH=meta/job.json# Point at a directory of agent trace JSON filesexport EVALHUB_DATA_DIR=./my-traces
python main.pyRunning on Kubernetes
Section titled “Running on Kubernetes”Submit a job through the eval-hub API using provider ibm-clear and benchmark agentic-evaluation.
Traces from S3:
- Upload trace files to
s3://my-bucket/traces/ - Configure the job’s
test_data_ref.s3field - The adapter auto-discovers
*.jsonfiles under/test_datainside the pod
Configuration Parameters
Section titled “Configuration Parameters”| Parameter | Type | Description |
|---|---|---|
data_dir | string | Directory containing *.json trace files |
eval_model_name | string | LLM judge model name (e.g. openai/gpt-4o) |
provider | string | Inference provider (openai, anthropic, etc.) |
agent_framework | string | Agent framework used to generate traces (e.g. langgraph) |
observability_framework | string | Observability framework (e.g. mlflow) |
inference_backend | string | litellm (default) or endpoint |
Provider Details
Section titled “Provider Details”| Field | Value |
|---|---|
| Provider ID | ibm-clear |
| Benchmark ID | agentic-evaluation |
Source
Section titled “Source”- Adapter: eval-hub-contrib/adapters/clear
- Upstream: IBM/CLEAR