API Reference
Experiment Types
Dataclasses describing experiment specs and the results each run returns.
| ExperimentSpec | Declarative specification for one experiment run in a sweep. |
| ExperimentRunResult | Combined result returned by run_experiment. |
| AnnotationRunResult | Result returned by run_annotation. |
| MetricsRunResult | Result returned by run_metrics. |
| HumanReliabilityResult | Result returned by calculate_human_reliability. |
| HumanGroundTruthResult | Result returned by build_human_ground_truth. |
Experiment Functions
High-level entry points for running and sweeping experiments.
| run_experiment | Run one end-to-end experiment and evaluate it against ground truth. |
| run_experiment_grid | Run a whole parameter sweep from a grid or a prebuilt spec list. |
| expand_param_grid | Expand a parameter grid into concrete ExperimentSpec runs. |
| resolve_task_dir | Resolve a task directory from either a user path or bundled examples. |
Lower-Level Workflow
Run annotation, scoring, and human-reliability steps independently.
| run_annotation | Run one annotation job and persist its outputs to disk. |
| run_metrics | Evaluate one model-output CSV against ground truth and persist the results. |
| calculate_human_reliability | Validate human coder CSVs and calculate inter-coder reliability metrics. |
| build_human_ground_truth | Build consensus human ground truth from coder CSVs and optional adjudications. |
Validation and Retry Helpers
Response parsing and retry utilities used inside the annotation loop.
| annotate.extract_json_response | Extract and validate JSON response based on annotation type |
| annotate.normalize_retry_strategy | Return a supported retry strategy, falling back to "identical". |
| annotate.classify_text | Annotate one text row across all sections in a codebook. |
Example Tasks
Discover and copy the bundled starter tasks.
| list_example_tasks | List bundled example task names shipped with the package. |
| get_example_task_dir | Return the filesystem path to a bundled example task. |
| get_example_task_files | Return the standard file paths for a bundled example task. |
| copy_example_task | Copy a bundled example task to a user-controlled directory. |
Prompts
Inspect built-in prompt wrappers and register custom ones.
| list_prompt_wrappers | Return the sorted names of all registered prompt wrappers. |
| get_prompt_wrapper | Return a registered prompt wrapper by name. |
| register_prompt_wrapper | Register a prompt wrapper for use in Python and CLI experiment configs. |
| PromptContext | Structured prompt-building context passed to prompt wrapper functions. |
Ollama Helpers
Check on and start the local Ollama server and models.
| ensure_ollama_available | Check that the Ollama server is reachable, optionally starting it locally. |
| ensure_ollama_model | Pull an Ollama model so it is available locally before a run. |
| get_ollama_base_url | Return the Ollama base URL used for connectivity checks. |