AgentEval.Cli
0.2.1-alpha
This is a prerelease version of AgentEval.Cli.
dotnet tool install --global AgentEval.Cli --version 0.2.1-alpha
This package contains a .NET tool you can call from the shell/command line.
dotnet new tool-manifest
dotnet tool install --local AgentEval.Cli --version 0.2.1-alpha
This package contains a .NET tool you can call from the shell/command line.
#tool dotnet:?package=AgentEval.Cli&version=0.2.1-alpha&prerelease
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
nuke :add-package AgentEval.Cli --version 0.2.1-alpha
The NuGet Team does not provide support for this client. Please contact its maintainers for support.

AgentEval CLI
Command-line interface for AgentEval — evaluate any OpenAI-compatible AI agent from the terminal.
Installation
dotnet tool install -g AgentEval.Cli --prerelease
Compatibility
| AgentEval CLI | AgentEval | MAF | .NET |
|---|---|---|---|
| 0.2.0-alpha | 0.6.0-beta | 1.0.0-rc3 | 9.0, 10.0 |
| 0.1.0-alpha | 0.5.3-beta | 1.0.0-rc2 | 9.0, 10.0 |
Quick Start
Initialize a test dataset
agenteval init
agenteval init -o my-tests.yaml
agenteval init --format json
Run evaluations
# Against Azure OpenAI
agenteval eval --azure --endpoint https://myresource.openai.azure.com/ --deployment-name gpt-4o --dataset agenteval.yaml
# Against OpenAI directly
agenteval eval --endpoint https://api.openai.com/v1 --model gpt-4o --dataset agenteval.yaml
# Against a local Ollama model
agenteval eval --endpoint http://localhost:11434/v1 --model llama3 --dataset agenteval.yaml
Stochastic evaluation (multi-run)
agenteval eval --azure --endpoint https://myresource.openai.azure.com/ --deployment-name gpt-4o --dataset agenteval.yaml --runs 5 --success-threshold 0.9
Export results
# Single file export
agenteval eval --azure --endpoint https://myresource.openai.azure.com/ --deployment-name gpt-4o --dataset agenteval.yaml --format json -o results.json
# Structured directory export (ADR-002 format)
agenteval eval --azure --endpoint https://myresource.openai.azure.com/ --deployment-name gpt-4o --dataset agenteval.yaml --format directory --output-dir results/
Red team security scanning
# Run all 9 attack types
agenteval redteam --azure --endpoint https://myresource.openai.azure.com/ --deployment-name gpt-4o --intensity moderate
# Run specific attacks
agenteval redteam --azure --endpoint https://myresource.openai.azure.com/ --deployment-name gpt-4o --attacks PromptInjection,Jailbreak --format sarif
List available metrics and attacks
agenteval list
agenteval list --type metrics
agenteval list --type attacks
Authentication
AgentEval supports two endpoint modes: Azure OpenAI (--azure) and OpenAI-compatible (--endpoint).
Azure OpenAI (--azure)
The --azure flag uses AzureOpenAIClient. Both --endpoint and --deployment-name are required:
| Setting | Flag | Env var fallback |
|---|---|---|
| Endpoint | --endpoint (required) |
— |
| Deployment | --deployment-name (required) |
— |
| API Key | --api-key |
AZURE_OPENAI_API_KEY |
# Explicit key
agenteval eval --azure --endpoint https://myresource.openai.azure.com/ --deployment-name gpt-4o --dataset agenteval.yaml --api-key sk-...
# Key from env var
export AZURE_OPENAI_API_KEY=sk-...
agenteval eval --azure --endpoint https://myresource.openai.azure.com/ --deployment-name gpt-4o --dataset agenteval.yaml
Note:
--deployment-nameis the name you gave your model deployment in Azure AI Foundry, not the underlying model name.
OpenAI-compatible (--endpoint)
For OpenAI, Ollama, Groq, vLLM, LM Studio, Together.ai, or any OpenAI-compatible API:
# OpenAI (set OPENAI_API_KEY or use --api-key)
agenteval eval --endpoint https://api.openai.com/v1 --model gpt-4o --dataset agenteval.yaml --api-key sk-...
# Local Ollama (no key needed)
agenteval eval --endpoint http://localhost:11434/v1 --model llama3 --dataset agenteval.yaml
Commands
| Command | Description |
|---|---|
eval |
Run evaluations against an AI agent endpoint |
init |
Scaffold a sample test dataset file |
list |
List available metrics and attack types |
redteam |
Run red team security scans |
Requirements
- .NET 9.0 or 10.0
- An AI agent endpoint (Azure OpenAI, OpenAI, Ollama, or any OpenAI-compatible API)
- Built on Microsoft.Extensions.AI (MAF 1.0.0-rc3)
Documentation
Contributing
Contributions are welcome! Please open an issue or pull request.
For discussions and questions, visit the AgentEval Discussions on the main repository.
License
MIT License. See LICENSE for details.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net9.0 is compatible. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
This package has no dependencies.
| Version | Downloads | Last Updated |
|---|---|---|
| 0.2.1-alpha | 61 | 3/5/2026 |
| 0.2.0-alpha | 45 | 3/5/2026 |
| 0.1.1-alpha | 48 | 3/3/2026 |
| 0.1.0-alpha | 43 | 3/1/2026 |