AgentEval.Cli 0.2.1-alpha

This is a prerelease version of AgentEval.Cli.
dotnet tool install --global AgentEval.Cli --version 0.2.1-alpha
                    
This package contains a .NET tool you can call from the shell/command line.
dotnet new tool-manifest
                    
if you are setting up this repo
dotnet tool install --local AgentEval.Cli --version 0.2.1-alpha
                    
This package contains a .NET tool you can call from the shell/command line.
#tool dotnet:?package=AgentEval.Cli&version=0.2.1-alpha&prerelease
                    
nuke :add-package AgentEval.Cli --version 0.2.1-alpha
                    

AgentEval CLI

AgentEval CLI

NuGet CI License: MIT MAF 1.0.0-rc3 .NET 9.0 | 10.0

Command-line interface for AgentEval — evaluate any OpenAI-compatible AI agent from the terminal.

Installation

dotnet tool install -g AgentEval.Cli --prerelease

Compatibility

AgentEval CLI AgentEval MAF .NET
0.2.0-alpha 0.6.0-beta 1.0.0-rc3 9.0, 10.0
0.1.0-alpha 0.5.3-beta 1.0.0-rc2 9.0, 10.0

Quick Start

Initialize a test dataset

agenteval init
agenteval init -o my-tests.yaml
agenteval init --format json

Run evaluations

# Against Azure OpenAI
agenteval eval --azure --endpoint https://myresource.openai.azure.com/ --deployment-name gpt-4o --dataset agenteval.yaml

# Against OpenAI directly
agenteval eval --endpoint https://api.openai.com/v1 --model gpt-4o --dataset agenteval.yaml

# Against a local Ollama model
agenteval eval --endpoint http://localhost:11434/v1 --model llama3 --dataset agenteval.yaml

Stochastic evaluation (multi-run)

agenteval eval --azure --endpoint https://myresource.openai.azure.com/ --deployment-name gpt-4o --dataset agenteval.yaml --runs 5 --success-threshold 0.9

Export results

# Single file export
agenteval eval --azure --endpoint https://myresource.openai.azure.com/ --deployment-name gpt-4o --dataset agenteval.yaml --format json -o results.json

# Structured directory export (ADR-002 format)
agenteval eval --azure --endpoint https://myresource.openai.azure.com/ --deployment-name gpt-4o --dataset agenteval.yaml --format directory --output-dir results/

Red team security scanning

# Run all 9 attack types
agenteval redteam --azure --endpoint https://myresource.openai.azure.com/ --deployment-name gpt-4o --intensity moderate

# Run specific attacks
agenteval redteam --azure --endpoint https://myresource.openai.azure.com/ --deployment-name gpt-4o --attacks PromptInjection,Jailbreak --format sarif

List available metrics and attacks

agenteval list
agenteval list --type metrics
agenteval list --type attacks

Authentication

AgentEval supports two endpoint modes: Azure OpenAI (--azure) and OpenAI-compatible (--endpoint).

Azure OpenAI (--azure)

The --azure flag uses AzureOpenAIClient. Both --endpoint and --deployment-name are required:

Setting Flag Env var fallback
Endpoint --endpoint (required)
Deployment --deployment-name (required)
API Key --api-key AZURE_OPENAI_API_KEY
# Explicit key
agenteval eval --azure --endpoint https://myresource.openai.azure.com/ --deployment-name gpt-4o --dataset agenteval.yaml --api-key sk-...

# Key from env var
export AZURE_OPENAI_API_KEY=sk-...
agenteval eval --azure --endpoint https://myresource.openai.azure.com/ --deployment-name gpt-4o --dataset agenteval.yaml

Note: --deployment-name is the name you gave your model deployment in Azure AI Foundry, not the underlying model name.

OpenAI-compatible (--endpoint)

For OpenAI, Ollama, Groq, vLLM, LM Studio, Together.ai, or any OpenAI-compatible API:

# OpenAI (set OPENAI_API_KEY or use --api-key)
agenteval eval --endpoint https://api.openai.com/v1 --model gpt-4o --dataset agenteval.yaml --api-key sk-...

# Local Ollama (no key needed)
agenteval eval --endpoint http://localhost:11434/v1 --model llama3 --dataset agenteval.yaml

Commands

Command Description
eval Run evaluations against an AI agent endpoint
init Scaffold a sample test dataset file
list List available metrics and attack types
redteam Run red team security scans

Requirements

  • .NET 9.0 or 10.0
  • An AI agent endpoint (Azure OpenAI, OpenAI, Ollama, or any OpenAI-compatible API)
  • Built on Microsoft.Extensions.AI (MAF 1.0.0-rc3)

Documentation

Contributing

Contributions are welcome! Please open an issue or pull request.

For discussions and questions, visit the AgentEval Discussions on the main repository.

License

MIT License. See LICENSE for details.

Product Compatible and additional computed target framework versions.
.NET net9.0 is compatible.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

This package has no dependencies.

Version Downloads Last Updated
0.2.1-alpha 61 3/5/2026
0.2.0-alpha 45 3/5/2026
0.1.1-alpha 48 3/3/2026
0.1.0-alpha 43 3/1/2026