AFClaude 0.2.1
See the version list below for details.
dotnet tool install --global AFClaude --version 0.2.1
dotnet new tool-manifest
dotnet tool install --local AFClaude --version 0.2.1
#tool dotnet:?package=AFClaude&version=0.2.1
nuke :add-package AFClaude --version 0.2.1
AFClaude
A local .NET 10 process that lets Claude (Claude Code / Claude Desktop) call a model
hosted on Azure AI Foundry, authenticated via az login (Entra ID / AzureCliCredential),
with no API keys on disk.
It wraps the Foundry model with Microsoft Agent Framework (ChatClientAgent over
IChatClient) and exposes it three ways:
- MCP stdio server (default) — the way Claude Code/Desktop actually consumes local
tools. Claude launches the process, talks JSON-RPC over stdio, and calls a tool
(e.g.
ask_foundry) that forwards the prompt to the Foundry deployment. launchmode — starts an Anthropic Messages API-compatible endpoint (POST /v1/messages) and execsclaudeitself pointed at it viaANTHROPIC_BASE_URL, so Claude Code's own traffic runs against the Foundry model. Tool use (Read/Edit/Bash/etc.) is bridged to Azure OpenAI function-calling; see Running claude against Foundry.- OpenAI-compatible HTTP proxy (
--http) —POST /v1/chat/completionsandGET /v1/models, for any other OpenAI-compatible client that wants to point at the same Foundry deployment overhttp://127.0.0.1:<port>/v1.
Why not just one HTTP mode? Claude Desktop/Code's MCP integration expects a stdio MCP server, not an HTTP endpoint at all. And when Claude Code is pointed at a custom endpoint via
ANTHROPIC_BASE_URL, it only speaks the Anthropic Messages API wire format (/v1/messages) — never OpenAI's/v1/chat/completions. So the three surfaces serve three distinct consumers: Claude via MCP (default), Claude Code's own model traffic via/v1/messages(launch), and everything else that already speaks OpenAI's HTTP API (--http).
See PLAN.md for the build plan, open decisions, and current status.
Architecture
Claude Code / Claude Desktop
│ JSON-RPC over stdio (MCP)
▼
AFClaude (this repo, .NET 10)
ChatClientAgent (Microsoft.Agents.AI)
│ IChatClient
▼
AzureOpenAIClient (Azure.AI.OpenAI)
│ Entra ID token (AzureCliCredential, scope https://ai.azure.com/.default)
▼
Azure AI Foundry model deployment
The HTTP proxy mode swaps the top of the stack for a Kestrel endpoint instead of an
MCP stdio transport, reusing the same ChatClientAgent/IChatClient underneath.
Prerequisites
- .NET 10 SDK
- Azure CLI, logged in with access to the target Foundry resource:
az login - An Azure AI Foundry (or Azure OpenAI) resource with a model deployment
- A data-plane RBAC role on that resource — grant the account the Cognitive Services OpenAI User role. Control-plane roles (even subscription Owner) do not grant inference access under Entra auth; without the data-plane role every request is rejected as unauthorized. Allow a minute or two for a fresh role assignment to propagate.
Configuration
Set via environment variables (or appsettings.json / dotnet user-secrets locally):
| Variable | Example | Notes |
|---|---|---|
Foundry__Endpoint |
https://<resource>.cognitiveservices.azure.com/ |
Use whatever az cognitiveservices account list reports as properties.endpoint. The .cognitiveservices.azure.com shape is verified live against a real AIServices/Foundry resource; .openai.azure.com resource endpoints should work identically. |
Foundry__Deployment |
gpt-4o-mini |
Deployment name, not the base model name. |
Foundry__CliTimeoutSeconds |
60 (default) |
How long to wait for the az CLI to produce a token. The Azure SDK default (13s) is too short for a cold az start on slow or loaded machines (14–24s observed) — AFClaude defaults to 60; raise it if you still see token-timeout errors. |
No API keys are configured — auth is entirely via AzureCliCredential (falls back to
other DefaultAzureCredential sources if you later want that instead).
Running locally
Default mode is the MCP stdio server (what Claude actually launches):
az login
$env:Foundry__Endpoint = "https://<resource>.openai.azure.com/"
$env:Foundry__Deployment = "<deployment-name>"
dotnet run
For the HTTP proxy instead, add --http (see below).
One current caveat:
AFClaude.csprojpinsAzure.AI.OpenAIto a2.9.0-beta.1prerelease. The latest stable release (2.1.0) is binary-incompatible with theOpenAIpackage version pulled in transitively byMicrosoft.Agents.AI.OpenAI(throwsMissingMethodExceptionon the first real chat call) — see PLAN.md decision 7. Revisit when a compatible stableAzure.AI.OpenAIships.
Integrating with Claude
Claude Code / Claude Desktop (MCP, primary)
dnx resolves AFClaude from NuGet like npx resolves an npm package — once a
version is published (see Publishing), no local build step
is required:
dnx AFClaude -y
Before a version is published (or while iterating locally), pack it and point dnx
at a local feed instead:
dotnet pack src/AFClaude/AFClaude.csproj -c Release -o local-feed
dnx AFClaude -y --add-source ./local-feed
Register it as an MCP server (e.g. in Claude Code's .mcp.json):
{
"mcpServers": {
"afclaude": {
"command": "dnx",
"args": ["AFClaude", "--yes"],
"env": {
"Foundry__Endpoint": "https://<resource>.openai.azure.com/",
"Foundry__Deployment": "<deployment-name>"
}
}
}
}
Claude then sees a single tool, ask_foundry (one required prompt string), that it
can call mid-conversation to delegate a prompt to the Foundry model. az login must
have been run in advance, in the same user/environment context dnx will inherit.
Running claude against Foundry (launch)
dnx AFClaude -y -- launch
This starts the Anthropic-compatible HTTP host on http://127.0.0.1:31337 (override
via AFClaude__Launch__Port), then execs claude with ANTHROPIC_BASE_URL pointed
at it and ANTHROPIC_MODEL set to the configured Foundry deployment. Any arguments
after launch are forwarded straight to claude — e.g.
dnx AFClaude -y -- launch --dangerously-skip-permissions. When claude exits,
AFClaude stops the proxy and exits with claude's exit code.
The /v1/messages translation bridges Anthropic tool calling to Azure OpenAI
function-calling in both directions: the tools array becomes function-tool
definitions, tool_use/tool_result history becomes assistant tool calls and
tool-role messages, and the model's function calls come back as tool_use content
blocks with stop_reason: "tool_use" — so Claude Code's coding-agent behavior
(Read, Edit, Bash, ...) works in this mode. max_tokens, temperature, top_p,
and stop_sequences pass through as well.
Current limitations. Anthropic built-in server tools (e.g. web search) have no function-calling counterpart and are skipped, and non-text content blocks (images, thinking) are dropped. Streaming responses are a single coalesced SSE burst rather than true incremental token streaming — the reply arrives all at once. The deployed model must also support function calling on the Azure side. See PLAN.md Phase 8 for what's left.
Other OpenAI-compatible clients (HTTP proxy, secondary)
Run AFClaude in HTTP mode and point any OpenAI-compatible client at:
http://127.0.0.1:5277/v1
This does not register with Claude directly — it's for tooling that already speaks
the OpenAI HTTP API and needs a local, key-free route to a Foundry deployment. HTTP
mode is opt-in: pass --http or set AFClaude__Mode=http (default mode is the MCP
stdio server above).
Publishing (maintainers)
.github/workflows/publish.yml builds, packs, and pushes AFClaude to nuget.org on
any v*.*.* tag push (or via manual workflow_dispatch), using NuGet Trusted
Publishing — no long-lived NuGet API key is stored in the repo. This requires a
one-time policy on nuget.org, configured with:
| Field | Value |
|---|---|
| Repository Owner | jamesburton |
| Repository | AFClaude |
| Workflow File | publish.yml |
| Environment | nuget |
See PLAN.md (Phase 4) for the full one-time setup checklist, including the
matching GitHub Environment and the NUGET_USER repo secret the workflow needs.
Status
Phases 1–8 done and verified: scaffold, HTTP proxy, MCP stdio server, dnx packaging,
a live NuGet Trusted Publishing pipeline, clean auth-error surfaces, the
/v1/messages + launch path, and full tool-use bridging for /v1/messages —
verified against a real Azure AI Foundry deployment (gpt-4.1, Entra auth via
az login): real chat completions, Anthropic-shaped text + streaming turns, a full
tool_use/tool_result round trip, MCP ask_foundry, and claude -p through launch
all pass. Auth failures are classified into actionable messages (az missing, session
expired, az CLI token timeout, missing data-plane RBAC role). See
PLAN.md Phase 9 for what's left (true incremental streaming, error-surface
parity).
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
This package has no dependencies.