Mythosia.AI
6.4.0
dotnet add package Mythosia.AI --version 6.4.0
NuGet\Install-Package Mythosia.AI -Version 6.4.0
<PackageReference Include="Mythosia.AI" Version="6.4.0" />
<PackageVersion Include="Mythosia.AI" Version="6.4.0" />
<PackageReference Include="Mythosia.AI" />
paket add Mythosia.AI --version 6.4.0
#r "nuget: Mythosia.AI, 6.4.0"
#:package Mythosia.AI@6.4.0
#addin nuget:?package=Mythosia.AI&version=6.4.0
#tool nuget:?package=Mythosia.AI&version=6.4.0
Mythosia.AI
⚠️ Upgrading from v5.x? See the v6.0 Migration Guide.
Package Summary
The Mythosia.AI library provides a unified interface for various AI models with multimodal support, function calling, reasoning streaming, round-level token usage, and advanced streaming capabilities.
Supported Providers
- OpenAI — GPT-5.4 / 5.4 Mini / 5.4 Nano / 5.4 Pro / 5.3 Codex / 5.2 / 5.2 Codex / 5.1 / 5 (with reasoning), GPT-4.1, GPT-4o, o3
- Anthropic — Claude Opus 4.6 / 4.5 / 4.1 / 4, Sonnet 4.6 / 4.5 / 4, Haiku 4.5
- Google — Gemini 3 Flash/Pro Preview, Gemini 2.5 Pro/Flash/Flash-Lite
- DeepSeek — Chat and Reasoner models
- xAI — Grok 4, Grok 4.1 Fast, Grok 3, Grok 3 Mini
- Perplexity — Sonar with web search and citations
📚 Documentation
- Basic Usage Guide — Getting started with text queries, streaming, image analysis, and more
- Advanced Features — Function calling, policies, and enhanced streaming
- Release Notes — Full version history and migration guides
- Relationship to Microsoft.Extensions.AI — How IAIService and IChatClient differ
Installation
dotnet add package Mythosia.AI
For advanced LINQ operations with streams:
dotnet add package System.Linq.Async
For RAG (Retrieval-Augmented Generation) support:
dotnet add package Mythosia.AI.Rag
This adds .WithRag() to any AIService, enabling document-based context augmentation. See the Mythosia.AI.Rag README for full usage details.
using Mythosia.AI.Rag;
var service = new AnthropicService(apiKey, httpClient)
.WithRag(rag => rag
.AddDocument("manual.txt")
.AddDocument("policy.txt")
);
var response = await service.GetCompletionAsync("What is the refund policy?");
Quick Start
// OpenAI GPT
var gptService = new OpenAIService(apiKey, httpClient);
var response = await gptService.GetCompletionAsync("Hello!");
// Anthropic Claude
var claudeService = new AnthropicService(apiKey, httpClient);
var response = await claudeService.GetCompletionAsync("Hello!");
// Google Gemini
var geminiService = new GoogleAIService(apiKey, httpClient);
geminiService.ChangeModel(AIModels.Google.Gemini3FlashPreview);
var response = await geminiService.GetCompletionAsync("Hello!");
AIModels Catalog
Model selection is now documented around provider-grouped string constants via AIModels.
service.ChangeModel(AIModels.OpenAI.Gpt5_4);
service.ChangeModel(AIModels.Anthropic.ClaudeSonnet4_6);
service.ChangeModel(AIModels.Google.Gemini3FlashPreview);
Static Quick Helpers
For simple stateless usage, use AIService static helpers.
var answer = await AIService.QuickAskAsync(apiKey, "Summarize this text.");
var vision = await AIService.QuickAskWithImageAsync(apiKey, "Describe this image.", imagePath);
GPT-5 Family Configuration
GPT-5 family models (GPT-5 / 5.1 / 5.2 / 5.3 / 5.4) support type-safe reasoning configuration with per-model enums.
Reasoning Effort (Per-Model Enums)
Each GPT-5 variant has its own enum to ensure only valid options are available at compile time.
var gptService = (OpenAIService)service;
// GPT-5: Gpt5Reasoning (Auto/Minimal/Low/Medium/High)
gptService.WithGpt5Parameters(
reasoningEffort: Gpt5Reasoning.High,
reasoningSummary: ReasoningSummary.Concise);
// GPT-5.1: Gpt5_1Reasoning (Auto/None/Low/Medium/High) + Verbosity
gptService.WithGpt5_1Parameters(
reasoningEffort: Gpt5_1Reasoning.Medium,
verbosity: Verbosity.Low,
reasoningSummary: ReasoningSummary.Concise);
// GPT-5.2: Gpt5_2Reasoning (Auto/None/Low/Medium/High/XHigh) + Verbosity
gptService.WithGpt5_2Parameters(
reasoningEffort: Gpt5_2Reasoning.XHigh,
verbosity: Verbosity.High);
// GPT-5.3 Codex: Gpt5_3Reasoning (Auto/None/Low/Medium/High/XHigh) + Verbosity
gptService.WithGpt5_3Parameters(
reasoningEffort: Gpt5_3Reasoning.Medium,
verbosity: Verbosity.Medium,
reasoningSummary: ReasoningSummary.Concise);
// GPT-5.4 / 5.4 Pro: Gpt5_4Reasoning (Auto/None/Low/Medium/High/XHigh) + Verbosity
gptService.WithGpt5_4Parameters(
reasoningEffort: Gpt5_4Reasoning.Auto,
verbosity: Verbosity.High,
reasoningSummary: ReasoningSummary.Auto);
Auto uses the model-appropriate default (e.g., Medium for GPT-5, None for GPT-5.1/5.2, Medium for GPT-5.2 Pro/Codex, Medium for GPT-5.3 Codex, None for GPT-5.4, Medium for GPT-5.4 Pro).
Reasoning Summary
All GPT-5 family models support ReasoningSummary enum (Auto / Concise / Detailed). Set to null to disable.
Gemini Configuration
Gemini 3 — ThinkingLevel
var geminiService = new GoogleAIService(apiKey, httpClient);
geminiService.ChangeModel(AIModels.Google.Gemini3FlashPreview);
// GeminiThinkingLevel enum: Auto / Minimal / Low / Medium / High
geminiService.ThinkingLevel = GeminiThinkingLevel.Low; // Auto = model default (High)
Gemini 2.5 — ThinkingBudget
geminiService.ChangeModel(AIModels.Google.Gemini2_5Pro);
geminiService.ThinkingBudget = 8192; // -1 = dynamic (default), 0 = disable
Gemini Streaming Reasoning (includeThoughts)
When streaming with StreamOptions.WithReasoning(), Mythosia.AI now requests Gemini thought chunks (includeThoughts: true) and emits them as StreamingContentType.Reasoning.
await foreach (var content in geminiService.StreamAsync(message, new StreamOptions().WithReasoning()))
{
if (content.Type == StreamingContentType.Reasoning)
Console.Write($"[Gemini Thinking] {content.Content}");
else if (content.Type == StreamingContentType.Text)
Console.Write(content.Content);
}
Grok Configuration
Reasoning Effort
var grokService = new XAIService(apiKey, httpClient);
grokService.ChangeModel(AIModels.xAI.Grok3Mini);
// GrokReasoning enum: Off / Low / High
grokService.WithGrokParameters(reasoningEffort: GrokReasoning.High);
Note: Only
grok-3-minisupports thereasoning_effortAPI parameter. Other Grok models ignore it.
Reasoning Content Streaming
Grok reasoning models (grok-3-mini, grok-4, grok-4-1-fast) stream reasoning_content when reasoning is enabled:
await foreach (var content in grokService.StreamAsync(message, new StreamOptions().WithReasoning()))
{
if (content.Type == StreamingContentType.Reasoning)
Console.Write($"[Think] {content.Content}");
else if (content.Type == StreamingContentType.Text)
Console.Write(content.Content);
}
AIRequestProfile
Apply one-shot runtime overrides per request without mutating long-lived service configuration.
var response = await service.GetCompletionAsync(
"Rewrite this query for retrieval.",
RequestProfiles.QueryRewrite);
AIRequestContext
Use request-scoped prompt injection when you need to pass derived prompt data only for the current call without polluting the real conversation history or the service's base system message.
Available fields:
| Field | Purpose |
|---|---|
SystemMessagePrefix |
Text prepended to the system message for this request only |
SystemMessageSuffix |
Text appended to the system message for this request only |
AdditionalMessages |
Extra messages injected into the conversation for this request only (reference docs, few-shot examples) |
RequestMessageOverride |
Completely replaces the user message sent to the model while the original prompt stays in chat history |
Example — a query rewriter flow where the original user question should remain in chat history, but a retrieval-friendly rewrite is what actually gets sent to the model:
var rewrittenQuery = await service.GetCompletionAsync(
"Rewrite this question for retrieval.",
RequestProfiles.QueryRewrite);
var response = await service.GetCompletionAsync(
originalUserQuestion,
new AIRequestContext
{
RequestMessageOverride = new Message(ActorRole.User, rewrittenQuery)
});
Example — injecting retrieved RAG context as a suffix on the system message, without leaking it into conversation history:
var answer = await service.GetCompletionAsync(userQuestion,
new AIRequestContext
{
SystemMessageSuffix = $"\n\nUse the following context to answer:\n{retrievedDocs}"
});
For the full flow and before/after comparisons, see docs/request-contexts.md.
SystemMessageProvider — Automatic Baseline Injection
When the same dynamic data (today's date, active folder, session info) must be injected on every LLM call, passing an AIRequestContext at every entry point gets tedious and error-prone. AIService.SystemMessageProvider lets you register a callback once, and every outbound call (GetCompletionAsync, StreamAsync, RunAgentAsync, RunAgentStreamAsync) automatically invokes it to build a baseline context.
// Register once — typically at service construction / DI setup
service.WithSystemMessageProvider(() => new AIRequestContext
{
SystemMessageSuffix =
$"Today is {DateTime.UtcNow:yyyy-MM-dd}.\n" +
$"Current folder: {_uiContext.CurrentFolder}"
});
// Every call below automatically receives the baseline context
var answer = await service.GetCompletionAsync(userQuery);
await foreach (var chunk in service.StreamAsync(msg, options)) { /* ... */ }
var agentResult = await service.RunAgentAsync(goal);
When the baseline comes from a database, cache, or HTTP call, use the async overload so the provider does not have to block on .Result. Overload resolution picks the right one by lambda arity — no arg for sync, one CancellationToken for async:
service.WithSystemMessageProvider(async ct =>
{
var prefs = await _db.UserPreferences.FirstOrDefaultAsync(ct);
return new AIRequestContext
{
SystemMessageSuffix = $"User language: {prefs?.Language ?? "en"}"
};
});
Streaming paths (StreamAsync, RunAgentStreamAsync) forward the caller's CancellationToken through to the async provider. Non-streaming paths (GetCompletionAsync, RunAgentAsync) do not support cancellation — use the streaming counterparts if your provider needs to be cancellable.
When a call also passes an explicit AIRequestContext, the two merge field-by-field: explicit values win on scalar fields (SystemMessagePrefix, SystemMessageSuffix, RequestMessageOverride); AdditionalMessages concatenates (provider first, then explicit).
Available in Mythosia.AI v6.3.0+. Full details in docs/request-contexts.md.
Function Calling
Quick Start with Functions
// Define a simple function
var service = new OpenAIService(apiKey, httpClient)
.WithFunction(
"get_weather",
"Gets the current weather for a location",
("location", "The city and country", required: true),
(string location) => $"The weather in {location} is sunny, 22°C"
);
// AI will automatically call the function when needed
var response = await service.GetCompletionAsync("What's the weather in Seoul?");
// Output: "The weather in Seoul is currently sunny with a temperature of 22°C."
Attribute-Based Function Registration
public class WeatherService
{
[AiFunction("get_current_weather", "Gets the current weather for a location")]
public string GetWeather(
[AiParameter("The city name", required: true)] string city,
[AiParameter("Temperature unit", required: false)] string unit = "celsius")
{
// Your implementation
return $"Weather in {city}: 22°{unit[0]}";
}
}
// Register all functions from a class
var weatherService = new WeatherService();
var service = new OpenAIService(apiKey, httpClient)
.WithFunctions(weatherService);
Advanced Function Builder
var service = new OpenAIService(apiKey, httpClient)
.WithFunction(FunctionBuilder.Create("calculate")
.WithDescription("Performs mathematical calculations")
.AddParameter("expression", "string", "The math expression", required: true)
.AddParameter("precision", "integer", "Decimal places", required: false, defaultValue: 2)
.WithHandler(async (args) =>
{
var expr = args["expression"].ToString();
var precision = Convert.ToInt32(args.GetValueOrDefault("precision", 2));
// Calculate and return result
return await CalculateAsync(expr, precision);
})
.Build());
Multiple Functions with Different Types
var service = new OpenAIService(apiKey, httpClient)
// Parameterless function
.WithFunction(
"get_time",
"Gets the current time",
() => DateTime.Now.ToString("HH:mm:ss")
)
// Two-parameter function
.WithFunction(
"add_numbers",
"Adds two numbers",
("a", "First number", true),
("b", "Second number", true),
(double a, double b) => $"The sum is {a + b}"
)
// Async function
.WithFunctionAsync(
"fetch_data",
"Fetches data from API",
("endpoint", "API endpoint", true),
async (string endpoint) => await httpClient.GetStringAsync(endpoint)
);
// The AI will automatically use the appropriate functions
var response = await service.GetCompletionAsync(
"What time is it? Also, what's 15 plus 27?"
);
Function Calling Policies
// Pre-defined policies
service.DefaultPolicy = FunctionCallingPolicy.Fast; // 30s timeout, 10 rounds
service.DefaultPolicy = FunctionCallingPolicy.Complex; // 300s timeout, 50 rounds
service.DefaultPolicy = FunctionCallingPolicy.Vision; // 200s timeout, for image analysis
// Custom policy
service.DefaultPolicy = new FunctionCallingPolicy
{
MaxRounds = 25,
TimeoutSeconds = 120,
MaxConcurrency = 5,
EnableLogging = true // Enable debug output
};
// Per-request policy override
var response = await service
.WithPolicy(FunctionCallingPolicy.Fast)
.GetCompletionAsync("Complex task requiring functions");
// Inline policy configuration
var response = await service
.BeginMessage()
.AddText("Analyze this data")
.WithMaxRounds(5)
.WithTimeout(60)
.SendAsync();
Function Calling with Streaming
// Stream with function calling support
await foreach (var content in service.StreamAsync(
"What's the weather in Seoul and calculate 15% tip on $85",
StreamOptions.WithFunctions))
{
if (content.Type == StreamingContentType.FunctionCall)
{
Console.WriteLine($"Calling function: {content.Metadata["function_name"]}");
}
else if (content.Type == StreamingContentType.FunctionResult)
{
Console.WriteLine($"Function completed: {content.Metadata["status"]}");
}
else if (content.Type == StreamingContentType.Text)
{
Console.Write(content.Content);
}
}
ReAct Agent Helpers
// Non-streaming agent helper
var answer = await service.RunAgentAsync(
"Find the weather in Seoul and explain what to wear today."
);
// Streaming agent helper
await foreach (var content in service.RunAgentStreamAsync(
"Find the weather in Seoul and explain what to wear today.",
maxSteps: 10))
{
if (content.Type == StreamingContentType.FunctionCall)
{
Console.WriteLine($"Calling: {content.Metadata["function_name"]}");
}
else if (content.Type == StreamingContentType.FunctionResult)
{
Console.WriteLine($"Tool result: {content.Content}");
}
else if (content.Type == StreamingContentType.Text)
{
Console.Write(content.Content);
}
}
RunAgentStreamAsync(...) is the streaming counterpart to RunAgentAsync(...). It keeps function calling enabled for the request and disables TextOnly so agent runs can emit function call, function result, and completion events.
Disabling Functions Temporarily
// Disable functions for a single request
var response = await service
.WithoutFunctions()
.GetCompletionAsync("Don't use any functions for this");
// Or use the async helper
var response = await service.AskWithoutFunctionsAsync(
"Process this without calling functions"
);
Structured Output
Deserialize LLM responses directly into C# POCOs with automatic JSON recovery.
Basic Usage
// Define your POCO
public class WeatherResponse
{
public string City { get; set; }
public double Temperature { get; set; }
public string Condition { get; set; }
}
// Get typed result — schema is auto-generated and sent to the LLM
var result = await service.GetCompletionAsync<WeatherResponse>(
"What's the weather in Seoul?");
Console.WriteLine($"{result.City}: {result.Temperature}°C, {result.Condition}");
Auto-Recovery Retry
When the LLM returns invalid JSON, a correction prompt is automatically sent asking the model to fix its output. This is not a network retry — it's an output quality/format correction loop.
// Configure service-level retry count (default: 2)
service.StructuredOutputMaxRetries = 3;
// On final failure, StructuredOutputException is thrown with rich diagnostics:
// - FirstRawResponse, LastRawResponse
// - ParseError, AttemptCount, SchemaJson, TargetTypeName
Per-Call Structured Output Policy
Override retry behavior for a single request without changing service defaults:
// Custom policy — applies only to this call, then auto-cleared
var result = await service
.WithStructuredOutputPolicy(new StructuredOutputPolicy { MaxRepairAttempts = 5 })
.GetCompletionAsync<MyDto>(prompt);
// Preset: no retry (1 attempt only)
var result = await service
.WithNoRetryStructuredOutput()
.GetCompletionAsync<MyDto>(prompt);
// Preset: strict mode (up to 3 retries = 4 total attempts)
var result = await service
.WithStrictStructuredOutput()
.GetCompletionAsync<MyDto>(prompt);
| Preset | MaxRepairAttempts | Description |
|---|---|---|
Default |
null (service default) |
Uses StructuredOutputMaxRetries |
NoRetry |
0 |
Single attempt, no retry |
Strict |
3 |
Up to 3 correction retries |
Streaming Structured Output
Stream text chunks in real-time to the UI while getting a final deserialized object with auto-repair:
var run = service.BeginStream(prompt)
.WithStructuredOutput(new StructuredOutputPolicy { MaxRepairAttempts = 2 })
.As<MyDto>();
// Optional: observe chunks in real-time
await foreach (var chunk in run.Stream(cancellationToken))
{
Console.Write(chunk); // UI display
}
// Final deserialized result (waits for stream + parse/repair)
MyDto dto = await run.Result;
Resultworks withoutStream()— justawait run.Resultinternally consumes the stream and parsesStream()is single-use — second call throwsInvalidOperationExceptionResultwaits for stream completion — even if awaited mid-stream, it won't resolve early- Repair retries are non-streaming — correction prompts use
GetCompletionAsync()for efficiency
Collection Support (List<T>, T[])
Both GetCompletionAsync<T>() and streaming support collection types — no wrapper DTO needed:
// Non-streaming: get a list directly
var items = await service.GetCompletionAsync<List<ItemDto>>(
"Extract all entities from this document...");
// Streaming: observe chunks + get list result
var run = service.BeginStream(prompt).As<List<ItemDto>>();
await foreach (var chunk in run.Stream()) Console.Write(chunk);
List<ItemDto> items = await run.Result;
List<T>, T[], IReadOnlyList<T> are all supported. JSON array schema is auto-generated from the element type.
Conversation Summary Policy
Automatically summarize old conversation messages when the conversation exceeds a configured threshold. The summary is stored and injected into the system message on each subsequent LLM request.
Configuration
// Token-based: summarize when total tokens exceed 3000, keep recent ~1000 tokens
service.ConversationPolicy = SummaryConversationPolicy.ByToken(
triggerTokens: 3000,
keepRecentTokens: 1000
);
// Message-count-based: summarize when messages exceed 20, keep last 5
service.ConversationPolicy = SummaryConversationPolicy.ByMessage(
triggerCount: 20,
keepRecentCount: 5
);
// Combined (OR condition): triggers when either threshold is exceeded
service.ConversationPolicy = SummaryConversationPolicy.ByBoth(
triggerTokens: 3000,
triggerCount: 20
);
Usage
// Just use as normal — summarization happens automatically
service.ConversationPolicy = SummaryConversationPolicy.ByMessage(triggerCount: 20, keepRecentCount: 5);
var response = await service.GetCompletionAsync("Continue our conversation...");
// When message count exceeds 20, old messages are summarized automatically
Session Persistence
// Save summary for later
string saved = service.ConversationPolicy.CurrentSummary;
// Restore in a new session
policy.LoadSummary(saved);
Key Design Decisions
- StatelessMode protection — Summary LLM calls use
StatelessMode = trueto prevent polluting the main conversation history - Backward compatible —
ConversationPolicydefaults tonull; existing behavior is unchanged - Provider-agnostic — Works with all providers (OpenAI, Claude, Gemini, Grok, DeepSeek, Perplexity)
- Incremental summarization — When re-summarizing, existing summary is included as context for the new summary
Enhanced Streaming
Stream Options
// Text only - fastest, no overhead
await foreach (var chunk in service.StreamAsync("Hello", StreamOptions.TextOnlyOptions))
{
Console.Write(chunk.Content);
}
// With metadata - includes model info, timestamps, etc.
await foreach (var content in service.StreamAsync("Hello", StreamOptions.FullOptions))
{
if (content.Metadata != null)
{
Console.WriteLine($"Model: {content.Metadata["model"]}");
}
Console.Write(content.Content);
}
// Custom options
var options = new StreamOptions()
.WithMetadata(true)
.WithFunctionCalls(true)
.AsTextOnly(false);
await foreach (var content in service.StreamAsync("Query", options))
{
// Process based on content.Type
switch (content.Type)
{
case StreamingContentType.Text:
Console.Write(content.Content);
break;
case StreamingContentType.FunctionCall:
Console.WriteLine($"Calling: {content.Metadata["function_name"]}");
break;
case StreamingContentType.Completion:
Console.WriteLine($"Total length: {content.Metadata["total_length"]}");
break;
}
}
Streaming Diagnostics
When an SSE stream dies mid-flight against a self-hosted backend (vLLM, ollama, internal proxy), you usually need to know exactly where it died. Register diagnostic hooks once on the service — every subsequent StreamAsync call picks them up automatically. Same fluent builder pattern as WithRag.
using Mythosia.AI.Extensions;
service.WithStreamDiagnostics(d => d
.OnRawLine(line => logger.LogDebug("SSE: {Line}", line))
.OnComplete(diag => logger.LogInformation("Stream finished: {Diag}", diag)));
await foreach (var chunk in service.StreamAsync(message))
Console.Write(chunk.Content);
Each On* method is independent — register only what you need:
// Raw line trace only
service.WithStreamDiagnostics(d => d.OnRawLine(line => logger.LogDebug("SSE: {Line}", line)));
// Clear all hooks
service.WithStreamDiagnostics(_ => { });
When SSE reading throws, the library wraps the exception in StreamReadException with a StreamDiagnostics snapshot taken at the moment of failure. This works regardless of whether WithStreamDiagnostics was registered:
try
{
await foreach (var chunk in service.StreamAsync(message))
Console.Write(chunk.Content);
}
catch (StreamReadException ex)
{
logger.LogError(ex,
"Stream died after {Lines} lines, {Chars} chars. Last raw line: {Line}",
ex.Diagnostics.LinesRead,
ex.Diagnostics.AccumulatedTextLength,
ex.Diagnostics.LastRawLine);
// ex.InnerException carries the original exception (IOException, etc.)
}
StreamDiagnostics exposes LinesRead, DataLinesProcessed, ParseFailures, AccumulatedTextLength, LastRawLine, and Elapsed. Hooks are propagated through CopyFrom, so cross-provider switches in a multi-provider chat UI keep the registered diagnostics without re-registration.
Available in Mythosia.AI v6.4.0+. Full guide: docs/streaming.md.
Token Usage
Streaming exposes token usage in two different places, with different meanings:
StreamingContentType.RoundUsage: usage for one LLM round only.StreamingContentType.Completion: cumulative usage for the whole streaming run.
For a single LLM call, the final RoundUsage.Usage and Completion.Usage should describe
the same one-round request. For an agent or function-calling run, each LLM round emits its own
RoundUsage, while the final Completion.Usage remains the sum of all rounds.
This distinction is important for UI context meters. If you want to show "how many tokens the
current conversation state used when it entered the latest LLM call", use the latest
RoundUsage.Usage.TotalTokens. If you want cost or diagnostics for the full agent run, use
Completion.Usage.TotalTokens.
RoundUsage events also include:
RoundIndex: 1-based LLM round number.IsFinalRound: true when this is the last LLM round in the stream.
await foreach (var content in service.StreamAsync(message, StreamOptions.FullOptions))
{
if (content.Type == StreamingContentType.Text)
Console.Write(content.Content);
if (content.Type == StreamingContentType.RoundUsage && content.Usage != null)
{
Console.WriteLine($"Round: {content.RoundIndex}");
Console.WriteLine($"Round total: {content.Usage.TotalTokens}");
Console.WriteLine($"Final round: {content.IsFinalRound}");
}
if (content.Type == StreamingContentType.Completion && content.Usage != null)
{
Console.WriteLine($"Input tokens: {content.Usage.InputTokens}");
Console.WriteLine($"Output tokens: {content.Usage.OutputTokens}");
Console.WriteLine($"Cached tokens: {content.Usage.CachedInputTokens}");
Console.WriteLine($"Reasoning tokens: {content.Usage.ReasoningTokens}");
Console.WriteLine($"Cache hit ratio: {content.Usage.CacheHitRatio:P1}");
}
}
Agent Token Meter Example
int? contextTokenMeter = null;
TokenUsage? cumulativeRunUsage = null;
await foreach (var content in service.RunAgentStreamAsync(
"Find the weather in Seoul and answer briefly.",
maxSteps: 10))
{
if (content.Type == StreamingContentType.RoundUsage && content.Usage != null)
{
// Best value for a UI context/token meter.
contextTokenMeter = content.Usage.TotalTokens;
Console.WriteLine(
$"Round {content.RoundIndex}: {content.Usage.TotalTokens} tokens");
if (content.IsFinalRound)
{
Console.WriteLine($"Final context meter value: {contextTokenMeter}");
}
continue;
}
if (content.Type == StreamingContentType.Completion)
{
// Cumulative usage across the whole agent run.
cumulativeRunUsage = content.Usage;
continue;
}
if (content.Type == StreamingContentType.Text)
Console.Write(content.Content);
}
Token Usage Contract
RoundUsage.Usageis never an accumulated run total. It represents that one LLM round.RoundUsage.Usage.TotalTokensis normalized toInputTokens + OutputTokens.Completion.Usagekeeps the existing cumulative meaning for the full stream or agent run.- In function-calling streams, non-final rounds have
IsFinalRound = false; the last round hasIsFinalRound = true. - Token usage collection does not depend on
IncludeMetadata. Usage can still be emitted when metadata is disabled. - Providers may attach official usage to different stream chunks internally. Consumers should read the normalized
RoundUsageandCompletionevents rather than provider-specific chunk metadata. - Gemini streams are drained after function calls so late
usageMetadatachunks can still becomeRoundUsage.
The Token test category contains provider-level tests for this contract. If those tests pass
for a provider/model, Mythosia.AI considers round-level usage and final cumulative usage supported
for that provider/model. If a provider/model does not return official usage, these tests should fail
or be treated as unsupported for token usage.
TokenUsage fields:
| Field | Description | Providers |
|---|---|---|
InputTokens |
Input/prompt tokens | All |
OutputTokens |
Output/completion tokens | All |
TotalTokens |
Total tokens used | All |
CachedInputTokens |
Tokens served from cache | OpenAI, Claude, DeepSeek, Gemini |
CacheCreationTokens |
Tokens written to cache | Claude |
ReasoningTokens |
Internal reasoning tokens | OpenAI, Gemini |
Computed properties: NonCachedInputTokens, CacheHitRatio, HasCacheActivity, VisibleOutputTokens.
Reasoning Streaming
GPT-5, Gemini 3, and Grok reasoning models support streaming reasoning (thinking) content.
await foreach (var content in service.StreamAsync(message, new StreamOptions().WithReasoning()))
{
if (content.Type == StreamingContentType.Reasoning)
Console.WriteLine($"[Thinking] {content.Content}");
else if (content.Type == StreamingContentType.Text)
Console.Write(content.Content);
}
Service Support
| Service | Function Calling | Streaming | Reasoning | Notes |
|---|---|---|---|---|
| OpenAI GPT-5.4 / 5.4 Pro | ✅ | ✅ | ✅ | Per-model reasoning enums + verbosity |
| OpenAI GPT-5.3 Codex | ✅ | ✅ | ✅ | Per-model reasoning enums + verbosity |
| OpenAI GPT-5.2 / 5.2 Pro / 5.2 Codex | ✅ | ✅ | ✅ | Per-model reasoning enums + verbosity |
| OpenAI GPT-5.1 | ✅ | ✅ | ✅ | Reasoning + verbosity control |
| OpenAI GPT-5 / Mini / Nano | ✅ | ✅ | ✅ | Reasoning streaming + summary |
| OpenAI GPT-4.1 / GPT-4o | ✅ | ✅ | — | Full function support |
| OpenAI o3 / o3-pro | ✅ | ✅ | ✅ | Advanced reasoning |
| Claude Opus 4.6 / 4.5 / 4.1 / 4 | ✅ | ✅ | ✅ | Extended thinking + tool use |
| Claude Sonnet 4.6 / 4.5 / 4 | ✅ | ✅ | ✅ | Extended thinking + tool use |
| Claude Haiku 4.5 | ✅ | ✅ | ✅ | Extended thinking + tool use |
| Gemini 3 Flash/Pro | ✅ | ✅ | ✅ | ThinkingLevel + thought signatures |
| Gemini 2.5 Pro/Flash | ✅ | ✅ | ✅ | ThinkingBudget control |
| xAI Grok 4 / 4.1 Fast / 3 / 3 Mini | ✅ | ✅ | ✅ | GrokReasoning effort + reasoning streaming |
| DeepSeek | ❌ | ✅ | ✅ | Reasoner model streaming |
| Perplexity | ❌ | ✅ | — | Web search + citations |
Complete Examples
Building a Weather Assistant
public class WeatherAssistant
{
private readonly OpenAIService _service;
private readonly HttpClient _httpClient;
public WeatherAssistant(string apiKey)
{
_httpClient = new HttpClient();
_service = new OpenAIService(apiKey, _httpClient)
.WithSystemMessage("You are a helpful weather assistant.")
.WithFunction(
"get_weather",
"Gets current weather for a city",
("city", "City name", true),
GetWeatherData
)
.WithFunction(
"get_forecast",
"Gets weather forecast",
("city", "City name", true),
("days", "Number of days", false),
GetForecast
);
// Configure function calling behavior
_service.DefaultPolicy = new FunctionCallingPolicy
{
MaxRounds = 10,
TimeoutSeconds = 30,
EnableLogging = true
};
}
private string GetWeatherData(string city)
{
// In real implementation, call weather API
return $"{{\"city\":\"{city}\",\"temp\":22,\"condition\":\"sunny\"}}";
}
private string GetForecast(string city, int days = 3)
{
// In real implementation, call forecast API
return $"{{\"city\":\"{city}\",\"forecast\":\"{days} days of sun\"}}";
}
public async Task<string> AskAsync(string question)
{
return await _service.GetCompletionAsync(question);
}
public async IAsyncEnumerable<string> StreamAsync(string question)
{
await foreach (var content in _service.StreamAsync(question))
{
if (content.Type == StreamingContentType.Text && content.Content != null)
{
yield return content.Content;
}
}
}
}
// Usage
var assistant = new WeatherAssistant(apiKey);
// Functions are called automatically
var response = await assistant.AskAsync("What's the weather in Tokyo?");
// AI calls get_weather("Tokyo") and responds naturally
// Streaming also supports functions
await foreach (var chunk in assistant.StreamAsync(
"Compare weather in Seoul and Tokyo for the next 5 days"))
{
Console.Write(chunk);
}
Math Tutor with Step-by-Step Solutions
var mathTutor = new OpenAIService(apiKey, httpClient)
.WithSystemMessage("You are a math tutor. Always explain your reasoning.")
.WithFunction(
"calculate",
"Performs calculations",
("expression", "Math expression", true),
(string expr) => {
// Using a math expression evaluator
var result = EvaluateExpression(expr);
return $"Result: {result}";
}
)
.WithFunction(
"solve_equation",
"Solves equations step by step",
("equation", "Equation to solve", true),
(string equation) => {
var steps = SolveWithSteps(equation);
return JsonSerializer.Serialize(steps);
}
);
// The AI will use functions and explain the process
var response = await mathTutor.GetCompletionAsync(
"Solve the equation 2x + 5 = 13 and verify the answer"
);
// Output includes step-by-step solution with verification
Best Practices
Function Design: Keep functions focused and simple. Complex logic should be broken into multiple functions.
Error Handling: Functions should return meaningful error messages that the AI can understand.
Performance: Use appropriate policies for your use case (Fast for simple tasks, Complex for detailed analysis).
Streaming: Use
TextOnlyOptionsfor best performance when metadata isn't needed.Testing: Test function calling with various prompts to ensure robust behavior.
Troubleshooting
Q: Functions aren't being called when expected?
- Ensure functions are registered with clear, descriptive names and descriptions
- Check that
EnableFunctionsis true on the service - Verify the model supports function calling (see Service Support table above)
Q: Function calling is too slow?
- Adjust the policy timeout:
service.DefaultPolicy.TimeoutSeconds = 30 - Use
FunctionCallingPolicy.Fastfor simple operations - Consider using streaming for better perceived performance
Q: How to debug function execution?
- Enable logging:
service.DefaultPolicy.EnableLogging = true - Check the console output for round-by-round execution details
- Use
StreamOptions.FullOptionsto see function call metadata
Q: Can I use functions with streaming?
- Yes! Functions work seamlessly with streaming
- Use
StreamOptions.WithFunctionsto see function execution in real-time
📋 TODO — Unsupported Models (Planned)
The following OpenAI models are not yet supported due to significant API differences:
| Model | API Name | Status | Notes |
|---|---|---|---|
| GPT-5.2 Instant | gpt-5.2-chat-latest |
⏳ Planned | ChatGPT-optimized model; uses a different routing/parameter set than standard Responses API models |
| GPT-5.3 Instant | gpt-5.3-chat-latest |
⏳ Planned | ChatGPT-optimized model; same API constraints as GPT-5.2 Instant |
| GPT-5.3 Codex Spark | gpt-5.3-codex-spark |
⏳ Planned | Research preview; completely different infrastructure (Cerebras-powered, WebSocket-based, text-only) |
Why are these models different?
chat-latest models (Instant)
- These are ChatGPT-internal models exposed to the API. OpenAI recommends using the standard models (e.g.,
gpt-5.2,gpt-5.3-codex) for API usage instead. - They do not support the full set of Responses API parameters such as
reasoning.effort,text.verbosity, and other model-specific configurations. - Response format and content structure may differ from standard models.
gpt-5.3-codex-spark
- Research preview available only to ChatGPT Pro subscribers.
- Powered by Cerebras inference hardware for near-instant responses.
- Uses persistent WebSocket connections and an optimized Responses API — a fundamentally different transport layer than the standard HTTP-based request/response pattern.
- Text-only (no multimodal support).
- Designed specifically for real-time coding iteration within Codex, not general-purpose API usage.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
| .NET Core | netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
| .NET Standard | netstandard2.1 is compatible. |
| MonoAndroid | monoandroid was computed. |
| MonoMac | monomac was computed. |
| MonoTouch | monotouch was computed. |
| Tizen | tizen60 was computed. |
| Xamarin.iOS | xamarinios was computed. |
| Xamarin.Mac | xamarinmac was computed. |
| Xamarin.TVOS | xamarintvos was computed. |
| Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETStandard 2.1
- Azure.AI.OpenAI (>= 2.1.0)
- Mythosia.AI.Abstractions (>= 2.2.0)
- Newtonsoft.Json (>= 13.0.4)
- NJsonSchema (>= 11.6.1)
- System.Threading.Channels (>= 10.0.7)
- TiktokenSharp (>= 1.2.1)
NuGet packages (2)
Showing the top 2 NuGet packages that depend on Mythosia.AI:
| Package | Downloads |
|---|---|
|
Mythosia.AI.Providers.Alibaba
Alibaba Cloud Qwen provider package for Mythosia.AI. Includes QwenService with expanded Qwen 3 / 3.5 model constants, platform-specific thinking request handling across DashScope, vLLM, and Ollama, token usage streaming support, and Mythosia.AI v6.4.0 compatibility. Documentation - GitHub: https://github.com/AJ-comp/Mythosia.AI - Release Notes: core/Mythosia.AI.Providers.Alibaba/RELEASE_NOTES.md |
|
|
Mythosia.AI.Mcp
MCP (Model Context Protocol) client integration for Mythosia.AI. Connect to any MCP server (stdio or SSE) and automatically register its tools as FunctionDefinitions usable by all AI providers. |
GitHub repositories
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated |
|---|---|---|
| 6.4.0 | 389 | 4/28/2026 |
| 6.4.0-preview1 | 172 | 4/25/2026 |
| 6.3.0 | 204 | 4/20/2026 |
| 6.2.0 | 165 | 4/16/2026 |
| 6.1.0 | 186 | 4/10/2026 |
| 6.0.0 | 203 | 4/3/2026 |
| 5.3.0 | 140 | 4/2/2026 |
| 5.2.0 | 158 | 3/29/2026 |
| 5.1.0 | 188 | 3/28/2026 |
| 5.0.1 | 184 | 3/24/2026 |
| 5.0.0 | 320 | 3/15/2026 |
| 4.7.1 | 134 | 3/11/2026 |
| 4.7.0 | 127 | 3/7/2026 |
| 4.6.2 | 265 | 2/27/2026 |
| 4.6.1 | 116 | 2/27/2026 |
| 4.6.0 | 115 | 2/26/2026 |
| 4.5.0 | 111 | 2/26/2026 |
| 4.4.0 | 113 | 2/25/2026 |
| 4.3.0 | 152 | 2/24/2026 |
| 4.2.0 | 112 | 2/22/2026 |
v6.4.0: streaming diagnostics — new service-level WithStreamDiagnostics(d => d.OnRawLine(...).OnComplete(...)) extension, StreamReadException wrapping read-time failures with a StreamDiagnostics snapshot (LinesRead, LastRawLine, Elapsed, etc.) for observability against self-hosted vLLM/ollama/unstable proxies. Fixes NotSupportedException at await foreach ... DisposeAsync() across all 5 providers by replacing synchronous using (var stream = ...) with async stream disposal; finally block now guards disposal so a Dispose-time failure cannot mask the real read exception and OnComplete is guaranteed to fire. Fixes CopyFrom silently dropping SystemMessageProvider (v6.3.0 omission) plus the new streaming-diagnostics callbacks — cross-provider switches in multi-provider chat UIs now preserve registered hooks. Internal: 5 providers / 10 SSE loops consolidated to one ReadSseLinesAsync helper. 16 new diagnostics unit tests. Additive; existing callers unaffected. Requires Mythosia.AI.Abstractions v2.2.0.