HarmonyAI.SDK
0.4.0
dotnet add package HarmonyAI.SDK --version 0.4.0
NuGet\Install-Package HarmonyAI.SDK -Version 0.4.0
<PackageReference Include="HarmonyAI.SDK" Version="0.4.0" />
<PackageVersion Include="HarmonyAI.SDK" Version="0.4.0" />
<PackageReference Include="HarmonyAI.SDK" />
paket add HarmonyAI.SDK --version 0.4.0
#r "nuget: HarmonyAI.SDK, 0.4.0"
#:package HarmonyAI.SDK@0.4.0
#addin nuget:?package=HarmonyAI.SDK&version=0.4.0
#tool nuget:?package=HarmonyAI.SDK&version=0.4.0
HarmonyAI.SDK
AI RAG agent for chatting over the contents of a local (or multiple) documentation files (README.md
, CHANGELOG.md
, others). Uses Google Gemini to generate answers strictly constrained to the provided context.
Features
- Chat:
IDocChatAgent.AskAsync
returnsChatResponse
(answer, used chunks, history, cache flag). - No-answer contract: model returns [NO_ANSWER] when context lacks data; SDK sets
ChatResponse.NoAnswer
and you can render a docs link fromDocumentationUrl
. - Conversation history with bounded turns (
EnableConversationHistory
,MaxHistoryTurns
). - Multi-file documentation (
ReadmePaths
). - Vectors / RAG: chunking, embeddings, Top-K selection.
- Embeddings cache (hash + model based).
- Answer cache (LRU) keyed by context hash + question.
- Semantic answer reuse for similar questions (embeddings + Jaccard chunk overlap).
- Rate limiting (requests and prompt chars / minute).
- Token-based context budgeting (approximate tokens → characters).
- Telemetry hooks (cache hits, model calls, retries, rate limits).
- Retries with backoff and per-attempt timeouts.
- Adaptive chunking.
- Pseudo streaming + ready API for future native streaming.
- DI integration (
AddHarmonyDocChat
). - Custom exceptions:
GeminiApiException
,RateLimitExceededException
,DocumentationNotFoundException
.
Installation
dotnet add package HarmonyAI.SDK
(Or build locally and install the package from artifacts.)
API key configuration
Set environment variable:
SETX GEMINI_API_KEY "your_key" # Windows (requires new session)
Or pass the key via ReadmeDocAgentOptions.GeminiApiKey
.
Usage example (simple answer)
using HarmonyAI.SDK.AI;
var agent = new ReadmeDocAgent(new ReadmeDocAgentOptions
{
GeminiApiKey = Environment.GetEnvironmentVariable("GEMINI_API_KEY"),
ReadmePath = "README.md", // documentation path
MaxReadmeChars = 12000,
Model = "gemini-1.5-flash"
});
string question = "How to run the sample agent?";
var answer = await agent.AnswerAsync(question);
Console.WriteLine(answer);
Chat example (IDocChatAgent)
using HarmonyAI.SDK.Chat;
using HarmonyAI.SDK.AI;
var agent = new ReadmeDocAgent(new ReadmeDocAgentOptions { GeminiApiKey = Environment.GetEnvironmentVariable("GEMINI_API_KEY") });
var r1 = await agent.AskAsync("What does this library do?");
Console.WriteLine(r1.Answer);
var r2 = await agent.AskAsync("Does it support multiple files?");
Console.WriteLine(r2.Answer);
// Access used chunks:
foreach (var c in r2.UsedChunks)
Console.WriteLine($"Chunk #{c.Index} score={c.Score:F3} len={c.Text.Length}");
DI registration (Microsoft.Extensions.DependencyInjection)
using HarmonyAI.SDK.Extensions;
var services = new ServiceCollection();
services.AddHarmonyDocChat(o => {
o.GeminiApiKey = Environment.GetEnvironmentVariable("GEMINI_API_KEY");
o.ReadmePath = "README.md";
});
var provider = services.BuildServiceProvider();
var chat = provider.GetRequiredService<IDocChatAgent>();
var response = await chat.AskAsync("What is HarmonyAI.SDK?");
## Vectors (RAG / Vector Search)
Default `UseVectorSearch = true`. On first call:
1. README is split into chunks of `ChunkSize` with overlap `ChunkOverlap`.
2. Each chunk gets an embedding from `EmbeddingModel` (e.g. `text-embedding-004`).
3. Embeddings are cached in `.harmonyai.readme.embeddings.json` with file SHA256 and model name.
4. Subsequent queries load cache instead of recomputing embeddings.
5. The user question is also embedded; cosine similarity selects top `MaxChunksInPrompt` chunks.
6. Only those chunks go into the prompt (reduces hallucination and cost).
Changing README (content change -> new hash) triggers automatic cache rebuild.
Disable:
```csharp
var agent = new ReadmeDocAgent(new ReadmeDocAgentOptions { UseVectorSearch = false });
Multiple documentation files
You can provide several files instead of one ReadmePath
via ReadmePaths
:
var agent = new ReadmeDocAgent(new ReadmeDocAgentOptions {
ReadmePaths = new[]{ "README.md", "CHANGELOG.md", "docs/intro.md" }
});
Files are merged with headers --- FILE:xyz ---
aiding contextual references.
Answer cache
Enabled by default (EnableAnswerCache = true
). Key = context hash + question. LRU limited by AnswerCacheCapacity
(default 128). Repeated questions avoid API calls.
Disable:
new ReadmeDocAgentOptions { EnableAnswerCache = false };
Streaming (experimental)
AnswerStreamingAsync
splits the final answer into fragments (pseudo-stream). Future: replace with native Gemini streaming.
await foreach (var chunk in agent.AnswerStreamingAsync("Question?"))
Console.Write(chunk);
Rate limiting
Parameters:
RequestsPerMinuteLimit
PromptCharsPerMinuteLimit
Throws InvalidOperationException
when exceeded. Example:
var agent = new ReadmeDocAgent(new ReadmeDocAgentOptions {
RequestsPerMinuteLimit = 30,
PromptCharsPerMinuteLimit = 200_000
});
Adaptive chunking
Enable with AdaptiveChunking = true
. Approximate target number of chunks: AdaptiveTargetChunks
.
Options modify effective ChunkSize
dynamically based on total document length.
var opts = new ReadmeDocAgentOptions {
UseVectorSearch = true,
AdaptiveChunking = true,
AdaptiveTargetChunks = 50
};
Safety bounds: min effective size 200, max 4000 chars, overlap ⇐ 1/4 size.
Exceptions
GeminiApiException
– Gemini API response error (status, raw body if available).RateLimitExceededException
– rate or prompt char limit exceeded.DocumentationNotFoundException
– no documentation available.
Possible extensions / ideas
- Additional document types / chunking strategies
- Local embedding index options
- Persistent answer cache (disk / distributed)
- Cost governance (token accounting)
License
MIT
Tests
To run unit tests (xUnit):
dotnet test
Building and publishing the NuGet package
- Pack
dotnet pack -c Release
Artifacts will be in bin/Release/
.
- (Optional) Local install to a sample app
dotnet nuget add source ./bin/Release --name LocalHarmony
dotnet add package HarmonyAI.SDK --prerelease
- Publish to NuGet.org (requires API key)
dotnet nuget push bin/Release/*.nupkg --api-key $Env:NUGET_API_KEY --source https://api.nuget.org/v3/index.json
Changelog (summary)
- 0.4.0: Same-language answers; [NO_ANSWER] contract + NoAnswer flag; semantic reuse; token budget; telemetry; retry/timeouts; thread-safety.
- 0.3.0: First public release: chat interface, conversation, used chunks, DI, custom exceptions, docs.
- 0.2.0: Multi-file, LRU answer cache, streaming draft, markdown normalization, rate limiting, adaptive chunking, tests, SourceLink.
- 0.1.0: Basic agent + vector search + tests.
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 is compatible. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net10.0
-
net8.0
-
net9.0
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
- Same-language answers and explicit [NO_ANSWER] contract with NoAnswer flag
- Semantic answer reuse for similar questions (embeddings + chunk overlap)
- Token-based context budgeting and adaptive chunk accumulation
- Thread-safe caches and history; AnswerAsync delegates to AskAsync core
- Telemetry hooks (cache/model/retry/rate-limit)
- Retry with backoff + per-attempt timeouts
- ChatResponse extended with NoAnswer and DocumentationUrl