DevOnBike.Overfit.Server
10.0.25
dotnet add package DevOnBike.Overfit.Server --version 10.0.25
NuGet\Install-Package DevOnBike.Overfit.Server -Version 10.0.25
<PackageReference Include="DevOnBike.Overfit.Server" Version="10.0.25" />
<PackageVersion Include="DevOnBike.Overfit.Server" Version="10.0.25" />
<PackageReference Include="DevOnBike.Overfit.Server" />
paket add DevOnBike.Overfit.Server --version 10.0.25
#r "nuget: DevOnBike.Overfit.Server, 10.0.25"
#:package DevOnBike.Overfit.Server@10.0.25
#addin nuget:?package=DevOnBike.Overfit.Server&version=10.0.25
#tool nuget:?package=DevOnBike.Overfit.Server&version=10.0.25
DevOnBike.Overfit.Server
An OpenAI-compatible HTTP server for the Overfit in-process
LLM runtime — dependency-free (System.Net.HttpListener + System.Text.Json source-gen), no ASP.NET Core,
so it drops cleanly into a Native-AOT single binary. Point any OpenAI client at the base URL and only change the
model name.
What you get
OverfitOpenAiServer.Serve(client, modelName, host, port, systemMessage, …)— a blocking, single-flight server exposing:POST /v1/chat/completions— streaming (SSE) and non-streaming, withresponse_format→ JSON / JSON-Schema constrained decoding; sampling viatemperature/top_p/min_p(llama.cpp-server extension; takes precedence overtop_p)GET /v1/models,GET /health
DevOnBike.Overfit.Server.OpenAi— the OpenAI request/response DTOs + a source-generatedJsonSerializerContext(OpenAiJsonContext) and host-agnosticOpenAiChatMapping(sampling /response_format→ constraint / history replay) you can reuse from ASP.NET or your own host.
Usage
using DevOnBike.Overfit.LanguageModels;
using DevOnBike.Overfit.Server;
using var client = OverfitClient.LoadGguf(@"C:\models\qwen2.5-3b-instruct-q4_k_m.gguf");
client.AddSystem("You are a concise assistant running locally in pure .NET.");
// Serves until the token is cancelled. One request at a time (single model session).
OverfitOpenAiServer.Serve(client, "qwen2.5-3b", host: "127.0.0.1", port: 11434,
systemMessage: "You are a concise assistant running locally in pure .NET.");
This is exactly what the overfit serve CLI command uses.
Notes
- Requests are served strictly one at a time (single-tenant model session / one KV cache), like a local llama.cpp server. For multi-tenant use a session-per-request pool.
- Binding to
127.0.0.1/localhostneeds no elevation;0.0.0.0/*may need a URL ACL / admin on Windows. - The decode worker pool parks when idle (spin-then-park) — a server waiting for requests sits at ~0% CPU.
License
Dual-licensed: AGPL-3.0-or-later for open source; a commercial license is available for
closed-source / SaaS / regulated use — see COMMERCIAL.md.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net10.0
- DevOnBike.Overfit (>= 10.0.25)
- System.Numerics.Tensors (>= 10.0.9)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.