Fluid.OpenVINO.GenAI
2025.3.0.1
dotnet add package Fluid.OpenVINO.GenAI --version 2025.3.0.1
NuGet\Install-Package Fluid.OpenVINO.GenAI -Version 2025.3.0.1
<PackageReference Include="Fluid.OpenVINO.GenAI" Version="2025.3.0.1" />
<PackageVersion Include="Fluid.OpenVINO.GenAI" Version="2025.3.0.1" />
<PackageReference Include="Fluid.OpenVINO.GenAI" />
paket add Fluid.OpenVINO.GenAI --version 2025.3.0.1
#r "nuget: Fluid.OpenVINO.GenAI, 2025.3.0.1"
#:package Fluid.OpenVINO.GenAI@2025.3.0.1
#addin nuget:?package=Fluid.OpenVINO.GenAI&version=2025.3.0.1
#tool nuget:?package=Fluid.OpenVINO.GenAI&version=2025.3.0.1
Fluid.OpenVINO.GenAI.NET
A C# wrapper for OpenVINO and OpenVINO GenAI, providing idiomatic .NET APIs for AI inference and generative AI tasks.
For LLM, an alternative to consider is Microsoft's C# Foundry Local package if you're just looking to run inference on GPU.
Features
Supports LLMPipeline and WhisperPipeline through the C API from openvino.genai. Using pre-release as WhisperPipeline was just added recently (by us :])
Requirements
- .NET 8.0 or later
- Windows x64
- OpenVINO GenAI 2025.3.0.0.dev20250801 runtime
Quick Start
Option 1: Quick Demo (Recommended)
The easiest way to get started is with the QuickDemo application that automatically downloads a model:
By default the script downloads for ubuntu 24, if have another version, change it in the script
scripts/download-openvino-runtime.sh
OPENVINO_RUNTIME_PATH=/home/brandon/OpenVINO.GenAI.NET/build/native/runtimes/linux-x64/native dotnet run --project samples/QuickDemo/ --configuration Release -- --device CPU
For Windwos
.\scripts\download-openvino-runtime.ps1
$env:OPENVINO_RUNTIME_PATH = "C:\Users\brand\code\OpenVINO.GenAI.NET\build\native\runtimes\win-x64\native"
dotnet run --project samples/QuickDemo/ --configuration Release -- --device CPU
Sample Output:
OpenVINO.NET Quick Demo
=======================
Model: Qwen3-0.6B-fp16-ov
Temperature: 0.7, Max Tokens: 100
✓ Model found at: ./Models/Qwen3-0.6B-fp16-ov
Device: CPU
Prompt 1: "Explain quantum computing in simple terms:"
Response: "Quantum computing is a revolutionary technology that uses quantum mechanics principles..."
Performance: 12.4 tokens/sec, First token: 450ms
Option 2: Code Integration
For integrating into your own applications:
using OpenVINO.NET.GenAI;
using var pipeline = new LLMPipeline("path/to/model", "CPU");
var config = GenerationConfig.Default.WithMaxTokens(100).WithTemperature(0.7f);
string result = await pipeline.GenerateAsync("Hello, world!", config);
Console.WriteLine(result);
Streaming Generation
using OpenVINO.NET.GenAI;
using var pipeline = new LLMPipeline("path/to/model", "CPU");
var config = GenerationConfig.Default.WithMaxTokens(100);
await foreach (var token in pipeline.GenerateStreamAsync("Tell me a story", config))
{
Console.Write(token);
}
Projects
OpenVINO.NET.Core
- Core OpenVINO wrapperOpenVINO.NET.GenAI
- GenAI functionalityOpenVINO.NET.Native
- Native library managementQuickDemo
- Quick start demo with automatic model downloadTextGeneration.Sample
- Basic text generation exampleStreamingChat.Sample
- Streaming chat application
Architecture
Three-Layer Design
┌─────────────────────────────────────────────────────────────┐
│ Your Application │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ OpenVINO.NET.GenAI │
│ • LLMPipeline (High-level API) │
│ • GenerationConfig (Fluent configuration) │
│ • ChatSession (Conversation management) │
│ • IAsyncEnumerable streaming │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ OpenVINO.NET.Core │
│ • Core OpenVINO functionality │
│ • Model loading and inference │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ OpenVINO.NET.Native │
│ • P/Invoke declarations │
│ • SafeHandle resource management │
│ • MSBuild targets for DLL deployment │
└─────────────────────────────────────────────────────────────┘
↓
┌─────────────────────────────────────────────────────────────┐
│ OpenVINO GenAI C API │
│ • Native OpenVINO GenAI runtime │
│ • Version: 2025.2.0.0 │
└─────────────────────────────────────────────────────────────┘
Key Features
- Memory Safe: SafeHandle pattern for automatic resource cleanup
- Async/Await: Full async support with cancellation tokens
- Streaming: Real-time token generation with
IAsyncEnumerable<string>
- Fluent API: Chainable configuration methods
- Error Handling: Comprehensive exception handling and device fallbacks
- Performance: Optimized for both throughput and latency
Installation
Prerequisites
Install .NET 8.0 SDK or later
- Download from: https://dotnet.microsoft.com/download
Install OpenVINO GenAI Runtime 2025.2.0.0
- Download from: https://storage.openvinotoolkit.org/repositories/openvino_genai/packages/
- Extract to a directory in your PATH, or place DLLs in your application's output directory
Benchmark Command
# Compare all available devices
dotnet run --project samples/QuickDemo -- --benchmark
Troubleshooting
For detailed NuGet package troubleshooting, see NuGet Troubleshooting Guide.
Common Issues
1. "OpenVINO runtime not found"
Error: The specified module could not be found. (Exception from HRESULT: 0x8007007E)
Solution: Ensure OpenVINO GenAI runtime DLLs are in your PATH or application directory.
2. "Device not supported"
Error: Failed to create LLM pipeline on GPU: Device GPU is not supported
Solutions:
- Check device availability:
dotnet run --project samples/QuickDemo -- --benchmark
- Use CPU fallback:
dotnet run --project samples/QuickDemo -- --device CPU
- Install appropriate drivers (Intel GPU driver for GPU support, Intel NPU driver for NPU)
3. "Model download fails"
Error: Failed to download model files from HuggingFace
Solutions:
- Check internet connectivity
- Verify HuggingFace is accessible
- Manually download model files to
./Models/Qwen3-0.6B-fp16-ov/
4. "Out of memory during inference"
Error: Insufficient memory to load model
Solutions:
- Use a smaller model
- Reduce max_tokens parameter
- Close other memory-intensive applications
- Consider using INT4 quantized models
Debug Mode
Enable detailed logging by setting environment variable:
# Windows
set OPENVINO_LOG_LEVEL=DEBUG
# Linux/macOS
export OPENVINO_LOG_LEVEL=DEBUG
Contributing
Development Setup
Install Prerequisites
- Visual Studio 2022 or VS Code with C# extension
- .NET 9.0 SDK
- OpenVINO GenAI runtime
Build and Test
dotnet build OpenVINO.NET.sln dotnet test tests/OpenVINO.NET.GenAI.Tests/
License
This project is licensed under the MIT License - see the LICENSE file for details.
Resources
- OpenVINO GenAI Documentation
- OpenVINO GenAI C API Reference
- .NET P/Invoke Documentation
- HuggingFace Model Hub
Building
dotnet build OpenVINO.NET.sln
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net6.0 is compatible. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 is compatible. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 is compatible. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net6.0
- System.Memory (>= 4.5.5)
- System.Runtime.CompilerServices.Unsafe (>= 6.0.0)
- System.Threading.Channels (>= 7.0.0)
-
net7.0
- System.Memory (>= 4.5.5)
- System.Runtime.CompilerServices.Unsafe (>= 6.0.0)
- System.Threading.Channels (>= 7.0.0)
-
net8.0
- System.Memory (>= 4.5.5)
- System.Runtime.CompilerServices.Unsafe (>= 6.0.0)
- System.Threading.Channels (>= 7.0.0)
-
net9.0
- System.Memory (>= 4.5.5)
- System.Runtime.CompilerServices.Unsafe (>= 6.0.0)
- System.Threading.Channels (>= 7.0.0)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
Version | Downloads | Last Updated |
---|---|---|
2025.3.0.1 | 210 | 8/6/2025 |
2025.3.0-dev20250801 | 208 | 8/6/2025 |
2025.2.0.1 | 196 | 7/20/2025 |
Fluid.OpenVINO.GenAI 2025.3.0.1
Community-maintained C# wrapper for OpenVINO GenAI.
IMPORTANT FIX:
- Fixed NuGet package native library bundling and MSBuild targets
- Native libraries now properly deploy on both Windows and Linux
- Added multi-targeting for .NET 6.0, 7.0, 8.0, and 9.0
Requirements:
- .NET 6.0, 7.0, 8.0, or 9.0
- Windows x64 or Linux x64
- OpenVINO GenAI 2025.3.0.0.dev20250801 runtime (included)
Features:
- LLM Pipeline with streaming support (IAsyncEnumerable)
- Whisper Pipeline for speech-to-text
- Fluent configuration API
- Automatic native library management via MSBuild targets
- SafeHandle resource management
Changes in 2025.3.0.1:
- Fixed MSBuild targets to correctly resolve native libraries from NuGet package
- Added support for .NET 6.0, 7.0, and 9.0 (previously only .NET 8.0)
- Improved error messages for missing native libraries
- Added comprehensive troubleshooting documentation
- All 27 OpenVINO native libraries now properly included for Windows
- Linux native libraries included for Ubuntu 24.04