Microsoft.ML.OnnxRuntimeGenAI.Cuda
0.5.0
Prefix Reserved
dotnet add package Microsoft.ML.OnnxRuntimeGenAI.Cuda --version 0.5.0
NuGet\Install-Package Microsoft.ML.OnnxRuntimeGenAI.Cuda -Version 0.5.0
<PackageReference Include="Microsoft.ML.OnnxRuntimeGenAI.Cuda" Version="0.5.0" />
paket add Microsoft.ML.OnnxRuntimeGenAI.Cuda --version 0.5.0
#r "nuget: Microsoft.ML.OnnxRuntimeGenAI.Cuda, 0.5.0"
// Install Microsoft.ML.OnnxRuntimeGenAI.Cuda as a Cake Addin #addin nuget:?package=Microsoft.ML.OnnxRuntimeGenAI.Cuda&version=0.5.0 // Install Microsoft.ML.OnnxRuntimeGenAI.Cuda as a Cake Tool #tool nuget:?package=Microsoft.ML.OnnxRuntimeGenAI.Cuda&version=0.5.0
About
Run Llama, Phi (Language + Vision!), Gemma, Mistral with ONNX Runtime.
This API gives you an easy, flexible and performant way of running LLMs on device using .NET/C#.
It implements the generative AI loop for ONNX models, including pre and post processing, inference with ONNX Runtime, logits processing, search and sampling, and KV cache management.
You can call a high level generate()
method to generate all of the output at once, or stream the output one token at a time.
Key Features
- Language and vision pre and post processing
- Inference using ONNX Runtime
- Generation tuning with greedy, beam search and random sampling
- KV cache management to optimize performance
- Multi target execution (CPU, GPU, with NPU coming!)
Sample
// See https://aka.ms/new-console-template for more information
using Microsoft.ML.OnnxRuntimeGenAI;
OgaHandle ogaHandle = new OgaHandle();
// Specify the location of your downloaded model.
// Many models are published on HuggingFace e.g.
// https://huggingface.co/microsoft/Phi-3-mini-4k-instruct-onnx
string modelPath = "..."
Console.WriteLine("Model path: " + modelPath);
using Model model = new Model(modelPath);
using Tokenizer tokenizer = new Tokenizer(model);
// Set your prompt here
string prompt = "public static bool IsPrime(int number)";
var sequences = tokenizer.Encode($"<|user|>{prompt}<|end|><|assistant|>");
using GeneratorParams generatorParams = new GeneratorParams(model);
generatorParams.SetSearchOption("max_length", 512);
generatorParams.SetInputSequences(sequences);
using var tokenizerStream = tokenizer.CreateStream();
using var generator = new Generator(model, generatorParams);
while (!generator.IsDone())
{
generator.ComputeLogits();
generator.GenerateNextToken();
Console.Write(tokenizerStream.Decode(generator.GetSequence(0)[^1]));
}
Generates the following output:
Here's a complete implementation of the `IsPrime` function in C# that checks if a given number is prime. The function includes basic input validation and comments for clarity.
using System;
namespace PrimeChecker
{
public class PrimeChecker
{
/// <summary>
/// Checks if the given number is prime.
/// </summary>
/// <param name="number">The number to check.</param>
/// <returns>true if the number is prime; otherwise, false.</returns>
public static bool IsPrime(int number)
{
// Input validation
if (number < 2)
{
return false;
}
// 2 is the only even prime number
if (number == 2)
{
return true;
}
// Exclude even numbers greater than 2
if (number % 2 == 0)
{
return false;
}
// Check for factors up to the square root of the number
int limit = (int)Math.Floor(Math.Sqrt(number));
for (int i = 3; i <= limit; i += 2)
{
if (number % i == 0)
{
return false;
}
}
return true;
}
static void Main(string[] args)
{
int number = 29;
bool isPrime = PrimeChecker.IsPrime(number);
Console.WriteLine($"Is {number} prime? {isPrime}");
}
}
}
This implementation checks if a number is prime by iterating only up to the square root of the number, which is an optimization over checking all numbers up to the number itself. It also excludes even numbers greater than 2, as they cannot be prime.
Source code repository
ONNX Runtime is an open source project. See:
- (ONNX Runtime)[https://github.com/microsoft/onnxruntime]
- (ONNX Runtime GenAI)[https://github.com/microsoft/onnxruntime-genai]
Documentation
See (ONNX Runtime GenAI Documentation)[https://onxxruntime.ai/docs/genai]
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 is compatible. net8.0-android was computed. net8.0-android31.0 is compatible. net8.0-browser was computed. net8.0-ios was computed. net8.0-ios15.4 is compatible. net8.0-maccatalyst was computed. net8.0-maccatalyst14.0 is compatible. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. |
.NET Core | netcoreapp2.0 was computed. netcoreapp2.1 was computed. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
.NET Standard | netstandard2.0 is compatible. netstandard2.1 was computed. |
.NET Framework | net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 was computed. net48 was computed. net481 was computed. |
MonoAndroid | monoandroid was computed. |
MonoMac | monomac was computed. |
MonoTouch | monotouch was computed. |
native | native is compatible. |
Tizen | tizen40 was computed. tizen60 was computed. |
Xamarin.iOS | xamarinios was computed. |
Xamarin.Mac | xamarinmac was computed. |
Xamarin.TVOS | xamarintvos was computed. |
Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETCoreApp 0.0
- Microsoft.ML.OnnxRuntime.Gpu (>= 1.20.0)
- Microsoft.ML.OnnxRuntimeGenAI.Managed (>= 0.5.0)
-
.NETFramework 0.0
- Microsoft.ML.OnnxRuntime.Gpu (>= 1.20.0)
- Microsoft.ML.OnnxRuntimeGenAI.Managed (>= 0.5.0)
-
.NETStandard 0.0
- Microsoft.ML.OnnxRuntime.Gpu (>= 1.20.0)
- Microsoft.ML.OnnxRuntimeGenAI.Managed (>= 0.5.0)
-
net8.0-android31.0
- Microsoft.ML.OnnxRuntime.Gpu (>= 1.20.0)
- Microsoft.ML.OnnxRuntimeGenAI.Managed (>= 0.5.0)
-
net8.0-ios15.4
- Microsoft.ML.OnnxRuntime.Gpu (>= 1.20.0)
- Microsoft.ML.OnnxRuntimeGenAI.Managed (>= 0.5.0)
-
net8.0-maccatalyst14.0
- Microsoft.ML.OnnxRuntime.Gpu (>= 1.20.0)
- Microsoft.ML.OnnxRuntimeGenAI.Managed (>= 0.5.0)
NuGet packages (1)
Showing the top 1 NuGet packages that depend on Microsoft.ML.OnnxRuntimeGenAI.Cuda:
Package | Downloads |
---|---|
feiyun0112.SemanticKernel.Connectors.OnnxRuntimeGenAI.CUDA
Semantic Kernel connector for Microsoft.ML.OnnxRuntimeGenAI. |
GitHub repositories (1)
Showing the top 1 popular GitHub repositories that depend on Microsoft.ML.OnnxRuntimeGenAI.Cuda:
Repository | Stars |
---|---|
microsoft/semantic-kernel
Integrate cutting-edge LLM technology quickly and easily into your apps
|
Version | Downloads | Last updated |
---|---|---|
0.5.0 | 288 | 11/7/2024 |
0.4.0 | 2,918 | 8/21/2024 |
0.4.0-rc1 | 254 | 8/14/2024 |
0.3.0 | 5,140 | 6/21/2024 |
0.3.0-rc2 | 6,530 | 5/29/2024 |
0.3.0-rc1 | 755 | 5/22/2024 |
0.2.0 | 1,666 | 5/20/2024 |
0.2.0-rc7 | 1,268 | 5/14/2024 |
0.2.0-rc6 | 1,797 | 5/4/2024 |
0.2.0-rc4 | 2,216 | 4/25/2024 |
0.2.0-rc3 | 232 | 4/24/2024 |
0.1.0 | 360 | 4/8/2024 |
0.1.0-rc4 | 276 | 3/27/2024 |
Release Def:
Branch: refs/heads/rel-0.5.0
Commit: 826f6aa04490ad081347278eb6ca87f230c0bf44