Amplifier.NET
2.0.0
dotnet add package Amplifier.NET --version 2.0.0
NuGet\Install-Package Amplifier.NET -Version 2.0.0
<PackageReference Include="Amplifier.NET" Version="2.0.0" />
<PackageVersion Include="Amplifier.NET" Version="2.0.0" />
<PackageReference Include="Amplifier.NET" />
paket add Amplifier.NET --version 2.0.0
#r "nuget: Amplifier.NET, 2.0.0"
#:package Amplifier.NET@2.0.0
#addin nuget:?package=Amplifier.NET&version=2.0.0
#tool nuget:?package=Amplifier.NET&version=2.0.0
Amplifier.NET
Write C#. Run on GPU.
Amplifier.NET is a GPU computing library that lets .NET developers harness the power of parallel processing on Intel, NVIDIA, and AMD hardware—without writing a single line of C or OpenCL kernel code.
Why Amplifier.NET?
Modern applications demand massive computational power for machine learning, scientific simulations, image processing, and financial modeling. GPUs can process thousands of operations in parallel, but traditionally require specialized knowledge of OpenCL, CUDA, or shader languages.
Amplifier.NET bridges this gap. Write your compute kernels in familiar C# syntax, and let Amplifier handle the translation to OpenCL, device management, and memory transfers. Your code runs on any OpenCL-compatible device—from integrated Intel graphics to high-end discrete GPUs.
Features
- Pure C# Kernels — Write GPU compute functions using standard C# syntax
- Automatic Translation — C# code is decompiled and translated to OpenCL C99 at runtime
- OpenCL 3.0 Support — Full support for the latest OpenCL specification including optional features
- Cross-Platform — Works on Windows, Linux, and macOS with any OpenCL driver
- Multi-Device — Enumerate and target specific compute devices (CPU, GPU, FPGA)
- Struct Support — Pass custom structs between host and device
- XArray System — Advanced array types with shape manipulation and automatic memory management
Quick Start
Installation
dotnet add package Amplifier.NET
Your First Kernel
Define a kernel class that extends OpenCLFunctions:
using Amplifier.OpenCL;
public class MyKernels : OpenCLFunctions
{
[OpenCLKernel]
void VectorAdd([Global] float[] a, [Global] float[] b, [Global] float[] result)
{
int i = get_global_id(0);
result[i] = a[i] + b[i];
}
[OpenCLKernel]
void Scale([Global] float[] data, float factor)
{
int i = get_global_id(0);
data[i] *= factor;
}
}
Execute on GPU
using Amplifier;
// Initialize the compiler and select a device
var compiler = new OpenCLCompiler();
Console.WriteLine("Available Devices:");
foreach (var device in compiler.Devices)
Console.WriteLine($" {device}");
compiler.UseDevice(0); // Select first device
compiler.CompileKernel(typeof(MyKernels));
// Prepare data
float[] a = { 1, 2, 3, 4, 5 };
float[] b = { 10, 20, 30, 40, 50 };
float[] result = new float[5];
// Execute kernels
var exec = compiler.GetExec();
exec.VectorAdd(a, b, result);
Console.WriteLine(string.Join(", ", result));
// Output: 11, 22, 33, 44, 55
Working with Structs
Amplifier supports custom structs for complex data types:
using System.Runtime.InteropServices;
[StructLayout(LayoutKind.Sequential)]
public struct Particle
{
public float X, Y, Z;
public float VelocityX, VelocityY, VelocityZ;
public float Mass;
public int Active;
}
public class PhysicsKernels : OpenCLFunctions
{
[OpenCLKernel]
void Integrate([Global][Struct] Particle[] particles, float deltaTime)
{
int i = get_global_id(0);
if (particles[i].Active == 1)
{
particles[i].X += particles[i].VelocityX * deltaTime;
particles[i].Y += particles[i].VelocityY * deltaTime;
particles[i].Z += particles[i].VelocityZ * deltaTime;
}
}
}
// Compile with struct types
compiler.CompileKernel(typeof(PhysicsKernels), typeof(Particle));
Advanced: XArray for Scientific Computing
The XArray system provides NumPy-like array operations with automatic GPU memory management:
int M = 1024, N = 1024, K = 512;
var a = new InArray(new long[] { M, K }, DType.Float32);
var b = new InArray(new long[] { K, N }, DType.Float32);
var c = new OutArray(new long[] { M, N }, DType.Float32);
exec.Fill(a, 1.0f);
exec.Fill(b, 2.0f);
exec.MatMul(M, N, K, a, b, c);
float[] result = c.ToArray();
OpenCL Built-in Functions
Kernels have access to all standard OpenCL functions:
| Category | Functions |
|---|---|
| Work-item | get_global_id, get_local_id, get_group_id, get_global_size |
| Math | sin, cos, tan, exp, log, pow, sqrt, fabs, fmin, fmax |
| Geometric | dot, cross, length, normalize, distance |
| Integer | abs, clamp, min, max |
| Synchronization | barrier, mem_fence |
Performance Tips
- Minimize Host-Device Transfers — Keep data on the GPU between kernel calls
- Use Appropriate Work Sizes — Match your problem dimensions to the kernel's global size
- Prefer Float over Double — Many GPUs have limited double-precision performance
- Coalesce Memory Access — Access contiguous memory addresses for best throughput
- Avoid Branching — Divergent control flow reduces GPU efficiency
Supported Platforms
| Platform | Status |
|---|---|
| Windows (x64) | Fully Supported |
| Linux (x64) | Fully Supported |
| macOS | Supported (Intel/AMD GPUs) |
Tested Hardware:
- Intel Iris Xe, UHD Graphics
- NVIDIA GTX/RTX series
- AMD Radeon RX series
Documentation
- Getting Started: Articles
- API Reference: Documentation
- Examples: See the examples directory
Contributing
Contributions are welcome! Please:
- Fork the repository
- Create a feature branch
- Submit a pull request
For bugs or feature requests, please open an issue.
License
Amplifier.NET is released under the MIT License.
Amplifier.NET — Unlock the power of GPU computing in pure C#.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
| .NET Core | netcoreapp3.0 was computed. netcoreapp3.1 was computed. |
| .NET Standard | netstandard2.1 is compatible. |
| MonoAndroid | monoandroid was computed. |
| MonoMac | monomac was computed. |
| MonoTouch | monotouch was computed. |
| Tizen | tizen60 was computed. |
| Xamarin.iOS | xamarinios was computed. |
| Xamarin.Mac | xamarinmac was computed. |
| Xamarin.TVOS | xamarintvos was computed. |
| Xamarin.WatchOS | xamarinwatchos was computed. |
-
.NETStandard 2.1
- Humanizer.Core (>= 3.0.1)
- ICSharpCode.Decompiler (>= 9.1.0.7988)
- Newtonsoft.Json (>= 13.0.4)
- Newtonsoft.Json.Bson (>= 1.0.3)
- System.Collections.Immutable (>= 10.0.1)
- System.Reflection.Metadata (>= 10.0.1)
- System.ValueTuple (>= 4.6.1)
NuGet packages (1)
Showing the top 1 NuGet packages that depend on Amplifier.NET:
| Package | Downloads |
|---|---|
|
SuperchargedArray
An extended version of the Array to accelerate operation, easy to use, multi dimensional. With SuperchargedArray.Accelerated namespace you will unlock SIMD potential to run the Array operation on any hardware like Intel CPU/GPU, NVIDIA, AMD etc. |
GitHub repositories
This package is not used by any popular GitHub repositories.
Amplifier.NET 2.0 Release Notes
December 2024
OpenCL 3.0 Support — Added complete bindings for OpenCL 2.0, 2.1, 2.2, and 3.0, including Shared Virtual Memory (SVM), Pipes, SPIR-V program loading, and subgroup operations
Improved C# 10 Compatibility — Fixed code translator to properly handle file-scoped namespaces (namespace Foo.Bar;), preventing malformed OpenCL kernel output
Enhanced Code Translation — Fixed struct generation issues including globalstruct spacing, StructLayout attribute removal, and proper handling of float.MinValue/float.MaxValue constants
Decompiler Stability — Fixed NullReferenceException in DynamicCallSiteTransform that occurred when processing certain dynamic call sites
Comprehensive Test Suite — Added 35+ example kernels covering vector math, matrix operations, particle physics, image processing, neural network operations, and parallel reductions
New Struct Types — Introduced 8 new struct examples (Float3, Matrix4x4F, Particle, ColorRGBA, ComplexF, BoundingBox, HistogramBin, NeuronWeights) demonstrating complex GPU data patterns
Backward Compatible — No breaking changes; existing 1.x code works without modification