DotCompute.Backends.CPU.V2 1.0.0-preview3

This is a prerelease version of DotCompute.Backends.CPU.V2.
dotnet add package DotCompute.Backends.CPU.V2 --version 1.0.0-preview3
                    
NuGet\Install-Package DotCompute.Backends.CPU.V2 -Version 1.0.0-preview3
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="DotCompute.Backends.CPU.V2" Version="1.0.0-preview3" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="DotCompute.Backends.CPU.V2" Version="1.0.0-preview3" />
                    
Directory.Packages.props
<PackageReference Include="DotCompute.Backends.CPU.V2" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add DotCompute.Backends.CPU.V2 --version 1.0.0-preview3
                    
#r "nuget: DotCompute.Backends.CPU.V2, 1.0.0-preview3"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package DotCompute.Backends.CPU.V2@1.0.0-preview3
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=DotCompute.Backends.CPU.V2&version=1.0.0-preview3&prerelease
                    
Install as a Cake Addin
#tool nuget:?package=DotCompute.Backends.CPU.V2&version=1.0.0-preview3&prerelease
                    
Install as a Cake Tool

DotCompute.Backends.CPU

Production-ready CPU compute backend with SIMD vectorization for .NET 9+.

Status: ✅ Production Ready

The CPU backend provides high-performance compute acceleration through:

  • SIMD Vectorization: AVX512/AVX2/NEON instruction sets
  • Multi-threading: Work-stealing thread pool
  • Memory Optimization: NUMA-aware allocation
  • Native AOT: Full compatibility with Native AOT compilation

Key Features

SIMD Acceleration

  • AVX512: Best performance on Intel Ice Lake+ and AMD Zen 4+
  • AVX2: Wide compatibility on modern Intel/AMD processors
  • NEON: ARM64 support for Apple Silicon and ARM servers
  • Automatic Detection: Runtime detection of optimal instruction set

Performance

  • 8-23x Speedup: Achieved on vectorizable operations
  • Memory Bandwidth: 95%+ of theoretical peak utilization
  • Thread Scaling: Near-linear scaling to CPU core count
  • Low Overhead: Sub-microsecond kernel launch latency

Installation

dotnet add package DotCompute.Backends.CPU --version 0.5.3

Usage

Basic Setup

using DotCompute.Backends.CPU;
using Microsoft.Extensions.Logging;

var logger = LoggerFactory.Create(builder => builder.AddConsole())
    .CreateLogger<CpuAccelerator>();

var accelerator = new CpuAccelerator(logger);
await accelerator.InitializeAsync();

Service Registration

services.AddSingleton<IAccelerator, CpuAccelerator>();
// OR
services.AddCpuBackend();

Kernel Execution

var kernelDef = new KernelDefinition
{
    Name = "VectorAdd",
    Source = "/* OpenCL C kernel source */",
    EntryPoint = "vector_add"
};

var compiledKernel = await accelerator.CompileKernelAsync(kernelDef);
await compiledKernel.ExecuteAsync(parameters);

Architecture

SIMD Dispatcher

Automatically selects the best available SIMD instruction set:

  1. Detection: Runtime CPU capability detection
  2. Dispatch: Function pointer selection to optimized kernels
  3. Fallback: Scalar implementation for unsupported hardware

Thread Pool

  • Work-Stealing: Efficient load balancing across cores
  • Thread-Local Storage: Minimizes synchronization overhead
  • Adaptive Sizing: Scales with workload and system load

Memory Management

  • NUMA Awareness: Memory allocation respects CPU topology
  • Cache Optimization: Data layout for optimal cache usage
  • Memory Pooling: Reuse allocations to reduce overhead

Performance Benchmarks

Tested on Intel Core Ultra 7 165H with 16 threads:

Operation Elements CPU Time SIMD Time Speedup
Vector Add 1M floats 4.33ms 187μs 23x
Matrix Mult 512x512 2,340ms 89ms 26x
Dot Product 1M floats 2.1ms 156μs 13.4x

System Requirements

Minimum

  • .NET 9.0 or later
  • x64 or ARM64 processor
  • 2GB RAM
  • Modern CPU with AVX2+ (Intel Haswell+ / AMD Excavator+)
  • 8+ CPU cores for optimal threading performance
  • 16GB+ RAM for large datasets

Supported Platforms

  • Windows: x64, ARM64
  • Linux: x64, ARM64
  • macOS: x64 (Intel), ARM64 (Apple Silicon)

Build Configuration

The CPU backend automatically configures itself based on the target platform:

<PropertyGroup Condition="'$(TargetArchitecture)' == 'x64'">
  <DefineConstants>$(DefineConstants);ENABLE_AVX2;ENABLE_AVX512</DefineConstants>
</PropertyGroup>

<PropertyGroup Condition="'$(TargetArchitecture)' == 'arm64'">
  <DefineConstants>$(DefineConstants);ENABLE_NEON</DefineConstants>
</PropertyGroup>

Troubleshooting

Performance Issues

  1. Check SIMD Support: Verify CPU supports AVX2/AVX512
  2. Memory Alignment: Ensure data is properly aligned for SIMD
  3. Thread Count: Match thread count to physical cores
  4. Memory Bandwidth: Monitor memory utilization during execution

Compatibility Issues

  1. Native AOT: Ensure all types are AOT-compatible
  2. Platform Support: Verify target platform support
  3. Dependencies: Check for missing runtime dependencies

Documentation & Resources

Comprehensive documentation is available for DotCompute:

Architecture Documentation

Developer Guides

Examples

API Documentation

Support

Contributing

The CPU backend welcomes contributions in:

  • New SIMD instruction set support (e.g., AVX-512 variants)
  • Platform-specific optimizations
  • Kernel compilation improvements
  • Performance benchmarks and analysis

See CONTRIBUTING.md for guidelines.

Product Compatible and additional computed target framework versions.
.NET net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages (1)

Showing the top 1 NuGet packages that depend on DotCompute.Backends.CPU.V2:

Package Downloads
DotCompute.Linq.V2

GPU-accelerated LINQ extensions for DotCompute. Transparent GPU execution for LINQ queries with automatic kernel generation, fusion optimization, and Reactive Extensions support.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.0.0-preview3 69 4/21/2026
1.0.0-preview2 116 4/21/2026