DotCompute.Generators 0.6.2

.NET Standard 2.0

dotnet add package DotCompute.Generators --version 0.6.2

NuGet\Install-Package DotCompute.Generators -Version 0.6.2

This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.

<PackageReference Include="DotCompute.Generators" Version="0.6.2" />

For projects that support PackageReference, copy this XML node into the project file to reference the package.

<PackageVersion Include="DotCompute.Generators" Version="0.6.2" />
                    

                            Directory.Packages.props

<PackageReference Include="DotCompute.Generators" />
                    

                            Project file

For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.

paket add DotCompute.Generators --version 0.6.2

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

#r "nuget: DotCompute.Generators, 0.6.2"

#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.

#:package DotCompute.Generators@0.6.2

#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.

#addin nuget:?package=DotCompute.Generators&version=0.6.2
                    

                            Install as a Cake Addin

#tool nuget:?package=DotCompute.Generators&version=0.6.2
                    

                            Install as a Cake Tool

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

DotCompute.Generators

Source generators for the DotCompute framework that enable compile-time code generation for high-performance compute kernels.

Overview

The DotCompute.Generators project provides Roslyn-based source generators that automatically generate optimized backend-specific implementations for compute kernels marked with the [Kernel] or [RingKernel] attributes.

Features

1. KernelSourceGenerator

Incremental source generator using IIncrementalGenerator for optimal performance
Detects methods marked with [Kernel] and [RingKernel] attributes
Generates backend-specific implementations (CPU, CUDA, Metal, OpenCL)
Creates a kernel registry for runtime dispatch
Supports SIMD vectorization, parallel execution, and persistent kernels
Generates message queue infrastructure for Ring Kernels

2. KernelCompilationAnalyzer

Compile-time diagnostics for kernel methods
Validates parameter types and vector sizes
Detects performance issues (nested loops, allocations in loops)
Ensures unsafe context for optimal performance
Provides actionable error messages

3. Backend Code Generators

CpuCodeGenerator: Generates optimized CPU implementations with:
- Scalar fallback implementation
- Platform-agnostic SIMD using Vector<T>
- AVX2 optimizations for x86/x64
- AVX-512 optimizations for latest processors
- Parallel execution with task partitioning
- Automatic hardware capability detection

Installation

dotnet add package DotCompute.Generators --version 0.6.0

Usage

1. Add the Generator to Your Project

<ItemGroup>
  <ProjectReference Include="..\..\src\DotCompute.Generators\DotCompute.Generators.csproj" 
                    OutputItemType="Analyzer" 
                    ReferenceOutputAssembly="false" />
</ItemGroup>

2. Mark Methods with Kernel or RingKernel Attribute

using DotCompute.Generators.Kernel;

// Standard kernel for one-shot execution
public static unsafe class VectorMath
{
    [Kernel(
        Backends = KernelBackends.CPU | KernelBackends.CUDA,
        VectorSize = 8,
        IsParallel = true,
        Optimizations = OptimizationHints.AggressiveInlining | OptimizationHints.Vectorize)]
    public static void AddVectors(float* a, float* b, float* result, int length)
    {
        for (int i = 0; i < length; i++)
        {
            result[i] = a[i] + b[i];
        }
    }
}

// Ring kernel for persistent GPU-resident computation
public static class GraphAlgorithms
{
    [RingKernel(
        KernelId = "pagerank-vertex",
        Domain = RingKernelDomain.GraphAnalytics,
        Mode = RingKernelMode.Persistent,
        Capacity = 10000,
        Backends = KernelBackends.CUDA | KernelBackends.OpenCL)]
    public static void PageRankVertex(
        IMessageQueue<VertexMessage> incoming,
        IMessageQueue<VertexMessage> outgoing,
        Span<float> pageRank)
    {
        int vertexId = Kernel.ThreadId.X;

        while (incoming.TryDequeue(out var msg))
        {
            if (msg.TargetVertex == vertexId)
                pageRank[vertexId] += msg.Rank;
        }

        // Send to neighbors...
    }
}

3. Generated Code

The source generator will create:

Kernel Registry (KernelRegistry.g.cs):
- Catalog of all kernels with metadata
- Runtime lookup capabilities
- Backend support information
CPU Implementation (AddVectors_CPU.g.cs):
- Multiple implementations (scalar, SIMD, AVX2, AVX-512)
- Automatic hardware detection and dispatch
- Parallel execution support
Kernel Invoker (VectorMathInvoker.g.cs):
- Dynamic dispatch based on backend
- Parameter validation
- Type-safe invocation

Kernel Attribute Options

Standard Kernel Attributes

Backends

CPU: CPU backend with SIMD support
CUDA: NVIDIA GPU backend
Metal: Apple GPU backend
OpenCL: Cross-platform GPU backend
All: All available backends

Optimization Hints

AggressiveInlining: Force method inlining
LoopUnrolling: Unroll loops for better performance
Vectorize: Enable SIMD vectorization
Prefetch: Add memory prefetch hints
FastMath: Use fast math operations (may reduce accuracy)

Memory Access Patterns

Sequential: Linear memory access
Strided: Fixed-stride memory access
Random: Random memory access
Coalesced: GPU-optimized coalesced access
Tiled: Tiled/blocked memory access

RingKernel Attribute Options

Ring Kernels enable persistent GPU computation with message passing capabilities:

Execution Modes

Persistent: Kernel stays active continuously, ideal for streaming workloads
EventDriven: Kernel launches on-demand when messages arrive, conserves resources

Message Passing Strategies

SharedMemory: Lock-free queues in GPU shared memory (fastest for single-GPU)
AtomicQueue: Lock-free queues in global memory with atomics (scalable)
P2P: Direct GPU-to-GPU memory transfers (CUDA only, requires NVLink)
NCCL: Multi-GPU collectives (CUDA only, optimal for distributed workloads)

Application Domains

General: No domain-specific optimizations
GraphAnalytics: Optimized for irregular memory access patterns (graph algorithms)
SpatialSimulation: Optimized for regular access with halo exchange (physics, fluids)
ActorModel: Optimized for message-heavy workloads with dynamic distribution

Configuration Options

KernelId: Unique identifier for the kernel (required)
Capacity: Maximum concurrent work items (default: 1024, must be power of 2)
InputQueueSize: Size of incoming message queue (default: 256, must be power of 2)
OutputQueueSize: Size of outgoing message queue (default: 256, must be power of 2)
GridDimensions: Number of thread blocks per dimension (auto-calculated if null)
BlockDimensions: Threads per block per dimension (auto-selected if null)
UseSharedMemory: Enable shared memory for thread-block coordination
SharedMemorySize: Shared memory size in bytes per block

Analyzer Diagnostics

ID	Severity	Description
DC0001	Error	Unsupported type in kernel
DC0002	Error	Kernel method missing buffer parameter
DC0003	Error	Invalid vector size (must be 4, 8, or 16)
DC0004	Warning	Unsafe code context required
DC0005	Warning	Potential performance issue

Architecture

DotCompute.Generators/
├── Kernel/
│   ├── KernelSourceGenerator.cs    # Main generator
│   ├── KernelAttribute.cs          # Attribute definitions
│   ├── KernelCompilationAnalyzer.cs # Compile-time analysis
│   └── AcceleratorType.cs          # Backend enum
├── Backend/
│   └── CpuCodeGenerator.cs         # CPU code generation
├── Models/
│   ├── KernelParameter.cs          # Parameter model
│   └── VectorizationInfo.cs        # Vectorization analysis model
├── Configuration/
│   └── GeneratorConfiguration.cs   # Generator configuration
└── Utils/
    ├── SourceGeneratorHelpers.cs   # Legacy facade (deprecated)
    ├── CodeFormatter.cs             # Code formatting utilities
    ├── ParameterValidator.cs       # Parameter validation
    ├── LoopOptimizer.cs            # Loop optimization
    ├── VectorizationAnalyzer.cs    # Vectorization analysis
    ├── MethodBodyExtractor.cs      # Method body extraction
    └── SimdTypeMapper.cs           # SIMD type mapping

Future Enhancements

CUDA Code Generation
- PTX generation for NVIDIA GPUs
- Shared memory optimization
- Warp-level primitives
Metal Shader Generation
- Metal Shading Language generation
- Compute pipeline setup
- Resource binding
OpenCL Kernel Generation
- OpenCL C kernel generation
- Work-group optimization
- Memory coalescing
Advanced Optimizations
- Auto-vectorization analysis
- Loop fusion and tiling
- Memory layout optimization
- Cache-aware algorithms
Debugging Support
- Source maps for generated code
- Performance counters injection
- Validation code generation

Development Notes

The generator targets .NET Standard 2.0 for compatibility
Uses incremental generation for optimal IDE performance
Follows Roslyn best practices for analyzers
Includes comprehensive unit tests (see tests project)

Documentation & Resources

Comprehensive documentation is available for DotCompute:

Architecture Documentation

Source Generators - Compile-time code generation (12 diagnostic rules, 5 automated fixes)
System Overview - Generator integration in architecture

Developer Guides

Getting Started - Using [Kernel] attributes
Kernel Development - Writing kernels with attributes
Native AOT Guide - Native AOT compatibility

Reference

Diagnostic Rules (DC001-DC012) - Complete analyzer reference with automated fixes

Examples

Basic Vector Operations - [Kernel] attribute usage examples

API Documentation

API Reference - Complete API documentation

Support

Documentation: Comprehensive Guides
Issues: GitHub Issues
Discussions: GitHub Discussions

Product	Compatible and additional computed target framework versions.
.NET	net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 was computed. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed.
.NET Core	netcoreapp2.0 was computed. netcoreapp2.1 was computed. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed.
.NET Standard	netstandard2.0 is compatible. netstandard2.1 was computed.
.NET Framework	net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 was computed. net48 was computed. net481 was computed.
MonoAndroid	monoandroid was computed.
MonoMac	monomac was computed.
MonoTouch	monotouch was computed.
Tizen	tizen40 was computed. tizen60 was computed.
Xamarin.iOS	xamarinios was computed.
Xamarin.Mac	xamarinmac was computed.
Xamarin.TVOS	xamarintvos was computed.
Xamarin.WatchOS	xamarinwatchos was computed.

Compatible target framework(s)

Included target framework(s) (in package)

Learn more about Target Frameworks and .NET Standard.

.NETStandard 2.0
- No dependencies.

NuGet packages (2)

Showing the top 2 NuGet packages that depend on DotCompute.Generators:

Package	Downloads
DotCompute.Backends.CUDA Production-ready NVIDIA CUDA GPU backend for DotCompute. Provides GPU acceleration (21-92x speedup) through CUDA with NVRTC compilation, P2P transfers, Ring Kernels with NCCL support, and unified memory. Requires CUDA 12.0+ and Compute Capability 5.0+ NVIDIA GPU. Benchmarked on RTX 2000 Ada (CC 8.9).	4.2K
Orleans.GpuBridge.Backends.DotCompute DotCompute backend provider for Orleans.GpuBridge.Core - Enables GPU acceleration via CUDA, OpenCL, Metal, and CPU with attribute-based kernel definition.	1.2K

GitHub repositories

This package is not used by any popular GitHub repositories.

Version	Downloads	Last Updated
0.6.2	196	2/9/2026
0.5.3	417	2/2/2026
0.5.2	702	12/8/2025
0.5.1	656	11/28/2025
0.5.0	246	11/27/2025
0.4.2-rc2	319	11/11/2025
0.4.1-rc2	222	11/6/2025

Total 3.6K

Current version 196

Per day average 16

GPU Compute CUDA Metal Vulkan OpenCL GPGPU HPC AOT