DotCompute.Core 0.4.2-rc2

This is a prerelease version of DotCompute.Core.

There is a newer version of this package available.
See the version list below for details.

dotnet add package DotCompute.Core --version 0.4.2-rc2

NuGet\Install-Package DotCompute.Core -Version 0.4.2-rc2

This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.

<PackageReference Include="DotCompute.Core" Version="0.4.2-rc2" />

For projects that support PackageReference, copy this XML node into the project file to reference the package.

<PackageVersion Include="DotCompute.Core" Version="0.4.2-rc2" />
                    

                            Directory.Packages.props

<PackageReference Include="DotCompute.Core" />
                    

                            Project file

For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.

paket add DotCompute.Core --version 0.4.2-rc2

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

#r "nuget: DotCompute.Core, 0.4.2-rc2"

#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.

#:package DotCompute.Core@0.4.2-rc2

#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.

#addin nuget:?package=DotCompute.Core&version=0.4.2-rc2&prerelease
                    

                            Install as a Cake Addin

#tool nuget:?package=DotCompute.Core&version=0.4.2-rc2&prerelease
                    

                            Install as a Cake Tool

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

DotCompute.Core

Core runtime and orchestration engine for the DotCompute compute acceleration framework.

Status: ✅ Production Ready (v0.3.0-rc1)

The Core runtime provides comprehensive infrastructure for compute acceleration:

Kernel Execution Management: Complete kernel compilation and execution pipeline
Accelerator Discovery: Automatic detection and lifecycle management
Service Orchestration: Dependency injection integration
Performance Monitoring: OpenTelemetry-based telemetry infrastructure
Debugging Services: Cross-backend validation and profiling
Optimization Engine: ML-powered backend selection and workload optimization
Pipeline System: Execution pipeline management with optimization
Recovery System: Fault tolerance and error recovery
Native AOT: Full Native AOT compatibility

Key Components

Compute Orchestration

IComputeOrchestrator

Universal kernel execution interface providing:

Backend-agnostic kernel execution
Automatic accelerator selection
Type-safe parameter binding
Asynchronous execution model
Error handling and recovery

Kernel Execution Service

Runtime kernel orchestration:

Generated kernel discovery
Automatic kernel registration
Execution context management
Performance profiling
Result materialization

Kernel Management

Kernel Definition System

KernelDefinition: Metadata and source code representation
KernelSource: Source code abstraction with language detection
Kernel Compilation: Multi-backend compilation pipeline
Kernel Validation: Static analysis and validation
Kernel Caching: Compiled kernel caching with TTL

Compiled Kernel Execution

ICompiledKernel: Compiled kernel interface
Parameter Binding: Type-safe argument binding
Launch Configuration: Grid/block dimension specification
Memory Management Integration: Automatic buffer handling
Synchronization: Explicit and implicit synchronization

Accelerator Management

Accelerator Discovery

Automatic backend detection:

Platform capability detection
Hardware enumeration
Driver version checking
Compute capability validation
Multi-GPU/device support

Accelerator Lifecycle

Initialization: Device context creation
Resource Management: Memory and compute resources
Synchronization: Cross-device synchronization
Cleanup: Proper resource disposal

Debugging and Validation

Cross-Backend Debugging

Production-ready debugging system with 8 methods:

CompareBackends: CPU vs GPU result validation
ValidateKernelExecution: Output correctness verification
AnalyzePerformance: Performance profiling
InspectMemory: Memory pattern analysis
TestDeterminism: Reproducibility testing
FindOptimalConfig: Parameter tuning
SimulateFailures: Fault injection testing
GenerateDiagnostics: Comprehensive diagnostics

Debug Profiles

Development: Extensive validation and logging
Testing: Balance of validation and performance
Production: Minimal overhead with targeted checks
Custom: User-defined debugging strategies

Optimization Engine

Adaptive Backend Selection

ML-powered backend selection with 4 optimization levels:

Conservative: Stable, proven configurations
Balanced: Performance vs reliability tradeoff
Aggressive: Maximum performance optimization
ML-Optimized: Machine learning-based selection

Workload Analysis

Workload Characteristics: Pattern recognition
Performance Prediction: Execution time estimation
Resource Utilization: Memory and compute requirements
Backend Affinity: Optimal backend recommendation

Optimization Strategies

Hardware Profiling: Capability-based selection
Cost-Based: Execution cost minimization
ML-Based: Learning from execution patterns
Hybrid: Combination of multiple strategies

Pipeline System

Execution Pipelines

Comprehensive pipeline management:

Pipeline Definition: DAG-based execution graphs
Stage Composition: Kernel chains and parallel stages
Pipeline Optimization: Automatic graph optimization
Memory Management: Inter-stage buffer management
Error Handling: Pipeline-wide error recovery

Pipeline Profiling

Metrics Collection: Per-stage performance metrics
Bottleneck Analysis: Critical path identification
Resource Tracking: Memory and compute utilization
Optimization Recommendations: Auto-tuning suggestions

Telemetry and Observability

OpenTelemetry Integration

Production-grade observability:

Distributed Tracing: Execution flow tracking
Metrics Collection: Performance counters
Logging Integration: Structured logging
Prometheus Export: Metrics endpoint
OTLP Support: OpenTelemetry Protocol

Performance Metrics

Kernel Execution Time: Per-kernel timing
Memory Transfer Bandwidth: Host-device transfer rates
Throughput: Operations per second
Resource Utilization: CPU/GPU usage
Queue Depth: Command queue metrics

Recovery and Fault Tolerance

Fault Detection

Compilation Failures: Kernel compilation errors
Execution Failures: Runtime kernel failures
Memory Errors: Out-of-memory, allocation failures
Device Errors: GPU hangs, driver issues

Recovery Strategies

Retry with Backoff: Automatic retry logic
Fallback Backends: CPU fallback for GPU failures
Parameter Adjustment: Reduce resource requirements
Graceful Degradation: Continue with reduced capability

Security Features

Kernel Validation

Source Code Scanning: Detect unsafe patterns
Resource Limits: Prevent resource exhaustion
Sandboxing: Isolated execution environments
Access Control: Permission-based execution

Installation

dotnet add package DotCompute.Core --version 0.3.0-rc1

Usage

Basic Service Configuration

using DotCompute.Core;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Hosting;

// Configure services
var builder = Host.CreateApplicationBuilder(args);

// Add DotCompute runtime
builder.Services.AddDotComputeRuntime(options =>
{
    options.EnableTelemetry = true;
    options.DefaultAccelerator = AcceleratorType.Auto;
    options.EnableDebugValidation = false; // Production
});

var host = builder.Build();

// Get orchestrator service
var orchestrator = host.Services.GetRequiredService<IComputeOrchestrator>();

Kernel Execution

using DotCompute.Core;

// Execute kernel using orchestrator
var result = await orchestrator.ExecuteKernelAsync<float[], float[]>(
    "VectorAdd",
    new { a = dataA, b = dataB, length = 1_000_000 }
);

// Result is automatically materialized
Console.WriteLine($"First result: {result[0]}");

Debug-Enabled Orchestration

using DotCompute.Core.Debugging;

// Enable debugging for development
builder.Services.AddProductionDebugging(options =>
{
    options.Profile = DebugProfile.Development;
    options.EnableCrossBackendValidation = true;
    options.ValidateAllExecutions = true;
    options.CollectPerformanceMetrics = true;
});

// Orchestrator automatically validates all executions
var result = await orchestrator.ExecuteKernelAsync<float[], float[]>(
    "MyKernel",
    parameters
);

// Debug service provides detailed diagnostics
var debugService = host.Services.GetRequiredService<IKernelDebugService>();
var diagnostics = await debugService.GenerateDiagnosticsAsync("MyKernel", parameters);
Console.WriteLine(diagnostics.Summary);

Performance Optimization

using DotCompute.Core.Optimization;

// Enable ML-based optimization
builder.Services.AddProductionOptimization(options =>
{
    options.OptimizationStrategy = OptimizationStrategy.Aggressive;
    options.EnableMachineLearning = true;
    options.EnableAdaptiveSelection = true;
});

// Orchestrator automatically selects optimal backend
var result = await orchestrator.ExecuteKernelAsync<float[], float[]>(
    "ComplexKernel",
    largeDataset
);

// Get optimization insights
var optimizer = host.Services.GetRequiredService<IAdaptiveBackendSelector>();
var recommendation = await optimizer.SelectBackendAsync(
    new WorkloadCharacteristics
    {
        DataSize = largeDataset.Length,
        ComputeIntensity = ComputeIntensity.High,
        MemoryIntensive = true
    }
);

Console.WriteLine($"Recommended: {recommendation.Backend}");
Console.WriteLine($"Confidence: {recommendation.Confidence:P2}");

Pipeline Execution

using DotCompute.Core.Pipelines;

// Build execution pipeline
var pipeline = await pipelineBuilder
    .AddStage("Load", loadKernel, new { path = inputPath })
    .AddStage("Transform", transformKernel)
    .AddStage("Reduce", reduceKernel)
    .AddStage("Save", saveKernel, new { path = outputPath })
    .WithOptimization()
    .WithProfiling()
    .BuildAsync();

// Execute pipeline
var result = await pipeline.ExecuteAsync();

// Get profiling results
var profiler = host.Services.GetRequiredService<IPipelineProfiler>();
var metrics = await profiler.GetMetricsAsync(pipeline.Id);

foreach (var stage in metrics.Stages)
{
    Console.WriteLine($"{stage.Name}: {stage.Duration.TotalMilliseconds}ms");
}

Telemetry Configuration

using DotCompute.Core.Telemetry;
using OpenTelemetry.Metrics;
using OpenTelemetry.Trace;

builder.Services.AddOpenTelemetry()
    .WithMetrics(metrics =>
    {
        metrics.AddDotComputeInstrumentation();
        metrics.AddPrometheusExporter();
    })
    .WithTracing(tracing =>
    {
        tracing.AddDotComputeInstrumentation();
        tracing.AddOtlpExporter();
    });

// Metrics automatically collected:
// - dotcompute.kernel.executions (counter)
// - dotcompute.kernel.duration (histogram)
// - dotcompute.memory.allocated (counter)
// - dotcompute.memory.transferred (counter)

Recovery Configuration

using DotCompute.Core.Recovery;

builder.Services.Configure<RecoveryOptions>(options =>
{
    options.EnableAutoRecovery = true;
    options.MaxRetries = 3;
    options.RetryDelay = TimeSpan.FromMilliseconds(100);
    options.FallbackToCPU = true;
    options.CollectFailureStatistics = true;
});

// Automatic recovery on failures
try
{
    var result = await orchestrator.ExecuteKernelAsync(kernelName, parameters);
}
catch (ComputeException ex)
{
    // After exhausting retries and fallbacks
    Console.WriteLine($"Execution failed: {ex.Message}");
    Console.WriteLine($"Retries attempted: {ex.RetryCount}");
}

Architecture

Service Layer

Application
    ↓
IComputeOrchestrator (High-level API)
    ↓
KernelExecutionService (Orchestration)
    ↓
├── Accelerator Management (Device selection)
├── Kernel Discovery (Generated kernels)
├── Kernel Compilation (Backend-specific)
├── Memory Management (Buffer allocation)
├── Execution Pipeline (Kernel launch)
├── Telemetry Collection (Metrics)
└── Error Recovery (Fault handling)

Component Integration

DotCompute.Core
├── Abstractions Layer (Interfaces)
├── Compute Engine (Execution)
├── Memory Management (Buffers)
├── Debugging Services (Validation)
├── Optimization Engine (Backend selection)
├── Pipeline System (Workflow)
├── Telemetry System (Observability)
├── Recovery System (Fault tolerance)
└── Security System (Validation)

Configuration Options

Runtime Options

public class DotComputeRuntimeOptions
{
    public AcceleratorType DefaultAccelerator { get; set; } = AcceleratorType.Auto;
    public bool EnableTelemetry { get; set; } = true;
    public bool EnableDebugValidation { get; set; } = false;
    public bool EnableAutoOptimization { get; set; } = true;
    public bool EnableRecovery { get; set; } = true;
    public LogLevel MinimumLogLevel { get; set; } = LogLevel.Information;
}

Debug Options

public class DebugOptions
{
    public DebugProfile Profile { get; set; } = DebugProfile.Production;
    public bool EnableCrossBackendValidation { get; set; } = false;
    public bool ValidateAllExecutions { get; set; } = false;
    public bool CollectPerformanceMetrics { get; set; } = true;
    public double ToleranceThreshold { get; set; } = 1e-5;
}

Optimization Options

public class OptimizationOptions
{
    public OptimizationStrategy Strategy { get; set; } = OptimizationStrategy.Balanced;
    public bool EnableMachineLearning { get; set; } = false;
    public bool EnableAdaptiveSelection { get; set; } = true;
    public bool CacheDecisions { get; set; } = true;
    public TimeSpan CacheDuration { get; set; } = TimeSpan.FromMinutes(30);
}

System Requirements

.NET 9.0 or later
Native AOT compatible runtime (optional)
2GB+ RAM (4GB+ recommended)
OpenTelemetry compatible monitoring (optional)

Performance Characteristics

Overhead

Orchestration overhead: < 50μs per kernel
Debugging overhead (Development): 2-5x execution time
Debugging overhead (Production): < 5% execution time
Telemetry overhead: < 1% execution time

Scalability

Concurrent kernel executions: Unlimited (backend-limited)
Pipeline depth: No practical limit
Telemetry throughput: 10K+ events/second

Troubleshooting

Accelerator Not Found

Check accelerator availability:

var service = host.Services.GetRequiredService<IAcceleratorDiscoveryService>();
var accelerators = await service.DiscoverAsync();

if (!accelerators.Any())
{
    Console.WriteLine("No accelerators found. Ensure drivers installed.");
}

Kernel Compilation Failures

Enable detailed logging:

builder.Logging.SetMinimumLevel(LogLevel.Trace);

Performance Issues

Use profiling:

builder.Services.AddProductionDebugging(options =>
{
    options.CollectPerformanceMetrics = true;
});

Advanced Topics

Custom Accelerator Implementation

Implement IAccelerator and register:

builder.Services.AddSingleton<IAccelerator, CustomAccelerator>();

Custom Optimization Strategy

Implement IOptimizationStrategy:

public class CustomStrategy : IOptimizationStrategy
{
    public Task<OptimizationDecision> DecideAsync(WorkloadCharacteristics workload)
    {
        // Custom logic
    }
}

Pipeline Optimization

Custom pipeline optimizer:

public class CustomPipelineOptimizer : IPipelineOptimizer
{
    public Task<PipelineExecutionPlan> OptimizeAsync(IPipeline pipeline)
    {
        // Custom optimization logic
    }
}

Dependencies

DotCompute.Abstractions: Core abstractions
Microsoft.Extensions.DependencyInjection: DI infrastructure
Microsoft.Extensions.Logging: Logging infrastructure
OpenTelemetry: Observability infrastructure
System.Diagnostics.DiagnosticSource: Metrics and tracing
prometheus-net: Prometheus metrics export

Documentation & Resources

Comprehensive documentation is available for DotCompute:

Architecture Documentation

Core Orchestration - Kernel execution pipeline (< 50μs overhead)
Debugging System - Cross-backend validation and profiling
Optimization Engine - ML-powered backend selection
Memory Management - Unified memory with pooling

Developer Guides

Getting Started - Installation and quick start
Kernel Development - Writing efficient kernels
Performance Tuning - Optimization techniques
Debugging Guide - Cross-backend validation and troubleshooting
Dependency Injection - DI integration patterns

Examples

Basic Vector Operations - Fundamental operations with benchmarks
Multi-Kernel Pipelines - Chaining operations efficiently

API Documentation

API Reference - Complete API documentation
IComputeOrchestrator - Universal execution interface

Support

Documentation: Comprehensive Guides
Issues: GitHub Issues
Discussions: GitHub Discussions

Contributing

Contributions are welcome in:

Additional optimization strategies
Performance improvements
Platform-specific enhancements
Documentation and examples

See CONTRIBUTING.md for guidelines.

License

Product	Compatible and additional computed target framework versions.
.NET	net9.0 is compatible. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed.

Product

.NET

Compatible target framework(s)

Included target framework(s) (in package)

Learn more about Target Frameworks and .NET Standard.

net9.0
- DotCompute.Abstractions (>= 0.4.2-rc2)
- Microsoft.Extensions.DependencyInjection.Abstractions (>= 9.0.10)
- Microsoft.Extensions.Logging.Abstractions (>= 9.0.10)
- Microsoft.Extensions.Options (>= 9.0.10)
- Microsoft.NET.ILLink.Tasks (>= 9.0.10)
- OpenTelemetry (>= 1.13.1)
- OpenTelemetry.Exporter.OpenTelemetryProtocol (>= 1.13.1)
- OpenTelemetry.Exporter.Prometheus.AspNetCore (>= 1.9.0-beta.2)
- OpenTelemetry.Extensions.Hosting (>= 1.13.1)
- prometheus-net (>= 8.2.1)
- System.Collections.Immutable (>= 9.0.10)
- System.Diagnostics.DiagnosticSource (>= 9.0.10)
- System.Text.Json (>= 9.0.10)
- System.Threading.Channels (>= 9.0.10)

NuGet packages (10)

Showing the top 5 NuGet packages that depend on DotCompute.Core:

Package	Downloads
DotCompute.Plugins Plugin system for DotCompute compute acceleration framework. Provides hot-reload capability, plugin discovery, circuit breaker patterns, and fault tolerance infrastructure.	4.2K
DotCompute.Memory Unified memory management for DotCompute. Provides zero-copy buffers, memory pooling, and cross-device memory transfers.	4.0K
DotCompute.Backends.CUDA Production-ready NVIDIA CUDA GPU backend for DotCompute. Provides GPU acceleration (21-92x speedup) through CUDA with NVRTC compilation, P2P transfers, Ring Kernels with NCCL support, and unified memory. Requires CUDA 12.0+ and Compute Capability 5.0+ NVIDIA GPU. Benchmarked on RTX 2000 Ada (CC 8.9).	3.8K
DotCompute.Backends.CPU Production-ready CPU compute backend for DotCompute. Provides SIMD vectorization (3.7x faster) using AVX2/AVX512/NEON instructions, multi-threaded kernel execution, and Ring Kernel simulation. Benchmarked: Vector Add (100K elements) 2.14ms → 0.58ms. Native AOT compatible with sub-10ms startup.	3.6K
DotCompute.Backends.OpenCL Production-ready OpenCL backend for DotCompute. Cross-platform GPU acceleration for NVIDIA, AMD, Intel, ARM Mali, and Qualcomm Adreno GPUs. Supports OpenCL 1.2+, Ring Kernels with atomic message queues, runtime kernel compilation, and multi-device workload distribution. Works with nvidia-opencl-icd, ROCm, intel-opencl-icd, and vendor drivers.	3.3K

GitHub repositories

This package is not used by any popular GitHub repositories.

Version	Downloads	Last Updated
0.6.2	166	2/9/2026
0.5.3	351	2/2/2026
0.5.2	703	12/8/2025
0.5.1	664	11/28/2025
0.5.0	264	11/27/2025
0.4.2-rc2	435	11/11/2025
0.4.1-rc2	397	11/6/2025

Total 4.3K

Current version 435

Per day average 24

dotcompute compute kernel accelerator parallel native-aot dotnet