NeuralCodecs 0.3.1

dotnet add package NeuralCodecs --version 0.3.1
                    
NuGet\Install-Package NeuralCodecs -Version 0.3.1
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="NeuralCodecs" Version="0.3.1" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="NeuralCodecs" Version="0.3.1" />
                    
Directory.Packages.props
<PackageReference Include="NeuralCodecs" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add NeuralCodecs --version 0.3.1
                    
#r "nuget: NeuralCodecs, 0.3.1"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#addin nuget:?package=NeuralCodecs&version=0.3.1
                    
Install NeuralCodecs as a Cake Addin
#tool nuget:?package=NeuralCodecs&version=0.3.1
                    
Install NeuralCodecs as a Cake Tool

NeuralCodecs is a .NET library for neural audio codec implementations, designed for efficient audio compression and reconstruction.

Features

  • SNAC: Multi-Scale Neural Audio Codec
    • Support for multiple sampling rates: 24kHz, 32kHz, and 44.1kHz
    • Attention mechanisms with adjustable window sizes for improved quality
    • Automatic resampling for input flexibility
  • DAC: Descript Audio Codec
    • Supports multiple sampling rates: 16kHz, 24kHz, and 44.1kHz
    • Configurable encoder/decoder architecture with variable rates
    • Flexible bitrate configurations from 8kbps to 16kbps
  • Encodec: Meta's Encodec neural audio compression
    • Supports stereo audio at 24kHz and 48kHz sample rates
    • Variable bitrate compression (1.5-24 kbps)
    • Neural language model for enhanced compression quality
    • Direct file compression to .ecdc format
  • AudioTools: Advanced audio processing utilities
    • Based on Descript's audiotools Python package
    • Extended with .NET-specific optimizations and additional features
    • Audio filtering, transformation, and effects processing
    • Works with Descript's AudioSignal or Tensors
  • Audio Visualization: Example project includes spectrogram generation and comparison tools

Requirements

  • .NET 8.0 or later
  • TorchSharp or libTorch compatible with your platform
  • NAudio (for audio processing)
  • SkiaSharp (for visualization features)

Usage

Creating/loading the model

There are several ways to load a model:

  1. Using static factory method:
// Load SNAC model with static method provided for built-in models
var model = await NeuralCodecs.CreateSNACAsync("model.pt");
  1. Using premade config:
    SnacConfig provides premade configurations for 24kHz, 32kHz, and 44kHz sampling rates.
var model = await NeuralCodecs.CreateSNACAsync(modelPath, SNACConfig.SNAC24Khz);
  1. Using IModelLoader instance with default config:
    Allows the use of custom loader implementations
// Load model with default config from IModelLoader instance
var torchLoader = NeuralCodecs.CreateTorchLoader();
var model = await torchLoader.LoadModelAsync<SNAC, SNACConfig>("model.pt");
  1. Using IModelLoader instance with custom config:
// For Encodec with custom bandwidth and settings
var encodecConfig = new EncodecConfig { 
    SampleRate = 48000,
    Bandwidth = 12.0f,
    Channels = 2,  // Stereo audio
    Normalize = true
};
var encodecModel = await torchLoader.LoadModelAsync<Encodec, EncodecConfig>("encodec_model.pt", encodecConfig);
  1. Using factory method for custom models:
    Allows the use of custom model implementations with built-in or custom loaders
// Load custom model with factory method
var model = await torchLoader.LoadModelAsync<CustomModel, CustomConfig>(
    "model.pt",
    config => new CustomModel(config, ...),
    config);

Models can be loaded in Pytorch or Safetensors format.

AudioTools Features

The AudioTools namespace provides extensive audio processing capabilities:

var audio = new Tensor(...); // Load or create audio tensor

// Apply effects
var processedAudio = AudioEffects.ApplyCompressor(
    audio, 
    sampleRate: 48000,
    threshold: -20f,
    ratio: 4.0f);

// Compute spectrograms and transforms
var spectrogram = DSP.MelSpectrogram(audio, sampleRate);
var stft = DSP.STFT(audio, windowSize: 1024, hopSize: 512, windowType: "hann");

Encoding and Decoding Audio

There are two main ways to process audio:

  1. Using the simplified ProcessAudio method:
// Compress audio in one step
var processedAudio = model.ProcessAudio(audioData, sampleRate);
  1. Using separate encode and decode steps:
// Encode audio to compressed format
var codes = model.Encode(buffer);

// Decode back to audio
var processedAudio = model.Decode(codes);
  1. Saving the processed audio

    Use your preferred method to save WAV files

// using NAudio
await using var writer = new WaveFileWriter(
    outputPath,
    new WaveFormat(model.Config.SamplingRate, channels: model.Channels)
);
writer.WriteSamples(processedAudio, 0, processedAudio.Length);

Encodec-Specific Features

Encodec provides additional capabilities:

// Set target bandwidth for compression (supported values depend on model)
encodecModel.SetTargetBandwidth(12.0f); // 12 kbps

// Get available bandwidth options
var availableBandwidths = encodecModel.TargetBandwidths; // e.g. [1.5, 3, 6, 12, 24]

// Use language model for enhanced compression quality
var lm = await encodecModel.GetLanguageModel();
// Apply LM during encoding/decoding for better quality

// Direct file compression
await EncodecCompressor.CompressToFileAsync(encodecModel, audioTensor, "audio.ecdc", useLm: true);

// Decompress from file
var (decompressedAudio, sampleRate) = await EncodecCompressor.DecompressFromFileAsync("audio.ecdc");

Acknowledgments

Contributing

Suggestions and contributions are welcome! Feel free to submit a pull request.

License

This project is licensed under the MIT License.

Product Compatible and additional computed target framework versions.
.NET net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
0.3.1 204 a month ago
0.2.0 110 3 months ago
0.1.5 101 3 months ago
0.1.4 102 3 months ago
0.1.3 105 5 months ago
0.1.1 104 5 months ago
0.1.0 90 5 months ago