ManySpeech.SpeechFeatures 1.1.7

dotnet add package ManySpeech.SpeechFeatures --version 1.1.7
                    
NuGet\Install-Package ManySpeech.SpeechFeatures -Version 1.1.7
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="ManySpeech.SpeechFeatures" Version="1.1.7" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="ManySpeech.SpeechFeatures" Version="1.1.7" />
                    
Directory.Packages.props
<PackageReference Include="ManySpeech.SpeechFeatures" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add ManySpeech.SpeechFeatures --version 1.1.7
                    
#r "nuget: ManySpeech.SpeechFeatures, 1.1.7"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package ManySpeech.SpeechFeatures@1.1.7
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=ManySpeech.SpeechFeatures&version=1.1.7
                    
Install as a Cake Addin
#tool nuget:?package=ManySpeech.SpeechFeatures&version=1.1.7
                    
Install as a Cake Tool

SpeechFeatures

SpeechFeatures is a library implemented in C# that can quickly compute audio features, typically used in scenarios such as speech signal processing.

Introduction

SpeechFeatures is an audio feature computation library based on C# implementation, capable of quickly extracting various audio features and widely applied in scenarios like speech signal processing. This library boasts excellent compatibility in terms of framework adaptation, supporting multiple environments including .NET 4.5+, .NET 6.0+, .NET Core 3.1, and .NET Standard 2.0+. It enables functionalities such as cross-platform compilation, AOT compilation, and WebAssembly compilation. Its core capabilities include computing mainstream speech features like kaldi fbank and whisper feature, providing efficient support for speech processing-related tasks.

Calling Method

Parameter reference - Constructor of the SpeechFeatures.OnlineFbank class:

/// <summary>
/// Initializes an instance of the OnlineFbank class for extracting filter bank (Fbank) features (commonly used in scenarios such as speech signal processing)
/// </summary>
/// <param name="dither">Dither value, used to add slight noise to the signal before feature extraction to reduce the impact of quantization errors; 0.0 means no dithering</param>
/// <param name="snip_edges">Whether to snip edge frames. If true, incomplete edge frames will be discarded when the signal length is insufficient to fill a complete frame; if false, edge frames will be retained (padded with zeros)</param>
/// <param name="sample_rate">Sampling rate of the input signal (in Hz), which must be consistent with the actual signal sampling rate</param>
/// <param name="num_bins">Number of filter banks (i.e., the dimension of output features), determining the dimension size of Fbank features</param>
/// <param name="frame_shift">Frame shift (in milliseconds), representing the time interval between adjacent frames, determining the temporal resolution of features (default 10ms)</param>
/// <param name="frame_length">Frame length (in milliseconds), representing the time length of each frame of signal, used to calculate the original signal window size for single-frame feature computation (default 25ms)</param>
/// <param name="energy_floor">Energy floor value, used to limit the minimum energy in feature computation to avoid numerical underflow or abnormal logarithmic calculation (default 0f)</param>
/// <param name="debug_mel">Whether to enable mel-scale debugging mode. If true, additional debugging information or intermediate results will be output to verify the correctness of the mel filter bank</param>
/// <param name="window_type">Type of window function used for windowing each frame of signal (default "hamming", i.e., Hamming window; other options include ('hamming'|'hanning'|'povey'|'rectangular'|'blackman'), etc.)</param>
/// <param name="feature_type">Type of feature, specifying the type of feature to be extracted (default "fbank", i.e., filter bank feature; other options include ('fbank'|'whisper'))</param>
public OnlineFbank(float dither, bool snip_edges, float sample_rate, int num_bins, float frame_shift = 10f, float frame_length = 25f, float energy_floor = 0f, bool debug_mel = false, string window_type = "hamming", string feature_type = "fbank")

The following is sample code. Please configure parameters such as dither, snip_edges, sample_rate, num_bins, window_type, and feature_type according to project requirements:

// Add project reference
using SpeechFeatures;
// Initialize OnlineFbank
OnlineFbank _onlineFbank = new OnlineFbank(
                dither: 0,
                snip_edges: false,
                sample_rate: 16000,
                num_bins: 80,
                window_type: "hamming", // window_type (string): Type of window ('hamming'|'hanning'|'povey'|'rectangular'|'blackman')
                feature_type: "fbank" // feature_type (string): Type of feature ('fbank'|'whisper')
                );
// Pass in audio samples to get features
public float[] GetFbank(float[] samples)
{
     float[] fbanks = _onlineFbank.GetFbank(samples);
     return fbanks;
}

Refer to:

[1] https://github.com/manyeyes/KaldiNativeFbankSharp

Product Compatible and additional computed target framework versions.
.NET net5.0 was computed.  net5.0-windows was computed.  net6.0 is compatible.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 is compatible.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 is compatible.  net8.0-android was computed.  net8.0-android34.0 is compatible.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-ios18.0 is compatible.  net8.0-maccatalyst was computed.  net8.0-maccatalyst18.0 is compatible.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net8.0-windows10.0.19041 is compatible.  net9.0 is compatible.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 was computed.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
.NET Core netcoreapp2.0 was computed.  netcoreapp2.1 was computed.  netcoreapp2.1.30 is compatible.  netcoreapp2.2 was computed.  netcoreapp3.0 was computed.  netcoreapp3.1 is compatible. 
.NET Standard netstandard2.0 is compatible.  netstandard2.1 is compatible. 
.NET Framework net45 is compatible.  net451 was computed.  net452 was computed.  net46 was computed.  net461 is compatible.  net462 was computed.  net463 was computed.  net47 was computed.  net471 was computed.  net472 is compatible.  net48 is compatible.  net481 was computed. 
MonoAndroid monoandroid was computed. 
MonoMac monomac was computed. 
MonoTouch monotouch was computed. 
Tizen tizen40 was computed.  tizen60 was computed. 
Xamarin.iOS xamarinios was computed. 
Xamarin.Mac xamarinmac was computed. 
Xamarin.TVOS xamarintvos was computed. 
Xamarin.WatchOS xamarinwatchos was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
  • .NETCoreApp 2.1.30

    • No dependencies.
  • .NETCoreApp 3.1

    • No dependencies.
  • .NETFramework 4.5

    • No dependencies.
  • .NETFramework 4.6.1

    • No dependencies.
  • .NETFramework 4.7.2

    • No dependencies.
  • .NETFramework 4.8

    • No dependencies.
  • .NETStandard 2.0

    • No dependencies.
  • .NETStandard 2.1

    • No dependencies.
  • net6.0

    • No dependencies.
  • net7.0

    • No dependencies.
  • net8.0

    • No dependencies.
  • net8.0-android34.0

    • No dependencies.
  • net8.0-ios18.0

    • No dependencies.
  • net8.0-maccatalyst18.0

    • No dependencies.
  • net8.0-windows10.0.19041

    • No dependencies.
  • net9.0

    • No dependencies.

NuGet packages (6)

Showing the top 5 NuGet packages that depend on ManySpeech.SpeechFeatures:

Package Downloads
ManySpeech.AliParaformerAsr

c# library for decoding paraformer, sensevoice Models,used in speech recognition (ASR).Paraformer is an efficient non autoregressive end-to-end speech recognition framework proposed by the speech team at Damo Institute. This project is a Paraformer Chinese universal speech recognition model, which uses tens of thousands of hours of industrial grade annotated audio for model training to ensure the universal recognition effect of the model. The model can be applied to scenarios such as voice input methods, voice navigation, and intelligent meeting minutes. Accuracy: High.

ManySpeech.AliFsmnVad

16k Universal VAD Model: Can be used to detect the start and end time points of effective speech in long speech segments FSMN Monochrome VAD is an efficient speech endpoint detection model proposed by the Speech Team of Damo Institute. It is used to detect the start and end time information of valid speech in input audio, and input the detected valid audio segments into the recognition engine for recognition, reducing recognition errors caused by invalid speech.

ManySpeech.K2TransducerAsr

c# library for decoding K2 transducer Models,used in auto speech recognition (ASR)

ManySpeech.FireRedAsr

c# library for decoding FireRedASR's AED-L Model,used in speech recognition (ASR).FireRedASR is a family of open-source industrial-grade automatic speech recognition (ASR) models supporting Mandarin, Chinese dialects and English, achieving a new state-of-the-art (SOTA) on public Mandarin ASR benchmarks, while also offering outstanding singing lyrics recognition capability.

ManySpeech.WenetAsr

C# library for decoding the Wenet ASR onnx model,used in speech recognition (ASR). Wenet ASR is a family of open-source industrial-grade automatic speech recognition (ASR) models supporting Mandarin and English, excellent ability.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.1.7 439 8/12/2025
1.1.6 650 8/6/2025
1.1.5 216 8/6/2025
1.1.4 247 7/26/2025
1.1.3 194 6/19/2025
1.1.2 152 6/15/2025
1.1.1 151 6/15/2025
1.1.0 357 6/10/2025
1.0.1 293 5/13/2025