EasyOcrSharp 2.2.4

dotnet add package EasyOcrSharp --version 2.2.4
                    
NuGet\Install-Package EasyOcrSharp -Version 2.2.4
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="EasyOcrSharp" Version="2.2.4" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="EasyOcrSharp" Version="2.2.4" />
                    
Directory.Packages.props
<PackageReference Include="EasyOcrSharp" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add EasyOcrSharp --version 2.2.4
                    
#r "nuget: EasyOcrSharp, 2.2.4"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package EasyOcrSharp@2.2.4
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=EasyOcrSharp&version=2.2.4
                    
Install as a Cake Addin
#tool nuget:?package=EasyOcrSharp&version=2.2.4
                    
Install as a Cake Tool

<div align="center">

<img src="src/EasyOcrSharp/Assets/icon.gif" alt="EasyOcrSharp" width="120" height="120">

EasyOcrSharp

High-accuracy, fully-offline OCR for .NET — EasyOCR's neural models, running natively on ONNX Runtime. No Python.

NuGet Downloads License: MIT .NET 10 AOT ready

Quick start · Documents & tables · PDF · Languages · Production

</div>


EasyOcrSharp runs EasyOCR's exact CRAFT text detector and CRNN recognizers — exported to ONNX and executed through Microsoft.ML.OnnxRuntime. You get EasyOCR-grade accuracy in a tiny managed package: no Python interpreter, no PyTorch, no native OCR binaries, and nothing ever leaves the machine.

await using var ocr = new EasyOcrService();
var result = await ocr.ExtractTextFromImage("receipt.png", new[] { "en" });
Console.WriteLine(result.FullText);

✨ Highlights

🌍 86 languages 13 script families: Latin, Cyrillic, Arabic, Devanagari, Bengali, Chinese (Simplified & Traditional), Korean, Japanese, Thai, Tamil, Telugu, Kannada
📦 ~3 MB package Models download on demand and cache locally — nothing is bundled
🔒 Verified & private Every model download is SHA256-checked; OCR runs fully offline
Fast Concurrent multi-region recognition; automatic CUDA GPU with CPU fallback; tunable threads
🧩 Flexible input File / Stream / byte[] / Image / PDF, region-of-interest, recognize-from-boxes, word/line/paragraph grouping, auto language detection
🧱 Document structure AnalyzeDocumentAsync: layout regions, tables as HTML, formulas, seals & reading order (PP-StructureV3 via PaddleOcrNet), with Markdown / JSON export
📄 Document-ready Searchable-PDF output, plus hOCR / ALTO / TSV / JSON exporters
🎯 Accurate fields Allow/block-lists, beam / word-beam decoders, per-box rotation, custom recognizers, exposed detection/grouping/contrast thresholds
🩺 Scan-ready Deskew, orientation correction, adaptive binarize, denoise & sharpen, plus model-based document orientation & page unwarp
📊 Production-grade OpenTelemetry metrics & tracing, health checks, resilient resumable downloads, batch API
🛠️ Modern .NET AOT- & single-file-friendly, DI-ready, .NET 10

🆚 Why EasyOcrSharp?

EasyOcrSharp Python EasyOCR Cloud OCR APIs Tesseract (.NET wrappers)
Runtime Pure .NET + ONNX Python + PyTorch Remote HTTP service Native binary via P/Invoke
Install dotnet add package pip + CUDA toolchain API key + billing Native libs + tessdata files
Privacy 🟢 100% offline 🟢 offline 🔴 data leaves the machine 🟢 offline
Accuracy 🟢 EasyOCR neural models 🟢 EasyOCR neural models 🟢 high 🟡 classical (weaker on hard text)
Tables / layout 🟢 built-in (PP-Structure) 🟡 add-ons 🟢 yes 🔴 no
PDF in & searchable out 🟢 built-in 🔴 DIY 🟢 yes 🔴 DIY
GPU 🟢 CUDA (opt-in package) 🟢 CUDA n/a 🔴 no
Native AOT / trimming 🟢 yes n/a n/a 🟡 limited
Cost 🟢 free (MIT) 🟢 free 🔴 per-call 🟢 free

Same models as Python EasyOCR, none of the Python. Fully local, so no per-page cost and no data egress.


📥 Installation

dotnet add package EasyOcrSharp

PDF input and searchable-PDF output are built in — no extra package. For NVIDIA GPU acceleration (Windows/Linux x64, CUDA 12+):

dotnet add package EasyOcrSharp.Gpu

Upgrading from 1.x? v2 replaced the ~1.5 GB embedded Python + PyTorch runtime with native ONNX. The public API (EasyOcrService, OcrResult, OcrLine, OcrBoundingBox) is unchanged.

🚀 Quick start

using EasyOcrSharp.Services;

await using var ocr = new EasyOcrService();

var result = await ocr.ExtractTextFromImage("sample.png", new[] { "en" });

Console.WriteLine(result.FullText);

foreach (var line in result.Lines)
    Console.WriteLine($"{line.Text}  (confidence {line.Confidence:P0})");

The first call for a language downloads its model (cached afterwards). Detected text is returned as reading-order lines (top-to-bottom), matching EasyOCR's readtext().

📦 The result model

public sealed record OcrResult
{
    public string FullText { get; }                 // all lines joined by newlines (reading order)
    public IReadOnlyList<OcrLine> Lines { get; }
    public IReadOnlyList<string> Languages { get; }
    public TimeSpan Duration { get; }
    public bool UsedGpu { get; }
    public int SourceWidth { get; }                 // dimensions OCR ran on (0 if unknown) — handy for exporters
    public int SourceHeight { get; }
}

public sealed record OcrLine
{
    public string Text { get; }
    public double Confidence { get; }                       // 0..1
    public IReadOnlyList<OcrPoint> BoundingPolygon { get; } // 4 corners
    public OcrBoundingBox BoundingBox { get; }              // MinX/MinY/MaxX/MaxY + Width/Height/Center
}

<br>

🧭 Core OCR

Input sources

OCR from a file path, a Stream, raw encoded bytes, or an already-decoded ImageSharp image:

await ocr.ExtractTextFromImage("photo.jpg",            new[] { "en" });
await ocr.ExtractTextFromImage(stream,                  new[] { "en" });
await ocr.ExtractTextFromImage(File.ReadAllBytes("p"),  new[] { "en" });   // byte[]
await ocr.ExtractTextFromImage(memory,                  new[] { "en" });   // ReadOnlyMemory<byte>
await ocr.ExtractTextFromImage(image,                   new[] { "en" });   // Image<Rgb24> (caller-owned)

All overloads accept an optional RecognitionOptions and a CancellationToken.

Recognition options

var options = new RecognitionOptions
{
    Grouping = TextGrouping.Line,    // Word | Line (default) | Paragraph
    MinConfidence = 0.3,             // drop results below this confidence
    MaxDegreeOfParallelism = 8,      // regions recognized concurrently (default: CPU count)
    AdjustContrast = true,           // low-confidence contrast-retry pass (EasyOCR's 2nd pass)
    Region = null,                   // optional region of interest (see below)
};

var result = await ocr.ExtractTextFromImage("doc.png", new[] { "en" }, options);
Grouping Behaviour
Word One result per detected box (≈ per word)
Line Adjacent boxes merged into lines (default; matches EasyOCR)
Paragraph Nearby lines merged into paragraph blocks

Region of interest

Restrict OCR to a rectangle — ideal for a fixed field (price, license plate, banner) and faster than scanning the whole image. Boxes are always reported in the original image's coordinates.

// Absolute pixels:
var roi = new RecognitionOptions { Region = OcrRegion.Pixels(x: 40, y: 320, width: 500, height: 80) };

// Resolution-independent fractions — e.g. the bottom third:
var bottom = new RecognitionOptions { Region = OcrRegion.Fraction(0, 0.66, 1, 0.34) };

var result = await ocr.ExtractTextFromImage("receipt.png", new[] { "en" }, bottom);

Multiple languages

Pass several codes for mixed-script images — each region is read by every requested script pack and the highest-confidence result wins:

var result = await ocr.ExtractTextFromImage("street_sign.png", new[] { "en", "ch_sim", "ru" });

Each additional script family loads its own model, so request only the scripts you expect.

Automatic language detection

Don't know the language? Let the engine detect it — pass no codes and set AutoDetectLanguage:

var result = await ocr.ExtractTextFromImage("unknown.png", Array.Empty<string>(),
    new RecognitionOptions { AutoDetectLanguage = true });

// Or just detect, without recognizing:
IReadOnlyList<string> langs = await ocr.DetectLanguagesAsync("unknown.png");

Detection samples the largest text regions and scores candidate script packs by confidence. Candidates default to a common set (Latin, Cyrillic, Chinese, Japanese, Korean); widen them when you expect heavier scripts:

var opts = new RecognitionOptions
{
    AutoDetectLanguage = true,
    AutoDetectCandidates = new[] { "en", "ar", "hi", "ch_sim" },
};

<br>

📄 Documents, scans & PDF

Scanned-document preprocessing

For photos and scans, enable clean-up via RecognitionOptions.Preprocessing:

var opts = new RecognitionOptions
{
    Preprocessing = new PreprocessingOptions
    {
        Deskew = true,            // straighten small tilt (±15°)
        DetectOrientation = true, // fix 90°/180°/270° rotation (≈4× cost)
        Binarize = true,          // adaptive black/white for uneven lighting
        Denoise = true,           // suppress scanner speckle
        Sharpen = true,           // unsharp-mask for soft / low-DPI scans (SharpenAmount tunes strength)
    },
};
var result = await ocr.ExtractTextFromImage("scan.jpg", new[] { "en" }, opts);

Two model-based document steps are also available — small dedicated neural models that download on first use (SHA256-verified like every other model):

var docOpts = new RecognitionOptions
{
    Preprocessing = new PreprocessingOptions
    {
        DocumentOrientation = true, // PP-LCNet doc classifier: fixes 90°/180°/270° in ONE tiny model
                                    // pass — much cheaper than DetectOrientation's 4× OCR
        DocumentUnwarp = true,      // UVDoc: dewarps curved/folded pages (photographed book pages,
                                    // creased receipts) before OCR
    },
};

Coordinate spaces. DetectOrientation reads the page at whichever 90°/180°/270° rotation scores best, then maps the boxes back onto your original image — results stay anchored to the input you passed and agree with the reported SourceWidth/SourceHeight. The model-based DocumentOrientation / DocumentUnwarp steps instead hand the whole pipeline a corrected page, so their boxes (and SourceWidth/SourceHeight) are in that corrected image's coordinate space.

🧱 Document structure & tables

Beyond plain text OCR, AnalyzeDocumentAsync recovers a page's structure — layout regions, tables (as HTML), formulas (as LaTeX), seals/stamps, and reading order — powered by PaddleOCR's PP-StructureV3 models through the PaddleOcrNet engine (bundled as a dependency; nothing extra to install):

using var ocr = new EasyOcrService();

var doc = await ocr.AnalyzeDocumentAsync("report_page.png");

foreach (var block in doc.Blocks)                    // in reading order
{
    Console.WriteLine($"{block.Order}: {block.Type} @ {block.Bounds}");
    if (block.TableHtml is not null)                 // tables come back as structured HTML
        Console.WriteLine(block.TableHtml);
}

string markdown = doc.ToMarkdown();                  // whole page as Markdown (tables included)
string json = doc.ToJson();

Tune what runs with DocumentAnalysisOptions (all models download on demand, SHA256-verified):

var doc = await ocr.AnalyzeDocumentAsync("scan.jpg", new DocumentAnalysisOptions
{
    DocumentOrientation = true,             // upright a rotated page first
    DocumentUnwarp = true,                  // dewarp a curved/folded page first
    RecognizeTables = true,                 // table structure as HTML (default on)
    RecognizeFormulas = false,              // skip LaTeX formula recognition
    RecognizeSeals = false,                 // skip seal/stamp recognition
    TableModel = DocumentTableModel.SlaNeXt,// higher-accuracy table model (default: SlanetPlus)
    Languages = new[] { "en" },             // text language(s); default pack covers ch/en/ja
});

The analyzer shares the service's execution provider, thread limits, cache path (when set) and download-resilience settings, loads lazily on first use, and is disposed with the service. Regular OCR calls never touch it.

📑 PDF input & searchable PDF

OCR scanned PDFs and emit searchable PDFs — built into the main package (no extra install needed). Pages are rasterized with PDFium and processed one at a time, so memory stays low even on large documents.

using EasyOcrSharp.Pdf;

await using var ocr = new EasyOcrService();

// 1) Extract text from every page:
PdfOcrResult doc = await ocr.ExtractTextFromPdfAsync("scan.pdf", new[] { "en" });
Console.WriteLine(doc.FullText);
foreach (var page in doc.Pages)
    Console.WriteLine($"Page {page.PageNumber}: {page.Ocr.Lines.Count} lines");

// 2) Produce a searchable PDF (original pages + invisible, selectable text layer):
await ocr.CreateSearchablePdfAsync("scan.pdf", "scan.searchable.pdf", new[] { "en" },
    pdfOptions: new PdfOcrOptions { Dpi = 250, JpegQuality = 80 });

PdfOcrOptions controls render Dpi, searchable-PDF JpegQuality, and a per-page Progress callback. The searchable text layer is best for Latin scripts (base-14 font, WinAnsi encoding).

<br>

🔌 Output & integration

Output formats (hOCR / ALTO / TSV / JSON)

Any OcrResult converts to the interchange formats DMS and archival pipelines expect:

using EasyOcrSharp.Export;

using var img = Image.Load<Rgb24>("page.png");
var result = await ocr.ExtractTextFromImage(img, new[] { "en" });

string hocr = result.ToHocr(pageWidth: img.Width, pageHeight: img.Height); // hOCR (HTML)
string alto = result.ToAlto(pageWidth: img.Width, pageHeight: img.Height); // ALTO XML v4
string tsv  = result.ToTsv();                                              // Tesseract-style TSV
string json = result.ToJson(indented: true);                              // AOT-safe JSON

ToJson uses a source-generated EasyOcrJsonContext, so it works in trimmed / Native-AOT apps with no reflection warnings.

Recognize from known boxes

If you already have regions — from DetectRegionsAsync, a previous run, or your own layout analysis — recognize them directly and skip detection (EasyOCR's recognize()):

using var image = Image.Load<Rgb24>("form.png");

// e.g. reuse a detection pass, or pass your own polygons (pixel coordinates):
IReadOnlyList<DetectedRegion> regions = await ocr.DetectRegionsAsync(image);

OcrResult result = await ocr.RecognizeRegionsAsync(image, regions, new[] { "en" });

There's also an overload taking raw polygons (IEnumerable<IReadOnlyList<OcrPoint>>).

Detection-only & visualization

Locate text regions without recognizing them — fast and language-independent, ideal for layout analysis, redaction, or cropping fields before a targeted recognition pass:

IReadOnlyList<DetectedRegion> regions = await ocr.DetectRegionsAsync("form.png");

Draw the boxes onto a copy of the image for debugging (no extra dependency; original is untouched):

using EasyOcrSharp.Export;

using var img = Image.Load<Rgb24>("page.png");
var result = await ocr.ExtractTextFromImage(img, new[] { "en" });
using var annotated = img.DrawAnnotations(result, new Rgb24(255, 0, 0), thickness: 2);
await annotated.SaveAsPngAsync("page.annotated.png");

Batch processing

Process a folder or queue with bounded concurrency. Results stream as they complete; a failed image is captured (not thrown), so one bad file never aborts the batch:

var files = Directory.EnumerateFiles("inbox", "*.png");

await foreach (var item in ocr.ExtractTextFromImagesAsync(files, new[] { "en" }, maxConcurrency: 4))
{
    if (item.Succeeded) Console.WriteLine($"{item.Source}: {item.Result!.Lines.Count} lines");
    else                Console.Error.WriteLine($"{item.Source} failed: {item.Error!.Message}");
}

<br>

🎯 Accuracy & tuning

Constrained fields (allow/block-lists & detection thresholds)

For fixed-format fields, restrict the character set — this sharply cuts errors:

// Digits only (invoice totals, IDs, meter readings):
var digits = new RecognitionOptions { Allowlist = "0123456789.," };

// License plate (upper-case + digits):
var plate = new RecognitionOptions { Allowlist = "ABCDEFGHIJKLMNOPQRSTUVWXYZ0123456789-" };

var total = await ocr.ExtractTextFromImage("receipt.png", new[] { "en" }, digits);

Use Blocklist to forbid specific characters instead. For hard inputs, the CRAFT detector thresholds are exposed via RecognitionOptions.Detection (defaults match EasyOCR):

var opts = new RecognitionOptions
{
    Detection = new DetectionOptions
    {
        TextThreshold = 0.6,  // lower → catch fainter text
        LowText = 0.3,        // lower → keep more of each glyph
        MagRatio = 1.5,       // enlarge before detection (small text)
    },
};

Decoders, rotation & batching

Switch the CTC decoder, recognize rotated text, or batch boxes through the model — all via RecognitionOptions (every option defaults to the previous behaviour):

var opts = new RecognitionOptions
{
    Decoder = DecoderType.BeamSearch,  // Greedy (default) | BeamSearch | WordBeamSearch
    BeamWidth = 10,                    // explored hypotheses (beam decoders)
    RotationInfo = new[] { 90, 270 },  // also try each box rotated; keep the best reading
    BatchSize = 16,                    // batch boxes through one ONNX run (see note below)
};

var result = await ocr.ExtractTextFromImage("rotated_labels.png", new[] { "en" }, opts);

BatchSize note. Batching needs a batch-capable recognizer export; the currently hosted models are exported with batch fixed at 1, so BatchSize > 1 transparently falls back to per-box inference today (results are unaffected). Per-box recognition already runs concurrently — tune throughput with MaxDegreeOfParallelism.

WordBeamSearch constrains output to a lexicon you supply, which is powerful for closed vocabularies (part numbers, place names, a product catalogue):

var opts = new RecognitionOptions
{
    Decoder = DecoderType.WordBeamSearch,
    Dictionary = new[] { "INVOICE", "TOTAL", "SUBTOTAL", "TAX" },
};

Custom recognizers

Register your own exported CRNN ONNX model (e.g. a fine-tuned EasyOCR recog_network) for chosen language codes. A custom recognizer takes precedence over the built-in pack and is loaded straight from disk — never downloaded:

var options = new EasyOcrServiceOptions();
options.CustomRecognizers.Add(new CustomRecognizer
{
    Name = "my_meter_reader",
    ModelPath = @"D:\models\meter_g2.onnx",
    VocabPath = @"D:\models\meter_g2.vocab.json", // or set Characters = "0123456789." inline
    Languages = new[] { "en" },                    // claim the codes it should handle
});

await using var ocr = new EasyOcrService(options);

Fine-tuning grouping & contrast

The thresholds that merge boxes into lines/paragraphs and trigger the contrast-retry pass are exposed for difficult layouts (defaults reproduce EasyOCR's behaviour):

var opts = new RecognitionOptions
{
    GroupingOptions = new GroupingOptions
    {
        SlopeThreshold = 0.1,        // tolerate gently tilted lines (slope_ths)
        YCenterThreshold = 0.5,      // vertical tolerance for same-line boxes (ycenter_ths)
        WidthThreshold = 1.0,        // max horizontal gap to merge on a line (width_ths)
        ParagraphYThreshold = 1.0,   // vertical reach when forming paragraphs (y_ths)
    },
    ContrastThreshold = 0.1,         // re-recognize below this confidence (contrast_ths)
    AdjustContrastTarget = 0.5,      // grey-stretch target for the retry pass (adjust_contrast)
};

Quantized models

Set Quantize to fetch the int8-quantized recognizers instead of the float ones — EasyOCR's quantize=True, for smaller downloads:

await using var ocr = new EasyOcrService(new EasyOcrServiceOptions { Quantize = true });

The int8 variants are hosted alongside the float models and SHA256-verified on download, and text output is effectively unchanged. The win is vocabulary-dependent: ONNX Runtime (CPU) int8-quantizes the matmul/linear layers but not the BiLSTM/convolutions, so large-vocabulary packs shrink most (e.g. zh_sim ~22 → ~16 MB) while small-vocabulary packs change little. The detector stays float (as in EasyOCR). Opt-in; the float models are the default.

<br>

📊 Production & operations

GPU & execution providers

GPU is automatic. ExecutionProvider defaults to Auto: on the first run EasyOcrSharp asks ONNX Runtime what accelerators are actually present and uses the best one, falling back to CPU when there's none. You don't pick a provider — you just install the package for the hardware you have:

Install this package What Auto enables
EasyOcrSharp (base) CPU only
EasyOcrSharp.Gpu NVIDIA CUDA (needs CUDA 12+ on PATH)

Why a separate package? The ONNX Runtime variants ship the same onnxruntime.dll compiled with different providers, so only one can be referenced at a time — and the CUDA build is several hundred MB. Shipping it as an opt-in package keeps the base library small and cross-platform; Auto then lights up whatever you installed.

// Nothing to configure — add the EasyOcrSharp.Gpu package and a GPU is used if present.
await using var ocr = new EasyOcrService();

You can still pin a provider explicitly (e.g. to force CPU, or to require CUDA) and set ONNX Runtime thread limits:

await using var ocr = new EasyOcrService(new EasyOcrServiceOptions
{
    ExecutionProvider = OcrExecutionProvider.Cuda, // Auto (default) | Cpu | Cuda | CoreMl
    IntraOpNumThreads = 4,   // cap CPU use in multi-tenant servers (null = runtime default)
});
Provider Package Notes
Auto (any) Default. Probes the runtime; uses the best installed accelerator, else CPU
Cpu (built-in) Always available
Cuda EasyOcrSharp.Gpu NVIDIA, CUDA 12+ on PATH
CoreMl CoreML-enabled ORT build macOS / Apple Silicon

Any non-CPU provider falls back to CPU automatically (with a logged warning) if its runtime is missing or the device fails to initialize — your app keeps working. The legacy useGpu: true flag still works and forces CUDA. Check ocr.UseGpu to see whether an accelerator was selected.

GPU upgrade hint. When Auto runs on CPU but a real NVIDIA GPU is physically present, EasyOcrSharp detects it and can tell you the exact package to add — EasyOcrSharp.Gpu. It's silent by default: the hint is exposed as a property you can surface yourself, and nothing is logged unless you opt in with LogGpuHint = true.

// Silent by default — read it only if you want to nudge the user yourself:
await using var ocr = new EasyOcrService();
if (ocr.GpuAccelerationHint is { } hint) Console.WriteLine(hint);
// e.g. "EasyOcrSharp: an NVIDIA GPU was detected but OCR is running on CPU. Install the
//       'EasyOcrSharp.Gpu' NuGet package for CUDA acceleration. ..."

// Opt in to a one-time startup warning in the logs instead:
await using var verbose = new EasyOcrService(new EasyOcrServiceOptions { LogGpuHint = true });

Observability & health checks

EasyOcrSharp emits OpenTelemetry-ready metrics and traces, always-on with near-zero cost when nobody is listening:

builder.Services.AddOpenTelemetry()
    .WithMetrics(m => m.AddMeter(EasyOcrDiagnostics.MeterName))   // operations, duration, lines, model loads/bytes
    .WithTracing(t => t.AddSource(EasyOcrDiagnostics.ActivitySourceName));

Add a readiness probe that reports whether the models for your languages are cached (so the first real request won't block on a download):

builder.Services.AddHealthChecks()
    .AddEasyOcrHealthCheck(languages: new[] { "en" });

Dependency injection

services.AddEasyOcrSharp(o =>
{
    o.ModelCachePath = "/var/cache/easyocr";
    // GPU is automatic (ExecutionProvider = Auto). To force CPU in a multi-tenant host:
    // o.ExecutionProvider = OcrExecutionProvider.Cpu;
});

public class ReceiptParser(IEasyOcrService ocr) { /* inject anywhere */ }

Registered as a singleton — ONNX sessions are expensive to build and thread-safe to reuse.

Hardening & resource limits

When OCR-ing untrusted images or PDFs, EasyOcrSharp guards against decompression-bomb / pixel-flood denial of service. The defaults are generous; raise them if you legitimately process larger inputs, or set them to 0 to disable a guard.

await using var ocr = new EasyOcrService(new EasyOcrServiceOptions
{
    MaxImagePixels = 100_000_000,   // reject images over 100 MP from the header, before decode (default)
});

var pdfOptions = new PdfOcrOptions
{
    MaxPages = 5000,                // reject documents with more pages (default)
    MaxPageMegapixels = 200,        // reject a page that would rasterize larger at the chosen DPI (default)
};

Failures surface as typed exceptions (all derive EasyOcrSharpException, so a catch-all still works):

Exception When
ImageTooLargeException image exceeds MaxImagePixels
PdfProcessingException corrupt / encrypted PDF, or a page/size guard tripped
ModelDownloadException download failed, or a non-HTTPS / malformed model source
ModelChecksumException downloaded model failed (or lacks) SHA256 verification
OfflineModelMissingException model not cached and Offline = true

Warm-up (remove cold-start latency). Preload the detector and recognizer packs so the first real request doesn't pay model-download + session-init latency — ideal for serverless / scale-out:

await ocr.WarmUp(new[] { "en" });   // downloads + initializes once, up front

Resilient & offline model downloads

Model downloads are production-hardened: atomic, SHA256-verified, resumable (HTTP range), and retried with exponential backoff. By default the model source must be HTTPS and every model must have a known checksum. Tune everything via ModelDownloadOptions:

await using var ocr = new EasyOcrService(new EasyOcrServiceOptions
{
    Download = new ModelDownloadOptions
    {
        MaxRetries = 5,
        Offline = false,                              // true = never download; fail fast if not cached
        BaseUrlOverride = "https://mirror.corp/ocr",  // private mirror (must be https unless opted out)
        HttpClientFactory = () => httpClientFactory.CreateClient("ocr"), // proxy / corporate certs
        // AllowInsecureModelSource = true,           // permit a plain-http mirror you control
        // AllowUnverifiedModels   = true,            // permit unlisted models that have no registry checksum
        Progress = new Progress<ModelDownloadProgress>(p =>
            Console.WriteLine($"{p.FileName}: {p.Fraction:P0}")),
    },
});

For air-gapped deployments, pre-seed the cache and set Offline = true — a missing model then throws a clear error instead of attempting a download.

<br>

📚 Reference

🌍 Supported languages

Languages are grouped by script; one recognizer covers an entire group, so ["en","es","fr"] loads a single model. Pack sizes vary widely with the network each script was trained on — some are a few MB, some ~210 MB — which affects first-run download size only, not runtime behaviour.

Pack Size Languages
latin_g2 ~15 MB af, az, bs, cs, cy, da, de, en, es, et, fr, ga, hr, hu, id, is, it, ku, la, lt, lv, mi, ms, mt, nl, no, oc, pi, pl, pt, ro, rs_latin, sk, sl, sq, sv, sw, tl, tr, uz, vi
cyrillic_g2 ~15 MB ru, rs_cyrillic, be, bg, uk, mn, abq, ady, kbd, ava, dar, inh, che, lbe, lez, tab, tjk
zh_sim_g2 ~22 MB ch_sim
korean_g2 ~16 MB ko
japanese_g2 ~17 MB ja
telugu_g2 ~15 MB te
kannada_g2 ~15 MB kn
arabic_g2 ~210 MB ar, fa, ug, ur
devanagari_g2 ~210 MB hi, mr, ne, bh, mai, ang, bho, mah, sck, new, gom, sa, bgc
bengali_g2 ~210 MB bn, as, mni
thai_g1 ~210 MB th
tamil_g1 ~210 MB ta
zh_tra_g1 ~215 MB ch_tra

That's all 86 languages EasyOCR supports, mapped exactly to the model each was trained on.

Not supported: Greek (el) and Hebrew (he) — upstream EasyOCR ships no model for either script, so they cannot be exported.

How model downloads work

EasyOcrSharp ships no models in the NuGet package. On the first call for a language it downloads, into a local cache:

  1. CRAFT detector (craft_mlt_25k.onnx, ~80 MB) — shared by all languages, downloaded once.
  2. CRNN recognizer for the language's script pack (e.g. latin_g2.onnx).
  3. A small vocabulary sidecar (<pack>.vocab.json).

Every file is SHA256-verified against a checksum baked into the library, so corrupted or tampered downloads are rejected. Models are hosted on Hugging Face.

Default cache: %LOCALAPPDATA%\EasyOcrSharp\models (Windows) or the platform equivalent. Override it:

await using var ocr = new EasyOcrService(modelCachePath: @"D:\MyApp\Models");
EASYOCRSHARP_CACHE=/var/cache/easyocr                          # cache directory
EASYOCRSHARP_MODEL_BASE_URL=https://files.mycorp.example/ocr   # private/offline mirror

Offline / air-gapped: pre-seed your cache directory with the .onnx + .vocab.json files from the model repo — no network is needed at runtime.

Accuracy notes

EasyOcrSharp reproduces EasyOCR's pipeline faithfully — aspect-preserving resize, normalization, a low-confidence contrast-retry pass, CRAFT box dilation, perspective de-warping of rotated text, and CTC decoding (greedy by default, with optional beam / word-beam search) — so output matches upstream EasyOCR. On top of that:

  • Reading order is column-aware and bands rows by a line-height-relative tolerance, so headings, high-DPI scans, and multi-column pages come out in natural reading order.
  • Overlapping detections are de-duplicated with IoU NMS (DetectionOptions.NmsIouThreshold, default 0.6; set 0 to disable).
  • On multi-language requests, scoring is biased toward the page's dominant script so an over-confident wrong-script pack can't hijack individual boxes.

As with any OCR:

  • Visually identical glyphs (capital I vs lowercase l, $ vs 8) can be confused.
  • Handwriting and low-resolution / low-contrast text are harder than clean printed text.
  • Right-to-left scripts (Arabic) are returned in the model's character order.

Building & testing

git clone https://github.com/easyocrsharp/EasyOcrSharp.git
cd EasyOcrSharp
dotnet build -c Release

# Everything — unit + real end-to-end integration tests. Downloads the models on first run
# and reports a pass/fail summary:
dotnet test

# Fast unit tests only (no models, no network):
dotnet test --filter "Category!=Integration"

# Only the model-backed integration tests:
dotnet test --filter "Category=Integration"

# Interactive console demo:
dotnet run --project test/EasyOcrSharp.Demo

The integration tests exercise every feature against the real engine (allow/block-lists, detection-only, exporters, batch, metrics/tracing, health check, execution-provider fallback, and the full PDF pipeline) — no mocks. The PDF tests need two small fixtures you generate yourself; they're skipped (never failed) until present. See test/assets/pdf/README.md for the exact one-page and three-page PDFs to drop in.

Path Purpose
src/EasyOcrSharp the core library (includes PDF input + searchable-PDF output)
src/EasyOcrSharp.Gpu CUDA execution-provider package
test/EasyOcrSharp.Tests xUnit unit + integration tests
test/EasyOcrSharp.Demo interactive console demo
test/assets sample images
tools/ maintainer-only ONNX export + quantization scripts

CI (GitHub Actions) builds and runs the unit tests on Linux and Windows for every push and PR. See CHANGELOG.md for release history.

<br>

🤝 Contributing

Contributions are welcome! New features, accuracy improvements, performance tuning, bug fixes, additional language/model coverage, documentation, and tests are all appreciated.

  • 🐛 Found a bug? Open an issue with a minimal repro (image/PDF + the code and options you used).
  • 💡 Have an idea or feature request? Open an issue to discuss it first, then send a PR.
  • 🔧 Sending a PR? Branch from main, keep changes focused, and make sure dotnet build -c Release and the unit tests (dotnet test --filter "Category!=Integration") pass.

If you're working on something larger, or want to collaborate on a feature, feel free to reach out before starting so we can align on the approach.

📬 Contact

For work inquiries, collaboration, feature requests, or any questions, reach out to:

Farhan Lodifarhanlodi31@gmail.com

📄 License

MIT — see LICENSE. EasyOCR (the upstream model authors) is also MIT-licensed.

🙏 Acknowledgments

<div align="center"> <br>

⬆ Back to top

<sub>Built with ❤️ for the .NET community · EasyOCR accuracy, zero Python</sub>

</div>

Product Compatible and additional computed target framework versions.
.NET net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages (1)

Showing the top 1 NuGet packages that depend on EasyOcrSharp:

Package Downloads
EasyOcrSharp.Gpu

CUDA GPU execution provider for EasyOcrSharp. Adds Microsoft.ML.OnnxRuntime.Gpu so detection and recognition run on NVIDIA GPUs.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
2.2.4 61 7/3/2026
2.2.3 138 6/19/2026
2.2.2 124 6/19/2026
2.2.1 154 6/6/2026
2.2.0 105 6/6/2026
2.1.1 139 5/30/2026
2.1.0 131 5/30/2026
2.0.1 123 5/30/2026
2.0.0 120 5/30/2026

2.2.4: document-structure analysis and document preprocessing. New AnalyzeDocumentAsync (all input overloads) recovers layout regions, tables as HTML, formulas as LaTeX, seals and reading order via PP-StructureV3 (PaddleOcrNet engine), with ToMarkdown()/ToJson() export and DocumentAnalysisOptions (table model choice, per-feature toggles, languages, page orientation/unwarp). New PreprocessingOptions.Sharpen/SharpenAmount (unsharp mask), DocumentOrientation (PP-LCNet single-pass 90/180/270° page fix) and DocumentUnwarp (UVDoc page dewarp) — all default-off. All additive and opt-in; no public method or default changed. 2.2.3: dependency updates (ONNX Runtime 1.27.0, Microsoft.Extensions 10.0.9) plus the hardening + performance + accuracy pass. Security: image decompression-bomb guard (MaxImagePixels), PDF page/size guards (MaxPages, MaxPageMegapixels), HTTPS-only model source + fail-closed checksum verification, model file-name traversal check. Thread-safety: recognizer cache no longer poisoned by a cancelled/failed load; DisposeAsync drains in-flight OCR before releasing sessions. Performance: contiguous Buffer.Span tensor reads, PerspectiveWarp sub-rect copy, CPU intra-op=1 under box-level parallelism, pooled scratch buffers, new WarmUp() to remove cold-start latency. Accuracy: column/font-aware reading order, IoU NMS box de-dup, dominant-script bias on multi-language requests. Added typed exceptions (ModelDownload/Checksum/OfflineModelMissing/PdfProcessing/ImageTooLarge) and OcrResult.SourceWidth/Height. No public method renamed; all additive or safer defaults. See CHANGELOG.md. 2.2.1: clearer typed errors for bad PDFs. 2.2.0: PDF I/O, exporters, telemetry, resilient downloads, auto-GPU, EasyOCR parity.