Angri450.Nong.MultiModal
3.0.2
There is a newer version of this package available.
See the version list below for details.
See the version list below for details.
dotnet add package Angri450.Nong.MultiModal --version 3.0.2
NuGet\Install-Package Angri450.Nong.MultiModal -Version 3.0.2
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Angri450.Nong.MultiModal" Version="3.0.2" />
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="Angri450.Nong.MultiModal" Version="3.0.2" />
<PackageReference Include="Angri450.Nong.MultiModal" />
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add Angri450.Nong.MultiModal --version 3.0.2
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
#r "nuget: Angri450.Nong.MultiModal, 3.0.2"
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package Angri450.Nong.MultiModal@3.0.2
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=Angri450.Nong.MultiModal&version=3.0.2
#tool nuget:?package=Angri450.Nong.MultiModal&version=3.0.2
The NuGet Team does not provide support for this client. Please contact its maintainers for support.
Angri450.Nong.MultiModal
Multi-modal document processing library. Cloud OCR (PaddleOCR-VL-1.6) and local CPU OCR (PaddleOCR). Speech-to-text and text-to-speech planned.
Supported Platforms
.NET 8.0 and above (net8.0, net9.0, net10.0, net11.0). Windows, macOS, Linux.
Install
dotnet add package Angri450.Nong.MultiModal
Optional: local OCR
pip install paddlepaddle paddleocr
Skip if you only use the cloud API — no Python required.
Quick Start
Cloud API (PaddleOCR-VL-1.6)
using MultiModalCore;
var client = new PaddleOcrVlClient(); // Token from PADDLEOCR_TOKEN env var
// File → Markdown
var mdFiles = await client.ProcessAsync("scan.pdf", "output/");
// File → Word (layout-preserving)
var docxPath = await client.ProcessToWordAsync("scan.pdf", "output/result.docx");
// URL → Markdown
await client.ProcessAsync("https://example.com/document.png", "output/");
// Raw bytes → Markdown
await client.ProcessAsync(fileBytes, "scan.png", "output/");
Local CPU OCR
var local = new LocalOcrClient(pythonExe: "python", lang: "ch");
var (ok, msg) = await local.CheckEnvironmentAsync();
if (!ok) Console.WriteLine("Install: pip install paddlepaddle paddleocr");
var blocks = await local.RecognizeAsync("crop.png");
foreach (var b in blocks)
Console.WriteLine($"[{b.Confidence:P0}] {b.Text}");
Step-by-Step Control
var client = new PaddleOcrVlClient();
var jobId = await client.SubmitFileAsync("scan.pdf");
var resultUrl = await client.WaitForJobAsync(jobId, TimeSpan.FromSeconds(5));
var mdFiles = await client.DownloadResultsAsync(resultUrl, "output/");
// Structured data for custom processing
var ocrResult = await client.DownloadResultsStructuredAsync(resultUrl, "output/");
foreach (var page in ocrResult.Pages)
foreach (var block in page.Blocks)
Console.WriteLine($"[{block.Label}] {block.Content}");
Options
var options = new OcrOptions
{
UseDocOrientationClassify = true, // Auto-detect orientation
UseDocUnwarping = true, // Document unwarping
UseChartRecognition = true, // Chart parsing
};
await client.ProcessAsync("scan.pdf", "output/", options);
Word Output Pipeline
ProcessToWordAsync produces layout-preserving .docx:
- Cloud API returns
parsing_res_list— each block hasblock_label,block_content,block_bbox LayoutToWordConvertermaps blocks to Docx primitives:doc_title→ Title,paragraph_title→ Heading,text→ Bodyimage→ embedded image (actual download),table→ OpenXML tablevision_footnote→ Footnote
- Multi-column pages auto-detected from
block_bboxcoordinates ElementOrder.RectifyTree()fixes OpenXML ordering before save
Dependencies
Angri450.Nong.Docx— Word generation forProcessToWordAsyncoutput
API Reference
PaddleOcrVlClient (Cloud)
| Method | Description |
|---|---|
ProcessAsync(input, outputDir) |
Submit → wait → download Markdown |
ProcessToWordAsync(input, docxPath) |
Submit → wait → download → Word |
SubmitFileAsync(path) |
Submit local file, returns jobId |
SubmitBytesAsync(bytes, name) |
Submit in-memory data |
SubmitUrlAsync(url) |
Submit remote URL |
WaitForJobAsync(jobId, interval) |
Poll until done, returns result URL |
DownloadResultsAsync(resultUrl, dir) |
Download Markdown + images |
DownloadResultsStructuredAsync(resultUrl, dir) |
Download and return OcrResult |
LocalOcrClient (CPU)
| Method | Description |
|---|---|
RecognizeAsync(path) |
OCR a single image |
RecognizeAsync(bytes) |
OCR from memory |
RecognizeBatchAsync(paths) |
OCR multiple images |
CheckEnvironmentAsync() |
Verify Python + PaddleOCR |
Source
https://github.com/angri450/Nong.NET — Issues and PRs welcome.
License
MIT
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
-
net8.0
- Angri450.Nong.Docx (>= 3.0.2)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.