OfficeIMO.Pdf 0.1.35

Prefix Reserved
dotnet add package OfficeIMO.Pdf --version 0.1.35
                    
NuGet\Install-Package OfficeIMO.Pdf -Version 0.1.35
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="OfficeIMO.Pdf" Version="0.1.35" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="OfficeIMO.Pdf" Version="0.1.35" />
                    
Directory.Packages.props
<PackageReference Include="OfficeIMO.Pdf" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add OfficeIMO.Pdf --version 0.1.35
                    
#r "nuget: OfficeIMO.Pdf, 0.1.35"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package OfficeIMO.Pdf@0.1.35
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=OfficeIMO.Pdf&version=0.1.35
                    
Install as a Cake Addin
#tool nuget:?package=OfficeIMO.Pdf&version=0.1.35
                    
Install as a Cake Tool

OfficeIMO.Pdf - Dependency-Free PDF Engine

OfficeIMO.Pdf is the first-party PDF package for the OfficeIMO family. It is MIT licensed and is intended to stay free of runtime package dependencies.

The long-term direction is larger than a small PDF writer: this package should become the engine that PSWriteOffice can expose for PSWritePDF-style workflows, while PSWritePDF can eventually be archived with migration guidance.

Roadmap: Docs/officeimo.pdf.roadmap.md

Goals

  • Keep the core package MIT licensed and dependency-free.
  • Build a real document model, layout model, PDF syntax model, and content operator pipeline.
  • Generate reports that look good enough to compete with QuestPDF-style output.
  • Read and manipulate PDFs enough to replace common iText-backed PSWritePDF workflows through PSWriteOffice.
  • Keep the public creation API Word-like and primitive-based; polished invoice/report/statement outputs belong in samples, visual fixtures, or wrappers, not as special engine concepts.
  • Keep rasterizers and visual comparison tools in tests/dev tooling, not in the runtime package.

Relationship to OfficeIMO.Word.Pdf, OfficeIMO.Excel.Pdf, and OfficeIMO.Markdown.Pdf

OfficeIMO.Word.Pdf now defaults to the first-party OfficeIMO.Pdf engine for Word-to-PDF export.

The default first-party path maps basic Word sections, page setup with explicit PDF page geometry preserved unless PdfSaveOptions.Orientation is set, Word document background color, Word section columns with explicit and inline paragraph column breaks plus separator lines, page breaks, headings including linked headings, paragraphs, common Word/PDF font-family requests to standard Helvetica, Times, and Courier PDF families, runs with isolated run color, font-size, superscript/subscript baseline, text-wrapping breaks, and highlight/background state, paragraph spacing/indents, simple tab stops with leaders/alignment, keep-with-next/keep-lines/widow-control flags, simple shaded and uniform/non-uniform bordered paragraphs, Word horizontal lines and paragraph top/bottom border rules, simple level-0 bullet/decimal lists including rich list-item runs, list-item bookmarks, links/bookmarks with tooltip metadata, generated table-of-contents entries with internal links to heading destinations, heading-based PDF outlines, footnote/endnote markers, tables with supported Word table style presets, rich text runs inside table cells, default and per-cell table margins, table cell spacing, table-level borders, uniform and non-uniform row heights, row-level break policies, preferred DXA table widths, explicit autofit-to-contents tables, simple cell fills, uniform and non-uniform cell borders, left/center/right table placement, uniform column and non-uniform per-cell horizontal/vertical alignment, simple merged cells, separated first-row visual table styling and repeated leading table header rows, and linked cells including linked merged cells, paragraph-aligned images, simple VML shapes and the DrawingML preset flow shapes exposed by WordShape, simple text boxes, simple inline body/table/header/footer text content controls, simple body, table-cell, header, and footer picture content controls as first-party PDF images, simple body/table-cell/header/footer repeating-section text items, simple body and table-cell Word check boxes as first-party AcroForm check boxes, simple body-level and table-cell Word dropdown, combo box, and date picker controls as first-party AcroForm choice/text fields, simple header/footer Word check boxes, dropdowns, combo boxes, and date pickers as static first-party zone text, simple default/first/even header and footer text/images/shapes with left/center/right paragraph alignment, Word PAGE/NUMPAGES header/footer fields and their simple numeric format switches, and simple header/footer table-cell text, image, and shape zones mapped to first-party zones, simple footnote/endnote markers with end-of-section note text, metadata, and page-number footer settings including Word section page-number starts/styles into OfficeIMO.Pdf.

Simple Word OMML equations with extractable math text are exported as static first-party PDF text in body paragraphs, table cells, headers, and footers; equations without extractable text still produce PdfSaveOptions.Warnings.

OfficeIMO.Excel.Pdf uses the same first-party engine for workbook-to-PDF export. It keeps workbook reading in OfficeIMO.Excel, maps worksheets into PDF headings, tables, images, charts, links, headers, footers, margins, and orientation, and leaves PDF layout and writing in OfficeIMO.Pdf.

OfficeIMO.Markdown.Pdf uses the same first-party engine for Markdown-to-PDF export. It keeps Markdown parsing in OfficeIMO.Markdown, maps Markdown semantics into headings, links, lists, tables, panels, front matter, images, and theme-aware page decorations, and leaves PDF layout and writing in OfficeIMO.Pdf.

The strategic target is for OfficeIMO.Pdf to become good enough that Word, Excel, Markdown, and PowerPoint exporters can render through the first-party engine without bringing QuestPDF, SkiaSharp, iText, or other runtime PDF dependencies into the core PDF package.

Quick Start

using OfficeIMO.Pdf;

PdfDocument.Create(new PdfOptions {
        DefaultFont = PdfStandardFont.Helvetica,
        DefaultFontSize = 11
    })
    .Meta(title: "Hello PDF", author: "OfficeIMO")
    .H1("OfficeIMO.Pdf")
    .Paragraph(p => p
        .Text("A dependency-free PDF builder with ")
        .Bold("rich text")
        .Text(", links, tables, images, and a growing reader."))
    .Size(PageSize.FromInches(8.5, 11))
    .Margin(PageMargins.Normal)
    .Table(new[] {
        new[] { "Area", "Status" },
        new[] { "Runtime dependencies", "None in OfficeIMO.Pdf" },
        new[] { "PowerShell future", "Expose through PSWriteOffice" }
    })
    .Save("HelloWorld.pdf");

Unified PDF Workflows

For read, merge, split, stamp, metadata, and page-edit workflows, load an existing PDF once and compose operations fluently:

using OfficeIMO.Pdf;

PdfDocument.Open("input.pdf")
    .Pages.Extract("1-2,4")
    .MergeWith("appendix.pdf")
    .UpdateMetadata(title: "Merged report")
    .Stamp.Text("Reviewed")
    .Save("output.pdf");

IReadOnlyList<PdfDocument> pages = PdfDocument.Open("output.pdf").Pages.Split();
string text = PdfDocument.Open("output.pdf").Read.Text();

PdfDocument snapshots opened input bytes and returns a new PdfDocument for transformations, so existing byte/stream/path helpers remain reusable implementation pieces while callers get one fluent surface.

Current Feature Set

Generation:

  • Automatic page flow with configurable page size, inch/centimeter-based page-size helpers, portrait/landscape orientation helpers, margins, inch/centimeter-based margin helpers, reusable Word-compatible PageMargins presets, optional full-page background color through PdfOptions.BackgroundColor, PdfDocument.Background(...), or PdfPageCompose.Background(...), optional fitted JPEG/PNG page background images through PdfOptions.PageBackgroundImage, PdfDocument.BackgroundImage(...), or PdfPageCompose.BackgroundImage(...), optional vector page background shapes through PdfOptions.PageBackgroundShapes, PdfDocument.BackgroundShape(...), PdfDocument.BackgroundRectangle(...), PdfDocument.BackgroundRoundedRectangle(...), PdfDocument.BackgroundEllipse(...), gradient/opacity-aware anchored band helpers such as PdfDocument.BackgroundTopBand(...), BackgroundBottomBand(...), BackgroundLeftBand(...), BackgroundRightBand(...), or matching PdfPageCompose methods, optional behind-content text watermarks through PdfOptions.TextWatermark, PdfDocument.Watermark(...), or PdfPageCompose.Watermark(...), optional behind-content JPEG/PNG image watermarks through PdfOptions.ImageWatermark, PdfDocument.ImageWatermark(...), or PdfPageCompose.ImageWatermark(...), optional page frames through PdfOptions.PageBorder, PdfDocument.PageBorder(...), or PdfPageCompose.PageBorder(...), and default text style through PdfOptions, document defaults, or page-scoped composition, including long rich paragraph and oversized list-item continuation across pages. Decorative page background and image watermark image draws are emitted as PDF artifacts as accessibility groundwork.
  • Headings with Word-like reusable PdfHeadingStyle / PdfHeadingStyles typography and rhythm defaults, spacing before/after with Word-like spacing-before suppression at fresh page/column starts, optional per-heading style/alignment/color overrides across direct, compose item/element, and row-column flows, and orphan prevention before following paragraphs; paragraphs; page breaks; hard line breaks in shared simple text wrapping for headings, simple list items, table cells, captions, and similar non-rich text; invisible Spacer(...) flow gaps for generic document rhythm without fake blank text; horizontal rules with reusable PdfHorizontalRuleStyle thickness, color, outer rhythm, and keep-with-next defaults across top-level, compose item/element, and row-column flows; panels with reusable PanelStyle box, uniform or side-specific borders, padding, alignment, color, keep-together, keep-with-next, and outer rhythm defaults; bounded composed panels through Panel(...) in top-level, compose item/element, and row-column flows for common paragraph, heading, list, simple-table, checklist-table, rule, spacer, bookmark, and nested-panel content; bullet and numbered lists with reusable PdfListStyle typography, indentation, marker gap, color, rhythm, keep-together, keep-with-next page flow, and rich PdfListItem runs plus per-item bookmark anchors through RichBullets(...) / RichNumbered(...) in top-level, compose item/element, and row-column flows; rows/columns with a built-in Word-like gutter, reusable PdfRowStyle gutters, keep-together and keep-with-next page flow, and outer rhythm plus column-local item groups, headings, lists, panels, spacers, and compact tables; and simple top-level tables. Flow-object spacing-before is treated as separation between visible blocks and is suppressed at fresh page/column starts across lists, panels, rules, images, shapes, drawings, rows, paragraphs, headings, and tables.
  • Rich paragraph runs with bold, italic, underline, strike, color, scoped standard PDF font family, scoped font size, scoped background/highlight fills, superscript/subscript baseline shifts, links with annotation contents metadata, explicit line breaks, left, center, right, and decimal paragraph tabs with optional dotted, hyphen, or underscore leaders, alignment, justification, proportional standard-font wrapping, configurable line height, left/right/first-line/hanging indents, spacing before/after with Word-like spacing-before suppression at fresh page/column starts, and Word-like keep-together, keep-with-next, and widow/orphan options for page-flow control.
  • Table styling, default table styles through PdfOptions.DefaultTableStyle or PdfDocument.DefaultTableStyle(...), rich PdfTableCell text runs with scoped color, bold/italic, underline/strike, font size, background/highlight, baseline, tabs, and links, a report-friendly TableStyles.Light() default with built-in before/after flow rhythm, generic polished table presets through TableStyles.TechnicalDocument(), TableStyles.Compact(), and TableStyles.Report(), initial Word-like table presets (TableNormal, TableGrid, TableGridLight, PlainTable1, GridTable1Light, ListTable1Light, plus Accent1-6 variants with Word default theme border, separator, and soft band colors for the light grid/list styles) with name-based resolution through TableStyles.FromWordTableStyle(...), canonical name normalization through TableStyles.GetCanonicalWordStyleName(...) / TryGetCanonicalWordStyleName(...), clean display names through TableStyles.CanonicalWordStyleNames, accepted input aliases through TableStyles.SupportedWordStyleNames, captions, row/header/footer separators, side-specific per-cell border overrides with independent side colors, widths, solid/dashed/dotted/dash-dot strokes, two-line borders, and diagonal-up/diagonal-down cell lines, body column fills, per-cell fills, per-cell data bars, per-cell vector icons, per-cell padding overrides, header-safe body row striping, column and per-cell horizontal/vertical cell alignment, generic header/body/footer typography, cell line-height controls, symmetric and side-specific cell padding, configurable cell spacing, configurable visual header/footer row counts with render-time bounds validation, optional repeated-header row count through PdfTableStyle.RepeatHeaderRowCount, table-wide and per-row minimum heights, table-wide and per-row row-break policies, table left indentation and max-width caps with left/center/right placement, table spacing before/after with Word-like spacing-before suppression at fresh page/column starts, keep-together, keep-with-next first-row preflight that honors configured column widths, and row-break page-flow controls, fixed/min/max column widths, relative column width weights, column-scoped style bounds validation for sizing/fills/horizontal and vertical alignment, OfficeIMO.Drawing-backed auto-fit column sizing with token minimums, initial PdfTableCell column spans, row spans, rectangular merged cells with combined-box alignment, overlong row-span validation, header/footer boundary validation for row-spanned cells, row-spanned explicit cell fills/borders, explicit cell fill/data-bar/icon/border/padding/alignment coordinate bounds validation, explicit cell fill/data-bar/icon/border coordinates that skip row-span and column-span continuation slots, row/header/footer separators, body-column background fills that skip merged-cell continuation columns, row/background fills, and default table border grids that skip row-spanned and rectangular merged-cell interiors, cell-owned URI or named-destination links, and cell-owned named-destination anchors through PdfTableCell.NamedDestinationName / WithNamedDestination(...), including linked column/row-spanned cell annotations over the merged text frame in top-level and compose/row-column table flows, row height calculation, proportional standard-font wrapped cell text and captions, row-by-row pagination, oversized-row splitting with measured rich cell line heights, and repeated header rows.
  • Flow vector lines, rectangles, rounded rectangles, ellipses, polygons, paths, and reusable drawing scenes through shared OfficeIMO.Drawing descriptors, with solid fill, two-stop linear gradient fill, simple offset shadow, stroke, stroke width, dash style, line cap, line join, fill/stroke opacity, affine transforms, clipping paths, optional URI link annotations on generic shape/drawing blocks and vector convenience helpers, and reusable PdfDrawingStyle alignment, outer rhythm, and keep-with-next defaults.
  • Foreground page canvas through PdfDocument.Canvas(...) and PdfPageCompose.Canvas(...) for ordered fixed top-left-coordinate text, reusable styled text boxes through PdfCanvasTextBoxStyle with fill, border, padding, font, horizontal alignment, and vertical text anchoring, fixed-frame tables through PdfPageCanvas.Table(...), shared drawing shapes, and supported JPEG/PNG images that validate page bounds, support text-box/shape/image rotation around the declared frame center, keep rotated link annotations aligned to the visual frame, and do not consume flow space; canvas tables reuse PdfTableCell and PdfTableStyle for spans, fixed columns/rows, fills, padding, alignment, rich runs, links, and borders, giving slide-like exporters a shared absolute-positioning table primitive.
  • JPEG and simple PNG image placement, including Adam7 interlace normalization, packed 1/2/4-bit grayscale icons, 16-bit grayscale/grayscale-alpha/RGB/RGBA payload downsampling, 1/2/4/8-bit indexed-color palettes, palette transparency, grayscale/RGB tRNS transparency, grayscale-alpha/RGBA soft masks, reusable PdfImageStyle alignment, fit, clipping, outer rhythm, keep-with-next defaults, opt-in proportional scale-down for flow and table-cell image frames, first-party image metadata detection, shared OfficeImageFit stretch/contain/cover fitting, shared OfficeClipPath clipping through OfficeIMO.Drawing, and table-cell images through PdfTableCell.WithImages(...).
  • Header/footer literal text formats through PdfOptions, document-level PdfDocument.Header(...) / PdfDocument.Footer(...), or page-scoped PdfPageCompose.Header(...) / Footer(...), with visible page number tokens that continue across flows by default, configurable visible page-number starts through PageNumberStart(...), decimal/roman/alphabetic page-number styles through PageNumberStyle(...), left/center/right text zones through Zones(...), FirstPageZones(...), and EvenPagesZones(...), segment builders for composed header/footer text, simple header/footer images through Image(...), FirstPageImage(...), and EvenPagesImage(...) with optional alternate text, configurable fonts, sizes, text colors, and margin-relative offsets, plus section-local first-page and odd/even header/footer overrides for Word-like section, cover-page, and report flows.
  • Generated PDF file header version selection through PdfOptions.FileVersion, PdfOptions.SetFileVersion(...), and fluent PdfDocument.FileVersion(...), retaining PDF 1.4 as the default while allowing PDF 1.5, 1.6, or 1.7 headers for compatibility and compliance groundwork.
  • Metadata: title, author, subject, and keywords, with optional generated catalog XMP metadata through PdfOptions.IncludeXmpMetadata, plus PDF/A-2 and PDF/A-3 identification XMP through PdfAIdentification, PdfOptions.SetPdfAIdentification(...), and fluent PdfDocument.PdfAIdentification(...). PdfOptions.ConfigurePdfAGroundwork(...) and fluent PdfDocument.ConfigurePdfAGroundwork(...) set profile-specific PDF/A-2 or PDF/A-3 identification, PDF 1.7, built-in sRGB output intent, and the needed Unicode/tagged/language groundwork for u and a levels while keeping formal compliance profiles disabled. PdfUaIdentification, PdfOptions.SetPdfUaIdentification(...), and fluent PdfDocument.PdfUaIdentification(...) can emit PDF/UA-1 identification XMP; PdfOptions.ConfigurePdfUaGroundwork(...) and fluent PdfDocument.ConfigurePdfUaGroundwork(...) additionally set PDF 1.7, PDF/UA-1 identification, language, tagged catalog markers, standard-font ToUnicode maps, and DisplayDocTitle true viewer preferences together. PdfElectronicInvoiceMetadata, PdfOptions.SetElectronicInvoiceMetadata(...), and fluent PdfDocument.ElectronicInvoiceMetadata(...) can emit the Factur-X/ZUGFeRD XMP extension metadata primitive, including document type, XML file name, version, conformance level, and PDF/A extension schema declaration. E-invoice readiness checks recognize known conformance-level names such as MINIMUM, BASIC WL, BASIC, EN 16931, EXTENDED, XRECHNUNG, and EXTENDED-CTC-FR. These metadata primitives are required groundwork, not certification switches.
  • Generated output intents backed by caller-supplied ICC profile bytes through PdfOutputIntent, PdfOptions.OutputIntent, PdfOptions.SetOutputIntent(...), and fluent PdfDocument.OutputIntent(...); ICC inputs must pass basic header size, acsp signature, and RGB/GRAY/CMYK color-space checks. OfficeIMO also includes PdfIccProfiles.SrgbIec6196621, PdfOutputIntent.CreateSrgbIec6196621(), PdfOptions.SetSrgbOutputIntent(), and fluent PdfDocument.SrgbOutputIntent() for the built-in sRGB IEC61966-2.1 profile. PdfOutputIntentPolicy can declare known policy intent such as sRGB IEC61966-2.1 for readiness reporting without claiming PDF/A conformance.
  • Generated catalog document language through PdfOptions.Language, PdfOptions.SetLanguage(...), and fluent PdfDocument.Language(...).
  • Generated catalog page labels through PdfOptions.IncludePageLabels, PdfOptions.SetPageLabels(...), and fluent PdfDocument.PageLabels(...), using the configured PageNumberStart, PageNumberStyle, and optional prefix so viewer page navigation can match visible document numbering.
  • Generated catalog viewer preferences through PdfViewerPreferencesOptions, PdfOptions.ViewerPreferences, and fluent PdfDocument.ViewerPreferences(...) for common viewer polish such as document-title display, window fitting/centering, and optional toolbar/menu/window UI hiding. PDF/UA readiness reports whether DisplayDocTitle is explicitly true.
  • Generated tagged-PDF groundwork through PdfTaggedStructureMode.CatalogMarkers, PdfOptions.EnableTaggedPdfCatalogMarkers(), and fluent PdfDocument.TaggedPdfCatalogMarkers(), emitting /MarkInfo plus /StructTreeRoot; generated top-level structure nodes are nested below a generated /Document structure element that carries /Lang when PdfOptions.Language is configured, generated paragraph, H1/H2/H3 heading, list labels and bodies nested below generated L/LI containers, table captions, table cell slices, generated Table/TR containers for table content, column-scope attributes on generated table header cells, span attributes on generated merged table cells, generated link annotations with /StructParent and /Link OBJR structure references, rich text links with visible /Link marked content plus annotation OBJR references, /ParentTreeNextKey, and alt-bearing generated image, shape, and drawing-scene figures also receive structure and ParentTree references. Decorative generated page chrome such as page backgrounds, background images, text/image watermarks, page borders, running header/footer text, horizontal flow rules, explicit PdfDrawingStyle.Decorative shape/drawing flow blocks, panel fills/borders, table fills/borders, row/column separators, and explicit cell borders is emitted as artifact marked content. This is limited accessibility groundwork and does not satisfy PDF/UA by itself.
  • Generated embedded/associated files through PdfEmbeddedFile, PdfOptions.AddEmbeddedFile(...), and fluent PdfDocument.AttachFile(...), emitting simple /EmbeddedFile streams with deterministic /Params /Size and /CheckSum metadata, /Filespec, /Names /EmbeddedFiles, and catalog /AF entries with explicit /AFRelationship values for general attachment workflows and PDF/A-3/Factur-X/ZUGFeRD groundwork without claiming formal e-invoice conformance. PdfOptions.AddFacturXInvoiceXml(...) and fluent PdfDocument.AttachFacturXInvoiceXml(...) add the canonical factur-x.xml CrossIndustryInvoice payload with application/xml, an invoice-appropriate associated-file relationship, and matching Factur-X/ZUGFeRD XMP extension metadata in one call. PdfOptions.ConfigureFacturXGroundwork(...) and fluent PdfDocument.ConfigureFacturXGroundwork(...) additionally set PDF 1.7, PDF/A-3b identification, built-in sRGB output intent, and standard-font ToUnicode groundwork while keeping formal compliance profiles disabled. E-invoice readiness checks now also look for a recognized CII GuidelineSpecifiedDocumentContextParameter profile identifier plus seller/buyer names, postal country identifiers, and electronic address scheme identifiers before the Mustang gate.
  • Compliance profile intent through PdfOptions.ComplianceProfile, PdfOptions.RequireCompliance(...), and fluent PdfDocument.Compliance(...) for planned PDF/A-2b/2u/2a, PDF/A-3b/3u/3a, PDF/UA-1, Factur-X, and ZUGFeRD output profiles; non-None profiles currently fail with clear diagnostics until the required profile-level file-version policy, built-in/approved output-intent policy, embedded-font, tagging, canonical e-invoice XML attachment, Factur-X/ZUGFeRD XMP extension metadata, and external validator lanes exist. PdfComplianceAnalyzer.Assess(...) can report planned-profile readiness as a PdfComplianceReadinessReport with stable requirement ids and Satisfied, Missing, or Unsupported states. PdfDocument.AssessCompliance(...) uses generated layout evidence so embedded-font coverage can point to missing or invalid standard-font mappings for the actual document, PDF/A and e-invoice readiness can separate configured output-intent presence from declared profile-specific policy and external validator evidence, PDF/A, PDF/UA, and e-invoice readiness can report the PDF 1.7 file-header prerequisite separately, PDF/UA readiness can report identification, document-title metadata, DisplayDocTitle true viewer preference, tagged page tab order, generated parent-tree next-key metadata, generated paragraph/heading structure references, generated list label/body structure references, generated list structure containers, generated table cell structure references, generated table structure containers, generated table header scope attributes, generated table span attributes, generated table caption structure references, generated link annotation structure references, generated rich link text structure references, generated-image alternate text, generated-image structure references, generated-drawing alternate text, generated-drawing structure references, aggregate generated visual alternate-text readiness, decorative-image artifact markers, decorative-drawing artifact markers, decorative running page text artifact markers, decorative flow rule artifact markers, and decorative layout artifact markers separately from broader language/tagging/validator-backed conformance gaps, and e-invoice readiness can distinguish a real factur-x.xml CrossIndustryInvoice attachment with an invoice-appropriate Alternative, Data, or Source relationship, recognized CII profile context, document header essentials, CII document-type-code essentials, CII date-format essentials, trade transaction essentials, seller/buyer party identification essentials, country-code list essentials, electronic address essentials, party tax registration essentials, party tax registration scheme metadata, tax party identifiers required/forbidden by category, line item quantity/unit-code essentials, unit-code list essentials, line pricing essentials, line amount-consistency essentials, line tax VAT type-code essentials, settlement/tax summary essentials, currency code essentials, tax breakdown VAT type-code essentials, tax category-code/O line-allowance-charge breakdown-exclusivity essentials, tax category-rate essentials, tax category-amount essentials, tax exemption-reason essentials, tax category/rate consistency, tax total/category-rate adjusted-basis consistency, payment instruction essentials, payment means-code essentials, payment account-format essentials, payment terms essentials, allowance/charge total-sum and paid/rounding-aware amount consistency, deterministic embedded-file /Size and /CheckSum parameters, and matching XMP extension metadata from generic XML files.
  • Optional reusable theme bundle through PdfTheme for default text, heading, list, panel, paragraph, horizontal rule, image, drawing, row, and table styles, applicable through PdfOptions.ApplyTheme(...), PdfDocument.Theme(...), or PdfPageCompose.Theme(...); built-in WordLike, TechnicalDocument, Compact, and Report profiles provide generic opt-in document rhythm and reuse the matching core table presets without introducing invoice/report-specific engine APIs.
  • Built-in Helvetica body/header/footer defaults for readable proportional no-options documents, with optional reusable default text style overrides through PdfTextStyle or fluent PdfDocument.DefaultTextStyle(...) / PdfPageCompose.DefaultTextStyle(...) configuration.
  • Optional default heading styles for Word-like H1/H2/H3 typography, color, spacing before/after with fresh page/column spacing-before suppression, and keep-with-next behavior through PdfOptions.DefaultHeadingStyles, PdfOptions.SetDefaultHeadingStyle(...), PdfDocument.DefaultHeadingStyle(...), PdfPageCompose.DefaultHeadingStyle(...), or per-heading style: overrides; compose item/element and row-column heading helpers also expose explicit align and color overloads for local visual control without report-specific APIs.
  • Optional default list style for Word-like bullet and numbered list font size, line height, left indent, marker gap, color, spacing before/after with fresh page/column spacing-before suppression, inter-item rhythm, keep-together, and keep-with-next page flow through PdfOptions.DefaultListStyle, PdfDocument.DefaultListStyle(...), PdfPageCompose.DefaultListStyle(...), or per-list style: overrides.
  • Optional default panel style for Word-like boxed paragraph appearance, including uniform or side-specific PdfPanelBorder sides, rhythm with fresh page/column spacing-before suppression, keep-together, and keep-with-next page flow through PdfOptions.DefaultPanelStyle, PdfDocument.DefaultPanelStyle(...), PdfPageCompose.DefaultPanelStyle(...), Panel(...), PanelParagraph(...), or per-panel style: overrides.
  • Optional default horizontal rule style for Word-like separators, rhythm with fresh page/column spacing-before suppression, and keep-with-next page flow through PdfOptions.DefaultHorizontalRuleStyle, PdfDocument.DefaultHorizontalRuleStyle(...), PdfPageCompose.DefaultHorizontalRuleStyle(...), or per-rule style: overrides.
  • Optional default image style for Word-like image placement, fitting, clipping, rhythm with fresh page/column spacing-before suppression, and keep-with-next page flow through PdfOptions.DefaultImageStyle, PdfDocument.DefaultImageStyle(...), PdfPageCompose.DefaultImageStyle(...), or per-image style: overrides.
  • Optional default drawing style for Word-like shape and drawing-scene placement, rhythm with fresh page/column spacing-before suppression, keep-with-next page flow, AlternativeText for meaningful figure tagging, and Decorative for artifact-tagged decorative flow drawings through PdfOptions.DefaultDrawingStyle, PdfDocument.DefaultDrawingStyle(...), PdfPageCompose.DefaultDrawingStyle(...), or per-shape/per-drawing style: overrides.
  • Optional default row style for Word-like column gutters, optional vertical column separators through PdfRowStyle.ColumnSeparatorColor / ColumnSeparatorWidth or per-row ColumnSeparator(...), row-level spacing with fresh page/column spacing-before suppression, keep-together, and keep-with-next page flow through PdfOptions.DefaultRowStyle, PdfDocument.DefaultRowStyle(...), PdfPageCompose.DefaultRowStyle(...), PdfTheme.RowStyle, or per-row Style(...) overrides; multi-column rows use a built-in gutter unless callers explicitly set Gap(0) or PdfRowStyle { Gap = 0 }.
  • Optional default paragraph style for Word-like reusable typography and page-flow settings when individual paragraphs do not provide their own style, either through PdfOptions.DefaultParagraphStyle or the fluent PdfDocument.DefaultParagraphStyle(...) setter.
  • Page/section-scoped flow can be created through PdfDocument.Page(...), PdfDocument.Section(...), PdfDocument.Compose(...Page...), or PdfDocument.Compose(...Section...); page content can add direct Item(...) groups, nested element groups, Spacer(...) rhythm blocks, and PageBreak() page transitions before, between, or after columns/rows; and scoped defaults can set page background color, fitted background images, vector background shapes and anchored bands, behind-content text and image watermarks, page frames, plus heading, list, panel, horizontal rule, image, drawing, row, paragraph, and table styles through PdfPageCompose.Background(...), PdfPageCompose.BackgroundImage(...), PdfPageCompose.BackgroundShape(...), PdfPageCompose.BackgroundRectangle(...), PdfPageCompose.BackgroundRoundedRectangle(...), PdfPageCompose.BackgroundEllipse(...), PdfPageCompose.BackgroundTopBand(...), PdfPageCompose.BackgroundBottomBand(...), PdfPageCompose.BackgroundLeftBand(...), PdfPageCompose.BackgroundRightBand(...), PdfPageCompose.Watermark(...), PdfPageCompose.ImageWatermark(...), PdfPageCompose.PageBorder(...), PdfPageCompose.DefaultHeadingStyle(...), PdfPageCompose.DefaultListStyle(...), PdfPageCompose.DefaultPanelStyle(...), PdfPageCompose.DefaultHorizontalRuleStyle(...), PdfPageCompose.DefaultImageStyle(...), PdfPageCompose.DefaultDrawingStyle(...), PdfPageCompose.DefaultRowStyle(...), PdfPageCompose.DefaultParagraphStyle(...), and PdfPageCompose.DefaultTableStyle(...).
  • Optional PDF outline generation from H1/H2/H3 headings plus generic Bookmark(...) anchors, LinkToBookmark(...) text links, and bookmark-targeted heading links that emit simple PDF named destinations and GoTo annotations.
  • Initial generated AcroForm text fields, check boxes, scalar choice fields, multi-select choice fields, and vertical radio button groups through PdfDocument.TextField(...), PdfDocument.CheckBox(...), PdfDocument.ChoiceField(...), PdfDocument.MultiSelectChoiceField(...), and PdfDocument.RadioButtonGroup(...), plus table-cell check boxes through PdfTableCell.WithCheckBoxes(...) and table-cell text/scalar choice fields through PdfTableCell.WithFormFields(...), with top-level, compose item/element, row/column, and table-cell flow placement, simple visible normal appearances, optional PdfFormFieldStyle background/border/text/mark colors plus /TU alternate names and /TM mapping names, catalog /AcroForm registration with /NeedAppearances false, and immediate compatibility with PdfInspector, PdfFormFiller.FillFields(...), and FillAndFlattenFields(...).
  • First-party color interop with OfficeIMO.Drawing.OfficeColor.
  • PDF RGB colors reject non-finite or out-of-range components before they can be written as invalid PDF color operators.
  • ToBytes, path and stream Save, and path and stream SaveAsync.
  • Optional generated page content stream Flate compression through PdfOptions.CompressContentStreams, while keeping uncompressed streams as the default for readable generated-output diagnostics.
  • Optional generated PDF file header version selection through PdfOptions.FileVersion, PdfOptions.SetFileVersion(...), or fluent PdfDocument.FileVersion(...); PDF 1.4 remains the default, while PDF 1.7 can now be emitted directly or through the PDF/A, PDF/UA, and Factur-X/ZUGFeRD groundwork helpers without claiming formal conformance.
  • Optional generated standard-font /ToUnicode CMaps through PdfOptions.IncludeStandardFontToUnicodeMaps, improving Unicode text extraction for WinAnsi text and laying groundwork for later PDF/A-2u and PDF/UA work without claiming formal archival or accessibility compliance yet.
  • Initial full-file TrueType embedding for generated standard-font slots through PdfOptions.EmbedStandardFont(...) or fluent PdfDocument.EmbedStandardFont(...); embedded fonts emit Type0/CIDFontType2 font dictionaries with /Encoding /Identity-H, /CIDToGIDMap /Identity, /FontDescriptor, compressed /FontFile2 streams with preserved /Length1, width tables, and glyph-aware /ToUnicode resources. PdfOptions.UseFontFamily(...) and fluent PdfDocument.UseFontFamily(...) let callers supply a named TrueType family for generated document text without manually mapping internal standard-font slots. Parseable embedded TrueType mappings now feed Unicode text encoding, text extraction, measurement, wrapping, alignment, table sizing, headers/footers, forms, and watermark placement so generated layout follows supplied font metrics instead of only writing the font file. PdfOptions.CompressEmbeddedFonts = false remains available for diagnostics while broader OpenType/CFF support, subsetting, shaping/ligatures, glyph fallback, and formal PDF/A validation remain roadmap work.
  • Optional generated catalog XMP metadata through PdfOptions.IncludeXmpMetadata, synchronized from title, author, subject, keywords, and producer Info metadata as a PDF/A/PDF/UA groundwork step without claiming a formal compliance profile yet.
  • Optional generated catalog /OutputIntents backed by a caller-supplied RGB, GRAY, or CMYK ICC profile through PdfOutputIntent, PdfOptions.SetOutputIntent(...), or fluent PdfDocument.OutputIntent(...); this checks the ICC header size, acsp signature, and color-space marker, and PdfOutputIntentPolicy.SrgbIec6196621 lets readiness checks validate a declared sRGB policy, but it remains PDF/A groundwork and does not by itself claim archival compliance.
  • Optional generated catalog /Lang through PdfOptions.Language, PdfOptions.SetLanguage(...), or fluent PdfDocument.Language(...), with inspector readback for accessibility and PDF/A-a/PDF/UA groundwork without claiming formal accessibility conformance yet. PdfDocument.AssessCompliance(...) can also report whether PdfDocument.Meta(title: ...) will provide non-empty XMP dc:title metadata for PDF/UA readiness.
  • Optional generated tagged-PDF groundwork through PdfTaggedStructureMode.CatalogMarkers, PdfOptions.SetTaggedStructureMode(...), PdfOptions.EnableTaggedPdfCatalogMarkers(), fluent PdfDocument.TaggedPdfCatalogMarkers(), or the broader ConfigurePdfUaGroundwork(...) helper; generated output includes /MarkInfo << /Marked true >> and /StructTreeRoot, generated top-level structure nodes are nested below a generated /Document structure element with /Lang metadata when a document language is configured, and generated paragraph/H1/H2/H3 heading slices, list labels and bodies nested below generated L/LI containers, table captions and table cell slices nested below generated Table/TR containers, generated table header cells with /Scope /Column attributes, generated merged table cells with /ColSpan and /RowSpan attributes, generated link annotations with /StructParent and /Link OBJR references, rich text links with visible /Link marked content plus annotation OBJR references, /ParentTreeNextKey, plus alt-bearing generated image, shape, and drawing-scene figures also receive structure and /ParentTree references. Decorative generated page backgrounds, watermarks, borders, running header/footer text, horizontal flow rules, explicit decorative drawing flow blocks, panel chrome, table fills/borders, row/column separators, and explicit cell borders are emitted as artifact marked content. PdfComplianceAnalyzer reports this separately from the still-unsupported full structure tree, role mapping, reading order, and complete marked-content reference coverage.
  • Optional generated embedded/associated files through PdfEmbeddedFile, PdfOptions.AddEmbeddedFile(...), and fluent PdfDocument.AttachFile(...); generated files include simple attachment streams with deterministic /Params /Size and /CheckSum metadata, file specifications, an /EmbeddedFiles name tree, and catalog /AF references with /AFRelationship metadata, which is useful as a general PDF capability and as PDF/A-3/e-invoice groundwork. PdfOptions.AddFacturXInvoiceXml(...) and fluent PdfDocument.AttachFacturXInvoiceXml(...) provide the safer canonical path for factur-x.xml payloads by setting the filename, MIME type, relationship, description, and matching XMP metadata together. PdfOptions.ConfigureFacturXGroundwork(...) and fluent PdfDocument.ConfigureFacturXGroundwork(...) build on that path by also setting the common PDF/A-3 e-invoice prerequisites: PDF 1.7 header, PDF/A-3b identification XMP, built-in sRGB output intent, and standard-font ToUnicode maps. Readiness checks can recognize a canonical factur-x.xml associated file with XML MIME metadata, an invoice-appropriate Alternative, Data, or Source relationship, parseable UN/CEFACT CrossIndustryInvoice XML, a recognized GuidelineSpecifiedDocumentContextParameter profile identifier, CII document ID/type/date, document type-code list value, seller/buyer/settlement transaction markers, seller/buyer party identity, country-code list value, electronic address, and tax registration markers, tax registration scheme metadata, tax party identifiers required/forbidden by category, line item quantity/unit-code, unit-code list value, pricing, line amount-consistency, and line tax VAT type-code and O rate-absence markers, settlement/tax summary markers, currency code, tax breakdown with VAT type-code, category-code/O line-allowance-charge breakdown-exclusivity, category-rate/O header-line-allowance-charge rate-absence, category-amount zero-amount/standard-rate-math, tax-exemption-reason/VATEX-EU-O semantics, category line/allowance-charge consistency, and total/category-rate adjusted-basis-consistency markers, payment instruction, payment means-code, payment account-format, terms, and allowance/charge reason markers, allowance/charge total-sum and paid/rounding-aware amount consistency markers, and generated embedded-file stream parameters before validator-backed profile support exists.
  • Optional Factur-X/ZUGFeRD XMP extension metadata through PdfElectronicInvoiceMetadata, PdfOptions.SetElectronicInvoiceMetadata(...), and fluent PdfDocument.ElectronicInvoiceMetadata(...); generated metadata includes the four e-invoice XMP fields and PDF/A extension schema declaration, and readiness checks reject unknown conformance-level names before the validator-backed profile switch.
  • Strict compliance profile guardrails through PdfOptions.ComplianceProfile: PDF/A, PDF/UA, Factur-X, and ZUGFeRD profile names are available as forward-compatible intent, but generated output refuses those profiles until OfficeIMO.Pdf can satisfy and externally validate the relevant ISO/EN requirements.

Reading:

  • Load from bytes, path, or stream into PdfReadDocument.
  • Enumerate pages, metadata, and document outlines/bookmarks.
  • Probe PDF header version, encryption markers, digital signature markers, form-field markers, annotation markers, outline/bookmark markers, catalog view-setting markers, page-label markers, catalog name-tree markers, named-destination markers, open-action markers, viewer-preference markers, tagged-structure markers, XMP metadata markers, catalog URI markers, output-intent markers, embedded-file markers, optional-content/layer markers, and active-content markers without full parsing through PdfInspector.Probe.
  • Validate or preflight PDFs through PdfValidator.Validate / PdfInspector.Preflight to get wrapper-friendly IsValid, CanRead, CanExtractText, CanExtractImages, CanReadLogicalObjects, CanRewrite, CanManipulatePages, CanFillSimpleFormFields, CanFlattenSimpleFormFields, CanFillAndFlattenSimpleFormFields, Can(PdfPreflightCapability), GetCapabilityDiagnostics(PdfPreflightCapability), parsed DocumentInfo, structured ReadBlockers and RewriteBlockers, HasReadBlocker(...) / HasRewriteBlocker(...) helpers, and diagnostics before invoking read or manipulation commands; unsupported page content stream filters are reported as read blockers so wrappers can explain why text extraction and logical object readback are not available for a real-world PDF, image extraction can still be allowed when document inspection succeeded, and simple AcroForm fill/flatten gates are exposed separately from generic page-rewrite blockers.
  • Inspect page count, selected source page ranges, page sizes, orientation, inherited page rotation, catalog page mode/layout/version/language values, simple page-label rules, simple document open-action targets, simple viewer preference entries, simple AcroForm /NeedAppearances, /SigFlags with named signatures-exist/append-only helpers, and /DA, field names/types/kinds/common flags/text max lengths/default appearance strings/text alignment/choice options/scalar or array values/selected options/field page numbers/field-local widget page lookups/widget field names, geometry, and named annotation flags, form field name/kind/page-number lookup helpers, document-level and page-level form widget list/name/page-number lookup helpers, simple page URI and named-destination link annotation summaries, distinct document-level link URI and internal destination targets, document-level page-aware link lists, named destination names/targets, and per-page link annotations with contents metadata through PdfInspector; InspectPageRanges(...) preserves caller range order and overlaps while narrowing page labels, page-resolved outlines, named destinations, open actions, AcroForm fields, and form widgets to selected source pages.
  • Extract document text, selected page-range text, and page-by-page text from bytes, paths, or streams; helpers can write one UTF-8 text result, or per-page text files, for wrapper pipelines.
  • Extract logical Markdown from bytes, paths, or streams through PdfTextExtractor.ExtractMarkdown(...), ExtractMarkdownByPage(...), ExtractMarkdownByPageRanges(...), and ExtractMarkdownByPageRangesAsDocument(...), including UTF-8 .md path/stream output helpers and deterministic per-page Markdown files for wrapper pipelines while reusing the PdfLogicalDocument model.
  • Extract text spans with positions.
  • Build an initial logical read model through PdfLogicalDocument.Load(...) / From(...), exposing logical pages, source-page lookup helpers through PagesBySourcePageNumber, HasSourcePage(...), and GetPages(...), document-level typed collections for text blocks, headings, paragraphs, list items, tables, and images, generic Elements, ElementsByKind, ElementsByPageNumber, HasElementKind(...), and GetElements(...) helpers on documents and pages, URI/named-destination link annotation objects with document-level URI/destination lookup, page-level AcroForm widget objects, metadata, catalog view settings, outlines/bookmarks, page-label rules, named destinations, open actions, viewer preferences, AcroForm /NeedAppearances, /SigFlags with named signatures-exist/append-only helpers, and /DA, and simple AcroForm fields with typed field-kind/common-flag helpers, text max length, inherited AcroForm/field-tree default appearance strings, text alignment, choice options, scalar or array values, selected options, distinct field page numbers, field-local widget page lookups, named/kind/page lookup, and document/page-level widget lookup helpers by field name or page number so wrappers can start from one stable object surface instead of stitching together low-level extraction helpers; LoadPageRanges(...) and FromPageRanges(...) return logical objects for selected source page ranges while preserving caller order and overlaps, and range-based logical loads now expose only page labels, page-resolved outlines, named destinations, open actions, AcroForm fields, and form widgets represented on selected source pages. PdfLogicalDocument.ToMarkdown(...), PdfLogicalPage.ToMarkdown(...), and the PdfTextExtractor.ExtractMarkdown... helpers render that same logical model as wrapper-friendly Markdown for headings, paragraphs, lists, detected tables, images, and optional link/form annotations without adding a second structure model.
  • Extract page image XObjects from bytes, paths, streams, or parsed documents with PdfImageExtractor; ExtractImagesByPageRanges(..., PdfPageRange...) selects reusable page-range lists for wrapper pipelines, JPEG images are returned as JPEG files and simple PNG-predictor Flate images as PNG files, compatible grayscale/RGB Flate images with grayscale /SMask alpha are returned as gray-alpha/RGBA PNGs, and helpers can write extracted images to deterministic page-numbered files.
  • Extract embedded file attachments from bytes, paths, streams, or parsed documents with PdfAttachmentExtractor and PdfReadDocument.ExtractAttachments(), returning name-tree keys, file names, Unicode file names, descriptions, MIME types, /AFRelationship, object numbers, stream filters, and decoded bytes; helper overloads can write attachments to sanitized deterministic output files.
  • Heuristic column-aware text extraction and simple structured extraction; PdfTextExtractor exposes layout-option overloads for bytes, paths, streams, page-range text/structured/heading/list-item/paragraph/table extraction with PdfPageRange lists, byte/path/stream whole-document text output to UTF-8 paths or caller-owned streams, and page-file output, plus structured-by-page, heading-by-page, list-item-by-page, paragraph-by-page, and table-by-page extraction that preserves detected lines, heuristic headings, heuristic paragraph groups, list item marker/level hints, dot/hyphen/underscore leader rows with decimal/currency value punctuation, simple table geometry, and selected source page numbers so wrappers can request readback without dropping to PdfReadDocument. Byte-, path-, and stream-based text/table extraction can also write deterministic source-page-0001.txt and source-page-0001-table-0001.csv files for all pages or selected page ranges, including option-aware selected text page output and the two-page line-item statement fixture's selected table output, with CSV escaping for table output.
  • Decode common simple streams used by many PDFs, including uncompressed, Flate, ASCIIHex, ASCII85, RunLength, and LZW paths.

Manipulation:

  • Merge parser-supported PDFs from bytes, streams, or paths into one new PDF with PdfMerger, including output stream helpers and enumerable file-list output to paths or streams for wrapper pipelines; realistic statement-style split/merge readback is guarded through PdfLogicalDocument so wrappers can verify useful content survives the page pipeline.
  • Extract selected pages, one inclusive page range, or multiple inclusive page ranges from bytes, paths, or streams into a new PDF with PdfPageExtractor, including repeated selected-page/range cloning plus byte-returning path helpers and output stream helpers for byte, stream, and path inputs; statement-style extract/reorder readback is guarded through the logical model.
  • Import selected, repeated, ranged, range-list, or all pages from one parser-supported PDF before, after, or inside another with PdfPageImporter, including byte-array, path, stream, and output helpers for wrapper pipelines. Insert helpers keep the target document as the primary catalog/metadata source even when source pages are inserted at page 1.
  • Split a PDF from bytes, paths, or streams into single-page PDFs with PdfPageExtractor.SplitPages, or into generic inclusive PdfPageRange chunks with PdfPageExtractor.SplitPageRanges, including deterministic split-to-directory file output and logical readback guards for the two-page line-item statement fixture; PdfPageRange.Parse(...), TryParse(...), ParseMany("1-3,5"), and TryParseMany(...) provide one shared wrapper-friendly range grammar.
  • Duplicate selected pages or inclusive page ranges/range lists, move selected pages or inclusive page ranges/range lists, delete selected pages or inclusive page ranges/range lists, reorder all pages from explicit page numbers or PdfPageRange lists, and rotate selected/all pages or inclusive page ranges/range lists from bytes, paths, or streams with PdfPageEditor, including byte-returning path helpers and output stream helpers for byte, stream, and path inputs.
  • Update or replace document metadata from bytes, paths, or streams with PdfMetadataEditor, including byte-returning path helpers and output stream helpers for byte, stream, and path inputs.
  • Add simple text/image stamps and text/image watermarks from bytes, paths, or streams with PdfStamper, including byte-returning path helpers plus output stream helpers for byte, stream, and path PDF inputs.
  • Fill simple AcroForm field values from bytes, paths, or streams with PdfFormFiller.FillFields(...), using fully qualified field names and byte-returning/path/output-stream helpers; current support updates text/string-style values, choice values supplied as export values or /Opt display text when available, multi-select choice arrays through PdfFormFieldValue.FromValues(...), and button name values, stores choice export values, switches only the matching radio child widget appearance state on, generates simple text/choice-widget normal appearance streams plus simple button-widget Off/selected appearance states for widgets with /Rect, marks /NeedAppearances true, and rejects signed or active-content PDFs.
  • Flatten simple text-widget, choice-widget, and button-widget AcroForms from bytes, paths, or streams with PdfFormFiller.FlattenFields(...), or update and flatten in one pass with FillAndFlattenFields(...), including byte-returning/path/output-stream helpers; current support paints text appearances, paints choice option display text from /Opt when available for scalar or array selected values, paints simple button-widget normal appearance states into page content, generates minimal button appearances when needed, removes those widget annotations, removes the AcroForm tree, and rejects signed or active-content PDFs.
  • Rewrite-style manipulation preserves simple direct catalog /PageMode, /PageLayout, /Version, /Lang, simple direct /PageLabels number trees, simple outline trees including simple GoTo action outline entries whose destinations point only at copied pages, direct /Dests dictionaries, simple /Names /Dests name trees, destination-array and simple GoTo dictionary /OpenAction entries, simple /ViewerPreferences dictionaries, simple catalog /Metadata XMP XML streams, simple catalog /URI base dictionaries, simple /OutputIntents metadata graphs, simple /Names /EmbeddedFiles attachment trees, simple catalog /AF associated-file arrays, and simple /OCProperties optional-content metadata, while pruning stale internal bookmark links whose named destinations no longer survive the selected pages. Copied-page label reindexing follows the trailer-root page tree, not stale catalog objects left behind by earlier revisions.
  • The current manipulation path copies reachable page object graphs and preserves simple image streams, selected-page URI link annotations, and internal named-destination link annotations with contents metadata across extraction, split, duplicate, move, delete, reorder, rotate, metadata rewrite, merge, and stamp flows when their targets remain reachable, but it is not yet a full arbitrary-PDF editing engine.

Quality Gates

The package now has tests that protect the dependency-free promise and start guarding visual quality:

  • PackageDependencyGuardrailTests.DependencyLightProjects_HaveNoPackageReferences fails if dependency-light first-party PDF packages (OfficeIMO.Drawing, OfficeIMO.Pdf, OfficeIMO.Word.Pdf, OfficeIMO.Excel.Pdf, or OfficeIMO.Markdown.Pdf) gain a runtime PackageReference.
  • PdfDocumentVisualQualityTests checks natural proportional-font word spacing, proportional-font alignment for simple text blocks and headers/footers, mixed Word-like flow rhythm across headings, paragraphs, invisible spacers, panels, lists, tables, images, shapes, and row columns, no-cramped-baseline, same-baseline text-collision, and ambiguous-run-gap guards, row/column text-frame bounds with explicit gutter clearance and baseline rhythm, generic line-item table rhythm and the two-page line-item statement fixture without template APIs, heading wrapping with proportional wide/narrow glyph metrics in top-level and row/column flows, bullet/numbered-list wrapping with proportional wide/narrow glyph metrics in top-level and row/column flows, table-cell wrapping including proportional wide/narrow glyph metrics in top-level and row/column flows plus long unspaced token breaks, currency/percent/accounting-style numeric alignment plus explicit per-cell horizontal/vertical alignment in top-level and row/column table flows, header-relative body row striping, fixed/relative/min/max/content-aware table column widths plus table max-width, left-indent, column-span placement, row-span placement, header/footer row-count bounds, table keep-with-next preflight diagnostics for invalid table role/span models and column-scoped style bounds including horizontal alignment, rectangular merged-cell fill/border/link/alignment geometry, explicit cell fill/border/padding/alignment coordinate bounds, row-spanned separator gaps, row-spanned and rectangular merged-cell default border gaps, row-spanned and column-spanned background-fill gaps, ignored explicit fill/border row-span and column-span continuation-slot coordinates, and linked merged-cell annotation rectangles in top-level and row/column table flows, table-cell link annotation output in top-level and row/column table flows, table keep-together and row-break page-flow behavior, and long-table pagination using rendered PDF text positions.
  • Justified paragraph checks verify that wrapped lines expand inter-word spacing, final lines and explicit line-break lines keep natural spacing, and text remains extractable.
  • Standard font handling uses shared validation for document options, compose default text style, rich text runs, stamp options, writer style selection, metric helpers, PDF base-font name conversion, WinAnsi text encoding, and PdfStandardFontMapper office-family alias mapping so invalid enum values cannot silently fall back to another font, unsupported generated/stamped characters cannot silently render as ?, and raw control characters cannot be emitted as invisible PDF text bytes; common Word/Excel families such as Aptos, Calibri, Segoe UI, Arial, Verdana, Times New Roman, Georgia, Cambria, Consolas, and Courier New map consistently to dependency-free Helvetica, Times, or Courier standard PDF families. Helvetica and Times family generated layout, text span readback, and text stamp/watermark placement use built-in glyph-width tables, including common WinAnsi punctuation and accented Latin letters, instead of average character widths, while parseable embedded TrueType mappings and named UseFontFamily(...) registrations override generated layout metrics for their assigned generated font family slots.
  • Page font resources are emitted only for fonts actually used by visible page content, including header/footer fonts only when headers, footers, or page numbers are enabled.
  • Page setup rejects invalid intrinsic page sizes and margins at fluent assignment time, while page options report clear layout errors for default/header/footer font enum values, default/header/footer font sizes, header/footer alignment, header/footer placement, and impossible content frames.
  • PdfDocument.Create(options) snapshots caller-provided options so later caller mutations cannot change document rendering.
  • Reusable themes apply default text, heading, list, panel, horizontal rule, image, drawing, paragraph, row, and table styles at options, document, or page scope, snapshot caller-provided style objects before rendering, and include rendered mixed-flow gates for the built-in PdfTheme.WordLike() and report-oriented document defaults.
  • Reusable and fluent default text styles apply font, font size, and color to following page-flow content, snapshot caller-provided style objects, and reject invalid configuration delegates, font sizes, or standard font values before rendering.
  • Default paragraph styles are snapshotted on assignment, fluent document configuration, and options cloning, apply to top-level and row/column paragraphs that do not provide their own style, and are bypassed by explicit per-paragraph styles.
  • Default table styles are snapshotted on assignment, readback, fluent document configuration, and options cloning, apply to top-level and row/column tables that do not provide their own style, can be set from supported Word table style names, and are bypassed by explicit per-table styles.
  • Compose page default heading, list, paragraph, and table styles are snapshotted per page, do not leak to later pages, and still allow explicit styles to override the page default.
  • Compose page blocks expose read-only content block collections after composition so page-scoped model nodes are not caller-mutable lists.
  • Page composition and header/footer templates report clear errors for null delegates, null header/footer text, invalid first-page/even-page header/footer text, invalid footer segment construction, and invalid externally-mutated footer segment state.
  • Directly assigned footer segment templates render footer content without requiring the page-number flag, and footer placement validation applies to both page-number footers and segment-based footers.
  • Footer segment lists are snapshotted on assignment and readback so caller mutations cannot change footer rendering after options are configured; first-page and even-page footer segments use the same snapshot and validation path.
  • Save APIs report clear errors for null or non-writable streams and invalid path outputs; async path saves honor cancellation before creating directories, rendering, or writing files.
  • Stream read APIs report clear errors for null or non-readable streams and read from the current stream position.
  • Core path read APIs reject null, empty, or whitespace paths before attempting file reads.
  • Page import APIs reject invalid source-page selections and invalid target insertion points before file reads, can read source streams from their current positions, can write byte, stream, or path inputs to the current output stream position, keep target metadata for target-edit insert operations, and AppendPageRanges, PrependPageRanges, InsertPageRange, and InsertPageRanges import inclusive source ranges from firstPage / lastPage pairs, reusable PdfPageRange values, or parsed range lists without wrappers materializing each page number; repeated or overlapping import ranges create cloned source pages in caller order.
  • Encrypted PDFs fail with a clear unsupported diagnostic before parser-supported read/manipulation helpers attempt to process page content.
  • Signed PDFs, form PDFs, complex outline/bookmark PDFs, complex page-label number-tree PDFs, unsupported catalog name-tree PDFs, unsupported named-destination name-tree PDFs, complex open-action dictionary PDFs, complex viewer-preference PDFs, complex XMP metadata PDFs, complex catalog URI PDFs, tagged PDFs, complex output-intent PDFs, complex embedded-file/associated-file PDFs, complex optional-content/layer PDFs, and active-content PDFs fail with clear unsupported diagnostics before rewrite-style manipulation helpers copy, merge, edit, metadata-rewrite, stamp, or watermark page content; simple direct catalog /PageMode, /PageLayout, /Version, /Lang, simple direct /PageLabels number trees, simple outline trees including simple GoTo action outline entries whose destinations point only at copied pages, direct /Dests dictionaries, simple /Names /Dests name trees including leaf /Kids, destination-array and simple GoTo dictionary /OpenAction entries, simple /ViewerPreferences dictionaries, simple catalog /Metadata XMP XML streams, simple catalog /URI base dictionaries, simple /OutputIntents metadata graphs, simple /Names /EmbeddedFiles attachment trees, simple catalog /AF associated-file arrays, and simple /OCProperties optional-content metadata are preserved during rewrite-style manipulation, while copied-page page labels are reindexed, stale named-destination links are pruned, and complex outlines, complex page labels, unsupported catalog name trees, malformed or unsupported named-destination name trees, complex open-action dictionaries, complex viewer preferences, complex XMP metadata, complex catalog URI dictionaries, complex output intents, complex embedded/associated files, and complex optional content remain blocked.
  • Manipulation path input APIs reject null, empty, or whitespace input paths before attempting file reads.
  • Page-by-page and page-range text extraction can validate/create output directories before reading inputs and write deterministic source-page-numbered text files for wrapper-friendly PSWritePDF parity; ExtractTextByPageRanges(...) accepts parsed range lists, preserves caller order, and treats overlapping selections as one page set.
  • Image extraction can read from bytes, paths, or streams, validate/create output directories before reading path inputs, and write byte-, path-, or stream-based extracted image files with deterministic page-numbered names for wrapper-friendly PSWritePDF parity.
  • Rich text runs support scoped per-run font-size and background/highlight changes plus link annotations, and report clear errors for null run text, invalid run font sizes, empty link text, non-absolute link URIs, empty link annotation contents, link contents without a link URI, image/shape/drawing link contents without a URI, and invalid table link coordinates before rendering. When tagged catalog markers are enabled, rich text links emit visible /Link marked content combined with annotation OBJR references as PDF/UA groundwork.
  • Paragraph, heading, image, shape, drawing-scene, vector convenience, and table-cell URI link annotations are emitted through a shared annotation dictionary builder, and paragraph LinkToBookmark(...) runs plus bookmark-targeted headings and table-cell named-destination links emit internal GoTo named-destination link annotations. When tagged markers are enabled, generated link annotations receive /StructParent entries plus /Link structure elements with /OBJR references as PDF/UA groundwork. Table cells can also define reusable named-destination anchors. Generated-PDF output checks verify /Annots, /Subtype /Link, /URI and /GoTo actions, escaped /Contents metadata, positive in-page link rectangles, annotation structure references, aligned heading-link geometry, image placement geometry, fixed visual object geometry, missing bookmark-link diagnostics, and inspector readback, including wrapped heading lines, row/column headings, images, shapes, drawing scenes, vector helper calls, table cells, and bookmark links generated from compose and row/column flows.
  • Heading-based PDF outlines are emitted through a shared outline dictionary builder and protected by generated-PDF output checks for /Outlines, title entries, nested tree links, counts, and /Dest destinations, plus inspector readback. Generic Bookmark(...) anchors emit sorted simple /Names /Dests named destinations, reject duplicate names before output, validate internal link targets, and are covered by generated-PDF and inspector readback checks for top-level and row/column flows.
  • Lightweight probe/readback reports PDF header version, trailer-root catalog page mode/layout/version/language values, simple page-label rules, simple document outline targets including named destinations, simple document open-action targets, simple viewer preference entries, encryption markers, digital signature markers, form-field markers, annotation markers, simple page URI and named-destination link annotation counts, distinct document-level link URI and internal destination targets, document-level page-aware link lists, named destination names/targets, and per-page annotations with contents metadata, outline/bookmark markers, catalog view-setting markers, page-label markers, catalog name-tree markers, named-destination markers, open-action markers, viewer-preference markers, tagged-structure markers, XMP metadata markers, catalog URI markers, output-intent markers, embedded-file markers, optional-content/layer markers, active-content markers, structured preflight read and rewrite blockers, and read/rewrite diagnostics so wrappers can warn before invoking read or manipulation helpers; simple catalog view settings, simple outlines including simple GoTo action outline entries, simple direct page labels, supported catalog name trees, direct named destinations, simple destination name trees including leaf /Kids, destination-array and simple GoTo dictionary open actions, simple viewer preferences, simple catalog XMP metadata streams, simple catalog URI base dictionaries, simple output intents, simple embedded-file attachment trees, simple catalog associated-file arrays, and simple optional-content metadata are detected without blocking rewrite. Column-aware text readback now splits wide same-baseline runs before gutter detection so generated row/column documents can be extracted in left-column then right-column order, and structured readback keeps clear single-line table gaps so generated simple tables can round-trip into detected table rows.
  • Generated metadata is protected by literal-string escaping checks for title, author, subject, and keywords, plus inspector readback of the original values.
  • PDF object-boundary parsing ignores stream and endobj tokens inside literal strings so ordinary form/text values cannot truncate parsed objects during read/rewrite flows.
  • Paragraph and panel paragraph text blocks reject invalid alignment enum values before layout while preserving supported justification.
  • Paragraph and panel paragraph blocks snapshot rich text runs into read-only model collections.
  • Paragraph scalar style properties reject invalid line height, spacing, and individual indents on assignment while combined text-frame width remains guarded during layout; paragraph style snapshots preserve line height, indents, first-line/hanging indents, spacing, keep-together, keep-with-next, and widow/orphan page-flow settings after the caller mutates the original style.
  • Paragraph first-line and hanging indents affect both rich-text wrapping and rendered positions in top-level and row/column flows, with diagnostics when the first-line frame would leave the content area or collapse to a non-positive width.
  • Mutable header, footer, panel-box, and table-caption alignment properties reject unsupported values on assignment instead of carrying invalid style state into rendering.
  • Table column horizontal/vertical alignment lists and per-cell horizontal/vertical alignment dictionaries reject unsupported values on assignment, reject out-of-grid entries during table layout/preflight, snapshot the assigned collection so later caller mutations cannot change the style, and are honored in both top-level and row/column table flows.
  • Table captions render above the grid with configured alignment, color, font size, and spacing in both top-level and row/column table flows.
  • Body column fills, per-cell fills, per-cell borders, per-cell padding overrides, and per-cell alignment overrides render in both top-level and row/column table flows.
  • Header, body, and footer row separators render as line strokes in both top-level and row/column table flows.
  • Body row striping is calculated relative to the first body row and does not apply to configured header rows in both top-level and row/column table flows.
  • Table column sizing lists reject non-positive/non-finite widths or weights on assignment and snapshot the assigned collection; oversized fixed column widths are proportionally fit into the available table frame, including row/column flows, while impossible minimum-width conflicts remain render-time diagnostics.
  • Table body-column fills, cell fills, cell data bars, cell icons, cell borders, cell padding overrides, and cell alignment overrides snapshot assigned collections; cell fill/data-bar/icon/border/padding/alignment coordinates are validated on assignment, PdfCellDataBar.Ratio rejects values outside the 0..1 range, PdfCellIcon.Size rejects invalid intrinsic sizes, PdfCellBorder.Width plus side-specific PdfCellBorderSide.Width reject invalid intrinsic widths, and PdfCellPadding rejects invalid intrinsic padding values on assignment.
  • Heading blocks reject empty or whitespace titles before layout so outlines and visible document structure cannot contain invisible headings. Bookmark blocks reject empty or whitespace names immediately and duplicate names during output so generated named destinations stay deterministic.
  • Heading blocks reject unsupported alignment values before layout so Justify or invalid enum state cannot silently render as left-aligned headings.
  • Heading style tests cover snapshotting, theme propagation, page-scoped defaults, rendered font size/color, spacing-before/after rhythm, and fresh page/column spacing-before suppression so H1/H2/H3 can move toward Word-like style control instead of hardcoded renderer constants.
  • Bullet and numbered list blocks snapshot caller-provided items and styles into read-only model state, reject unsupported alignment values before layout, cover default/page/per-list style rendering for font size, color, left indent, marker gap, vertical rhythm, and fresh page/column spacing-before suppression, and can keep a whole list together or keep it with the following visible block across top-level and row/column page flow so Justify or invalid enum state cannot silently render as left-aligned lists.
  • Image, shape, and drawing-scene blocks reject unsupported alignment values before layout so Justify or invalid enum state cannot silently render as left-aligned fixed-size content.
  • Table captions reject empty or whitespace text before layout while null still means no caption.
  • Table blocks reject unsupported table alignment values at model construction across top-level, compose, and link-enabled table APIs.
  • Image blocks snapshot caller-provided bytes and reject invalid intrinsic model state at construction time; image, drawing, and horizontal rule styles reject invalid intrinsic spacing at construction time; fixed-size flow blocks such as images, horizontal rules, vector shapes, and drawing scenes still report clear layout errors when they are wider or taller than the available page content frame, while flow and table-cell images can opt into proportional frame fitting with PdfImageStyle.ScaleDownToFit.
  • Kept-together panels report a clear layout error when their measured height exceeds the available page content height.
  • Panel scalar style properties reject invalid border width, padding, max width, and outer spacing on assignment while panel-box alignment and layout-dependent padding conflicts remain guarded.
  • Panel paragraph blocks snapshot explicit panel styles at add time, and default panel style tests cover snapshotting, theme propagation, page-scoped defaults, rendered background color, max-width alignment, padding, spacing rhythm including fresh page/column spacing-before suppression, and keep-with-next page flow for top-level and row/column panels.
  • Horizontal rule style tests cover snapshotting, theme propagation, document and page-scoped defaults, rendered stroke color/thickness, spacing rhythm including fresh page/column spacing-before suppression, and keep-with-next page flow for top-level and row/column rules.
  • Image style tests cover snapshotting, theme propagation, document and page-scoped defaults, rendered alignment/fit coordinates, opt-in proportional scale-down for top-level, row/column, and table-cell images, and spacing rhythm including fresh page/column spacing-before suppression for top-level and row/column images.
  • Drawing style tests cover snapshotting, theme propagation, document and page-scoped defaults, rendered shape and drawing-scene coordinates, and spacing rhythm including fresh page/column spacing-before suppression for top-level and row/column vector objects.
  • Row style tests cover snapshotting, theme propagation, document and page-scoped defaults, rendered gutter coordinates, optional column separators, row-level spacing rhythm including page-top spacing-before suppression, keep-together and keep-with-next page flow, and over-tall row diagnostics for reusable row/column primitives.
  • Paragraph keep-together layout moves a whole paragraph to the next page in top-level and row/column flows when it would otherwise split, and reports a clear error when the kept paragraph is taller than the available page content height.
  • Paragraph keep-with-next layout moves a paragraph with the following visible paragraph/list/panel/table/rule/image/shape/drawing/row-section neighbor in top-level and row/column flows when the first paragraph would otherwise be stranded at the bottom of a page.
  • Paragraph widow/orphan layout can avoid leaving a single paragraph line at the bottom of a page in top-level and row/column flows.
  • Heading layout keeps a heading with the following visible paragraph/list/panel/table/rule/image/shape/drawing/row-section neighbor in top-level and row/column flows when the heading would otherwise be orphaned at the bottom of a page.
  • Row/column composition reports clear layout errors for empty rows, invalid gutters, non-finite, non-positive, or over-allocated column widths before they can corrupt rendered geometry; render-time diagnostics reject gutters that exceed the available content width and kept rows that exceed the available page content height, and row/column model collections expose read-only views after composition.
  • Row/column visual-quality checks render ordinary Word-like column primitives, then verify extracted text lines remain inside their column frames, preserve explicit/default gutter clearance, maintain readable baseline rhythm and row-level breathing room, and suppress first flow-object spacing-before at column starts so composition regressions fail before they become cramped reports.
  • Generic business-shaped visual fixtures, such as line-item tables, stay as proof documents for reusable Word-like primitives: weighted/min-width table columns, wrapped text, right-aligned numeric values, footer/summary row separation, margins, and follow-on rhythm are verified without adding invoice/report concepts to the engine.
  • Word-like table presets and PdfTheme.WordLike() now include neutral footer separator defaults so summary/footer rows have document-style structure without requiring invoice/report-specific style APIs.
  • Table scalar style properties reject invalid border widths, row/header/footer separator widths, table-wide padding, per-cell padding values, max width, left indent, row counts, table-wide and per-row minimum heights, spacing, caption font size, header/body/footer font sizes, line height, and row baseline offsets on assignment while layout-dependent conflicts remain render-time diagnostics.
  • Table styles report clear layout errors for invalid captions, unsupported caption justification, alignment enum values, cell fills/data bars/icons/borders/padding/alignment overrides, explicit cell style coordinates outside the table grid, row-scoped style entries outside the table grid, column-scoped style entries outside the table grid, oversized header/footer row counts, and impossible column sizing.
  • Table header rows stay visually distinct from body row striping even when a style disables explicit header fill.
  • Tables can move as a unit when PdfTableStyle.KeepTogether is enabled, including row/column flows, and report a clear layout error when the kept table is taller than the available page content height.
  • Tables can keep with the first visible part of the following block when PdfTableStyle.KeepWithNext is enabled and the pair fits inside the page content frame.
  • Oversized table rows split across pages by wrapped text line when PdfTableStyle.AllowRowBreakAcrossPages or a per-row RowAllowBreakAcrossPages entry allows it, including row/column flows and measured rich cell line heights, and report a clear layout error when row splitting is disabled.
  • Table blocks snapshot input rows, styles, and link dictionaries into read-only model state and normalize null cells before layout so later caller mutations cannot change rendered output.
  • Shape and drawing blocks snapshot shared OfficeIMO.Drawing descriptors, including linear gradient fills, at add time so later caller mutations cannot change rendered output.
  • Structured and logical PDF readback expose wrapper-friendly documents and pages with typed text blocks, heuristic headings, paragraph groups, list item objects, detected table row/column/cell objects, image XObjects, URI/named-destination link annotation objects with document-level lookup, page-level AcroForm widget objects with current and normal appearance states plus named annotation flags, catalog navigation objects, AcroForm /NeedAppearances, /SigFlags named helpers, and /DA, and simple AcroForm fields/widgets with typed field-kind/common-flag helpers, inherited text max length, inherited AcroForm/field-tree default appearance strings/text alignment, scalar or array current/default values, inherited choice options, selected/default-selected options, plus document-level field/widget lookup by field name, field kind, or page number so PSWriteOffice can start from reusable objects instead of raw text. The two-page line-item statement proof document is covered by logical readback for source page ordering, table rows, totals, and selected page ranges.
  • Page extraction stream helpers read from and write to current stream positions; repeated selected pages are emitted as distinct cloned page objects; PdfPageRange parses single pages, inclusive first-last / first..last ranges, and comma/semicolon-separated range lists for wrappers; path output helpers create parent directories but reject empty paths and existing directory targets before reading inputs or writing output; split helpers validate/create output directories before reading inputs and write deterministic page-numbered or page-range files for wrapper-friendly PSWritePDF parity.
  • Page editing stream helpers read from and write to current stream positions, rejecting unreadable inputs or non-writable outputs before attempting parser work; path inputs can return edited PDF bytes, write to paths, or write to the current position of caller-owned output streams; DeletePageRange, DeletePageRanges, DuplicatePageRange, DuplicatePageRanges, MovePageRange, MovePageRanges, RotatePageRange, and RotatePageRanges accept either firstPage / lastPage pairs, reusable PdfPageRange values, or parsed range lists without making wrappers materialize each page number; DeletePageRanges, MovePageRanges, and RotatePageRanges treat overlapping ranges as one selection set; DuplicatePages and DuplicatePageRanges insert cloned copies immediately after selected source pages and honor repeated selections/ranges as repeated clones; MovePages moves selected source pages as a group in original relative order before another source page or to the end; path output helpers create parent directories but reject empty paths and existing directory targets before reading inputs or writing output.
  • Metadata editing stream helpers read from and write to current stream positions while preserving the same update/replace semantics as byte and path inputs; path inputs can return bytes, write to paths, or write to the current position of caller-owned output streams; path output helpers create parent directories but reject empty paths and existing directory targets before reading inputs or writing output.
  • Merge stream helpers read each input from its current stream position and can write merged PDFs to the current output stream position; file-list helpers accept enumerable paths for pipeline-collected inputs and can write to an output path or output stream.
  • Merge file output helpers create parent directories but reject empty paths and existing directory targets before reading inputs or writing output.
  • Text/image stamp and watermark helpers read the source PDF, plus image payloads for image stamps/watermarks, from the current stream position; path inputs can return stamped PDF bytes for wrapper pipelines, write to paths, or write to the current position of caller-owned output streams.
  • Text/image stamp and watermark path output helpers create parent directories but reject empty paths and existing directory targets before reading inputs, image payloads, or writing output.
  • Text/image stamp option models snapshot assigned page-number arrays, provide UsePageRange(...) overloads for firstPage / lastPage pairs or reusable PdfPageRange values, plus UsePageRanges(...) for parsed range lists without wrappers materializing page arrays; overlapping range-list selections are treated as one page selection set, and invalid intrinsic coordinates, sizes, rotation, fonts, and duplicate/non-positive page selections are rejected before stamping.
  • Text/image stamp and watermark output is emitted through the shared internal content-stream helper and protected by content-stream checks for placement matrices, color/font operators, image dimensions/rotation, PNG alpha soft masks, and above/below-content layering order; custom image watermark sizing preserves watermark layering.
  • PdfDocumentVisualBaselineTests keeps representative and professional report geometry snapshots for headings, paragraphs, rich text, panels, bullets, tables, images, PNG alpha soft masks, clipping, axial shading, and vector drawing content-stream signals.
  • PdfDocumentRasterVisualBaselineTests can render the professional report, a two-page line-item statement fixture, a Word-like table style gallery with compact Accent1-6 swatches, a landscape showcase dashboard, a native Word-to-first-party-PDF report fixture, a native Word daily-layout fixture covering TOC, margins, columns, separator lines, fonts, colors, lists, links, images, headers/footers, and a table inside the column flow, a native Word table-cell picture-control fixture, a native Excel daily-workbook fixture covering worksheet headers/footers, margins/orientation, merged cells, number formats, explicit row/column sizing, hidden row/column filtering, internal/external hyperlinks, and worksheet/header images, Markdown technical-document output, a Markdown visual theme gallery for every built-in Markdown PDF theme, plus compact hello-world, core-layout, style-cheatsheet, styled-runs, drawing-gallery, row-columns, links-rules, lists-tables, default-styles, three-page flow-dsl, and two-page headers-footers scenarios through Poppler pdftoppm, then compare page PNGs against approved baselines. On mismatch it writes expected, actual, and diff PNG artifacts under %TEMP%\OfficeIMO.PdfRaster. Set OFFICEIMO_REQUIRE_PDF_RASTERIZER=1 to make missing Poppler fail the test lane, OFFICEIMO_UPDATE_PDF_RASTER_BASELINE=1 to refresh approved PNGs, OFFICEIMO_PDF_RASTER_PIXEL_TOLERANCE to allow small per-channel deltas, and OFFICEIMO_PDF_RASTER_ALLOWED_DIFF_PIXELS to allow a limited changed-pixel count.

Near-term work should keep adding small visual gates before broad feature growth. The roadmap tracks the intended sequence.

Support Matrix

The current create/read/manipulate/export coverage is tracked in Docs/officeimo.pdf.support-matrix.md.

Examples

Runnable samples live under OfficeIMO.Examples/Pdf. The professional report sample can be generated with:

dotnet run --project OfficeIMO.Examples -- --pdf-professional

The Word-like table style gallery can be generated with:

dotnet run --project OfficeIMO.Examples -- --pdf-table-styles

Known Gaps

  • This is not yet a full QuestPDF replacement.
  • This is not yet a full iText/PSWritePDF replacement.
  • Font metrics are still simplified outside built-in Helvetica/Times-family tables and parseable embedded TrueType mappings assigned to generated standard-font slots.
  • TrueType embedding is currently a full-file generated-font feature whose glyph metrics and Unicode cmap mappings now participate in layout and extraction, including named family registration through UseFontFamily(...); OpenType/CFF embedding, font subsetting, shaping/ligatures, glyph fallback, and text outside generated standard-font slots remain roadmap work.
  • PdfOptions.ComplianceProfile values other than None are intentional roadmap guardrails and currently throw clear diagnostics instead of generating files that merely look compliant. PdfOptions.FileVersion, PdfAIdentification, PdfUaIdentification, PdfElectronicInvoiceMetadata, PdfIccProfiles.SrgbIec6196621, PdfOutputIntent.CreateSrgbIec6196621(), and PdfOutputIntentPolicy can emit PDF 1.7 headers, PDF/A, PDF/UA, Factur-X/ZUGFeRD XMP, and output-intent policy fields ahead of that profile switch, and PdfComplianceAnalyzer / PdfDocument.AssessCompliance(...) can explain which planned-profile requirements are satisfied, missing, or still unsupported, including PDF 1.7 file-header readiness, configured output-intent presence versus declared output-intent policy versus external validator evidence, generated standard-font embedding coverage with parseable TrueType checks, PDF/UA identification, document-title, viewer-display-title readiness, tagged page tab-order readiness, tagged parent-tree next-key readiness, generated link annotation and rich link text structure-reference readiness, decorative running page text artifact readiness, decorative flow rule artifact readiness, decorative layout artifact readiness, canonical factur-x.xml CrossIndustryInvoice attachment readiness, e-invoice XML profile-context readiness, e-invoice XML document-header readiness, e-invoice XML document-type-code readiness, e-invoice XML date-format readiness, e-invoice XML trade-transaction readiness, e-invoice XML party-identification readiness, e-invoice XML country-code readiness, e-invoice XML electronic-address readiness, e-invoice XML party-tax-registration readiness, e-invoice XML party-tax-registration-scheme readiness, e-invoice XML tax-party-identifiers required/forbidden readiness, e-invoice XML line-item readiness, e-invoice XML unit-code readiness, e-invoice XML line-pricing readiness, e-invoice XML line-amount-consistency readiness, e-invoice XML line-tax VAT type-code/O rate-absence readiness, e-invoice XML settlement-summary readiness, e-invoice XML currency-consistency readiness, e-invoice XML currency-code readiness, e-invoice XML tax-breakdown VAT type-code readiness, e-invoice XML tax-category-code/O line-allowance-charge breakdown-exclusivity readiness, e-invoice XML tax-category-rate/O header-line-allowance-charge rate-absence readiness, e-invoice XML tax-category-amount zero-amount/standard-rate-math readiness, e-invoice XML tax-exemption-reason/VATEX-EU-O semantics readiness, e-invoice XML tax-category line/allowance-charge consistency readiness, e-invoice XML tax-total/category-rate-adjusted-basis-consistency readiness, e-invoice XML payment-instructions readiness, e-invoice XML payment-means-code readiness, e-invoice XML payment-account-format readiness, e-invoice XML payment-terms readiness, e-invoice XML allowance/charge-reason readiness, e-invoice XML allowance/charge total-sum and amount-consistency readiness, e-invoice embedded-file stream-parameter readiness, and e-invoice XMP extension readiness, but these are only groundwork for the validator-backed profile bundle.
  • Generated AcroForm widgets can emit /StructParent entries plus /Form OBJR structure references when tagged catalog markers are enabled; PdfFormFieldStyle.AlternateName emits /TU alternate field names, PdfFormFieldStyle.MappingName emits /TM mapping names, and PdfDocument.AssessCompliance(...) reports both generated-form-widget-structure-references and generated-form-field-accessible-names. Richer accessible form semantics, descriptions beyond alternate names, read-order semantics, and validator-backed PDF/UA form success remain roadmap work.
  • Unsupported catalog name-tree preservation, malformed or unsupported named-destination name trees, full PNG coverage beyond the current grayscale/RGB/indexed-color/alpha paths, advanced page import/editing, richer image transparency cases, rich/custom form appearance generation and flattening beyond simple field inspection, simple value fill, and simple text/button-widget flattening, signatures, encryption, redaction, and Office document rendering are roadmap items.
  • The reader is intentionally pragmatic and does not yet cover the whole PDF specification.

Design Notes

  • OfficeIMO.Pdf runtime code must not depend on PDF libraries, rasterizers, SkiaSharp, QuestPDF, iText, or commercial engines.
  • OfficeIMO.Drawing is the preferred first-party reuse layer for shared color, font, image metadata, image fitting, text measurement, reusable drawing primitives, and eventually office-wide drawing scene concepts. Initial color interop is available through PdfColor, image placement can consume OfficeImageFit, and flow lines/rectangles/rounded rectangles/ellipses/polygons/paths plus grouped scenes can consume shared OfficeShape and OfficeDrawing descriptors, including stroke dash/cap/join, two-stop linear gradient fill intent, simple offset shadow intent, fill/stroke opacity, affine transform intent, and clipping path intent.
  • PDF syntax and layout should move toward reusable internal models instead of one-off string writing. Initial reused content-stream helpers now cover common fill, stroke, stroke width, stroke cap, stroke join, dash arrays, fill-stroke, rectangle, line, path painting for ordinary and transformed rounded rectangles/ellipses/polygons/freeform paths, clipping paths for shapes/gradients/images, shading draws for gradient fills, local transform matrices, ExtGState resource application for opacity/shadows, save/restore wrappers for images/clipping/gradients/transformed shapes, text decorations, simple text, rich paragraph text, table-cell text, header/footer text, generated image placement, text/image stamp and watermark streams, and graphics-state operators; generated and rewrite-style PDF names, literal strings, and indirect references share one syntax escaper, generated indirect-object creation, explicit object reservation/replacement, rewrite-style object wrapping, and stream body wrapping share one object-byte helper, generated page objects reference a reserved /Pages object directly instead of string-patched parent placeholders, generated page dictionaries including /MediaBox, /Resources, /Contents, /Annots, /StructParents, and tagged /Tabs /S hints now use one page dictionary builder, generated URI and named-destination link annotations now use one annotation dictionary builder with literal-string escaping, rectangle validation, and optional /StructParent entries, generated outline root/item dictionaries now use one outline dictionary builder with title escaping, navigation links, child counts, and destination validation, generated PDFs, metadata editing, and merge outputs share one Info dictionary builder, generated PDFs, page extraction, and merge outputs share one /Pages dictionary builder, generated catalog dictionaries and rewrite-style catalog prefix/name/reference entries share one catalog dictionary builder, generated PDFs and rewrite-style manipulation outputs share one classic xref/trailer assembler, generated and stamp-injected standard Type1 WinAnsi font dictionaries share one font dictionary builder, generated/stamped JPEG and PNG image XObject dictionaries including soft masks share one image XObject dictionary builder, page resource reference dictionaries for Font, XObject, ExtGState, and Shading now share one formatter with PDF name escaping, and generated ExtGState alpha plus axial shading object bodies share one visual resource dictionary builder with opacity and finite-coordinate validation.
  • Tests may use helper packages such as PdfPig to inspect output, because test dependencies do not ship with OfficeIMO.Pdf.
  • External rasterizers such as Poppler belong only in development/test lanes; they must never become runtime dependencies of OfficeIMO.Pdf.
  • External compliance validators also belong only in development/test lanes. PdfComplianceGateTests can run veraPDF and Mustang against deterministic groundwork fixtures when OFFICEIMO_VERAPDF / OFFICEIMO_VERAPDF_PATH and OFFICEIMO_MUSTANG / OFFICEIMO_MUSTANG_PATH are configured, with OFFICEIMO_VERAPDF_ARGS and OFFICEIMO_MUSTANG_ARGS available for local CLI syntax overrides. Set OFFICEIMO_REQUIRE_PDF_COMPLIANCE_VALIDATORS=1 to fail when validators are missing. The current expected result is validator failure, because the fixtures exercise PDF 1.7 headers, PDF/A identification XMP, built-in sRGB output intents, ToUnicode groundwork, embedded/associated files, CII profile context, party identity, electronic address, tax registration, tax breakdown VAT type-code, tax category code/O breakdown exclusivity, tax category rate, tax category amount zero-amount/standard-rate-math, tax exemption reason, tax party identifiers required/forbidden by category, payment instructions, payment means code, payment account format, payment terms, and Factur-X/ZUGFeRD XMP metadata without claiming formal PDF/A, PDF/UA, Factur-X, or ZUGFeRD conformance yet.
Product Compatible and additional computed target framework versions.
.NET net5.0 was computed.  net5.0-windows was computed.  net6.0 was computed.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
.NET Core netcoreapp2.0 was computed.  netcoreapp2.1 was computed.  netcoreapp2.2 was computed.  netcoreapp3.0 was computed.  netcoreapp3.1 was computed. 
.NET Standard netstandard2.0 is compatible.  netstandard2.1 was computed. 
.NET Framework net461 was computed.  net462 was computed.  net463 was computed.  net47 was computed.  net471 was computed.  net472 is compatible.  net48 was computed.  net481 was computed. 
MonoAndroid monoandroid was computed. 
MonoMac monomac was computed. 
MonoTouch monotouch was computed. 
Tizen tizen40 was computed.  tizen60 was computed. 
Xamarin.iOS xamarinios was computed. 
Xamarin.Mac xamarinmac was computed. 
Xamarin.TVOS xamarintvos was computed. 
Xamarin.WatchOS xamarinwatchos was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages (5)

Showing the top 5 NuGet packages that depend on OfficeIMO.Pdf:

Package Downloads
OfficeIMO.Reader

Unified, read-only document extraction facade for OfficeIMO (Word/Excel/PowerPoint/Markdown/PDF) intended for AI ingestion.

OfficeIMO.Word.Pdf

PDF converter for OfficeIMO.Word - Export Word documents to PDF using the first-party OfficeIMO.Pdf engine.

OfficeIMO.Excel.Pdf

PDF converter for OfficeIMO.Excel - Export Excel workbooks to PDF using the first-party OfficeIMO.Pdf engine.

OfficeIMO.Markdown.Pdf

PDF converter for OfficeIMO.Markdown - Export Markdown documents to PDF using the first-party OfficeIMO.Pdf engine.

OfficeIMO.PowerPoint.Pdf

PDF converter for OfficeIMO.PowerPoint - Export PowerPoint presentations to PDF using the first-party OfficeIMO.Pdf engine.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
0.1.35 0 6/5/2026
0.1.34 451 5/27/2026
0.1.33 438 5/26/2026
0.1.32 444 5/26/2026
0.1.31 462 5/23/2026
0.1.30 441 5/22/2026
0.1.29 450 5/21/2026
0.1.28 434 5/21/2026
0.1.27 433 5/20/2026
0.1.26 417 5/19/2026
0.1.25 406 5/18/2026
0.1.24 482 5/16/2026
0.1.23 435 5/14/2026
0.1.22 430 5/14/2026
0.1.21 427 5/7/2026
0.1.20 470 5/1/2026
0.1.19 429 4/27/2026
0.1.18 611 4/10/2026
0.1.17 151 4/9/2026
0.1.16 181 4/3/2026
Loading failed