D4S.Indexer.Domain 1.0.18

dotnet add package D4S.Indexer.Domain --version 1.0.18
                    
NuGet\Install-Package D4S.Indexer.Domain -Version 1.0.18
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="D4S.Indexer.Domain" Version="1.0.18" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="D4S.Indexer.Domain" Version="1.0.18" />
                    
Directory.Packages.props
<PackageReference Include="D4S.Indexer.Domain" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add D4S.Indexer.Domain --version 1.0.18
                    
#r "nuget: D4S.Indexer.Domain, 1.0.18"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package D4S.Indexer.Domain@1.0.18
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=D4S.Indexer.Domain&version=1.0.18
                    
Install as a Cake Addin
#tool nuget:?package=D4S.Indexer.Domain&version=1.0.18
                    
Install as a Cake Tool

D4S.Indexer

A document indexing library for Azure AI Search. It extracts text from documents, generates vector embeddings, and uploads them as searchable chunks.

Architecture

D4S.Indexer.Domain          Entities, abstractions (interfaces)
D4S.Indexer.Application     Orchestration (DocumentIndexerService, DocumentExtractor)
D4S.Indexer.Infrastructure  Azure implementations, builder, document processors, sources

Quick Start

var indexer = IndexerBuilder.Create("my-index")
    .WithAzureSearch(searchEndpoint, searchKey)
    .WithAzureOpenAI(aoaiEndpoint, aoaiKey, embeddingDeployment, embeddingDimensions)
    .WithLocalFiles("./documents")
    .WithFileMetadataFields()
    .Build();

var result = await indexer.IndexAsync();

Key Interfaces

Interface Purpose
IDocumentSource Enumerates documents from a data source
IDocumentProcessor Extracts text and metadata from a document
IEmbeddingService Generates vector embeddings
ISearchIndexService Manages the search index (CRUD on chunks)
ITextChunker Splits text into chunks
IOcrService OCR for scanned/image documents
IKeywordExtractor AI-based keyword extraction

Built-in Document Sources

  • LocalFileSystemDocumentSource — local filesystem with filtering and subdirectory scanning
  • MultiSiteSharePointDocumentSource — multiple SharePoint sites via PnP Core

Built-in Document Processors

PDF, DOCX, XLSX, PPTX, TXT/Markdown.

Indexing Modes

Full Mode (default)

All documents are fetched from every source. Documents missing from the source list are automatically deleted from the index.

Delta Mode

Enabled via .WithDeltaMode(). Only changed/new/deleted documents are provided by the source. Deletion is driven by DocumentMetadata.DeletedDate — documents with a non-null DeletedDate are removed from the index. The implicit cleanup step is skipped.

var indexer = IndexerBuilder.Create("my-index")
    .WithAzureSearch(searchEndpoint, searchKey)
    .WithAzureOpenAI(aoaiEndpoint, aoaiKey, deployment, dimensions)
    .WithDeltaMode()
    .WithCustomDocumentSource<MyDeltaSource>(serviceProvider, "delta")
    .Build();

The source provides SourceDocument instances. For documents to delete, set DeletedDate and pass null for GetContentAsync:

new SourceDocument(
    new DocumentMetadata
    {
        Id = "doc-123",
        LastModifiedDate = DateTimeOffset.UtcNow,
        Extension = ".pdf",
        DeletedDate = DateTimeOffset.UtcNow   // signals deletion
    },
    GetContentAsync: null);

In both modes, the indexer compares LastModifiedDate against the index to decide whether to reindex or skip unchanged documents.

Builder Options

IndexerBuilder.Create("index-name")
    // Required
    .WithAzureSearch(endpoint, apiKey)
    .WithAzureOpenAI(endpoint, apiKey, deployment, dimensions)

    // Sources (at least one required)
    .WithLocalFiles("./docs")
    .WithLocalFiles(opts => { opts.Path = "./docs"; opts.FileExtensions = [".pdf", ".docx"]; })
    .WithSharePointMultiSite(spOptions, contextFactory)
    .WithCustomDocumentSource<T>(serviceProvider, serviceKey)

    // Optional
    .WithDeltaMode()
    .WithFileMetadataFields()
    .WithChunkSize(maxSize: 1000, overlap: 200)
    .WithBatchSize(50)
    .WithKeywordExtraction(gptDeployment, maxKeywords: 10)
    .WithAzureDocumentIntelligence(endpoint, apiKey)  // OCR
    .WithCustomDocumentProcessor<T>(serviceProvider, serviceKey)
    .ContinueOnError(true)
    .Filter(meta => meta.Extension == ".pdf")
    .ConfigureMetadata(meta => meta with { CustomFields = ... })
    .AddCustomField("Status", CustomFieldType.String, filterable: true)
    .AddIndexFieldsFromAttributes<MyModel>()
    .OnProgress(p => Console.WriteLine(p.Phase))
    .WithLogging()
    .Build();

Samples

See src/Rag/samples/ for working examples: local files, SharePoint, OCR, and agentic retrieval.

Product Compatible and additional computed target framework versions.
.NET net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
  • net10.0

    • No dependencies.

NuGet packages (2)

Showing the top 2 NuGet packages that depend on D4S.Indexer.Domain:

Package Downloads
D4S.Indexer.Application

Application services and configuration for D4S Indexer.

D4S.Indexer

D4S document indexer for Azure AI Search and RAG workflows.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.0.18 105 5/12/2026
1.0.17 112 5/8/2026
1.0.16 104 5/6/2026