Iciclecreek.Lucene.Net.Linq 4.8.5-beta00017

This is a prerelease version of Iciclecreek.Lucene.Net.Linq.

dotnet add package Iciclecreek.Lucene.Net.Linq --version 4.8.5-beta00017

NuGet\Install-Package Iciclecreek.Lucene.Net.Linq -Version 4.8.5-beta00017

This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.

<PackageReference Include="Iciclecreek.Lucene.Net.Linq" Version="4.8.5-beta00017" />

For projects that support PackageReference, copy this XML node into the project file to reference the package.

<PackageVersion Include="Iciclecreek.Lucene.Net.Linq" Version="4.8.5-beta00017" />
                    

                            Directory.Packages.props

<PackageReference Include="Iciclecreek.Lucene.Net.Linq" />
                    

                            Project file

For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.

paket add Iciclecreek.Lucene.Net.Linq --version 4.8.5-beta00017

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

#r "nuget: Iciclecreek.Lucene.Net.Linq, 4.8.5-beta00017"

#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.

#:package Iciclecreek.Lucene.Net.Linq@4.8.5-beta00017

#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.

#addin nuget:?package=Iciclecreek.Lucene.Net.Linq&version=4.8.5-beta00017&prerelease
                    

                            Install as a Cake Addin

#tool nuget:?package=Iciclecreek.Lucene.Net.Linq&version=4.8.5-beta00017&prerelease
                    

                            Install as a Cake Tool

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

LINQ to Lucene

Iciclecreek.Lucene.Net.Linq is a .NET library that enables LINQ queries to run natively on a Lucene.Net index.

This is a fork of the excellent library Lucene.Net.Linq which has been fully Modernized for Lucene.Net 4.x and .NET 8+

Installation

To install the Iciclecreek.Lucene.Net.Linq package, run the following command in the Package Manager Console

PM> Install-Package Iciclecreek.Lucene.Net.Linq

Port

This branch ports the Lucene.Net.Linq library from the original Lucene.Net.Linq 3.0.3 / net40 baseline onto Lucene.Net 4.8.0-beta00017 and SDK-style projects multi-targeting netstandard2.0;net8.0;net10.0. Highlights:

Lucene.Net 4.8.0-beta00017 for the index, query, analysis, and query-parser packages.
Remotion.Linq 2.2.0 for the LINQ provider plumbing.
Microsoft.Extensions.Logging.Abstractions replaces Common.Logging. Wire your logger via Lucene.Net.Linq.Util.Logging.LoggerFactory = myFactory; (defaults to NullLoggerFactory).
Tests moved from RhinoMocks/NUnit 2 to NSubstitute / NUnit 4.

New 4.x features

Vector similarity search via .Similar() with automatic embedding at index and query time
Multi-targeting: the library builds for netstandard2.0, net8.0, and net10.0;
Polymorphic select - searching for a base type always properly instantiated object types of the original type.
JOIN LINQ join support utilizes search index to query across document types
DocValues opt-in ([Field(DocValues = true)] / [NumericField(DocValues = true)]). Writes a parallel column-store field at index time so sorting, grouping and faceting read from a packed forward index instead of uninverting the inverted index via FieldCache. Defaults to false on both attributes — opt in per field where sort/group performance matters. Silently ignored on collection (IEnumerable<T>) properties because Lucene.Net 4.8 beta lacks SortedNumericDocValuesField.

Features

Vector similarity search via .Similar() with automatic embedding at index and query time
Default search property ([Field(Default = true)]) — one property serves as the target for both free-text Query() and vector Similar() when no field is specified
Inline Lucene query syntax via .Query() — embed wildcards, boolean operators, phrase queries and more directly in LINQ expressions
One-liner free-text search via provider.AsQueryable<T>("query string")
Automatically converts PONOs to Documents and back
Add, delete and update documents in atomic transaction
Unit of Work pattern automatically tracks and flushes updated documents
Update/replace documents with [Field(Key=true)] to prevent duplicates
Term queries
Prefix queries
Range queries and numeric range queries
Complex boolean queries
LINQ joins across document types with automatic semi-join pushdown via TermsFilter
collection.Contains(field) — the LINQ "IN" pattern translates to an efficient TermsFilter
Native pagination using Skip and Take
Support storing and querying NumericField
Polymorphic type hierarchies: store subtypes, query by base type, get back real types
Automatically convert complex types for storing, querying and sorting
Custom boost functions using IQueryable<T>.Boost() extension method
Sort by standard string, NumericField or any type that implements IComparable
Sort by item.Score() extension method to sort by relevance
Specify custom format for DateTime stored as strings
Register cache-warming queries to be executed when IndexSearcher is being reloaded
Retrieve per-document term vectors (terms and frequencies) via TermFreqVectorDocumentMapper<T> for fields indexed with TermVector = TermVectorMode.Yes

Examples

Using fluent syntax to configure mappings
Using attributes to configure mappings
Specifying document keys

Mapping objects ⇔ documents

Iciclecreek.Lucene.Net.Linq maps plain CLR objects (POCOs) onto Lucene Documents. You can describe the mapping in two ways: with attributes on the type, or with a fluent code-first builder. Both produce the same internal IFieldMapper<T> graph and are fully interchangeable.

Attribute mapping

The simplest case: decorate properties with [Field] or [NumericField]. Properties without an attribute are still mapped with sensible defaults (analyzed, stored, OR query operator).

[DocumentKey(FieldName = "Type", Value = "Article")]
public class Article
{
    [Field(Key = true)]
    public string Id { get; set; }

    [Field(IndexMode.Analyzed, DocValues = true)]
    public string Title { get; set; }

    [Field(IndexMode.NotAnalyzed)]
    public string Slug { get; set; }

    [NumericField]
    public int WordCount { get; set; }

    [NumericField(DocValues = true)]
    public DateTime PublishedAt { get; set; }

    [Field("body_text", Analyzer = typeof(EnglishAnalyzer))]
    public string Body { get; set; }

    public IList<string> Tags { get; set; }   // multi-valued, default mapping

    [QueryScore]
    public float Score { get; set; }          // populated at read time

    [IgnoreField]
    public string TransientUiState { get; set; }
}

`[Field]` options

Option	Default	Notes
`Field` (ctor)	property name	Backing Lucene field name.
`IndexMode` (ctor)	`Analyzed`	`Analyzed`, `NotAnalyzed`, `NotAnalyzedNoNorms`, `NoIndex`.
`Store`	`Yes`	Whether the original value is kept verbatim for read-back.
`Key = true`	`false`	Participates in the document primary key (used for replace/delete). Multiple properties can be marked.
`Default = true`	`false`	Marks this property as the default search property. Used by `.Query()`, `.Similar()`, and `AsQueryable<T>(string)` when no field is specified. Only one property per type should be marked. When not set, defaults to the first key or first indexed property.
`Boost`	`1.0f`	Index-time boost.
`Converter`	derived	Custom `TypeConverter` for non-string values.
`Format`	`yyyy-MM-ddTHH:mm:ss` for `DateTime`	Format string used by the default value-type converter. Ignored if `Converter` is set.
`CaseSensitive`	depends on converter	Disables `LowercaseExpandedTerms` in the query parser; also routes the field through `KeywordAnalyzer` instead of `CaseInsensitiveKeywordAnalyzer` when no explicit analyzer is set.
`DefaultParserOperator`	`OR`	`OR` or `AND` for parsed-string queries on this field.
`Analyzer`	`null`	Per-field analyzer type; must have a default ctor or one accepting `LuceneVersion`. Overridden by an analyzer passed to `LuceneDataProvider`.
`TermVector`	`No`	`No`, `Yes`, `WithPositions`, `WithOffsets`, `WithPositionsOffsets`.
`NativeSort`	`false`	When `true`, sort uses byte-wise string comparison instead of `IComparable.CompareTo`. Only meaningful when the converter's string output is alphanumerically sortable.
`DocValues`	`false`	Write a parallel `SortedDocValuesField` column. Opt in for fast sort/group/facet on hot fields.

`[NumericField]` options

Trie-encoded numeric field. Use this for int, long, float, double, enum (via underlying integral type), DateTime / DateTimeOffset (via the built-in ticks converter), bool (via a custom converter), or any other type your TypeConverter can map onto one of the four primitive numeric types.

Option	Default	Notes
`Field` (ctor)	property name	Backing Lucene field name.
`Store`	`Yes`
`Key`	`false`
`Boost`	`1.0f`	Silently dropped — Lucene 4.8 numeric fields don't index norms.
`Converter`	built-in for `DateTime`/`DateTimeOffset`	`TypeConverter` mapping the property type to one of `int`/`long`/`float`/`double`.
`PrecisionStep`	`NumericUtils.PRECISION_STEP_DEFAULT`	Trie granularity. Smaller = faster range queries, larger index.
`DocValues`	`false`	Writes a parallel `NumericDocValuesField` / `SingleDocValuesField` / `DoubleDocValuesField` column. Opt in for fast sort.

Other attributes

[IgnoreField] — exclude a public property from mapping entirely.
[QueryScore] — populate a float property with the document's relevance score on read. No options.
[DocumentKey(FieldName, Value)] (class-level, repeatable) — pins a constant field/value on every document of the class. Useful for adding fixed metadata fields.

Fluent (code-first) mapping

When you can't or don't want to put attributes on the type — DTOs from another assembly, generated code, schemas built at runtime — use ClassMap<T>:

public class ArticleMap : ClassMap<Article>
{
    public ArticleMap() : base(LuceneVersion.LUCENE_48)
    {
        Key(a => a.Id);
        Property(a => a.Title).AnalyzedWith(new StandardAnalyzer(LuceneVersion.LUCENE_48));
        Property(a => a.Slug).NotAnalyzed();
        NumericField(a => a.WordCount);
        NumericField(a => a.PublishedAt).WithPrecisionStep(8);
        Property(a => a.Body).WithFieldName("body_text");
    }
}

var provider = new LuceneDataProvider(directory, LuceneVersion.LUCENE_48);
provider.RegisterCacheWarmingCallback<Article>(...);
using var session = provider.OpenSession(new ArticleMap());

The fluent and attribute paths are equivalent — you can mix them in a single project, but not on the same type. The fluent API exposes one method per option in the tables above.

Custom converters

Anything Lucene can store is ultimately a string (text fields) or a trie-coded primitive (numeric fields). For everything else, supply a TypeConverter:

public class VersionConverter : TypeConverter
{
    public override bool CanConvertFrom(ITypeDescriptorContext c, Type t) => t == typeof(string);
    public override bool CanConvertTo(ITypeDescriptorContext c, Type t)   => t == typeof(string);
    public override object ConvertFrom(ITypeDescriptorContext c, CultureInfo i, object v) => Version.Parse((string)v);
    public override object ConvertTo(ITypeDescriptorContext c, CultureInfo i, object v, Type t) => v?.ToString();
}

public class Package
{
    [Field(Converter = typeof(VersionConverter))]
    public Version Version { get; set; }
}

For [NumericField], the converter must be able to convert the property type to one of long, int, double, or float. The built-in DateTimeToTicksConverter and DateTimeOffsetToTicksConverter are wired up automatically when you mark a DateTime / DateTimeOffset property as [NumericField].

Multi-valued fields

Any property whose type implements IEnumerable<T> is treated as a multi-valued field automatically. The element type drives the underlying mapper, so IList<string> works with [Field] and IList<int> with [NumericField]. Note that DocValues=true is silently downgraded for collections in this Lucene.Net 4.8 beta — SortedNumericDocValuesField isn't available, and SortedSetDocValuesField doesn't fit single-value LINQ ordering.

Document keys

[Field(Key = true)] on one or more properties marks a primary key — a unique identifier within a document type. Adding a document whose key collides with an existing one replaces the old document in a single atomic transaction.

Polymorphic type hierarchies

The library automatically supports inheritance hierarchies. This means if you have class Dog : Animal and class Cat : Animal:

// Store mixed subtypes through a base-type session
using (var session = provider.OpenSession<Animal>())
{
    session.Add(new Dog   { Id = "1", Name = "Rex",      Breed = "Shepherd" });
    session.Add(new Cat   { Id = "2", Name = "Whiskers", Indoor = true });
    session.Add(new GuideDog { Id = "3", Name = "Buddy", Breed = "Lab", Handler = "John" });
    session.Commit();
}

// Query by base type — returns all three, each as its real type
var animals = provider.AsQueryable<Animal>().ToList();
// animals[0] is Dog, animals[1] is Cat, animals[2] is GuideDog

// Query by middle type — returns Dog + GuideDog, not Cat
var dogs = provider.AsQueryable<Dog>().ToList();

// Query by leaf type — returns only GuideDog
var guides = provider.AsQueryable<GuideDog>().ToList();

Key behaviors:

Subtype-specific fields are fully indexed and hydrated. A Dog stored via OpenSession<Animal>() retains its Breed property. When read back, Breed is populated even when querying as Animal.
Dirty tracking works across the hierarchy. If you query a Dog through ISession<Animal> and change Dog.Breed, the session detects the modification and flushes it on commit.
Same key = same entity. If a Dog and a Cat share the same [Field(Key = true)] value, the last write wins — they are treated as the same document.
Subtypes must have a parameterless constructor. The library uses Activator.CreateInstance to instantiate the actual runtime type when reading polymorphic documents.

Query semantics

Once you have a session, provider.AsQueryable<T>() returns an IQueryable<T> that translates standard LINQ operators into Lucene queries. Most of LINQ-to-objects works; the parts that don't, throw at translation time with a clear message.

Supported operators

LINQ	Lucene equivalent
`Where(d => d.Field == value)`	`TermQuery`
`Where(d => d.Field != value)`	Boolean `MUST_NOT`
`Where(d => d.Field.StartsWith("foo"))`	`PrefixQuery`
`Where(d => d.Field.EndsWith("foo"))`	`WildcardQuery` (`*foo`)
`Where(d => d.Field.Contains("foo"))`	`WildcardQuery` (`foo`)
`Where(d => d.Numeric > 5)` etc.	`NumericRangeQuery`
`Where(d => d.Field == null)`	Negated existence query
`&&`, `\\|\\|`, `!`	Boolean `MUST` / `SHOULD` / `MUST_NOT`
`OrderBy`, `OrderByDescending`, `ThenBy`, `ThenByDescending`	Multi-field `Sort`
`Skip(n).Take(m)`	`IndexSearcher.Search(query, n + m)` window
`First` / `FirstOrDefault` / `Single` / `SingleOrDefault`	`Take(1)`
`Any()` / `Any(predicate)`	`TotalHits > 0`
`Count()` / `LongCount()`	`TotalHits`
`Min` / `Max`	`Sort` ascending/descending + `Take(1)`
`Where(d => d.Field.Query("text*"))`	Parsed Lucene query on a specific field
`Where(d => d.Query("text*"))`	Parsed Lucene query on default search property
`Where(d => d.Field.Similar("text"))`	`VectorQuery` on field (KNN or cosine similarity)
`Where(d => d.Similar("text"))`	`VectorQuery` on default search property
`Select(d => new { ... })`	Document projection (read only the fields you reference)

Collection Contains ("IN" queries)

The LINQ collection.Contains(field) pattern translates to an efficient TermsFilter -- a single-pass filter that matches documents whose field value appears in the collection. This is the Lucene equivalent of SQL's IN operator.

var allowedCategories = new[] { "tech", "science", "health" };

var articles = provider.AsQueryable<Article>()
    .Where(a => allowedCategories.Contains(a.Category))
    .ToList();

This produces ConstantScoreQuery(TermsFilter([Category:tech, Category:science, Category:health])) -- much more efficient than chaining || equality checks, especially for large collections. Works with arrays, lists, and any IEnumerable<T>, including captured variables.

You can combine it with other predicates:

var results = provider.AsQueryable<Article>()
    .Where(a => allowedCategories.Contains(a.Category) && a.WordCount > 500)
    .ToList();

An empty collection matches nothing (returns zero results).

Joins

LINQ join syntax works across document types. The library materializes both sides via separate Lucene searches and joins them in memory. A semi-join optimization uses TermsFilter to push the outer key values into the inner query, so only matching inner documents are fetched.

var articles = provider.AsQueryable<Article>();
var authors  = provider.AsQueryable<Author>();

// Single join
var results = (
    from article in articles
    join author in authors on article.AuthorId equals author.Username
    select new { article.Title, author.DisplayName }
).ToList();

Multiple joins chain naturally:

var categories = provider.AsQueryable<Category>();

var results = (
    from article in articles
    join author in authors on article.AuthorId equals author.Username
    join category in categories on article.CategoryId equals category.Id
    select new { article.Title, author.DisplayName, category.Label }
).ToList();

Where clauses on the outer side are pushed into Lucene before the join:

var results = (
    from article in articles.Where(a => a.Title.Contains("lucene"))
    join author in authors on article.AuthorId equals author.Username
    select new { article.Title, author.DisplayName }
).ToList();

Method syntax also works:

var results = provider.AsQueryable<Article>()
    .Join(
        provider.AsQueryable<Author>(),
        article => article.AuthorId,
        author => author.Username,
        (article, author) => new { article.Title, author.DisplayName })
    .ToList();

How it works under the hood:

The outer query executes as a normal Lucene search.
Distinct join key values are extracted from the outer results.
A TermsFilter is built from those keys and pushed into the inner query -- only inner documents matching an outer key are fetched.
Both materialized sides are joined in memory via Enumerable.Join.

This means joins are efficient when the outer result set is selective (few distinct keys), but will materialize both sides for broad queries. Lucene has no relational join engine -- this is a convenience that avoids manual materialization and in-memory joining in user code.

Parsed string queries

For cases where you need raw Lucene query syntax — wildcards, fuzzy search, boosting, range, fielded queries — there are three approaches, from simplest to most flexible:

One-liner — AsQueryable<T>(string) parses the query against the type's DefaultSearchProperty and returns a filtered IQueryable<T>:

var results = provider.AsQueryable<Article>("kitten* OR dog*").ToList();

Inline in LINQ — .Query() embeds Lucene syntax directly in a LINQ expression, composable with other predicates:

// Against the default search property
var results = provider.AsQueryable<Article>()
    .Where(a => a.Query("kitten* OR dog*") && a.WordCount > 500)
    .ToList();

// Against a specific property
var results = provider.AsQueryable<Article>()
    .Where(a => a.Title.Query("lucene~0.8") && a.Category == "tech")
    .ToList();

Pre-built Query object — for full control, use Where(Query):

var parser = provider.CreateQueryParser<Article>();
var query = parser.Parse("title:foo* AND year:[2020 TO 2024]");
var results = provider.AsQueryable<Article>().Where(query).ToList();

Score and boost

// Order by relevance — highest score first.
var top = documents
    .WhereParseQuery("body:lucene")
    .OrderByDescending(d => d.Score())
    .Take(10);

// Per-query boost. Boost expressions can use document fields.
var boosted = documents
    .WhereParseQuery("body:lucene")
    .Boost(d => d.WordCount / 100.0f);

Score() is an extension on the document type, available inside queries even when the POCO has no [QueryScore] property; mark a property [QueryScore] to also have the score materialized onto each returned object.

Pagination

Skip(n).Take(m) translates to a Lucene window read. Lucene scores the entire result set up to n + m and returns the requested slice, so very large Skip values are slower than a key-set / cursor approach — prefer cursoring on a sortable key field for deep paging.

Sorting and DocValues

OrderBy / OrderByDescending translate into Lucene SortFields. The selection rules:

[NumericField] properties → typed numeric SortField (correct numeric ordering).
[Field] reference-type properties with a TypeConverter implementing IComparable / IComparable<T> → custom comparator that calls CompareTo, reading bytes from FieldCache.GetTerms.
Plain [Field] strings → SortFieldType.STRING.
Anything with DocValues = true → typed SortField reading from the column store, much faster on first touch.

For hot sort fields, opt into DocValues:

[Field(DocValues = true)]
public string Title { get; set; }

[NumericField(DocValues = true)]
public DateTime PublishedAt { get; set; }

Without DocValues, the first sort on a field per segment uninverts the inverted index via FieldCache — correct, but pays an O(n) cost on first touch and holds the result in memory. With DocValues, the sort reads directly from a packed column.

Sessions, transactions, and the data provider

var directory = FSDirectory.Open(new DirectoryInfo("./index"));
var provider  = new LuceneDataProvider(directory, LuceneVersion.LUCENE_48);

using (var session = provider.OpenSession<Article>())
{
    session.Add(new Article { Id = "1", Title = "Hello" });

    var hits = session.Query()
        .Where(a => a.Title == "hello")
        .ToList();

    foreach (var doc in hits) doc.Title = doc.Title.ToUpperInvariant();

    session.Commit();    // single atomic flush; rollback on exception
}

Key points:

A session implements the unit of work pattern. It tracks every document you add, every document you read (so dirty-checking can detect mutations), and every key you delete. Commit() flushes them in a single atomic transaction; Dispose() without Commit() rolls them back.
Add on a document whose [Field(Key = true)] matches an existing key replaces the existing document, in the same transaction.
Delete(key) and DeleteAll() are also queued and flushed at commit time.
provider.AsQueryable<T>() opens a read-only view that doesn't need a session — convenient for read-only OData / API endpoints.

Cache-warming callbacks

Lucene reopens its IndexSearcher periodically as the index changes. You can register queries to run on the new searcher before it becomes visible, so the first user query doesn't pay the warm-up cost:

provider.RegisterCacheWarmingCallback<Article>(queryable =>
{
    queryable.OrderByDescending(a => a.PublishedAt).Take(50).ToList();
});

Logging

Wire up a logger factory once, at startup:

Lucene.Net.Linq.Util.Logging.LoggerFactory = LoggerFactory.Create(b =>
    b.AddConsole().SetMinimumLevel(LogLevel.Debug));

The library logs query translation steps, cache reload events, and session commit summaries. Defaults to NullLoggerFactory if you don't set one.

Vector Similarity Search

The library supports vector similarity search. String properties can opt in via [VectorField] or the fluent .AsVectorField() API. Embeddings are automatically computed at index time and ranked by similarity at query time using .Similar().

Configuring an embedding generator

Vector search requires an IEmbeddingGenerator<string, Embedding<float>> (from Microsoft.Extensions.AI). Any implementation works -- OpenAI, Azure OpenAI, Ollama, or a local model. For local / offline scenarios, ElBruno.LocalEmbeddings is a good choice -- under 20 MB and works great offline:

using ElBruno.LocalEmbeddings;
using ElBruno.LocalEmbeddings.Options;

var generator = new LocalEmbeddingGenerator(new LocalEmbeddingsOptions
{
    ModelName = "SmartComponents/bge-micro-v2",
    PreferQuantized = true
});

var provider = new LuceneDataProvider(directory, LuceneVersion.LUCENE_48);
provider.Settings.EmbeddingGenerator = generator;

Attribute mapping

Add [VectorField] alongside [Field] on any string property:

public class Article
{
    [Field(Key = true)]
    public string Id { get; set; }

    [Field, VectorField]
    public string Title { get; set; }

    [Field]
    public string Category { get; set; }
}

A common pattern is a compound property that combines multiple text fields into one search surface, marked as both the default search property and a vector field:

public class Article
{
    [Field(Key = true)]
    public string Id { get; set; }

    [Field]
    public string Title { get; set; }

    [Field]
    public string Body { get; set; }

    [Field(Default = true), VectorField]
    public string Content => $"{Title} {Body}";
}

This makes Content the target for both article.Query("lucene*") (free-text) and article.Similar("machine learning") (vector similarity).

Fluent mapping

public class ArticleMap : ClassMap<Article>
{
    public ArticleMap() : base(LuceneVersion.LUCENE_48)
    {
        Key(a => a.Id);
        Property(a => a.Title).AsVectorField();
        DefaultProperty(a => a.Title);
    }
}

Querying with `.Similar()`

Property-level -- search against a specific field's embeddings:

var results = provider.AsQueryable<Article>()
    .Where(a => a.Title.Similar("a cute cat napping"))
    .Take(5)
    .ToList();

Object-level -- search against the default search property (set via [Field(Default = true)] or classMap.DefaultProperty()):

var results = provider.AsQueryable<Article>()
    .Where(a => a.Similar("machine learning breakthroughs"))
    .Take(5)
    .ToList();

Hybrid -- .Similar() composes naturally with other predicates. Filters are applied first, then matching documents are ranked by similarity:

// Only animals, ranked by similarity
var results = provider.AsQueryable<Article>()
    .Where(a => a.Title.Similar("furry animals") && a.Category == "animals")
    .Take(3)
    .ToList();

Integration with OData

Lucene.Net.Linq supports both WCF Data Services and WebApi OData. These libraries by default support a feature known as Null Propagation that adds null safety to LINQ Expressions to avoid NullReferenceException from being thrown when operating on a property that may be null.

A simple expression like:

from doc in Documents where doc.Name.StartsWith("Sample") select doc;

Is translated into:

from doc in Documents where (doc != null && doc.Name != null
    && doc.Name.StartsWith("Sample")) select doc;

Null Propagation is designed to work with LINQ To Objects but is not required for LINQ providers such as Lucene.Net.Linq. Lucene.Net.Linq does its best to remove these null-safety checks when translating a LINQ expression tree into a Lucene Query, but for best performance it is recommended to simply turn the feature off, as in this example:

public class PackagesODataController : ODataController
{
    [EnableQuery(HandleNullPropagation = HandleNullPropagationOption.False)]
    public IQueryable<Package> Get()
    {
        return provider.AsQueryable<Package>();
    }
}

Upcoming features / ideas / bugs / known issues

See Issues on the GitHub project page.

Unsupported Characters in Indexed Properties

Some characters, even when using a KeywordAnalyzer or equivalent, will not be handled correctly by Lucene.Net.Linq, such as \, :, ? and * because these characters have special meaning to Lucene's query parser.

This means if you want to index a DOS style path such as c:\dos and later retrieve documents using the same term, it will not work properly.

These characters are perfectly fine for fields that will be analyzed by a tokenizer that would remove them, but exact matching on the entire value is not possible.

If exact matching is required, these characters should be replaced with suitable substitutes that are not reserved by Lucene.

3.x ⇒ 4.x Changes

A handful of subsystems shifted because the underlying Lucene API was removed or substantially reworked between 3.0.3 and 4.8:

Document-level boost was removed in Lucene 4.8 (only field-level boost remains). The old [DocumentBoost] attribute and the document boost read/write hooks have been deleted. Equivalent functionality must be expressed as field-level boost or as a custom scoring query.
Numeric-field boost is dropped — Lucene 4.8 numeric fields don't index norms, so per-field boost on Int32Field/Int64Field/etc. has no effect. The Boost property on [NumericField] is accepted for source compatibility but silently ignored at index time.
Converter-based custom sort is back, on a new code path. Properties whose type implements IComparable / IComparable<T> and has a TypeConverter (e.g. System.Version) sort correctly via GenericConvertableFieldComparatorSource / NonGenericConvertableFieldComparatorSource, which read field bytes through FieldCache.GetTerms. Value-type properties (int, bool, DateTime, nullables) cannot use this path because Lucene 4.8's FieldComparer<T> constrains T : class — mark them [NumericField] for true numeric ordering, otherwise they fall back to string sort.
MergePolicyBuilder is now a Func<MergePolicy> returning the policy to install. Lucene 4.8 requires the merge policy to be set on IndexWriterConfig before the writer is constructed, so the old delegate signature that received the live IndexWriter no longer fits.

Upgrading from Lucene.Net.Linq 3.x

Iciclecreek.Lucene.Net.Linq 4.x is source-compatible for the most common usage shape — annotated POCOs, LuceneDataProvider, OpenSession, LINQ queries — but the underlying Lucene 3 → 4.8 jump forces a few changes.

Index files are not compatible. Lucene 4.x cannot read 3.x segments at all. Plan to reindex from your source of truth, or run a one-time upgrade through Lucene's IndexUpgrader 3 → 4 path before swapping libraries. Run reindexing in a separate utility against an empty directory; do not point the new library at an old index.

Step-by-step:

Replace the package reference:
```
<PackageReference Include="Iciclecreek.Lucene.Net.Linq" Version="4.8.0-beta00017" />
```
The package id changed from Lucene.Net.Linq to Iciclecreek.Lucene.Net.Linq to disambiguate this fork from the dormant original.
Retarget. The library is netstandard2.0;net8.0;net10.0. .NET Framework 4.6.1+ consumers are supported via netstandard2.0; net40–net46 consumers are not.
Update your LuceneVersion constants. Replace Lucene.Net.Util.Version.LUCENE_30 with Lucene.Net.Util.LuceneVersion.LUCENE_48 everywhere. The type was renamed in Lucene.Net 4.8.
Fix any direct Lucene.Net.* usage. The bulk of the porting effort is in the underlying Lucene 3 → 4.8 namespace and API churn, not in this library:
- Lucene.Net.QueryParsers → Lucene.Net.QueryParsers.Classic
- Lucene.Net.Analysis.Standard.StandardAnalyzer now lives in Lucene.Net.Analysis.Standard; many tokenizers moved into Lucene.Net.Analysis.Core.
- Field constructors now take a FieldType instead of separate Store/Index/TermVector enums. The library still accepts StoreMode / IndexMode / TermVectorMode on [Field]; only direct new Field(...) callers need to change.
- IndexWriter now requires an IndexWriterConfig:
```
var config = new IndexWriterConfig(LuceneVersion.LUCENE_48, analyzer);
var writer = new IndexWriter(directory, config);
```
- IndexReader.Open → DirectoryReader.Open. Most session code doesn't touch this directly.
Remove [DocumentBoost] attributes and any code that sets Document.Boost. Document-level boost is gone in Lucene 4.8. If you depended on it, fold the boost into a field boost on a discriminator field, or apply it via a CustomScoreQuery.
Drop any boost set on [NumericField]. It's silently ignored because numeric fields don't index norms in 4.8.
If you wrote a MergePolicyBuilder, change its signature from the old delegate to Func<MergePolicy> and return the policy instance directly. The library installs it on the IndexWriterConfig before constructing the writer.
If you sort by a value-type property without [NumericField] (e.g. a plain int or DateTime), be aware that 4.8 will sort it lexicographically by string form. Add [NumericField] for true numeric ordering, or accept the string sort if it happens to match (e.g. ISO-8601 DateTime strings).
Replace Common.Logging wiring with Microsoft.Extensions.Logging:
```
Lucene.Net.Linq.Util.Logging.LoggerFactory = myLoggerFactory;
```
Defaults to NullLoggerFactory if you don't set one.
(Optional) Opt into DocValues on hot sort fields by adding DocValues = true to [Field] / [NumericField] attributes on properties you OrderBy heavily.

Product	Compatible and additional computed target framework versions.
.NET	net5.0 was computed. net5.0-windows was computed. net6.0 was computed. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 is compatible. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed.
.NET Core	netcoreapp2.0 was computed. netcoreapp2.1 was computed. netcoreapp2.2 was computed. netcoreapp3.0 was computed. netcoreapp3.1 was computed.
.NET Standard	netstandard2.0 is compatible. netstandard2.1 was computed.
.NET Framework	net461 was computed. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 was computed. net48 was computed. net481 was computed.
MonoAndroid	monoandroid was computed.
MonoMac	monomac was computed.
MonoTouch	monotouch was computed.
Tizen	tizen40 was computed. tizen60 was computed.
Xamarin.iOS	xamarinios was computed.
Xamarin.Mac	xamarinmac was computed.
Xamarin.TVOS	xamarintvos was computed.
Xamarin.WatchOS	xamarinwatchos was computed.

Compatible target framework(s)

Included target framework(s) (in package)

Learn more about Target Frameworks and .NET Standard.

.NETStandard 2.0
- Iciclecreek.Lucene.Net.Vector (>= 2.0.2)
- Lucene.Net (>= 4.8.0-beta00017)
- Lucene.Net.Analysis.Common (>= 4.8.0-beta00017)
- Lucene.Net.Queries (>= 4.8.0-beta00017)
- Lucene.Net.QueryParser (>= 4.8.0-beta00017)
- Microsoft.Extensions.AI.Abstractions (>= 10.5.0)
- Microsoft.Extensions.Logging.Abstractions (>= 10.0.7)
- Remotion.Linq (>= 2.2.0)
net10.0
- Iciclecreek.Lucene.Net.Vector (>= 2.0.2)
- Lucene.Net (>= 4.8.0-beta00017)
- Lucene.Net.Analysis.Common (>= 4.8.0-beta00017)
- Lucene.Net.Queries (>= 4.8.0-beta00017)
- Lucene.Net.QueryParser (>= 4.8.0-beta00017)
- Microsoft.Extensions.AI.Abstractions (>= 10.5.0)
- Microsoft.Extensions.Logging.Abstractions (>= 10.0.7)
- Remotion.Linq (>= 2.2.0)
net8.0
- Iciclecreek.Lucene.Net.Vector (>= 2.0.2)
- Lucene.Net (>= 4.8.0-beta00017)
- Lucene.Net.Analysis.Common (>= 4.8.0-beta00017)
- Lucene.Net.Queries (>= 4.8.0-beta00017)
- Lucene.Net.QueryParser (>= 4.8.0-beta00017)
- Microsoft.Extensions.AI.Abstractions (>= 10.5.0)
- Microsoft.Extensions.Logging.Abstractions (>= 10.0.7)
- Remotion.Linq (>= 2.2.0)

NuGet packages (1)

Showing the top 1 NuGet packages that depend on Iciclecreek.Lucene.Net.Linq:

Package	Downloads
LottaDB LottaDB is a .NET library that stores POCO objects => Table Storage and Lucene catalogs with all of the goodness of LINQ.	762

GitHub repositories

This package is not used by any popular GitHub repositories.

Version	Downloads	Last Updated
4.8.5-beta00017	120	5/11/2026
4.8.4-beta00017	160	4/23/2026
4.8.3-beta00017	63	4/22/2026
4.8.2-beta00017	61	4/20/2026
4.8.1-beta00017	95	4/15/2026
4.8.0-beta00017	63	4/8/2026

See https://github.com/tomlm/Iciclecreek.Lucene.Net.Linq/releases

Iciclecreek.Lucene.Net.Linq 4.8.5-beta00017

LINQ to Lucene

Installation

Port

New 4.x features

Features

Examples

Mapping objects ⇔ documents

Attribute mapping

[Field] options

[NumericField] options

Other attributes

Fluent (code-first) mapping

Custom converters

Multi-valued fields

Document keys

Polymorphic type hierarchies

Query semantics

Supported operators

Collection Contains ("IN" queries)

Joins

Parsed string queries

Score and boost

Pagination

Sorting and DocValues

Sessions, transactions, and the data provider

Cache-warming callbacks

Logging

Vector Similarity Search

Configuring an embedding generator

Attribute mapping

Fluent mapping

Querying with .Similar()

Integration with OData

Upcoming features / ideas / bugs / known issues

Unsupported Characters in Indexed Properties

3.x ⇒ 4.x Changes

Upgrading from Lucene.Net.Linq 3.x

.NETStandard 2.0

net10.0

net8.0

NuGet packages (1)

GitHub repositories

`[Field]` options

`[NumericField]` options

Querying with `.Similar()`