DuckDB.OrmLite 1.5.1

There is a newer version of this package available.
See the version list below for details.
dotnet add package DuckDB.OrmLite --version 1.5.1
                    
NuGet\Install-Package DuckDB.OrmLite -Version 1.5.1
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="DuckDB.OrmLite" Version="1.5.1" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="DuckDB.OrmLite" Version="1.5.1" />
                    
Directory.Packages.props
<PackageReference Include="DuckDB.OrmLite" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add DuckDB.OrmLite --version 1.5.1
                    
#r "nuget: DuckDB.OrmLite, 1.5.1"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package DuckDB.OrmLite@1.5.1
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=DuckDB.OrmLite&version=1.5.1
                    
Install as a Cake Addin
#tool nuget:?package=DuckDB.OrmLite&version=1.5.1
                    
Install as a Cake Tool

DuckDB.OrmLite

DuckDB provider for ServiceStack.OrmLite - A fast, simple, and typed ORM for .NET.

NuGet License: MIT

Disclaimer

This is an independent, community-maintained provider for ServiceStack.OrmLite. It is not officially maintained or endorsed by ServiceStack.

About

This package enables ServiceStack.OrmLite to work with DuckDB, an in-process analytical database management system. DuckDB excels at:

  • Fast analytical queries - Columnar storage and vectorized execution
  • In-process - No separate server process required
  • SQL standard - Familiar SQL syntax with PostgreSQL compatibility
  • Data processing - Native support for Parquet, CSV, JSON
  • OLAP workloads - Aggregations, window functions, complex analytics

Features

Full OrmLite Support

  • Complete CRUD operations (sync + async)
  • LINQ query expressions
  • Transactions
  • AutoIncrement with sequences and INSERT...RETURNING
  • Complex queries (JOINs, aggregations, subqueries)
  • Parameterized queries
  • Batch operations
  • Async/await support (pseudo-async)
  • High-performance bulk insert - 10-100x faster using DuckDB's Appender API
  • Bulk insert with deduplication - Production-safe staging table pattern for massive tables
  • Type-safe LINQ expressions - IntelliSense support for unique column selection
  • Multi-database support - Query across multiple DuckDB files transparently

Complete Type Support

  • All .NET primitive types
  • DateTime, DateTimeOffset, TimeSpan
  • Decimal and all integer types
  • Guid (UUID)
  • byte[] (BLOB)
  • Nullable types

Production Ready

  • 100 comprehensive tests (100% passing)
  • Optimized for DuckDB 1.3.2
  • SQL injection prevention
  • Robust error handling
  • Connection timeout/retry for multi-process scenarios

Installation

dotnet add package DuckDB.OrmLite

Quick Start

using ServiceStack.OrmLite;
using DuckDB.OrmLite;

// Create connection factory
var dbFactory = new DuckDbOrmLiteConnectionFactory("Data Source=myapp.db");

// Define your models
public class Customer
{
    [AutoIncrement]
    public int Id { get; set; }

    [Required]
    public string Name { get; set; }

    public string Email { get; set; }

    public decimal CreditLimit { get; set; }

    public DateTime RegisteredAt { get; set; }
}

// Use OrmLite
using var db = dbFactory.Open();

// Create table
db.CreateTable<Customer>();

// Insert
var customer = new Customer
{
    Name = "Acme Corp",
    Email = "contact@acme.com",
    CreditLimit = 50000,
    RegisteredAt = DateTime.UtcNow
};
db.Insert(customer);

// Query with LINQ
var highValueCustomers = db.Select<Customer>(c =>
    c.CreditLimit > 10000 && c.RegisteredAt > DateTime.UtcNow.AddMonths(-6)
);

// Update
customer.CreditLimit = 75000;
db.Update(customer);

// Aggregations
var totalCredit = db.Scalar<decimal>(
    db.From<Customer>().Select(c => Sql.Sum(c.CreditLimit))
);

// JOINs
var orders = db.Select(db.From<Order>()
    .Join<Customer>((o, c) => o.CustomerId == c.Id)
    .Where<Customer>(c => c.Name == "Acme Corp")
);

// Transactions
using (var trans = db.OpenTransaction())
{
    db.Insert(customer);
    db.Insert(order);
    trans.Commit();
}

Async/Await Support

DuckDB.OrmLite supports async/await for all operations through ServiceStack.OrmLite's built-in async APIs.

Important: Since DuckDB.NET.Data does not provide native async I/O operations, these are pseudo-async implementations (similar to SQLite). Operations will block the calling thread but provide API compatibility with other OrmLite providers.

Async Examples

using var db = dbFactory.Open();

// SELECT operations
var customers = await db.SelectAsync<Customer>(c => c.CreditLimit > 10000);
var customer = await db.SingleAsync<Customer>(c => c.Id == 1);
var count = await db.CountAsync<Customer>();

// INSERT operations
await db.InsertAsync(new Customer { Name = "New Corp", ... });
await db.InsertAllAsync(customers);

// UPDATE operations
customer.CreditLimit = 75000;
await db.UpdateAsync(customer);
await db.UpdateAllAsync(customers);

// DELETE operations
await db.DeleteAsync<Customer>(c => c.CreditLimit < 1000);
await db.DeleteByIdAsync<Customer>(customerId);

// SQL operations
var results = await db.SqlListAsync<Customer>("SELECT * FROM Customer WHERE CreditLimit > 50000");
var total = await db.SqlScalarAsync<decimal>("SELECT SUM(CreditLimit) FROM Customer");

// Transactions work with async too
using (var trans = db.OpenTransaction())
{
    await db.InsertAsync(customer);
    await db.InsertAsync(order);
    trans.Commit();
}

Async Limitations

⚠️ Not true async I/O: Operations still block the calling thread internally

  • Suitable for maintaining consistent async/await code style
  • API-compatible with other OrmLite providers
  • Not suitable for high-concurrency scenarios expecting true async I/O benefits
  • Consider using synchronous methods if async benefits are not needed

Multi-Database Support

Query across multiple DuckDB database files transparently - perfect for time-series data, archival scenarios, or partitioned datasets.

Basic Configuration

var factory = new DuckDbOrmLiteConnectionFactory("Data Source=main.db")
    .WithAdditionalDatabases("archive_2024.db", "archive_2023.db")
    .WithMultiDatabaseTables("CmcPrice", "FiatPrice");

// Queries automatically span all databases
using var db = factory.Open();
var allPrices = db.Select<CmcPrice>(x => x.Symbol == "BTC");
// Returns data from main.db, archive_2024.db, and archive_2023.db

// Writes go to main database only
using var writeDb = factory.OpenForWrite();
writeDb.Insert(new CmcPrice { Date = DateTime.Today, Symbol = "ETH", Price = 3500 });

Type-Safe Configuration

// Generic factory - automatically configures multi-database support for CmcPrice
var factory = new DuckDbOrmLiteConnectionFactory<CmcPrice>("Data Source=main.db")
    .WithAdditionalDatabases("archive.db");

// Or explicit configuration
var factory = new DuckDbOrmLiteConnectionFactory("Data Source=main.db")
    .WithAdditionalDatabases("archive.db")
    .WithMultiDatabaseTable<CmcPrice>();

Use Case: Time-Series Data with Daily Updates

// Setup: Current year + archives
var factory = new DuckDbOrmLiteConnectionFactory("Data Source=prices_2025.db")
    .WithAdditionalDatabases("prices_2024.db", "prices_2023.db", "prices_2022.db")
    .WithMultiDatabaseTables("CmcPrice");

// Read: Query spans all years transparently
using (var db = factory.Open())
{
    // Get Bitcoin prices for last 2 years
    var btcPrices = db.Select<CmcPrice>(x =>
        x.Symbol == "BTC" &&
        x.Date >= DateTime.Today.AddYears(-2));

    // Aggregations work across all databases
    var avgPrice = db.Scalar<decimal>(
        db.From<CmcPrice>()
            .Where(x => x.Symbol == "BTC")
            .Select(x => Sql.Avg(x.Price))
    );
}

// Write: New data goes to current year database
using (var writeDb = factory.OpenForWrite())
{
    writeDb.Insert(new CmcPrice
    {
        Date = DateTime.Today,
        Symbol = "BTC",
        Price = 95000
    });
}

How It Works

  1. Automatic ATTACH: Additional databases are attached on connection open
  2. Unified Views: Creates {TableName}_Unified views with UNION ALL across all databases
  3. Query Routing: Read queries automatically use unified views; writes go directly to main database
  4. Zero Code Changes: Application code using db.Select<T>() works unchanged

Multi-Database Features

All OrmLite operations work across databases:

  • SELECT with WHERE, ORDER BY, LIMIT
  • Aggregations (COUNT, SUM, AVG, MAX, MIN)
  • JOINs (multi-db table with single-db table)
  • Complex LINQ expressions
  • Async operations

Smart table detection:

  • Only creates views for tables that exist in databases
  • Handles tables existing in subset of databases

Flexible configuration:

  • Mix multi-db and single-db tables in same factory
  • Toggle auto-configuration with .WithAutoConfigureViews(false)

Multi-Database Limitations

⚠️ Important considerations:

  • Schema consistency: Tables must have identical schemas across all databases
  • No cross-database transactions: Transactions only work with OpenForWrite() (single database)
  • Read-only archives: Additional databases should be read-only for data consistency
  • No automatic deduplication: UNION ALL doesn't deduplicate - ensure partitioning prevents duplicates

Connection Timeout and Retry

DuckDB uses an exclusive file lock when opening a database. If another process has the database open, subsequent attempts will fail immediately. The timeout feature provides automatic retry with exponential backoff:

// Default behavior - fail immediately if database is locked
using var db = factory.Open();

// Retry for up to 30 seconds with exponential backoff
using var db = factory.Open(TimeSpan.FromSeconds(30));

// Also available for write connections
using var writeDb = factory.OpenForWrite(TimeSpan.FromSeconds(30));

// Customize retry delays (useful for high-contention scenarios)
var factory = new DuckDbOrmLiteConnectionFactory("Data Source=myapp.db")
{
    RetryDelayMs = 100,      // Start with 100ms delays (default: 50ms)
    MaxRetryDelayMs = 5000   // Cap at 5 seconds (default: 1000ms)
};

How it works:

  • Detects lock errors: "Could not set lock", "database is locked", "IO Error"
  • Exponential backoff: starts at RetryDelayMs, doubles each retry, caps at MaxRetryDelayMs
  • Random jitter (up to 25% of delay) to avoid thundering herd
  • Throws TimeoutException if lock not acquired within timeout

Use cases:

  • Multi-process scenarios where processes may temporarily lock the database
  • Automated scripts that need to wait for other processes to finish
  • Services that need resilient database access

Concurrency notes:

  • Within same process: Multiple threads can read/write concurrently (MVCC)
  • Across processes: Only one process can open at a time (exclusive lock)
  • Use timeout feature for cross-process coordination

Best Practices

Recommended partitioning strategies:

  • Time-based: One database per year/month
  • Categorical: Separate databases by data type or category
  • Archival: Current database + historical archives

Workflow pattern:

// Daily process: Keep current database for writes
using (var writeDb = factory.OpenForWrite())
{
    writeDb.InsertAll(todaysData);
}

// Analytics: Query across all time periods
using (var readDb = factory.Open())
{
    var historicalTrends = readDb.Select<CmcPrice>(x =>
        x.Date >= new DateTime(2020, 1, 1));
}

// Year-end: Rotate current to archive
// 1. Copy prices_2025.db to archive location
// 2. Update factory configuration to include new archive
// 3. Create fresh prices_2026.db for new year

Use Cases

Data Analysis & Reporting

// DuckDB excels at analytical queries
var salesByMonth = db.SqlList<dynamic>(@"
    SELECT
        DATE_TRUNC('month', OrderDate) as Month,
        COUNT(*) as OrderCount,
        SUM(TotalAmount) as Revenue
    FROM Orders
    GROUP BY Month
    ORDER BY Month DESC
");

ETL & Data Processing

// High-performance bulk insert (10-100x faster than InsertAll)
db.BulkInsert(largeDataset);  // Uses DuckDB's native Appender API

// For smaller datasets or when transaction control is needed
db.InsertAll(smallDataset);

// Process with SQL
db.ExecuteSql(@"
    INSERT INTO ProcessedOrders
    SELECT * FROM Orders
    WHERE Status = 'completed'
    AND OrderDate > CURRENT_DATE - INTERVAL '30 days'
");

In-Memory Analytics

// Use in-memory database for fast processing
var dbFactory = new DuckDbOrmLiteConnectionFactory("Data Source=:memory:");

High-Performance Bulk Insert

For maximum performance when inserting large datasets, use BulkInsert() which leverages DuckDB's native Appender API:

using var db = dbFactory.Open();

var products = new List<Product>();
for (int i = 0; i < 100000; i++)
{
    products.Add(new Product {
        Id = i,
        Name = $"Product {i}",
        Price = i * 1.5m
    });
}

// BulkInsert is 10-100x faster than InsertAll for large datasets
db.BulkInsert(products);  // Uses DuckDB Appender API

Performance Comparison

Method 1,000 rows 10,000 rows 100,000 rows
InsertAll() ~100ms ~1s ~10s
BulkInsert() ~10ms ~50ms ~500ms
Speed improvement 10x 20x 20-100x

BulkInsert Features

Blazing fast - Uses DuckDB's native Appender API for direct memory-to-database transfer ✅ All data types supported - DateTime, Guid, decimal, byte[], TimeSpan, etc. ✅ Async support - BulkInsertAsync() available (pseudo-async wrapper) ✅ Simple API - Drop-in replacement for InsertAll()

BulkInsert Limitations

⚠️ Important considerations:

  • No transaction participation - Appender auto-commits on completion
  • No return values - Generated IDs are not returned (unlike Insert())
  • All-or-nothing - If any row fails, entire batch fails
  • For transactions, use InsertAll() - Standard inserts support explicit transactions

BulkInsert vs InsertAll

Use BulkInsert() when:

  • Inserting 100+ rows
  • Performance is critical
  • You don't need generated ID values returned
  • Auto-commit behavior is acceptable

Use InsertAll() when:

  • Inserting < 100 rows
  • You need transaction control (explicit BEGIN/COMMIT)
  • You need to mix inserts with other operations in a transaction
  • You need generated ID values returned

Async Bulk Insert

using var db = dbFactory.Open();

var products = GetLargeDataset();
await db.BulkInsertAsync(products);

// Note: This is pseudo-async (wraps sync operation)
// since DuckDB.NET doesn't provide native async Appender

Bulk Insert with Deduplication (Staging Table Pattern)

For large tables where indexes cannot fit in memory (a known DuckDB limitation), use BulkInsertWithDeduplication(). This is the RECOMMENDED approach for production use with tables containing hundreds of millions of rows.

Why Staging Tables?

DuckDB's ART indexes must fit in memory. With 845+ million rows, maintaining PRIMARY KEY or UNIQUE constraints becomes impractical. The staging table pattern provides:

Zero risk to main table - Data validated in staging before touching main table ✅ Atomic duplicate detection - SQL JOIN ensures no duplicates ✅ Fast rollback - Just drop staging table on error ✅ Minimal lock time - Main table locked only during final INSERT SELECT ✅ High performance - Uses Appender API for staging table loading

Basic Usage

using var db = dbFactory.Open();

// Your 845M row table with composite unique key
public class CryptoPrice
{
    public DateTime Timestamp { get; set; }
    public string Symbol { get; set; }
    public long ExchangeId { get; set; }
    public decimal Price { get; set; }
}

// Load 70,000 new records (internally unique, but may overlap with existing)
var newPrices = LoadNewPrices(); // 70K records

// Option 1: LINQ expression (type-safe, recommended)
var insertedCount = db.BulkInsertWithDeduplication(
    newPrices,
    x => new { x.Timestamp, x.Symbol, x.ExchangeId }
);

// Option 2: String column names (flexible)
var insertedCount = db.BulkInsertWithDeduplication(
    newPrices,
    "Timestamp", "Symbol", "ExchangeId"
);

Console.WriteLine($"Inserted {insertedCount} new records (duplicates filtered)");

How It Works

// Step 1: Create temporary staging table (same schema as main)
CREATE TABLE CryptoPrice_Staging_<guid> AS SELECT * FROM CryptoPrice LIMIT 0

// Step 2: BulkInsert into staging (fast, isolated from main table)
// Uses Appender API - 10-100x faster than InsertAll

// Step 3: Atomic INSERT SELECT with deduplication (handles both internal and external duplicates)
INSERT INTO CryptoPrice
WITH DeduplicatedStaging AS (
    SELECT *, ROW_NUMBER() OVER (PARTITION BY Timestamp, Symbol, ExchangeId ORDER BY (SELECT NULL)) as rn
    FROM CryptoPrice_Staging_<guid>
)
SELECT columns
FROM DeduplicatedStaging s
LEFT JOIN CryptoPrice m ON
    s.Timestamp = m.Timestamp AND
    s.Symbol = m.Symbol AND
    s.ExchangeId = m.ExchangeId
WHERE m.Timestamp IS NULL  -- Not in main table
  AND s.rn = 1             -- First occurrence from incoming data

// Step 4: DROP staging table (always executed, even on error)

Duplicate Handling:

  • Internal duplicates (within incoming data): Keeps first occurrence only
  • External duplicates (with main table): Filters out all matches
  • Deterministic behavior ensures consistent results

Specifying Unique Columns

Option 1: LINQ Expression (Type-Safe, Recommended)

// Single column
var insertedCount = db.BulkInsertWithDeduplication(
    users,
    x => x.Email
);

// Multiple columns
var insertedCount = db.BulkInsertWithDeduplication(
    prices,
    x => new { x.Timestamp, x.Symbol, x.ExchangeId }
);

// Async
var insertedCount = await db.BulkInsertWithDeduplicationAsync(
    prices,
    x => new { x.Timestamp, x.Symbol }
);

Option 2: String Column Names (Flexible)

var insertedCount = db.BulkInsertWithDeduplication(
    prices,
    "Timestamp", "Symbol", "ExchangeId"
);

Option 3: Auto-Detect from Attributes

// CompositeIndex attribute
[CompositeIndex(nameof(Timestamp), nameof(Symbol), nameof(ExchangeId), Unique = true)]
public class CryptoPrice { ... }

// Auto-detects unique columns from attribute
var insertedCount = db.BulkInsertWithDeduplication(newPrices);

// Single [Unique] attribute
public class User
{
    [Unique]
    public string Email { get; set; }
    ...
}

var insertedCount = db.BulkInsertWithDeduplication(newUsers);

Performance Characteristics

Operation Time (70K records) Notes
Append to staging ~5-10ms Appender API - blazing fast
INSERT SELECT with JOIN ~50-200ms Depends on main table size
Drop staging ~1ms Cleanup
Total overhead ~60-210ms Minimal cost for safety

Compare to risk of direct append:

  • Direct append failure: Hours to recover 845M row table
  • Duplicate corruption: Potentially unfixable without backups

845M Row Production Scenario

// Daily ETL process for massive time-series table
public async Task LoadDailyPrices()
{
    using var db = dbFactory.Open();

    // Main table: 845,000,000 rows
    // New batch: 70,000 rows (may contain internal duplicates)
    // Expected: ~500 duplicates with main table, ~69,500 new records

    var newBatch = await FetchDailyPricesAsync(); // 70K records

    var sw = Stopwatch.StartNew();
    var insertedCount = db.BulkInsertWithDeduplication(
        newBatch,
        "Timestamp", "Symbol", "ExchangeId"
    );
    sw.Stop();

    logger.LogInformation(
        "Inserted {Inserted} of {Total} records in {Ms}ms (duplicates filtered: {Duplicates})",
        insertedCount,
        newBatch.Count,
        sw.ElapsedMilliseconds,
        newBatch.Count - insertedCount
    );

    // Typical output:
    // Inserted 69,500 of 70,000 records in 180ms (duplicates filtered: 500)
}

Async Support

var insertedCount = await db.BulkInsertWithDeduplicationAsync(
    newRecords,
    "Timestamp", "Symbol", "ExchangeId"
);

When to Use Each Method

Use BulkInsertWithDeduplication() when:

  • ✅ Table has 100M+ rows (indexes can't fit in memory)
  • ✅ You need duplicate prevention without UNIQUE constraints
  • ✅ Main table safety is critical (zero corruption risk)
  • ✅ You're doing ETL/data loading with potential duplicates
  • ✅ You want automatic cleanup and transactional safety

Use BulkInsert() when:

  • ✅ Data is guaranteed unique (no duplicates possible)
  • ✅ Table is small enough for UNIQUE constraints
  • ✅ Maximum speed is priority (no duplicate checking needed)
  • ✅ You're loading into a temporary/staging table

Use InsertAll() when:

  • ✅ Small datasets (< 100 rows)
  • ✅ Need explicit transaction control
  • ✅ Need generated ID values returned

Configuration

Connection Strings

File-based database:

"Data Source=myapp.db"

In-memory database:

"Data Source=:memory:"

Read-only mode:

"Data Source=myapp.db;Read Only=true"

Required Setup

DuckDB requires parameter handling that differs slightly from other databases. The provider includes a BeforeExecFilter that handles this automatically:

// Automatically configured by DuckDbOrmLiteConnectionFactory
// Handles:
// - Parameter name conversion ($ prefix handling)
// - 1-based positional parameter indexing
// - DbType.Currency → DbType.Decimal conversion

Dependencies

This package depends on:

Both dependencies are automatically installed when you add this package.

Compatibility

  • .NET: 8.0+
  • DuckDB: 1.3.2+
  • ServiceStack.OrmLite: 8.5.2+

Performance

DuckDB is optimized for analytical workloads:

  • Fast aggregations - Columnar storage enables efficient aggregations
  • Vectorized execution - SIMD optimizations for bulk operations
  • Memory efficient - Optimized for large datasets
  • Zero-copy reads - Direct memory access where possible

Limitations

  • TimeSpan: Limited to ~24 hours when using HH:MM:SS format
  • Concurrent writes: DuckDB uses single-writer model

Documentation

Testing

# Run all tests
dotnet test

# Run specific test category
dotnet test --filter "FullyQualifiedName~AdvancedFeatureTests"

Test coverage:

  • 25 core OrmLite functionality tests
  • 15 advanced feature tests (JOINs, aggregations, edge cases)
  • 17 async/await tests
  • 18 multi-database tests
  • 11 concurrency tests (thread safety, optimistic concurrency)
  • 4 generic factory tests
  • 90 total tests (100% passing)
  • Production-ready error handling and SQL injection prevention

Contributing

Contributions are welcome! Please feel free to submit issues or pull requests.

License

This project is licensed under the MIT License - see the LICENSE.md file for details.

Acknowledgments

Support

Product Compatible and additional computed target framework versions.
.NET net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 was computed.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
1.5.3 0 10/7/2025
1.5.2 27 10/6/2025
1.5.1 24 10/6/2025
1.5.0 37 10/5/2025
1.4.0 72 10/3/2025
1.3.0 119 10/2/2025
1.1.0 111 10/1/2025
1.0.1 108 10/1/2025
1.0.0 106 10/1/2025

v1.5.1: CRITICAL FIX for BulkInsertWithDeduplication - Now handles internal duplicates within incoming dataset. Uses ROW_NUMBER window function for efficient in-database deduplication. 122 tests (100% passing). Production-ready for datasets with duplicate rows.