DuckDB.OrmLite
1.5.1
See the version list below for details.
dotnet add package DuckDB.OrmLite --version 1.5.1
NuGet\Install-Package DuckDB.OrmLite -Version 1.5.1
<PackageReference Include="DuckDB.OrmLite" Version="1.5.1" />
<PackageVersion Include="DuckDB.OrmLite" Version="1.5.1" />
<PackageReference Include="DuckDB.OrmLite" />
paket add DuckDB.OrmLite --version 1.5.1
#r "nuget: DuckDB.OrmLite, 1.5.1"
#:package DuckDB.OrmLite@1.5.1
#addin nuget:?package=DuckDB.OrmLite&version=1.5.1
#tool nuget:?package=DuckDB.OrmLite&version=1.5.1
DuckDB.OrmLite
DuckDB provider for ServiceStack.OrmLite - A fast, simple, and typed ORM for .NET.
Disclaimer
This is an independent, community-maintained provider for ServiceStack.OrmLite. It is not officially maintained or endorsed by ServiceStack.
About
This package enables ServiceStack.OrmLite to work with DuckDB, an in-process analytical database management system. DuckDB excels at:
- Fast analytical queries - Columnar storage and vectorized execution
- In-process - No separate server process required
- SQL standard - Familiar SQL syntax with PostgreSQL compatibility
- Data processing - Native support for Parquet, CSV, JSON
- OLAP workloads - Aggregations, window functions, complex analytics
Features
✅ Full OrmLite Support
- Complete CRUD operations (sync + async)
- LINQ query expressions
- Transactions
- AutoIncrement with sequences and INSERT...RETURNING
- Complex queries (JOINs, aggregations, subqueries)
- Parameterized queries
- Batch operations
- Async/await support (pseudo-async)
- High-performance bulk insert - 10-100x faster using DuckDB's Appender API
- Bulk insert with deduplication - Production-safe staging table pattern for massive tables
- Type-safe LINQ expressions - IntelliSense support for unique column selection
- Multi-database support - Query across multiple DuckDB files transparently
✅ Complete Type Support
- All .NET primitive types
- DateTime, DateTimeOffset, TimeSpan
- Decimal and all integer types
- Guid (UUID)
- byte[] (BLOB)
- Nullable types
✅ Production Ready
- 100 comprehensive tests (100% passing)
- Optimized for DuckDB 1.3.2
- SQL injection prevention
- Robust error handling
- Connection timeout/retry for multi-process scenarios
Installation
dotnet add package DuckDB.OrmLite
Quick Start
using ServiceStack.OrmLite;
using DuckDB.OrmLite;
// Create connection factory
var dbFactory = new DuckDbOrmLiteConnectionFactory("Data Source=myapp.db");
// Define your models
public class Customer
{
[AutoIncrement]
public int Id { get; set; }
[Required]
public string Name { get; set; }
public string Email { get; set; }
public decimal CreditLimit { get; set; }
public DateTime RegisteredAt { get; set; }
}
// Use OrmLite
using var db = dbFactory.Open();
// Create table
db.CreateTable<Customer>();
// Insert
var customer = new Customer
{
Name = "Acme Corp",
Email = "contact@acme.com",
CreditLimit = 50000,
RegisteredAt = DateTime.UtcNow
};
db.Insert(customer);
// Query with LINQ
var highValueCustomers = db.Select<Customer>(c =>
c.CreditLimit > 10000 && c.RegisteredAt > DateTime.UtcNow.AddMonths(-6)
);
// Update
customer.CreditLimit = 75000;
db.Update(customer);
// Aggregations
var totalCredit = db.Scalar<decimal>(
db.From<Customer>().Select(c => Sql.Sum(c.CreditLimit))
);
// JOINs
var orders = db.Select(db.From<Order>()
.Join<Customer>((o, c) => o.CustomerId == c.Id)
.Where<Customer>(c => c.Name == "Acme Corp")
);
// Transactions
using (var trans = db.OpenTransaction())
{
db.Insert(customer);
db.Insert(order);
trans.Commit();
}
Async/Await Support
DuckDB.OrmLite supports async/await for all operations through ServiceStack.OrmLite's built-in async APIs.
Important: Since DuckDB.NET.Data does not provide native async I/O operations, these are pseudo-async implementations (similar to SQLite). Operations will block the calling thread but provide API compatibility with other OrmLite providers.
Async Examples
using var db = dbFactory.Open();
// SELECT operations
var customers = await db.SelectAsync<Customer>(c => c.CreditLimit > 10000);
var customer = await db.SingleAsync<Customer>(c => c.Id == 1);
var count = await db.CountAsync<Customer>();
// INSERT operations
await db.InsertAsync(new Customer { Name = "New Corp", ... });
await db.InsertAllAsync(customers);
// UPDATE operations
customer.CreditLimit = 75000;
await db.UpdateAsync(customer);
await db.UpdateAllAsync(customers);
// DELETE operations
await db.DeleteAsync<Customer>(c => c.CreditLimit < 1000);
await db.DeleteByIdAsync<Customer>(customerId);
// SQL operations
var results = await db.SqlListAsync<Customer>("SELECT * FROM Customer WHERE CreditLimit > 50000");
var total = await db.SqlScalarAsync<decimal>("SELECT SUM(CreditLimit) FROM Customer");
// Transactions work with async too
using (var trans = db.OpenTransaction())
{
await db.InsertAsync(customer);
await db.InsertAsync(order);
trans.Commit();
}
Async Limitations
⚠️ Not true async I/O: Operations still block the calling thread internally
- Suitable for maintaining consistent async/await code style
- API-compatible with other OrmLite providers
- Not suitable for high-concurrency scenarios expecting true async I/O benefits
- Consider using synchronous methods if async benefits are not needed
Multi-Database Support
Query across multiple DuckDB database files transparently - perfect for time-series data, archival scenarios, or partitioned datasets.
Basic Configuration
var factory = new DuckDbOrmLiteConnectionFactory("Data Source=main.db")
.WithAdditionalDatabases("archive_2024.db", "archive_2023.db")
.WithMultiDatabaseTables("CmcPrice", "FiatPrice");
// Queries automatically span all databases
using var db = factory.Open();
var allPrices = db.Select<CmcPrice>(x => x.Symbol == "BTC");
// Returns data from main.db, archive_2024.db, and archive_2023.db
// Writes go to main database only
using var writeDb = factory.OpenForWrite();
writeDb.Insert(new CmcPrice { Date = DateTime.Today, Symbol = "ETH", Price = 3500 });
Type-Safe Configuration
// Generic factory - automatically configures multi-database support for CmcPrice
var factory = new DuckDbOrmLiteConnectionFactory<CmcPrice>("Data Source=main.db")
.WithAdditionalDatabases("archive.db");
// Or explicit configuration
var factory = new DuckDbOrmLiteConnectionFactory("Data Source=main.db")
.WithAdditionalDatabases("archive.db")
.WithMultiDatabaseTable<CmcPrice>();
Use Case: Time-Series Data with Daily Updates
// Setup: Current year + archives
var factory = new DuckDbOrmLiteConnectionFactory("Data Source=prices_2025.db")
.WithAdditionalDatabases("prices_2024.db", "prices_2023.db", "prices_2022.db")
.WithMultiDatabaseTables("CmcPrice");
// Read: Query spans all years transparently
using (var db = factory.Open())
{
// Get Bitcoin prices for last 2 years
var btcPrices = db.Select<CmcPrice>(x =>
x.Symbol == "BTC" &&
x.Date >= DateTime.Today.AddYears(-2));
// Aggregations work across all databases
var avgPrice = db.Scalar<decimal>(
db.From<CmcPrice>()
.Where(x => x.Symbol == "BTC")
.Select(x => Sql.Avg(x.Price))
);
}
// Write: New data goes to current year database
using (var writeDb = factory.OpenForWrite())
{
writeDb.Insert(new CmcPrice
{
Date = DateTime.Today,
Symbol = "BTC",
Price = 95000
});
}
How It Works
- Automatic ATTACH: Additional databases are attached on connection open
- Unified Views: Creates
{TableName}_Unified
views withUNION ALL
across all databases - Query Routing: Read queries automatically use unified views; writes go directly to main database
- Zero Code Changes: Application code using
db.Select<T>()
works unchanged
Multi-Database Features
✅ All OrmLite operations work across databases:
- SELECT with WHERE, ORDER BY, LIMIT
- Aggregations (COUNT, SUM, AVG, MAX, MIN)
- JOINs (multi-db table with single-db table)
- Complex LINQ expressions
- Async operations
✅ Smart table detection:
- Only creates views for tables that exist in databases
- Handles tables existing in subset of databases
✅ Flexible configuration:
- Mix multi-db and single-db tables in same factory
- Toggle auto-configuration with
.WithAutoConfigureViews(false)
Multi-Database Limitations
⚠️ Important considerations:
- Schema consistency: Tables must have identical schemas across all databases
- No cross-database transactions: Transactions only work with
OpenForWrite()
(single database) - Read-only archives: Additional databases should be read-only for data consistency
- No automatic deduplication:
UNION ALL
doesn't deduplicate - ensure partitioning prevents duplicates
Connection Timeout and Retry
DuckDB uses an exclusive file lock when opening a database. If another process has the database open, subsequent attempts will fail immediately. The timeout feature provides automatic retry with exponential backoff:
// Default behavior - fail immediately if database is locked
using var db = factory.Open();
// Retry for up to 30 seconds with exponential backoff
using var db = factory.Open(TimeSpan.FromSeconds(30));
// Also available for write connections
using var writeDb = factory.OpenForWrite(TimeSpan.FromSeconds(30));
// Customize retry delays (useful for high-contention scenarios)
var factory = new DuckDbOrmLiteConnectionFactory("Data Source=myapp.db")
{
RetryDelayMs = 100, // Start with 100ms delays (default: 50ms)
MaxRetryDelayMs = 5000 // Cap at 5 seconds (default: 1000ms)
};
How it works:
- Detects lock errors: "Could not set lock", "database is locked", "IO Error"
- Exponential backoff: starts at
RetryDelayMs
, doubles each retry, caps atMaxRetryDelayMs
- Random jitter (up to 25% of delay) to avoid thundering herd
- Throws
TimeoutException
if lock not acquired within timeout
Use cases:
- Multi-process scenarios where processes may temporarily lock the database
- Automated scripts that need to wait for other processes to finish
- Services that need resilient database access
Concurrency notes:
- Within same process: Multiple threads can read/write concurrently (MVCC)
- Across processes: Only one process can open at a time (exclusive lock)
- Use timeout feature for cross-process coordination
Best Practices
Recommended partitioning strategies:
- Time-based: One database per year/month
- Categorical: Separate databases by data type or category
- Archival: Current database + historical archives
Workflow pattern:
// Daily process: Keep current database for writes
using (var writeDb = factory.OpenForWrite())
{
writeDb.InsertAll(todaysData);
}
// Analytics: Query across all time periods
using (var readDb = factory.Open())
{
var historicalTrends = readDb.Select<CmcPrice>(x =>
x.Date >= new DateTime(2020, 1, 1));
}
// Year-end: Rotate current to archive
// 1. Copy prices_2025.db to archive location
// 2. Update factory configuration to include new archive
// 3. Create fresh prices_2026.db for new year
Use Cases
Data Analysis & Reporting
// DuckDB excels at analytical queries
var salesByMonth = db.SqlList<dynamic>(@"
SELECT
DATE_TRUNC('month', OrderDate) as Month,
COUNT(*) as OrderCount,
SUM(TotalAmount) as Revenue
FROM Orders
GROUP BY Month
ORDER BY Month DESC
");
ETL & Data Processing
// High-performance bulk insert (10-100x faster than InsertAll)
db.BulkInsert(largeDataset); // Uses DuckDB's native Appender API
// For smaller datasets or when transaction control is needed
db.InsertAll(smallDataset);
// Process with SQL
db.ExecuteSql(@"
INSERT INTO ProcessedOrders
SELECT * FROM Orders
WHERE Status = 'completed'
AND OrderDate > CURRENT_DATE - INTERVAL '30 days'
");
In-Memory Analytics
// Use in-memory database for fast processing
var dbFactory = new DuckDbOrmLiteConnectionFactory("Data Source=:memory:");
High-Performance Bulk Insert
For maximum performance when inserting large datasets, use BulkInsert()
which leverages DuckDB's native Appender API:
using var db = dbFactory.Open();
var products = new List<Product>();
for (int i = 0; i < 100000; i++)
{
products.Add(new Product {
Id = i,
Name = $"Product {i}",
Price = i * 1.5m
});
}
// BulkInsert is 10-100x faster than InsertAll for large datasets
db.BulkInsert(products); // Uses DuckDB Appender API
Performance Comparison
Method | 1,000 rows | 10,000 rows | 100,000 rows |
---|---|---|---|
InsertAll() |
~100ms | ~1s | ~10s |
BulkInsert() |
~10ms | ~50ms | ~500ms |
Speed improvement | 10x | 20x | 20-100x |
BulkInsert Features
✅ Blazing fast - Uses DuckDB's native Appender API for direct memory-to-database transfer
✅ All data types supported - DateTime, Guid, decimal, byte[], TimeSpan, etc.
✅ Async support - BulkInsertAsync()
available (pseudo-async wrapper)
✅ Simple API - Drop-in replacement for InsertAll()
BulkInsert Limitations
⚠️ Important considerations:
- No transaction participation - Appender auto-commits on completion
- No return values - Generated IDs are not returned (unlike
Insert()
) - All-or-nothing - If any row fails, entire batch fails
- For transactions, use
InsertAll()
- Standard inserts support explicit transactions
BulkInsert vs InsertAll
Use BulkInsert()
when:
- Inserting 100+ rows
- Performance is critical
- You don't need generated ID values returned
- Auto-commit behavior is acceptable
Use InsertAll()
when:
- Inserting < 100 rows
- You need transaction control (explicit BEGIN/COMMIT)
- You need to mix inserts with other operations in a transaction
- You need generated ID values returned
Async Bulk Insert
using var db = dbFactory.Open();
var products = GetLargeDataset();
await db.BulkInsertAsync(products);
// Note: This is pseudo-async (wraps sync operation)
// since DuckDB.NET doesn't provide native async Appender
Bulk Insert with Deduplication (Staging Table Pattern)
For large tables where indexes cannot fit in memory (a known DuckDB limitation), use BulkInsertWithDeduplication()
. This is the RECOMMENDED approach for production use with tables containing hundreds of millions of rows.
Why Staging Tables?
DuckDB's ART indexes must fit in memory. With 845+ million rows, maintaining PRIMARY KEY or UNIQUE constraints becomes impractical. The staging table pattern provides:
✅ Zero risk to main table - Data validated in staging before touching main table ✅ Atomic duplicate detection - SQL JOIN ensures no duplicates ✅ Fast rollback - Just drop staging table on error ✅ Minimal lock time - Main table locked only during final INSERT SELECT ✅ High performance - Uses Appender API for staging table loading
Basic Usage
using var db = dbFactory.Open();
// Your 845M row table with composite unique key
public class CryptoPrice
{
public DateTime Timestamp { get; set; }
public string Symbol { get; set; }
public long ExchangeId { get; set; }
public decimal Price { get; set; }
}
// Load 70,000 new records (internally unique, but may overlap with existing)
var newPrices = LoadNewPrices(); // 70K records
// Option 1: LINQ expression (type-safe, recommended)
var insertedCount = db.BulkInsertWithDeduplication(
newPrices,
x => new { x.Timestamp, x.Symbol, x.ExchangeId }
);
// Option 2: String column names (flexible)
var insertedCount = db.BulkInsertWithDeduplication(
newPrices,
"Timestamp", "Symbol", "ExchangeId"
);
Console.WriteLine($"Inserted {insertedCount} new records (duplicates filtered)");
How It Works
// Step 1: Create temporary staging table (same schema as main)
CREATE TABLE CryptoPrice_Staging_<guid> AS SELECT * FROM CryptoPrice LIMIT 0
// Step 2: BulkInsert into staging (fast, isolated from main table)
// Uses Appender API - 10-100x faster than InsertAll
// Step 3: Atomic INSERT SELECT with deduplication (handles both internal and external duplicates)
INSERT INTO CryptoPrice
WITH DeduplicatedStaging AS (
SELECT *, ROW_NUMBER() OVER (PARTITION BY Timestamp, Symbol, ExchangeId ORDER BY (SELECT NULL)) as rn
FROM CryptoPrice_Staging_<guid>
)
SELECT columns
FROM DeduplicatedStaging s
LEFT JOIN CryptoPrice m ON
s.Timestamp = m.Timestamp AND
s.Symbol = m.Symbol AND
s.ExchangeId = m.ExchangeId
WHERE m.Timestamp IS NULL -- Not in main table
AND s.rn = 1 -- First occurrence from incoming data
// Step 4: DROP staging table (always executed, even on error)
Duplicate Handling:
- Internal duplicates (within incoming data): Keeps first occurrence only
- External duplicates (with main table): Filters out all matches
- Deterministic behavior ensures consistent results
Specifying Unique Columns
Option 1: LINQ Expression (Type-Safe, Recommended)
// Single column
var insertedCount = db.BulkInsertWithDeduplication(
users,
x => x.Email
);
// Multiple columns
var insertedCount = db.BulkInsertWithDeduplication(
prices,
x => new { x.Timestamp, x.Symbol, x.ExchangeId }
);
// Async
var insertedCount = await db.BulkInsertWithDeduplicationAsync(
prices,
x => new { x.Timestamp, x.Symbol }
);
Option 2: String Column Names (Flexible)
var insertedCount = db.BulkInsertWithDeduplication(
prices,
"Timestamp", "Symbol", "ExchangeId"
);
Option 3: Auto-Detect from Attributes
// CompositeIndex attribute
[CompositeIndex(nameof(Timestamp), nameof(Symbol), nameof(ExchangeId), Unique = true)]
public class CryptoPrice { ... }
// Auto-detects unique columns from attribute
var insertedCount = db.BulkInsertWithDeduplication(newPrices);
// Single [Unique] attribute
public class User
{
[Unique]
public string Email { get; set; }
...
}
var insertedCount = db.BulkInsertWithDeduplication(newUsers);
Performance Characteristics
Operation | Time (70K records) | Notes |
---|---|---|
Append to staging | ~5-10ms | Appender API - blazing fast |
INSERT SELECT with JOIN | ~50-200ms | Depends on main table size |
Drop staging | ~1ms | Cleanup |
Total overhead | ~60-210ms | Minimal cost for safety |
Compare to risk of direct append:
- Direct append failure: Hours to recover 845M row table
- Duplicate corruption: Potentially unfixable without backups
845M Row Production Scenario
// Daily ETL process for massive time-series table
public async Task LoadDailyPrices()
{
using var db = dbFactory.Open();
// Main table: 845,000,000 rows
// New batch: 70,000 rows (may contain internal duplicates)
// Expected: ~500 duplicates with main table, ~69,500 new records
var newBatch = await FetchDailyPricesAsync(); // 70K records
var sw = Stopwatch.StartNew();
var insertedCount = db.BulkInsertWithDeduplication(
newBatch,
"Timestamp", "Symbol", "ExchangeId"
);
sw.Stop();
logger.LogInformation(
"Inserted {Inserted} of {Total} records in {Ms}ms (duplicates filtered: {Duplicates})",
insertedCount,
newBatch.Count,
sw.ElapsedMilliseconds,
newBatch.Count - insertedCount
);
// Typical output:
// Inserted 69,500 of 70,000 records in 180ms (duplicates filtered: 500)
}
Async Support
var insertedCount = await db.BulkInsertWithDeduplicationAsync(
newRecords,
"Timestamp", "Symbol", "ExchangeId"
);
When to Use Each Method
Use BulkInsertWithDeduplication()
when:
- ✅ Table has 100M+ rows (indexes can't fit in memory)
- ✅ You need duplicate prevention without UNIQUE constraints
- ✅ Main table safety is critical (zero corruption risk)
- ✅ You're doing ETL/data loading with potential duplicates
- ✅ You want automatic cleanup and transactional safety
Use BulkInsert()
when:
- ✅ Data is guaranteed unique (no duplicates possible)
- ✅ Table is small enough for UNIQUE constraints
- ✅ Maximum speed is priority (no duplicate checking needed)
- ✅ You're loading into a temporary/staging table
Use InsertAll()
when:
- ✅ Small datasets (< 100 rows)
- ✅ Need explicit transaction control
- ✅ Need generated ID values returned
Configuration
Connection Strings
File-based database:
"Data Source=myapp.db"
In-memory database:
"Data Source=:memory:"
Read-only mode:
"Data Source=myapp.db;Read Only=true"
Required Setup
DuckDB requires parameter handling that differs slightly from other databases. The provider includes a BeforeExecFilter
that handles this automatically:
// Automatically configured by DuckDbOrmLiteConnectionFactory
// Handles:
// - Parameter name conversion ($ prefix handling)
// - 1-based positional parameter indexing
// - DbType.Currency → DbType.Decimal conversion
Dependencies
This package depends on:
- ServiceStack.OrmLite (>= 8.5.2) - The ORM framework
- DuckDB.NET.Data.Full (>= 1.3.0) - .NET bindings for DuckDB
Both dependencies are automatically installed when you add this package.
Compatibility
- .NET: 8.0+
- DuckDB: 1.3.2+
- ServiceStack.OrmLite: 8.5.2+
Performance
DuckDB is optimized for analytical workloads:
- Fast aggregations - Columnar storage enables efficient aggregations
- Vectorized execution - SIMD optimizations for bulk operations
- Memory efficient - Optimized for large datasets
- Zero-copy reads - Direct memory access where possible
Limitations
- TimeSpan: Limited to ~24 hours when using HH:MM:SS format
- Concurrent writes: DuckDB uses single-writer model
Documentation
- DuckDB Official Documentation
- ServiceStack.OrmLite Documentation
- Development Documentation - Implementation details and history
Testing
# Run all tests
dotnet test
# Run specific test category
dotnet test --filter "FullyQualifiedName~AdvancedFeatureTests"
Test coverage:
- 25 core OrmLite functionality tests
- 15 advanced feature tests (JOINs, aggregations, edge cases)
- 17 async/await tests
- 18 multi-database tests
- 11 concurrency tests (thread safety, optimistic concurrency)
- 4 generic factory tests
- 90 total tests (100% passing)
- Production-ready error handling and SQL injection prevention
Contributing
Contributions are welcome! Please feel free to submit issues or pull requests.
License
This project is licensed under the MIT License - see the LICENSE.md file for details.
Acknowledgments
- ServiceStack.OrmLite - The excellent ORM framework
- DuckDB - The fast in-process analytical database
- DuckDB.NET - .NET bindings for DuckDB
Support
- Issues: GitHub Issues
- ServiceStack OrmLite: ServiceStack Support
- DuckDB: DuckDB Discord
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net8.0
- DuckDB.NET.Data.Full (>= 1.3.0)
- ServiceStack.OrmLite (>= 8.5.2)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
v1.5.1: CRITICAL FIX for BulkInsertWithDeduplication - Now handles internal duplicates within incoming dataset. Uses ROW_NUMBER window function for efficient in-database deduplication. 122 tests (100% passing). Production-ready for datasets with duplicate rows.