NPipeline.Connectors.PostgreSQL 0.8.0

Suggested Alternatives

NPipeline.Connectors.Postgres

There is a newer version of this package available.
See the version list below for details.
dotnet add package NPipeline.Connectors.PostgreSQL --version 0.8.0
                    
NuGet\Install-Package NPipeline.Connectors.PostgreSQL -Version 0.8.0
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="NPipeline.Connectors.PostgreSQL" Version="0.8.0" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="NPipeline.Connectors.PostgreSQL" Version="0.8.0" />
                    
Directory.Packages.props
<PackageReference Include="NPipeline.Connectors.PostgreSQL" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add NPipeline.Connectors.PostgreSQL --version 0.8.0
                    
#r "nuget: NPipeline.Connectors.PostgreSQL, 0.8.0"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package NPipeline.Connectors.PostgreSQL@0.8.0
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=NPipeline.Connectors.PostgreSQL&version=0.8.0
                    
Install as a Cake Addin
#tool nuget:?package=NPipeline.Connectors.PostgreSQL&version=0.8.0
                    
Install as a Cake Tool

NPipeline.Connectors.PostgreSQL

A PostgreSQL connector for NPipeline data pipelines. Provides source and sink nodes for reading from and writing to PostgreSQL databases with support for convention-based mapping, custom mappers, connection pooling, and streaming.

Installation

dotnet add package NPipeline.Connectors.PostgreSQL

Quick Start

Reading from PostgreSQL

using NPipeline.Connectors.PostgreSQL;
using NPipeline.Pipeline;

// Define your model
public record Customer(int Id, string Name, string Email);

// Create a source node
var connectionString = "Host=localhost;Database=mydb;Username=postgres;Password=password";
var source = new PostgresSourceNode<Customer>(
    connectionString,
    "SELECT id, name, email FROM customers"
);

// Use in a pipeline
var pipeline = new PipelineBuilder()
    .AddSource(source, "customer_source")
    .AddSink<ConsoleSinkNode<Customer>, Customer>("console_sink")
    .Build();

await pipeline.RunAsync();

Writing to PostgreSQL

using NPipeline.Connectors.PostgreSQL;
using NPipeline.Pipeline;

// Define your model
public record Customer(int Id, string Name, string Email);

// Create a sink node
var connectionString = "Host=localhost;Database=mydb;Username=postgres;Password=password";
var sink = new PostgresSinkNode<Customer>(
    connectionString,
    "customers",
    writeStrategy: PostgresWriteStrategy.Batch
);

// Use in a pipeline
var pipeline = new PipelineBuilder()
    .AddSource<InMemorySourceNode<Customer>, Customer>("source")
    .AddSink(sink, "customer_sink")
    .Build();

await pipeline.RunAsync();

Key Features

  • Streaming reads - Process large result sets with minimal memory usage
  • Batch writes - High-performance bulk inserts with configurable batch sizes
  • Connection pooling - Efficient connection management via dependency injection
  • Convention-based mapping - Automatic PascalCase to snake_case conversion
  • Custom mappers - Full control over row-to-object mapping
  • Retry logic - Automatic retry for transient errors
  • Checkpointing - In-memory recovery from transient failures
  • SSL/TLS support - Secure database connections
  • SQL injection prevention - Identifier validation enabled by default

Configuration

PostgresConfiguration

Configure connector behavior with PostgresConfiguration:

var configuration = new PostgresConfiguration
{
    ConnectionString = "Host=localhost;Database=mydb;Username=postgres;Password=password",
    StreamResults = true,
    FetchSize = 1_000,
    BatchSize = 1_000,
    MaxBatchSize = 5_000,
    UseTransaction = true,
    MaxRetryAttempts = 3,
    RetryDelay = TimeSpan.FromSeconds(2),
    ValidateIdentifiers = true,
    CheckpointStrategy = CheckpointStrategy.InMemory,
    CommandTimeout = 30
};

Connection String

The connection string supports all Npgsql options:

Host=localhost;Port=5432;Database=mydb;Username=postgres;Password=password;Timeout=15;Pooling=true;SslMode=Require

Mapping

Convention-Based Mapping

Properties are automatically mapped to columns using snake_case conversion:

public record Customer(
    int CustomerId,      // Maps to customer_id
    string FirstName,     // Maps to first_name
    string EmailAddress    // Maps to email_address
);

Attribute-Based Mapping

Override default mapping with attributes:

using NPipeline.Connectors.PostgreSQL.Mapping;

public record Customer(
    [PostgresColumn("cust_id", PrimaryKey = true)] int Id,
    [PostgresColumn("full_name")] string Name,
    [PostgresIgnore] string TemporaryField
);

Custom Mappers

For complete control, provide a custom mapper function:

var source = new PostgresSourceNode<Customer>(
    connectionString,
    "SELECT id, name, email FROM customers",
    rowMapper: row => new Customer(
        row.GetInt32(row.GetOrdinal("id")),
        row.GetString(row.GetOrdinal("name")),
        row.GetString(row.GetOrdinal("email"))
    )
);

Write Strategies

Per-Row Strategy

Writes each row individually. Best for:

  • Small batches
  • Real-time processing
  • Per-row error handling
var sink = new PostgresSinkNode<Customer>(
    connectionString,
    "customers",
    writeStrategy: PostgresWriteStrategy.PerRow
);

Batch Strategy

Buffers rows and issues a single multi-value INSERT. Best for:

  • Large datasets
  • Bulk imports
  • High-throughput scenarios
var configuration = new PostgresConfiguration
{
    BatchSize = 1_000,
    MaxBatchSize = 5_000,
    UseTransaction = true
};

var sink = new PostgresSinkNode<Customer>(
    connectionString,
    "customers",
    writeStrategy: PostgresWriteStrategy.Batch,
    configuration: configuration
);

Dependency Injection

Register the connector with dependency injection for production applications:

using Microsoft.Extensions.DependencyInjection;
using NPipeline.Connectors.PostgreSQL.DependencyInjection;

var services = new ServiceCollection()
    .AddPostgresConnector(options =>
    {
        options.DefaultConnectionString = "Host=localhost;Database=mydb;Username=postgres;Password=password";
        options.AddOrUpdateConnection("analytics", "Host=localhost;Database=analytics;Username=postgres;Password=postgres");
        options.DefaultConfiguration = new PostgresConfiguration
        {
            StreamResults = true,
            FetchSize = 1_000,
            BatchSize = 1_000
        };
    })
    .BuildServiceProvider();

var pool = services.GetRequiredService<IPostgresConnectionPool>();
var sourceFactory = services.GetRequiredService<PostgresSourceNodeFactory>();
var sinkFactory = services.GetRequiredService<PostgresSinkNodeFactory>();

Using Named Connections

var source = new PostgresSourceNode<Customer>(
    pool,
    "SELECT * FROM customers",
    connectionName: "analytics"
);

Streaming

Enable streaming for large result sets to reduce memory usage:

var configuration = new PostgresConfiguration
{
    StreamResults = true,
    FetchSize = 1_000
};

var source = new PostgresSourceNode<Customer>(
    connectionString,
    "SELECT * FROM large_table",
    configuration: configuration
);

Why streaming matters: Without streaming, the entire result set is loaded into memory. Streaming fetches rows in batches, allowing you to process millions of rows without memory issues.

Checkpointing

Enable in-memory checkpointing to recover from transient failures:

var configuration = new PostgresConfiguration
{
    CheckpointStrategy = CheckpointStrategy.InMemory,
    StreamResults = true
};

The connector tracks the last successfully processed row ID. If a transient failure occurs, processing resumes from the last checkpoint rather than restarting from the beginning.

Analyzers

The PostgreSQL connector includes a companion analyzer package that provides compile-time diagnostics to help prevent common mistakes when using checkpointing.

Installation

dotnet add package NPipeline.Connectors.PostgreSQL.Analyzers

NP9501: Checkpointing requires ORDER BY clause

Category: Reliability
Default Severity: Warning

When using checkpointing with PostgreSQL source nodes, the SQL query must include an ORDER BY clause on a unique, monotonically increasing column. This ensures consistent row ordering across checkpoint restarts. Without proper ordering, checkpointing may skip rows or process duplicates.

Example
// ❌ Warning: Missing ORDER BY clause
var source = new PostgresSourceNode<MyRecord>(
    connectionString,
    "SELECT id, name, created_at FROM my_table",
    configuration: new PostgresConfiguration
    {
        CheckpointStrategy = CheckpointStrategy.Offset
    }
);

// ✅ Correct: Includes ORDER BY clause
var source = new PostgresSourceNode<MyRecord>(
    connectionString,
    "SELECT id, name, created_at FROM my_table ORDER BY id",
    configuration: new PostgresConfiguration
    {
        CheckpointStrategy = CheckpointStrategy.Offset
    }
);
Why This Matters

Checkpointing tracks the position of processed rows to enable recovery from failures. Without a consistent ORDER BY clause:

  • Data Loss: Rows may be skipped during recovery
  • Data Duplication: Rows may be processed multiple times
  • Inconsistent State: Checkpoint positions become unreliable

Use a unique, monotonically increasing column such as:

  • id (primary key)
  • created_at (timestamp)
  • updated_at (timestamp)
  • timestamp (timestamp column)
  • Any auto-incrementing or sequential column

For more details, see the PostgreSQL Analyzer documentation.

Error Handling

Retry Configuration

Configure retries for transient failures:

var configuration = new PostgresConfiguration
{
    MaxRetryAttempts = 3,
    RetryDelay = TimeSpan.FromSeconds(2)
};

Custom Exception Handling

try
{
    await pipeline.RunAsync();
}
catch (NpgsqlException ex) when (ex.IsTransient)
{
    // Retry operation
    await Task.Delay(TimeSpan.FromSeconds(5));
    await pipeline.RunAsync();
}

SSL/TLS Configuration

Configure SSL/TLS for secure connections:

var configuration = new PostgresConfiguration
{
    // SSL mode is configured via connection string
    ConnectionString = "Host=localhost;Database=mydb;Username=postgres;Password=password;SslMode=Require"
};

Available SSL modes: Disable, Allow, Prefer, Require, VerifyCa, VerifyFull

Prepared Statements

The connector uses prepared statements by default (UsePreparedStatements = true). Prepared statements:

  • Reduce query parsing overhead on the database server
  • Improve performance for repeated query patterns (same query, different parameters)
  • Provide automatic SQL injection protection

When to Disable Prepared Statements

Consider disabling UsePreparedStatements only for:

  • Ad-hoc queries that are dynamically generated and never repeated
  • Very complex queries that may not benefit from preparation
  • Testing scenarios where you need to debug query generation

Performance Impact

Scenario Prepared Statements Performance Impact
Repeated inserts (same query pattern) Enabled 10-30% faster
Ad-hoc queries (different each time) Enabled 5-10% overhead
One-time bulk operations Disabled No impact

Performance Tips

  1. Use batch writes - 10-100x faster than per-row for bulk operations
  2. Enable streaming - Essential for large result sets
  3. Tune batch size - 1,000-5,000 provides good balance between throughput and latency
  4. Adjust fetch size - 1,000-5,000 rows works well for most workloads
  5. Use connection pooling - Leverage dependency injection for efficient connection management

Security

  • Identifier validation - Enabled by default to prevent SQL injection
  • Parameterized queries - All queries use parameterized statements
  • SSL/TLS support - Encrypt connections to database

Documentation

For comprehensive documentation including advanced scenarios, configuration reference, and best practices, see the PostgreSQL Connector documentation.

License

MIT License - see LICENSE file for details.

Product Compatible and additional computed target framework versions.
.NET net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 is compatible.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.