NdjsonDelta 2.0.0

dotnet add package NdjsonDelta --version 2.0.0
                    
NuGet\Install-Package NdjsonDelta -Version 2.0.0
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="NdjsonDelta" Version="2.0.0" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="NdjsonDelta" Version="2.0.0" />
                    
Directory.Packages.props
<PackageReference Include="NdjsonDelta" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add NdjsonDelta --version 2.0.0
                    
#r "nuget: NdjsonDelta, 2.0.0"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package NdjsonDelta@2.0.0
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=NdjsonDelta&version=2.0.0
                    
Install as a Cake Addin
#tool nuget:?package=NdjsonDelta&version=2.0.0
                    
Install as a Cake Tool

NdjsonDelta

A .NET library to compute the delta (added, removed, changed objects) between two NDJSON (Newline Delimited JSON) files or strings. Designed for easy integration and NuGet distribution.

Features

  • Compare two NDJSON sources and get added, removed, and changed objects
  • Simple API for file or string input
  • Robust JSON parsing with error handling
  • Well-documented public methods
  • Lightweight and focused on core delta functionality

Usage

  1. Add the NuGet package to your project (instructions will be provided after publishing).
  2. Use the NdjsonDeltaCalculator class to compute deltas between NDJSON sources.

Sample Program

Basic Example: Simple Content-Based Comparison

using System;
using NdjsonDelta;

class Program
{
    static void Main()
    {
        var calculator = new NdjsonDeltaCalculator();
        
        // Simple approach: Use content hash as unique key (works with any JSON structure)
        NdjsonDeltaResult result = calculator.ComputeDeltaFromFiles(
            "data_v1.ndjson", "data_v2.ndjson", 
            obj => obj.GetRawText().GetHashCode().ToString());

        Console.WriteLine("=== Data Changes ===");
        Console.WriteLine($"Added: {result.Added.Count}");
        Console.WriteLine($"Removed: {result.Removed.Count}");
        Console.WriteLine($"Changed: {result.Changed.Count}");
    }
}

Advanced Example: Azure Digital Twin Topology Versions

Suppose you have two NDJSON files containing Azure Digital Twin models and instances:

topology_v1.ndjson (Previous version)

{"Section": "Header"}
{"fileVersion":"1.0.0","author":"XYZ","organization":"XYZ"}
{"Section": "Models"}
{"@context":"dtmi:dtdl:context;2","@id":"dtmi:com:xyzindustries:edge:asset;1","@type":"Interface","displayName":"Asset"}
{"Section": "Twins"}
{"$dtId":"DfgA10","$metadata":{"$model":"dtmi:com:xyzindustries:edge:dfg;1"}}
{"$dtId":"GhjA14","$metadata":{"$model":"dtmi:com:xyzindustries:edge:ghj;1"}}
{"$dtId":"Plant301","$metadata":{"$model":"dtmi:com:xyzindustries:edge:plant;1"}}

topology_v2.ndjson (Current version)

{"Section": "Header"}
{"fileVersion":"1.0.1","author":"XYZ","organization":"XYZ"}
{"Section": "Models"}
{"@context":"dtmi:dtdl:context;2","@id":"dtmi:com:xyzindustries:edge:asset;1","@type":"Interface","displayName":"Asset"}
{"Section": "Twins"}
{"$dtId":"DfgA10","$metadata":{"$model":"dtmi:com:xyzindustries:edge:dfg;1"}}
{"$dtId":"GhjA15","$metadata":{"$model":"dtmi:com:xyzindustries:edge:ghj;1"}}
{"$dtId":"Plant301","$metadata":{"$model":"dtmi:com:xyzindustries:edge:plant;1"}}
{"$dtId":"DfgA11","$metadata":{"$model":"dtmi:com:xyzindustries:edge:dfg;1"}}

Advanced C# code with property-based keys:

using System;
using System.Text.Json;
using NdjsonDelta;

class Program
{
    static void Main()
    {
        var calculator = new NdjsonDeltaCalculator();
        
        // Advanced approach: Use specific properties for more readable keys
        NdjsonDeltaResult result = calculator.ComputeDeltaFromFiles(
            "topology_v1.ndjson", "topology_v2.ndjson", 
            obj => {
                // Use $dtId for twins
                if (obj.TryGetProperty("$dtId", out var dtId))
                    return dtId.GetString();
                // Use @id for models  
                if (obj.TryGetProperty("@id", out var id))
                    return id.GetString();
                // Use "fileVersion" for header info
                if (obj.TryGetProperty("fileVersion", out var version))
                    return "fileVersion";
                // Fallback to content hash for other objects
                return obj.GetRawText().GetHashCode().ToString();
            });

        Console.WriteLine("=== Digital Twin Topology Changes ===");
        Console.WriteLine($"Added Twins/Models: {result.Added.Count}");
        Console.WriteLine($"Removed Twins/Models: {result.Removed.Count}");
        Console.WriteLine($"Changed Items: {result.Changed.Count}");
        
        foreach (var item in result.Added)
        {
            if (item.TryGetProperty("$dtId", out var dtId))
                Console.WriteLine($"Added Twin: {dtId.GetString()}");
        }
    }
}

Expected Output:

=== Digital Twin Topology Changes ===
Added Twins/Models: 2
Removed Twins/Models: 1
Changed Items: 1
Added Twin: DfgA11

String-Based Comparison

Scenario 1: Compare Two NDJSON Strings (Current vs Base Version)

using System;
using NdjsonDelta;

class Program
{
    static void Main()
    {
        var calculator = new NdjsonDeltaCalculator();
        
        // Current version NDJSON content (as string)
        string currentVersionNdjsonString = @"{""$dtId"":""DfgA10"",""$metadata"":{""$model"":""dtmi:com:xyzindustries:edge:dfg;1""}}
{""$dtId"":""GhjA15"",""$metadata"":{""$model"":""dtmi:com:xyzindustries:edge:ghj;1""}}
{""$dtId"":""Plant301"",""$metadata"":{""$model"":""dtmi:com:xyzindustries:edge:plant;1""}}
{""$dtId"":""DfgA11"",""$metadata"":{""$model"":""dtmi:com:xyzindustries:edge:dfg;1""}}
{""$sourceId"":""DfgA10"",""$targetId"":""Plant301"",""$relationshipName"":""isPartOf""}";

        // Base version NDJSON content (as string)
        string baseVersionNdjsonString = @"{""$dtId"":""DfgA10"",""$metadata"":{""$model"":""dtmi:com:xyzindustries:edge:dfg;1""}}
{""$dtId"":""GhjA14"",""$metadata"":{""$model"":""dtmi:com:xyzindustries:edge:ghj;1""}}
{""$dtId"":""Plant301"",""$metadata"":{""$model"":""dtmi:com:xyzindustries:edge:plant;1""}}
{""$sourceId"":""DfgA10"",""$targetId"":""Plant301"",""$relationshipName"":""isPartOf""}";

        // Compare both strings to find differences (twins and relationships)
        NdjsonDeltaResult result = calculator.ComputeDeltaFromStrings(
            baseVersionNdjsonString, 
            currentVersionNdjsonString,
            obj => {
                // Handle twins
                if (obj.TryGetProperty("$dtId", out var dtId))
                    return $"twin:{dtId.GetString()}";
                // Handle relationships
                if (obj.TryGetProperty("$sourceId", out var sourceId) && 
                    obj.TryGetProperty("$targetId", out var targetId) &&
                    obj.TryGetProperty("$relationshipName", out var relName))
                    return $"rel:{sourceId.GetString()}-{relName.GetString()}-{targetId.GetString()}";
                return obj.GetRawText().GetHashCode().ToString();
            });

        Console.WriteLine("=== Twin and Relationship Changes ===");
        Console.WriteLine($"Added: {result.Added.Count}, Removed: {result.Removed.Count}, Changed: {result.Changed.Count}");
        
        foreach (var item in result.Added)
        {
            if (item.TryGetProperty("$dtId", out var dtId))
                Console.WriteLine($"Added Twin: {dtId.GetString()}");
            else if (item.TryGetProperty("$sourceId", out var sourceId))
                Console.WriteLine($"Added Relationship: {sourceId.GetString()} -> {item.GetProperty("$targetId").GetString()}");
        }
    }
}

Scenario 2: Process Current Version Only (No Base Version)

using System;
using NdjsonDelta;

class Program
{
    static void Main()
    {
        var calculator = new NdjsonDeltaCalculator();
        
        // Current version NDJSON content (as string)
        string currentVersionNdjsonString = @"{""$dtId"":""DfgA10"",""$metadata"":{""$model"":""dtmi:com:xyzindustries:edge:dfg;1""}}
{""$dtId"":""GhjA15"",""$metadata"":{""$model"":""dtmi:com:xyzindustries:edge:ghj;1""}}
{""$dtId"":""Plant301"",""$metadata"":{""$model"":""dtmi:com:xyzindustries:edge:plant;1""}}
{""$sourceId"":""DfgA10"",""$targetId"":""Plant301"",""$relationshipName"":""isPartOf""}
{""$sourceId"":""GhjA15"",""$targetId"":""Plant301"",""$relationshipName"":""isPartOf""}";

        // Base version is null/empty - get all twins and relationships from current version
        string baseVersionNdjsonString = null; // or string.Empty
        
        // When base is null/empty, all items from current will be in "Added" collection
        NdjsonDeltaResult result = calculator.ComputeDeltaFromStrings(
            baseVersionNdjsonString ?? string.Empty, 
            currentVersionNdjsonString,
            obj => {
                // Handle twins
                if (obj.TryGetProperty("$dtId", out var dtId))
                    return $"twin:{dtId.GetString()}";
                // Handle relationships
                if (obj.TryGetProperty("$sourceId", out var sourceId) && 
                    obj.TryGetProperty("$targetId", out var targetId) &&
                    obj.TryGetProperty("$relationshipName", out var relName))
                    return $"rel:{sourceId.GetString()}-{relName.GetString()}-{targetId.GetString()}";
                return obj.GetRawText().GetHashCode().ToString();
            });

        Console.WriteLine("=== All Twins and Relationships ===");
        Console.WriteLine($"Total Items: {result.Added.Count}");
        
        foreach (var item in result.Added)
        {
            if (item.TryGetProperty("$dtId", out var dtId))
                Console.WriteLine($"Twin: {dtId.GetString()}");
            else if (item.TryGetProperty("$sourceId", out var sourceId))
                Console.WriteLine($"Relationship: {sourceId.GetString()} -[{item.GetProperty("$relationshipName").GetString()}]-> {item.GetProperty("$targetId").GetString()}");
        }
    }
}

Usage with Azure Blob Storage

Your application handles blob access, NuGet package handles delta calculation:

using System;
using System.Threading.Tasks;
using Azure.Storage.Blobs;
using NdjsonDelta;

public class TopologyComparisonService
{
    private readonly BlobServiceClient _blobServiceClient;
    private readonly NdjsonDeltaCalculator _deltaCalculator;

    public TopologyComparisonService(string connectionString)
    {
        _blobServiceClient = new BlobServiceClient(connectionString);
        _deltaCalculator = new NdjsonDeltaCalculator();
    }

    public async Task<NdjsonDeltaResult> CompareTopologyFromBlobs(
        string baseContainer, string baseBlob,
        string currentContainer, string currentBlob)
    {
        // Application downloads blob content as strings
        string baseContent = await DownloadBlobAsString(baseContainer, baseBlob);
        string currentContent = await DownloadBlobAsString(currentContainer, currentBlob);
        
        // NuGet package handles the delta calculation
        return _deltaCalculator.ComputeDeltaFromStrings(
            baseContent,
            currentContent,
            obj => {
                // Handle twins
                if (obj.TryGetProperty("$dtId", out var dtId))
                    return $"twin:{dtId.GetString()}";
                // Handle relationships
                if (obj.TryGetProperty("$sourceId", out var sourceId) && 
                    obj.TryGetProperty("$targetId", out var targetId) &&
                    obj.TryGetProperty("$relationshipName", out var relName))
                    return $"rel:{sourceId.GetString()}-{relName.GetString()}-{targetId.GetString()}";
                return obj.GetRawText().GetHashCode().ToString();
            });
    }

    private async Task<string> DownloadBlobAsString(string containerName, string blobName)
    {
        var containerClient = _blobServiceClient.GetBlobContainerClient(containerName);
        var blobClient = containerClient.GetBlobClient(blobName);
        
        var response = await blobClient.DownloadContentAsync();
        return response.Value.Content.ToString();
    }
}

Usage example:

class Program
{
    static async Task Main()
    {
        var service = new TopologyComparisonService("your_connection_string");
        
        var result = await service.CompareTopologyFromBlobs(
            "topology-archive", "topology_20240301.ndjson",
            "topology-current", "topology_20240401.ndjson");
            
        Console.WriteLine($"Changes detected: {result.Added.Count + result.Removed.Count + result.Changed.Count}");
    }
}

Key Selector Best Practices

The key selector function is crucial for accurate delta calculation. Choose the right strategy based your data structure and requirements:

When to use:

  • You're new to the library and want to get started quickly
  • Your JSON objects don't have obvious unique identifiers
  • You want to detect any content changes (even minor formatting differences)
  • You're comparing heterogeneous data with varying structures

Pros:

  • Simple to implement
  • Works with any JSON structure
  • Detects all content changes

Cons:

  • Keys are not human-readable
  • Sensitive to formatting changes (whitespace, property order)
  • Harder to debug when issues arise
// Simple approach - works for any JSON data
obj => obj.GetRawText().GetHashCode().ToString()

When to use:

  • Your data has natural unique identifiers (IDs, names, etc.)
  • You want human-readable keys for debugging
  • You need to ignore formatting differences
  • You're working with structured data like database records or API responses

Pros:

  • Human-readable keys for easier debugging
  • More stable across formatting changes
  • Better performance for large datasets
  • More precise change detection

Cons:

  • Requires knowledge of your data structure
  • More complex implementation
  • Need fallback strategy for objects without identifiers
// Property-based approach for Digital Twins
obj => {
    if (obj.TryGetProperty("$dtId", out var dtId))
        return $"twin:{dtId.GetString()}";
    if (obj.TryGetProperty("@id", out var id))
        return $"model:{id.GetString()}";
    // Fallback to hash for unknown structures
    return obj.GetRawText().GetHashCode().ToString();
}

3. Hybrid Approach (Best of both worlds)

When to use:

  • You have mixed data types in your NDJSON
  • Some objects have identifiers, others don't
  • You want maximum flexibility
// Hybrid approach with multiple fallbacks
obj => {
    // Try common ID fields first
    if (obj.TryGetProperty("id", out var id))
        return $"id:{id.GetString()}";
    if (obj.TryGetProperty("$dtId", out var dtId))
        return $"twin:{dtId.GetString()}";
    if (obj.TryGetProperty("name", out var name))
        return $"name:{name.GetString()}";
    
    // For section headers or metadata
    if (obj.TryGetProperty("Section", out var section))
        return $"section:{section.GetString()}";
    
    // Final fallback to content hash
    return $"hash:{obj.GetRawText().GetHashCode()}";
}

4. Composite Key Strategy

When to use:

  • Single properties aren't unique enough
  • You need to combine multiple fields for uniqueness
  • Working with relational data (like relationships between entities)
// Composite keys for relationships
obj => {
    // For relationships, combine source, target, and relationship type
    if (obj.TryGetProperty("$sourceId", out var sourceId) && 
        obj.TryGetProperty("$targetId", out var targetId) &&
        obj.TryGetProperty("$relationshipName", out var relName))
        return $"rel:{sourceId.GetString()}-{relName.GetString()}-{targetId.GetString()}";
    
    // For twins, use the twin ID
    if (obj.TryGetProperty("$dtId", out var dtId))
        return $"twin:{dtId.GetString()}";
    
    return obj.GetRawText().GetHashCode().ToString();
}

Performance Considerations

  • Hash-based keys: Fast to compute, but can have collisions with large datasets
  • Property-based keys: Faster lookups, better memory efficiency for large datasets
  • String concatenation: Use $ string interpolation for better performance than + concatenation
  • Caching: For repeated comparisons of the same data, consider caching key extraction results

Debugging Tips

  1. Start simple: Begin with hash-based approach, then optimize
  2. Log your keys: Print out a few generated keys to verify they look correct
  3. Test edge cases: Verify behavior with missing properties, null values, empty objects
  4. Validate uniqueness: Ensure your key strategy produces unique keys for different objects

Installation

dotnet add package NdjsonDelta

Dependencies

This package has minimal dependencies:

  • .NET 6.0 or .NET 8.0
  • System.Text.Json (built-in)

For Azure Blob Storage integration, your application should install:

dotnet add package Azure.Storage.Blobs

License

MIT License

Contributing

Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.

Product Compatible and additional computed target framework versions.
.NET net6.0 is compatible.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 is compatible.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed.  net9.0 was computed.  net9.0-android was computed.  net9.0-browser was computed.  net9.0-ios was computed.  net9.0-maccatalyst was computed.  net9.0-macos was computed.  net9.0-tvos was computed.  net9.0-windows was computed.  net10.0 was computed.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.
  • net6.0

    • No dependencies.
  • net8.0

    • No dependencies.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
2.0.0 583 7/2/2025
1.0.4 148 6/30/2025
1.0.3 222 6/30/2025 1.0.3 is deprecated because it is no longer maintained and has critical bugs.
1.0.2 147 6/30/2025
1.0.1 152 6/30/2025
1.0.0 148 6/30/2025