NdjsonDelta 2.0.0
dotnet add package NdjsonDelta --version 2.0.0
NuGet\Install-Package NdjsonDelta -Version 2.0.0
<PackageReference Include="NdjsonDelta" Version="2.0.0" />
<PackageVersion Include="NdjsonDelta" Version="2.0.0" />
<PackageReference Include="NdjsonDelta" />
paket add NdjsonDelta --version 2.0.0
#r "nuget: NdjsonDelta, 2.0.0"
#:package NdjsonDelta@2.0.0
#addin nuget:?package=NdjsonDelta&version=2.0.0
#tool nuget:?package=NdjsonDelta&version=2.0.0
NdjsonDelta
A .NET library to compute the delta (added, removed, changed objects) between two NDJSON (Newline Delimited JSON) files or strings. Designed for easy integration and NuGet distribution.
Features
- Compare two NDJSON sources and get added, removed, and changed objects
- Simple API for file or string input
- Robust JSON parsing with error handling
- Well-documented public methods
- Lightweight and focused on core delta functionality
Usage
- Add the NuGet package to your project (instructions will be provided after publishing).
- Use the
NdjsonDeltaCalculator
class to compute deltas between NDJSON sources.
Sample Program
Basic Example: Simple Content-Based Comparison
using System;
using NdjsonDelta;
class Program
{
static void Main()
{
var calculator = new NdjsonDeltaCalculator();
// Simple approach: Use content hash as unique key (works with any JSON structure)
NdjsonDeltaResult result = calculator.ComputeDeltaFromFiles(
"data_v1.ndjson", "data_v2.ndjson",
obj => obj.GetRawText().GetHashCode().ToString());
Console.WriteLine("=== Data Changes ===");
Console.WriteLine($"Added: {result.Added.Count}");
Console.WriteLine($"Removed: {result.Removed.Count}");
Console.WriteLine($"Changed: {result.Changed.Count}");
}
}
Advanced Example: Azure Digital Twin Topology Versions
Suppose you have two NDJSON files containing Azure Digital Twin models and instances:
topology_v1.ndjson (Previous version)
{"Section": "Header"}
{"fileVersion":"1.0.0","author":"XYZ","organization":"XYZ"}
{"Section": "Models"}
{"@context":"dtmi:dtdl:context;2","@id":"dtmi:com:xyzindustries:edge:asset;1","@type":"Interface","displayName":"Asset"}
{"Section": "Twins"}
{"$dtId":"DfgA10","$metadata":{"$model":"dtmi:com:xyzindustries:edge:dfg;1"}}
{"$dtId":"GhjA14","$metadata":{"$model":"dtmi:com:xyzindustries:edge:ghj;1"}}
{"$dtId":"Plant301","$metadata":{"$model":"dtmi:com:xyzindustries:edge:plant;1"}}
topology_v2.ndjson (Current version)
{"Section": "Header"}
{"fileVersion":"1.0.1","author":"XYZ","organization":"XYZ"}
{"Section": "Models"}
{"@context":"dtmi:dtdl:context;2","@id":"dtmi:com:xyzindustries:edge:asset;1","@type":"Interface","displayName":"Asset"}
{"Section": "Twins"}
{"$dtId":"DfgA10","$metadata":{"$model":"dtmi:com:xyzindustries:edge:dfg;1"}}
{"$dtId":"GhjA15","$metadata":{"$model":"dtmi:com:xyzindustries:edge:ghj;1"}}
{"$dtId":"Plant301","$metadata":{"$model":"dtmi:com:xyzindustries:edge:plant;1"}}
{"$dtId":"DfgA11","$metadata":{"$model":"dtmi:com:xyzindustries:edge:dfg;1"}}
Advanced C# code with property-based keys:
using System;
using System.Text.Json;
using NdjsonDelta;
class Program
{
static void Main()
{
var calculator = new NdjsonDeltaCalculator();
// Advanced approach: Use specific properties for more readable keys
NdjsonDeltaResult result = calculator.ComputeDeltaFromFiles(
"topology_v1.ndjson", "topology_v2.ndjson",
obj => {
// Use $dtId for twins
if (obj.TryGetProperty("$dtId", out var dtId))
return dtId.GetString();
// Use @id for models
if (obj.TryGetProperty("@id", out var id))
return id.GetString();
// Use "fileVersion" for header info
if (obj.TryGetProperty("fileVersion", out var version))
return "fileVersion";
// Fallback to content hash for other objects
return obj.GetRawText().GetHashCode().ToString();
});
Console.WriteLine("=== Digital Twin Topology Changes ===");
Console.WriteLine($"Added Twins/Models: {result.Added.Count}");
Console.WriteLine($"Removed Twins/Models: {result.Removed.Count}");
Console.WriteLine($"Changed Items: {result.Changed.Count}");
foreach (var item in result.Added)
{
if (item.TryGetProperty("$dtId", out var dtId))
Console.WriteLine($"Added Twin: {dtId.GetString()}");
}
}
}
Expected Output:
=== Digital Twin Topology Changes ===
Added Twins/Models: 2
Removed Twins/Models: 1
Changed Items: 1
Added Twin: DfgA11
String-Based Comparison
Scenario 1: Compare Two NDJSON Strings (Current vs Base Version)
using System;
using NdjsonDelta;
class Program
{
static void Main()
{
var calculator = new NdjsonDeltaCalculator();
// Current version NDJSON content (as string)
string currentVersionNdjsonString = @"{""$dtId"":""DfgA10"",""$metadata"":{""$model"":""dtmi:com:xyzindustries:edge:dfg;1""}}
{""$dtId"":""GhjA15"",""$metadata"":{""$model"":""dtmi:com:xyzindustries:edge:ghj;1""}}
{""$dtId"":""Plant301"",""$metadata"":{""$model"":""dtmi:com:xyzindustries:edge:plant;1""}}
{""$dtId"":""DfgA11"",""$metadata"":{""$model"":""dtmi:com:xyzindustries:edge:dfg;1""}}
{""$sourceId"":""DfgA10"",""$targetId"":""Plant301"",""$relationshipName"":""isPartOf""}";
// Base version NDJSON content (as string)
string baseVersionNdjsonString = @"{""$dtId"":""DfgA10"",""$metadata"":{""$model"":""dtmi:com:xyzindustries:edge:dfg;1""}}
{""$dtId"":""GhjA14"",""$metadata"":{""$model"":""dtmi:com:xyzindustries:edge:ghj;1""}}
{""$dtId"":""Plant301"",""$metadata"":{""$model"":""dtmi:com:xyzindustries:edge:plant;1""}}
{""$sourceId"":""DfgA10"",""$targetId"":""Plant301"",""$relationshipName"":""isPartOf""}";
// Compare both strings to find differences (twins and relationships)
NdjsonDeltaResult result = calculator.ComputeDeltaFromStrings(
baseVersionNdjsonString,
currentVersionNdjsonString,
obj => {
// Handle twins
if (obj.TryGetProperty("$dtId", out var dtId))
return $"twin:{dtId.GetString()}";
// Handle relationships
if (obj.TryGetProperty("$sourceId", out var sourceId) &&
obj.TryGetProperty("$targetId", out var targetId) &&
obj.TryGetProperty("$relationshipName", out var relName))
return $"rel:{sourceId.GetString()}-{relName.GetString()}-{targetId.GetString()}";
return obj.GetRawText().GetHashCode().ToString();
});
Console.WriteLine("=== Twin and Relationship Changes ===");
Console.WriteLine($"Added: {result.Added.Count}, Removed: {result.Removed.Count}, Changed: {result.Changed.Count}");
foreach (var item in result.Added)
{
if (item.TryGetProperty("$dtId", out var dtId))
Console.WriteLine($"Added Twin: {dtId.GetString()}");
else if (item.TryGetProperty("$sourceId", out var sourceId))
Console.WriteLine($"Added Relationship: {sourceId.GetString()} -> {item.GetProperty("$targetId").GetString()}");
}
}
}
Scenario 2: Process Current Version Only (No Base Version)
using System;
using NdjsonDelta;
class Program
{
static void Main()
{
var calculator = new NdjsonDeltaCalculator();
// Current version NDJSON content (as string)
string currentVersionNdjsonString = @"{""$dtId"":""DfgA10"",""$metadata"":{""$model"":""dtmi:com:xyzindustries:edge:dfg;1""}}
{""$dtId"":""GhjA15"",""$metadata"":{""$model"":""dtmi:com:xyzindustries:edge:ghj;1""}}
{""$dtId"":""Plant301"",""$metadata"":{""$model"":""dtmi:com:xyzindustries:edge:plant;1""}}
{""$sourceId"":""DfgA10"",""$targetId"":""Plant301"",""$relationshipName"":""isPartOf""}
{""$sourceId"":""GhjA15"",""$targetId"":""Plant301"",""$relationshipName"":""isPartOf""}";
// Base version is null/empty - get all twins and relationships from current version
string baseVersionNdjsonString = null; // or string.Empty
// When base is null/empty, all items from current will be in "Added" collection
NdjsonDeltaResult result = calculator.ComputeDeltaFromStrings(
baseVersionNdjsonString ?? string.Empty,
currentVersionNdjsonString,
obj => {
// Handle twins
if (obj.TryGetProperty("$dtId", out var dtId))
return $"twin:{dtId.GetString()}";
// Handle relationships
if (obj.TryGetProperty("$sourceId", out var sourceId) &&
obj.TryGetProperty("$targetId", out var targetId) &&
obj.TryGetProperty("$relationshipName", out var relName))
return $"rel:{sourceId.GetString()}-{relName.GetString()}-{targetId.GetString()}";
return obj.GetRawText().GetHashCode().ToString();
});
Console.WriteLine("=== All Twins and Relationships ===");
Console.WriteLine($"Total Items: {result.Added.Count}");
foreach (var item in result.Added)
{
if (item.TryGetProperty("$dtId", out var dtId))
Console.WriteLine($"Twin: {dtId.GetString()}");
else if (item.TryGetProperty("$sourceId", out var sourceId))
Console.WriteLine($"Relationship: {sourceId.GetString()} -[{item.GetProperty("$relationshipName").GetString()}]-> {item.GetProperty("$targetId").GetString()}");
}
}
}
Usage with Azure Blob Storage
Your application handles blob access, NuGet package handles delta calculation:
using System;
using System.Threading.Tasks;
using Azure.Storage.Blobs;
using NdjsonDelta;
public class TopologyComparisonService
{
private readonly BlobServiceClient _blobServiceClient;
private readonly NdjsonDeltaCalculator _deltaCalculator;
public TopologyComparisonService(string connectionString)
{
_blobServiceClient = new BlobServiceClient(connectionString);
_deltaCalculator = new NdjsonDeltaCalculator();
}
public async Task<NdjsonDeltaResult> CompareTopologyFromBlobs(
string baseContainer, string baseBlob,
string currentContainer, string currentBlob)
{
// Application downloads blob content as strings
string baseContent = await DownloadBlobAsString(baseContainer, baseBlob);
string currentContent = await DownloadBlobAsString(currentContainer, currentBlob);
// NuGet package handles the delta calculation
return _deltaCalculator.ComputeDeltaFromStrings(
baseContent,
currentContent,
obj => {
// Handle twins
if (obj.TryGetProperty("$dtId", out var dtId))
return $"twin:{dtId.GetString()}";
// Handle relationships
if (obj.TryGetProperty("$sourceId", out var sourceId) &&
obj.TryGetProperty("$targetId", out var targetId) &&
obj.TryGetProperty("$relationshipName", out var relName))
return $"rel:{sourceId.GetString()}-{relName.GetString()}-{targetId.GetString()}";
return obj.GetRawText().GetHashCode().ToString();
});
}
private async Task<string> DownloadBlobAsString(string containerName, string blobName)
{
var containerClient = _blobServiceClient.GetBlobContainerClient(containerName);
var blobClient = containerClient.GetBlobClient(blobName);
var response = await blobClient.DownloadContentAsync();
return response.Value.Content.ToString();
}
}
Usage example:
class Program
{
static async Task Main()
{
var service = new TopologyComparisonService("your_connection_string");
var result = await service.CompareTopologyFromBlobs(
"topology-archive", "topology_20240301.ndjson",
"topology-current", "topology_20240401.ndjson");
Console.WriteLine($"Changes detected: {result.Added.Count + result.Removed.Count + result.Changed.Count}");
}
}
Key Selector Best Practices
The key selector function is crucial for accurate delta calculation. Choose the right strategy based your data structure and requirements:
1. Simple Hash-Based Approach (Recommended for beginners)
When to use:
- You're new to the library and want to get started quickly
- Your JSON objects don't have obvious unique identifiers
- You want to detect any content changes (even minor formatting differences)
- You're comparing heterogeneous data with varying structures
Pros:
- Simple to implement
- Works with any JSON structure
- Detects all content changes
Cons:
- Keys are not human-readable
- Sensitive to formatting changes (whitespace, property order)
- Harder to debug when issues arise
// Simple approach - works for any JSON data
obj => obj.GetRawText().GetHashCode().ToString()
2. Property-Based Approach (Recommended for production)
When to use:
- Your data has natural unique identifiers (IDs, names, etc.)
- You want human-readable keys for debugging
- You need to ignore formatting differences
- You're working with structured data like database records or API responses
Pros:
- Human-readable keys for easier debugging
- More stable across formatting changes
- Better performance for large datasets
- More precise change detection
Cons:
- Requires knowledge of your data structure
- More complex implementation
- Need fallback strategy for objects without identifiers
// Property-based approach for Digital Twins
obj => {
if (obj.TryGetProperty("$dtId", out var dtId))
return $"twin:{dtId.GetString()}";
if (obj.TryGetProperty("@id", out var id))
return $"model:{id.GetString()}";
// Fallback to hash for unknown structures
return obj.GetRawText().GetHashCode().ToString();
}
3. Hybrid Approach (Best of both worlds)
When to use:
- You have mixed data types in your NDJSON
- Some objects have identifiers, others don't
- You want maximum flexibility
// Hybrid approach with multiple fallbacks
obj => {
// Try common ID fields first
if (obj.TryGetProperty("id", out var id))
return $"id:{id.GetString()}";
if (obj.TryGetProperty("$dtId", out var dtId))
return $"twin:{dtId.GetString()}";
if (obj.TryGetProperty("name", out var name))
return $"name:{name.GetString()}";
// For section headers or metadata
if (obj.TryGetProperty("Section", out var section))
return $"section:{section.GetString()}";
// Final fallback to content hash
return $"hash:{obj.GetRawText().GetHashCode()}";
}
4. Composite Key Strategy
When to use:
- Single properties aren't unique enough
- You need to combine multiple fields for uniqueness
- Working with relational data (like relationships between entities)
// Composite keys for relationships
obj => {
// For relationships, combine source, target, and relationship type
if (obj.TryGetProperty("$sourceId", out var sourceId) &&
obj.TryGetProperty("$targetId", out var targetId) &&
obj.TryGetProperty("$relationshipName", out var relName))
return $"rel:{sourceId.GetString()}-{relName.GetString()}-{targetId.GetString()}";
// For twins, use the twin ID
if (obj.TryGetProperty("$dtId", out var dtId))
return $"twin:{dtId.GetString()}";
return obj.GetRawText().GetHashCode().ToString();
}
Performance Considerations
- Hash-based keys: Fast to compute, but can have collisions with large datasets
- Property-based keys: Faster lookups, better memory efficiency for large datasets
- String concatenation: Use
$
string interpolation for better performance than+
concatenation - Caching: For repeated comparisons of the same data, consider caching key extraction results
Debugging Tips
- Start simple: Begin with hash-based approach, then optimize
- Log your keys: Print out a few generated keys to verify they look correct
- Test edge cases: Verify behavior with missing properties, null values, empty objects
- Validate uniqueness: Ensure your key strategy produces unique keys for different objects
Installation
dotnet add package NdjsonDelta
Dependencies
This package has minimal dependencies:
- .NET 6.0 or .NET 8.0
- System.Text.Json (built-in)
For Azure Blob Storage integration, your application should install:
dotnet add package Azure.Storage.Blobs
License
MIT License
Contributing
Pull requests are welcome. For major changes, please open an issue first to discuss what you would like to change.
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET | net6.0 is compatible. net6.0-android was computed. net6.0-ios was computed. net6.0-maccatalyst was computed. net6.0-macos was computed. net6.0-tvos was computed. net6.0-windows was computed. net7.0 was computed. net7.0-android was computed. net7.0-ios was computed. net7.0-maccatalyst was computed. net7.0-macos was computed. net7.0-tvos was computed. net7.0-windows was computed. net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net6.0
- No dependencies.
-
net8.0
- No dependencies.
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.