FractalDataWorks.Data.Files 0.4.0-preview.6

This is a prerelease version of FractalDataWorks.Data.Files.
The owner has unlisted this package. This could mean that the package is deprecated, has security vulnerabilities or shouldn't be used anymore.
dotnet add package FractalDataWorks.Data.Files --version 0.4.0-preview.6
                    
NuGet\Install-Package FractalDataWorks.Data.Files -Version 0.4.0-preview.6
                    
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="FractalDataWorks.Data.Files" Version="0.4.0-preview.6" />
                    
For projects that support PackageReference, copy this XML node into the project file to reference the package.
<PackageVersion Include="FractalDataWorks.Data.Files" Version="0.4.0-preview.6" />
                    
Directory.Packages.props
<PackageReference Include="FractalDataWorks.Data.Files" />
                    
Project file
For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.
paket add FractalDataWorks.Data.Files --version 0.4.0-preview.6
                    
#r "nuget: FractalDataWorks.Data.Files, 0.4.0-preview.6"
                    
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
#:package FractalDataWorks.Data.Files@0.4.0-preview.6
                    
#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.
#addin nuget:?package=FractalDataWorks.Data.Files&version=0.4.0-preview.6&prerelease
                    
Install as a Cake Addin
#tool nuget:?package=FractalDataWorks.Data.Files&version=0.4.0-preview.6&prerelease
                    
Install as a Cake Tool

FractalDataWorks.Data.Files

File system container and path implementations for the FractalDataWorks data framework. This package provides container types for working with file-based data sources including CSV, JSON, XML, Parquet, and other file formats.

Overview

FractalDataWorks.Data.Files extends the FractalDataWorks data abstraction layer to support file-based data operations. It provides:

  • FileContainer: Container implementation for file-based data sources
  • FilePath: Path implementation for file system locations with wildcard support
  • FileContainerType: TypeOption for file container type discovery
  • Format Support: CSV, JSON, XML, Parquet, Avro, Protobuf, and extensible format types
  • Operation Mapping: Automatic determination of supported operations based on file format

Target Frameworks: .NET 10.0

Dependencies:

  • FractalDataWorks.Data.Abstractions
  • FractalDataWorks.Collections

Core Components

FileContainer

Represents a file container with schema, format, and path information.

From FileContainer.cs:11-61:

public sealed class FileContainer : ContainerBase
{
    public FileContainer(
        string name,
        FilePath path,
        IContainerSchema schema,
        IFormatType format)
        : base(20, name)
    {
        Path = path ?? throw new ArgumentNullException(nameof(path));
        Schema = schema ?? throw new ArgumentNullException(nameof(schema));
        Format = format ?? throw new ArgumentNullException(nameof(format));
        ContainerType = FileContainerType.Instance;
        SupportedOperations = DetermineOperations(format);
    }

    public override IContainerType ContainerType { get; }
    public override IFormatType Format { get; }
    public override IPath Path { get; }
    public override IContainerSchema Schema { get; }

    private static string[] DetermineOperations(IFormatType format)
    {
        return format.Name switch
        {
            "Json" => new[] { "Query", "Insert", "Update", "Delete" },
            "Xml" => new[] { "Query", "Insert", "Update", "Delete" },
            "Csv" => new[] { "Query", "Insert", "Update" },
            "Parquet" => new[] { "Query" },
            "Avro" => new[] { "Query" },
            "Protobuf" => new[] { "Query" },
            _ => new[] { "Query" }
        };
    }
}

Key Characteristics:

  • Container ID: 20 (unique identifier in container type system)
  • Automatic operation support based on format (Query-only for binary formats, full CRUD for text formats)
  • Schema-aware for structured data validation
  • Format-specific handling for parsing and serialization

FilePath

Represents a path to a file or file pattern with wildcard support.

From FilePath.cs:15-87:

public sealed class FilePath : PathBase, IDataPath<IStorageContainer>
{
    private readonly List<IStorageContainer> _containers;

    public FilePath(
        string path,
        IEnumerable<IStorageContainer>? containers = null)
        : base(3, "FilePath")
    {
        PathValue = path ?? throw new ArgumentNullException(nameof(path));
        _containers = containers?.ToList() ?? new List<IStorageContainer>();
        IsPattern = path.Contains('*') || path.Contains('?');
    }

    public override string PathValue { get; }
    public override string Domain => "File";
    public bool IsPattern { get; }
    public string Directory => Path.GetDirectoryName(PathValue) ?? string.Empty;
    public string FileName => Path.GetFileName(PathValue);
    public string Extension => IsPattern ? string.Empty : Path.GetExtension(PathValue);

    // IDataPath implementation
    public IReadOnlyList<IStorageContainer> Containers => _containers;

    public IStorageContainer? GetContainer(string name) =>
        _containers.FirstOrDefault(c => string.Equals(c.Name, name, StringComparison.OrdinalIgnoreCase));

    public bool ContainsContainer(string name) =>
        _containers.Any(c => string.Equals(c.Name, name, StringComparison.OrdinalIgnoreCase));
}

Features:

  • Path type ID: 3 (unique identifier in path type system)
  • Wildcard pattern detection (* and ? characters)
  • File name and extension extraction
  • Directory path parsing
  • Container association for multi-file scenarios
  • Implements IDataPath<IStorageContainer> for data layer integration

FileContainerType

TypeOption implementation for file container type discovery.

From FileContainerType.cs:10-28:

[ExcludeFromCodeCoverage]
[TypeOption(typeof(ContainerTypes), "File")]
public sealed class FileContainerType : ContainerTypeBase
{
    public static readonly FileContainerType Instance = new();

    private FileContainerType()
        : base(
            id: 20,
            name: "File",
            displayName: "File",
            description: "File-based data container supporting CSV, JSON, XML, Parquet, and other formats",
            supportsSchemaDiscovery: true)
    {
    }
}

Usage Patterns

Creating File Containers

The following examples demonstrate creating file containers using actual framework APIs.

CSV File Container
using FractalDataWorks.Data.Abstractions;
using FractalDataWorks.Data.Files.Containers;
using FractalDataWorks.Data.Files.Paths;

// Define schema using required init properties
// See FractalDataWorks.Data.Abstractions/Schema/ContainerSchema.cs
var schema = new ContainerSchema
{
    Fields = new List<IField>
    {
        new Field
        {
            Name = "CustomerId",
            FieldType = new SimpleFieldType { TypeName = "String", ClrType = typeof(string) },
            Role = FieldRole.Identity
        },
        new Field
        {
            Name = "OrderDate",
            FieldType = new SimpleFieldType { TypeName = "DateTime", ClrType = typeof(DateTime) },
            Role = FieldRole.Attribute
        },
        new Field
        {
            Name = "Amount",
            FieldType = new SimpleFieldType { TypeName = "Decimal", ClrType = typeof(decimal) },
            Role = FieldRole.Measure
        }
    }
};

// Create file path (see Paths/FilePath.cs)
var csvPath = new FilePath(@"C:\Data\Orders\orders_2024.csv");

// Get format type from TypeCollection (see FractalDataWorks.Data.Abstractions/Formats/FormatTypes.cs)
var csvFormat = FormatTypes.ByName("Csv");

// Create CSV file container
var csvContainer = new FileContainer(
    name: "CustomerOrders",
    path: csvPath,
    schema: schema,
    format: csvFormat);

// Container properties
Console.WriteLine($"Container: {csvContainer.Name}");
Console.WriteLine($"Type: {csvContainer.ContainerType.Name}");       // "File"
Console.WriteLine($"Format: {csvContainer.Format.Name}");            // "Csv"
Console.WriteLine($"Path: {csvContainer.Path.PathValue}");           // Note: PathValue not Value
Console.WriteLine($"Operations: {string.Join(", ", csvContainer.SupportedOperations)}");
// Output: "Query, Insert, Update"
JSON File Container
var jsonPath = new FilePath("./data/customers.json");
var jsonFormat = FormatTypes.ByName("Json");

var jsonContainer = new FileContainer(
    name: "Customers",
    path: jsonPath,
    schema: schema,
    format: jsonFormat);

// JSON supports full CRUD
Console.WriteLine($"Operations: {string.Join(", ", jsonContainer.SupportedOperations)}");
// Output: "Query, Insert, Update, Delete"
Parquet File Container
var parquetPath = new FilePath(@"D:\Sensors\telemetry_2024-11.parquet");
var parquetFormat = FormatTypes.ByName("Parquet");

var parquetContainer = new FileContainer(
    name: "SensorTelemetry",
    path: parquetPath,
    schema: schema,
    format: parquetFormat);

// Parquet files are Query-only (binary format)
Console.WriteLine($"Query supported: {parquetContainer.SupportedOperations.Contains("Query")}");   // true
Console.WriteLine($"Insert supported: {parquetContainer.SupportedOperations.Contains("Insert")}"); // false

File Path Operations

Based on actual API from FilePath.cs:

var path = new FilePath(@"C:\Data\Archive\sales_2024_Q1.csv");

// Access path components (using actual property names)
Console.WriteLine($"Full Path: {path.PathValue}");    // C:\Data\Archive\sales_2024_Q1.csv
Console.WriteLine($"File Name: {path.FileName}");     // sales_2024_Q1.csv
Console.WriteLine($"Directory: {path.Directory}");    // C:\Data\Archive
Console.WriteLine($"Extension: {path.Extension}");    // .csv
Console.WriteLine($"Is Pattern: {path.IsPattern}");   // false

// Wildcard pattern example
var patternPath = new FilePath(@"C:\Data\Archive\*.csv");
Console.WriteLine($"Is Pattern: {patternPath.IsPattern}");  // true
Console.WriteLine($"Extension: {patternPath.Extension}");   // "" (empty for patterns)

Note: FilePath does not include Exists() or IsAbsolute() methods. For file existence checks, use System.IO.File.Exists(path.PathValue) directly.

Format-Based Operation Detection

The actual operation detection is implemented in FileContainer. From FileContainer.cs:46-60:

private static string[] DetermineOperations(IFormatType format)
{
    // Most text-based formats support full CRUD
    // Binary formats are often read-only
    return format.Name switch
    {
        "Json" => new[] { "Query", "Insert", "Update", "Delete" },
        "Xml" => new[] { "Query", "Insert", "Update", "Delete" },
        "Csv" => new[] { "Query", "Insert", "Update" },
        "Parquet" => new[] { "Query" },  // Read-only
        "Avro" => new[] { "Query" },  // Read-only
        "Protobuf" => new[] { "Query" },  // Read-only
        _ => new[] { "Query" }  // Default: read-only
    };
}

Usage:

var csvFormat = FormatTypes.ByName("Csv");
var csvContainer = new FileContainer("Data", csvPath, schema, csvFormat);
var canInsert = csvContainer.SupportedOperations.Contains("Insert"); // true

var parquetFormat = FormatTypes.ByName("Parquet");
var parquetContainer = new FileContainer("Data", parquetPath, schema, parquetFormat);
var canInsert2 = parquetContainer.SupportedOperations.Contains("Insert"); // false

Path Validation Pattern

Example pattern for validating file paths with Railway-Oriented Programming:

public IGenericResult<FilePath> ValidateAndCreate(string pathString)
{
    try
    {
        var path = new FilePath(pathString);

        // Validate extension (patterns have empty extension)
        if (!path.IsPattern && string.IsNullOrEmpty(path.Extension))
        {
            return GenericResult<FilePath>.Failure(
                "File path must include an extension");
        }

        // Validate directory exists
        if (!string.IsNullOrEmpty(path.Directory) &&
            !System.IO.Directory.Exists(path.Directory))
        {
            return GenericResult<FilePath>.Failure(
                $"Directory does not exist: {path.Directory}");
        }

        return GenericResult<FilePath>.Success(path);
    }
    catch (Exception ex)
    {
        return GenericResult<FilePath>.Failure($"Invalid file path: {ex.Message}");
    }
}

Best Practices

  1. Use PathValue Property: Access the path string via path.PathValue, not path.Value
  2. Check IsPattern: Before accessing Extension, check path.IsPattern (patterns return empty extension)
  3. Schema Definition: Always define schemas for structured file formats (CSV, Parquet)
  4. Use FormatTypes.ByName: Access format types via FormatTypes.ByName("Csv") TypeCollection lookup
  5. Error Handling: Use Railway-Oriented Programming with IGenericResult for file operations
  6. Path Security: Validate and sanitize user-provided paths to prevent path traversal
  7. Check SupportedOperations: Verify operation support before attempting Insert/Update/Delete
  8. File Existence: Use System.IO.File.Exists(path.PathValue) for existence checks

Testing Examples

public class FileContainerTests
{
    [Fact]
    public void FileContainerStoresCorrectProperties()
    {
        // Arrange
        var path = new FilePath(@"C:\Data\test.csv");
        var schema = new ContainerSchema { Fields = new List<IField>() };
        var format = FormatTypes.ByName("Csv");

        // Act
        var container = new FileContainer("TestData", path, schema, format);

        // Assert
        container.Name.ShouldBe("TestData");
        container.Path.ShouldBe(path);
        container.Schema.ShouldBe(schema);
        container.Format.ShouldBe(format);
        container.ContainerType.Name.ShouldBe("File");
    }

    [Fact]
    public void FilePathExtractsComponentsCorrectly()
    {
        // Arrange & Act
        var path = new FilePath(@"C:\Data\Archive\orders.csv");

        // Assert
        path.FileName.ShouldBe("orders.csv");
        path.Extension.ShouldBe(".csv");
        path.Directory.ShouldContain("Archive");
        path.IsPattern.ShouldBeFalse();
    }

    [Fact]
    public void FilePathDetectsWildcardPattern()
    {
        // Arrange & Act
        var path = new FilePath(@"C:\Data\Archive\*.csv");

        // Assert
        path.IsPattern.ShouldBeTrue();
        path.Extension.ShouldBeEmpty();
    }
}

Dependencies

  • FractalDataWorks.Data.Abstractions: Container and path base interfaces (ContainerBase, PathBase, IStorageContainer, IDataPath)
  • FractalDataWorks.Data.DataStores.Abstractions: Data path interfaces
  • FractalDataWorks.Collections: TypeOption and TypeCollection infrastructure
  • FractalDataWorks.Results: IGenericResult for Railway-Oriented Programming

Relationship to Other Packages

FractalDataWorks.Data.Files (this package)
├── Implements: FractalDataWorks.Data.Abstractions (IStorageContainer, IPath, IDataPath)
├── Used by: FractalDataWorks.Data.DataStores.FileSystem
└── Parallel to: FractalDataWorks.Data.Http (HTTP containers)

See Also

Product Compatible and additional computed target framework versions.
.NET net10.0 is compatible.  net10.0-android was computed.  net10.0-browser was computed.  net10.0-ios was computed.  net10.0-maccatalyst was computed.  net10.0-macos was computed.  net10.0-tvos was computed.  net10.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last Updated
Loading failed