Earl.Crawler.Middleware 0.0.0-alpha.0.95

This is a prerelease version of Earl.Crawler.Middleware.
There is a newer prerelease version of this package available.
See the version list below for details.
dotnet add package Earl.Crawler.Middleware --version 0.0.0-alpha.0.95                
NuGet\Install-Package Earl.Crawler.Middleware -Version 0.0.0-alpha.0.95                
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Earl.Crawler.Middleware" Version="0.0.0-alpha.0.95" />                
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add Earl.Crawler.Middleware --version 0.0.0-alpha.0.95                
#r "nuget: Earl.Crawler.Middleware, 0.0.0-alpha.0.95"                
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install Earl.Crawler.Middleware as a Cake Addin
#addin nuget:?package=Earl.Crawler.Middleware&version=0.0.0-alpha.0.95&prerelease

// Install Earl.Crawler.Middleware as a Cake Tool
#tool nuget:?package=Earl.Crawler.Middleware&version=0.0.0-alpha.0.95&prerelease                

Earl Middleware Layer

The "Earl Middleware Layer" refers to a suite of APIs that enable the composition of code into a series of operations performed against a url during a crawl.

Earl's Middleware pattern is strongly inlfuenced by ASP.NET Core's Middleware pattern, it is strongly recommended to review ASP.NET Core's Middleware documentation

Middleware accepts a CrawlUrlResult and a CrawlUrlDelegate, the former of which represents the current state of the crawl against the current url; the latter being a reference to the next operation in the pipeline. This behaviour is captured in the ICrawlerMiddleware and is analgous to ASP.NET Core's IMiddleware contract.

Middleware is configured for a crawl using the Use extension methods, which allow 3 means of implementing middlware:

  • Typed Middleware
  • Typed Middleware with Options
  • Delegate Middleware

Typed Middleware

"Typed Middleware" refers to a class that implements the ICrawlerMiddleware contract, for example:

public class CustomMiddleware : ICrawlerMiddleware
{
    public Task InvokeAsync( CrawlUrlContext context, CrawlUrlDelegate next )
    {
        Console.WriteLine( $"Executing typed middleware while crawling {context.Url}" );
        return next( context );
    }
}

// ...

var options = CrawlerOptionsBuilder.CreateDefault()
    .Use<CustomMiddleware>()
    .Build();

await crawler.CrawlAsync( new Uri(...), options );

Typed Middleware with Options

If you wish to allow consumers of Middleware to specify an object to configure the functionality of the Middleware, the ICrawlerMiddleware<TOptions> contract may be used.

When using the ICrawlerMiddleware<TOptions> contract, specify a constructor dependency on an instance of TOptions, and invoke the Use<TMiddleware, TOptions>( this ICrawlerOptionsBuilder builder, TOptions options ) extension method to configure the desired TOptions for a crawl:

public record CustomMiddlewareOptions( string Value );

public class CustomMiddleware : ICrawlerMiddleware<CustomMiddlewareOptions>
{
    private readonly CustomMiddlewareOptions options;

    // Accept options as ctor dependency
    public CustomMiddleware( CustomMiddlewareOptions options )
        => this.options = options;

    public Task InvokeAsync( CrawlUrlContext context, CrawlUrlDelegate next )
    {
        Console.WriteLine( $"Executing typed middleware with option '{options.Value}' while crawling {context.Url}" );
        return next( context );
    }
}

// ...

var options = CrawlerOptionsBuilder.CreateDefault()
    .Use<CustomMiddleware, CustomMiddlewareOptions>( new( "Hello, World!" ) )
    .Build();

await crawler.CrawlAsync( new Uri(...), options );

Delegate Middleware

The final method of implementing a Middleware is a "Delegate Middleware", which allows an inline delegate method to be used:

var options = CrawlerOptionsBuilder.CreateDefault()
    .Use(
        ( CrawlUrlContext context, CrawlUrlDelegate next ) =>
        {
            Console.WriteLine( $"Executing delegate middleware while crawling {context.Url}" );
            return next( context );
        }
    )
    .Build();

await crawler.CrawlAsync( new Uri(...), options );

Delegate Middleware is especially useful for debugging & testing other Middleware in the crawl.

Product Compatible and additional computed target framework versions.
.NET net6.0 is compatible.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 was computed.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages (2)

Showing the top 2 NuGet packages that depend on Earl.Crawler.Middleware:

Package Downloads
Earl.Crawler.Middleware.UrlScraping

Earl Middleware for scarping and enqueuing urls via the Earl.Crawler.Middleware.Html.Abstractions.IHtmlDocumentFeature when crawling a url.

Earl.Crawler

Earl is a suite of APIs for developing url crawlers & web scrapers driven by a middleware pattern similar to, and strongly influenced by, ASP.NET Core.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
0.0.0-alpha.0.111 150 3/30/2022
0.0.0-alpha.0.110 119 3/30/2022
0.0.0-alpha.0.109 122 3/30/2022
0.0.0-alpha.0.108 124 3/30/2022
0.0.0-alpha.0.107 127 3/30/2022
0.0.0-alpha.0.106 120 3/30/2022
0.0.0-alpha.0.104 125 3/29/2022
0.0.0-alpha.0.103 136 3/27/2022
0.0.0-alpha.0.102 131 3/27/2022
0.0.0-alpha.0.101 133 3/27/2022
0.0.0-alpha.0.100 125 3/26/2022
0.0.0-alpha.0.99 133 3/26/2022
0.0.0-alpha.0.98 129 3/25/2022
0.0.0-alpha.0.97 129 3/25/2022
0.0.0-alpha.0.96 131 3/25/2022
0.0.0-alpha.0.95 131 3/25/2022
0.0.0-alpha.0.94 133 3/25/2022
0.0.0-alpha.0.93 122 3/25/2022
0.0.0-alpha.0.92 127 3/24/2022
0.0.0-alpha.0.91 122 3/24/2022
0.0.0-alpha.0.90 123 3/24/2022
0.0.0-alpha.0.89 121 3/24/2022
0.0.0-alpha.0.88 117 3/23/2022
0.0.0-alpha.0.85 129 3/23/2022
0.0.0-alpha.0.84 123 3/23/2022
0.0.0-alpha.0.83 125 3/23/2022
0.0.0-alpha.0.82 127 3/23/2022
0.0.0-alpha.0.79 128 3/22/2022
0.0.0-alpha.0.78 125 3/22/2022
0.0.0-alpha.0.77 122 3/22/2022
0.0.0-alpha.0.76 122 3/22/2022
0.0.0-alpha.0.74 127 3/22/2022
0.0.0-alpha.0.73 125 3/22/2022
0.0.0-alpha.0.72 120 3/21/2022
0.0.0-alpha.0.71 131 3/21/2022
0.0.0-alpha.0.70 130 3/20/2022
0.0.0-alpha.0.69 131 3/19/2022
0.0.0-alpha.0.67 133 3/19/2022
0.0.0-alpha.0.66 125 3/19/2022
0.0.0-alpha.0.65 130 3/19/2022
0.0.0-alpha.0.62 131 3/19/2022
0.0.0-alpha.0.61 128 3/13/2022
0.0.0-alpha.0.60 128 3/13/2022
0.0.0-alpha.0.59 134 3/11/2022
0.0.0-alpha.0.58 131 3/7/2022
0.0.0-alpha.0.57 126 3/7/2022
0.0.0-alpha.0.56 118 3/7/2022
0.0.0-alpha.0.55 123 3/7/2022
0.0.0-alpha.0.54 119 3/7/2022
0.0.0-alpha.0.53 123 3/6/2022
0.0.0-alpha.0.52 123 3/6/2022
0.0.0-alpha.0.51 125 3/6/2022