WebReaper 3.0.7

dotnet add package WebReaper --version 3.0.7

NuGet\Install-Package WebReaper -Version 3.0.7

This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.

<PackageReference Include="WebReaper" Version="3.0.7" />

For projects that support PackageReference, copy this XML node into the project file to reference the package.

<PackageVersion Include="WebReaper" Version="3.0.7" />
                    

                            Directory.Packages.props

<PackageReference Include="WebReaper" />
                    

                            Project file

For projects that support Central Package Management (CPM), copy this XML node into the solution Directory.Packages.props file to version the package.

paket add WebReaper --version 3.0.7

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

#r "nuget: WebReaper, 3.0.7"

#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.

#:package WebReaper@3.0.7

#:package directive can be used in C# file-based apps starting in .NET 10 preview 4. Copy this into a .cs file before any lines of code to reference the package.

#addin nuget:?package=WebReaper&version=3.0.7
                    

                            Install as a Cake Addin

#tool nuget:?package=WebReaper&version=3.0.7
                    

                            Install as a Cake Tool

The NuGet Team does not provide support for this client. Please contact its maintainers for support.

Interface	Description
IScheduler	Reading and writing from the job queue. By default, the in-memory queue is used, but you can provider your implementation
ICrawledLinkTracker	Tracker of visited links. A default implementation is an in-memory tracker. You can provide your own for Redis, MongoDB, etc.
IPageLoader	Loader that takes URL and returns HTML of the page as a string
IContentParser	Takes HTML and schema and returns JSON representation (JObject).
ILinkParser	Takes HTML as a string and returns page links
IScraperSink	Represents a data store for writing the results of web scraping. Takes the JObject as parameter
ISpider	A spider that does the crawling, parsing, and saving of the data

Project	Description
WebReaper	Library for web scraping
WebReaper.ScraperWorkerService	Example of using WebReaper library in a Worker Service .NET project.
WebReaper.DistributedScraperWorkerService	Example of using WebReaper library in a distributed way wih Azure Service Bus
WebReaper.AzureFuncs	Example of using WebReaper library with serverless approach using Azure Functions
WebReaper.ConsoleApplication	Example of using WebReaper library with in a console application

Version	Downloads	Last Updated
3.5.2	2,288	10/19/2024
3.5.1	2,930	8/15/2023
3.5.0	212	8/9/2023
3.4.0	331	4/17/2023
3.3.0	289	4/3/2023
3.2.0	277	4/2/2023
3.1.0	389	2/28/2023
3.0.8	607	11/12/2022
3.0.7	428	11/4/2022
3.0.6	404	11/3/2022
3.0.5	428	10/31/2022
3.0.4	440	10/29/2022
3.0.3	441	10/29/2022
3.0.2	464	10/24/2022
3.0.1	467	10/21/2022
3.0.0	471	10/7/2022

WebReaper 3.0.7

WebReaper

Overview

Install

Requirements

📋 Example:

Features:

Usage examples

API overview

SPA parsing example

Authorization

Distributed web scraping with Serverless approach

StartScrapting

WebReaperSpider

Extensibility

Adding a new sink to persist your data

Intrefaces

Main entities

Repository structure

Coming soon:

Features under consideration

License

net6.0

NuGet packages

GitHub repositories