CrawlSharp 1.0.1
See the version list below for details.
dotnet add package CrawlSharp --version 1.0.1
NuGet\Install-Package CrawlSharp -Version 1.0.1
<PackageReference Include="CrawlSharp" Version="1.0.1" />
<PackageVersion Include="CrawlSharp" Version="1.0.1" />
<PackageReference Include="CrawlSharp" />
paket add CrawlSharp --version 1.0.1
#r "nuget: CrawlSharp, 1.0.1"
#:package CrawlSharp@1.0.1
#addin nuget:?package=CrawlSharp&version=1.0.1
#tool nuget:?package=CrawlSharp&version=1.0.1
<img src="https://raw.githubusercontent.com/jchristn/CrawlSharp/refs/heads/main/assets/icon.png" width="256" height="256">
CrawlSharp
CrawlSharp is a library and integrated webserver for crawling basic web content.
New in v1.0.x
- Initial release
Bugs, Feedback, or Enhancement Requests
Please feel free to start an issue or a discussion!
Simple Example, Embedded
Embedding CrawlSharp into your application is simple and requires minimal configuration. Refer to the Test project for a full example.
using CrawlSharp;
Settings settings = new Settings();
settings.Crawl.StartUrl = "http://www.mywebpage.com";
WebCrawler crawler = new WebCrawler(settings);
await foreach (WebResource resource in crawler.Crawl())
Console.WriteLine(resource.Status + ": " + resource.Url);
Web Resources
Objects crawled using CrawlSharp have the following properties:
Url- the URL from which the resource was retrievedParentUrl- the URL from which theUrlwas identifiedDepth- the depth level at which theUrlwas identifiedStatus- the HTTP status code returned when retrieving theUrlContentLength- the content length of the body returned when retrievingUrlContentType- the content type returned while retrievingUrlHeaders- aNameValueCollectionwith the headers returned while retrievingUrlData- abyte[]containing the data returned while retrievingUrl
REST API
CrawlSharp includes a project called CrawlSharp.Server which allows you to deploy a RESTful front-end for CrawlSharp. Refer to REST_API.md and also the Postman collection in the root of this repository for details.
CrawlSharp.Server will by default listen on host localhost and port 8000, meaning it will not accept requests from outside of the machine.
To change this, specify the hostname as the first argument and the port as the second, i.e. dotnet CrawlSharp.Server myhostname.com 8888.
$ dotnet CrawlSharp.Server
_ _ _
___ _ __ __ ___ _| | _| || |_
/ __| '__/ _` \ \ /\ / / | |_ .. _|
| (__| | | (_| |\ V V /| | |_ _|
\___|_| \__,_| \_/\_/ |_| |_||_|
(c)2025 Joel Christner
Usage:
crawlsharp [hostname] [port]
Where:
[hostname] is the hostname or IP address on which to listen
[port] is the port number, greater than or equal to zero, and less than 65536
NOTICE
------
Configured to listen on local address 'localhost'
Service will not receive requests from outside of localhost
Webserver started on http://localhost:8000/
2025-03-01 20:39:17 joel-laptop Info [CrawlSharpServer] server started
Refer to REST_API.md for more information about using the RESTful API.
Running in Docker
A Docker image is available in Docker Hub under jchristn/crawlsharp. Use the Docker Compose start (compose-up.sh and compose-up.bat) and stop (compose-down.sh and compose-down.bat) scripts in the Docker directory if you wish to run within Docker Compose.
Version History
Please refer to CHANGELOG.md for version history.
| Product | Versions Compatible and additional computed target framework versions. |
|---|---|
| .NET | net8.0 is compatible. net8.0-android was computed. net8.0-browser was computed. net8.0-ios was computed. net8.0-maccatalyst was computed. net8.0-macos was computed. net8.0-tvos was computed. net8.0-windows was computed. net9.0 was computed. net9.0-android was computed. net9.0-browser was computed. net9.0-ios was computed. net9.0-maccatalyst was computed. net9.0-macos was computed. net9.0-tvos was computed. net9.0-windows was computed. net10.0 was computed. net10.0-android was computed. net10.0-browser was computed. net10.0-ios was computed. net10.0-maccatalyst was computed. net10.0-macos was computed. net10.0-tvos was computed. net10.0-windows was computed. |
-
net8.0
- HtmlAgilityPack (>= 1.11.74)
- RestWrapper (>= 3.1.4)
- SerializationHelper (>= 2.0.3)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.
| Version | Downloads | Last Updated |
|---|---|---|
| 1.0.15 | 283 | 3 months ago |
| 1.0.14 | 114 | 3 months ago |
| 1.0.13 | 198 | 4 months ago |
| 1.0.12 | 167 | 4 months ago |
| 1.0.11 | 183 | 4 months ago |
| 1.0.10 | 181 | 5 months ago |
| 1.0.9 | 138 | 6 months ago |
| 1.0.8 | 99 | 6 months ago |
| 1.0.7 | 92 | 6 months ago |
| 1.0.6 | 207 | 8 months ago |
| 1.0.5 | 189 | 8 months ago |
| 1.0.4 | 188 | 8 months ago |
| 1.0.3 | 189 | 8 months ago |
| 1.0.2 | 256 | 8 months ago |
| 1.0.1 | 229 | 8 months ago |
| 1.0.0 | 133 | 8 months ago |
Initial release