Scraping 3.0.0.3
dotnet add package Scraping --version 3.0.0.3
NuGet\Install-Package Scraping -Version 3.0.0.3
<PackageReference Include="Scraping" Version="3.0.0.3" />
paket add Scraping --version 3.0.0.3
#r "nuget: Scraping, 3.0.0.3"
// Install Scraping as a Cake Addin #addin nuget:?package=Scraping&version=3.0.0.3 // Install Scraping as a Cake Tool #tool nuget:?package=Scraping&version=3.0.0.3
Scraping-Toolkit
Read this in other language: English, Portuguese
Overview
The Scrapping-Toolkit is a fast-based structure to capture information within web pages, used to track websites and even extract or insert data on the web pages. It can be widely used to reach to any goal from data-mining to web site monitoring and automated tests.
Prerequisites
How to use
.NET Framework
To install the component you can use the "Install" command or access https://www.nuget.org/packages/Scraping/
Install-Package Scraping
.NET Core
To install the component you can use the "Install" command or access https://www.nuget.org/packages/Scraping.Core/
Install-Package Scraping.Core
To make it use the "load", you must inform the url (FromUrl) and one usage possibility is to let the tool try to identify the screen components.
public void LoadComponents()
{
var ret = new HttpRequestFluent(true)
.FromUrl("https://github.com/otavioalfenas/Scraping-Toolkit")
.TryGetComponents(Scraping.Enums.TypeComponent.LinkButton| Scraping.Enums.TypeComponent.InputHidden)
.Load();
}
Inside the tool, there are also many extensions that make the parse work easier.
public void AllTags()
{
var ret = new HttpRequestFluent(true)
.FromUrl("https://github.com/otavioalfenas/Scraping-Toolkit")
.Load();
var byClassContain = ret.HtmlPage.GetByClassNameContains("Box mb-3 Box--");
var byClassEquals = ret.HtmlPage.GetByClassNameEquals("Box mb-3 Box--condensed");
var byId = ret.HtmlPage.GetById("readme");
}
Examples
Below there is an example of all the methods inside the Load. The folder "test" contains many examples on Load usage and extensions. If any doubt or suggestion comes up, you may contact us or open an issue so we can improve the tool together.
public void LoagPageFull()
{
var ret = new HttpRequestFluent(true);
ret.OnLoad += Ret_OnLoad;
NameValueCollection parameters = new NameValueCollection();
parameters.Add("Name", "Value");
ret.FromUrl("https://github.com/otavioalfenas/Scraping-Toolkit")
.TryGetComponents(Enums.TypeComponent.ComboBox| Enums.TypeComponent.DataGrid|
Enums.TypeComponent.Image|Enums.TypeComponent.InputCheckbox|
Enums.TypeComponent.InputHidden| Enums.TypeComponent.InputText|
Enums.TypeComponent.LinkButton)
.RemoveHeader("name")
.AddHeader("name", "value")
.KeepAlive(true)
.WithAccept("Accept")
.WithAcceptEncoding("Accept-Encoding")
.WithAcceptLanguage("Accept-Language")
.WithAutoRedirect(true)
.WithContentType("ContentType")
.WithMaxRedirect(2)
.WithParameters(parameters)
.WithPreAuthenticate(true)
.WithReferer("Referer")
.WithRequestedWith("WithRequestedWidth")
.WithTimeoutRequest(100)
.WithUserAgent("User-Agent")
.Load();
}
private void Ret_OnLoad(object sender, RequestHttpEventArgs e)
{
e.HtmlPage;
e.ResponseHttp;
}
Contribution
Below you can contribute to the project as much as you want. Any advice,suggestion or adjust will always be welcomed. Here is a step-by-step guide on how to proceed to upload your update.
- Fork the Project;
- Create your Feature Branch (
git checkout -b branch/Example
); - Commit your updates (
git commit -m 'Message of any updates that were made to the program'
);
Request permission to send your branch.
4. Send to your Branch (git push --set-upstream origin Example
);
5. Open a Pull Request;
Licences
Distributed over GNU Licence. See the file LICENSE
for more information.
Contact
Otavio Alfenas: @otavioalfenas<br/> E-mail: otavioalfenas@hotmail.com<br/>
Leandro Klaiber: @leandroklaiber<br/> E-mail: leandroklaiber@gmail.com<br/>
Acknowledgement
Eduardo Chen - https://www.linkedin.com/in/EduardoChen <br/> Edgard Yamashita - https://www.linkedin.com/in/eguilherme
Product | Versions Compatible and additional computed target framework versions. |
---|---|
.NET Framework | net461 is compatible. net462 was computed. net463 was computed. net47 was computed. net471 was computed. net472 was computed. net48 was computed. net481 was computed. |
-
- HtmlAgilityPack (>= 1.11.18)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.