Scraping 1.5.1
See the version list below for details.
dotnet add package Scraping --version 1.5.1
NuGet\Install-Package Scraping -Version 1.5.1
<PackageReference Include="Scraping" Version="1.5.1" />
paket add Scraping --version 1.5.1
#r "nuget: Scraping, 1.5.1"
// Install Scraping as a Cake Addin #addin nuget:?package=Scraping&version=1.5.1 // Install Scraping as a Cake Tool #tool nuget:?package=Scraping&version=1.5.1
Scraping-Toolkit
Read this in other language: English, Portuguese
Overview
The Scrapping-Toolkit is a fast-based structure to capture information within web pages, used to track websites and even extract or insert data on the web pages. It can be widely used to reach to any goal from data-mining to web site monitoring and automated tests.
Prerequisites
How to use
To install the component you can use the "Install" command or access https://www.nuget.org/packages/Scraping/
Install-Package Scraping
To make it use the "load", you must inform the url (FromUrl) and one usage possibility is to let the tool try to identify the screen components.
public void LoadComponents()
{
var ret = new HttpRequestFluent(true)
.FromUrl("https://github.com/otavioalfenas/Scraping-Toolkit")
.TryGetComponents(Scraping.Enums.TypeComponent.LinkButton| Scraping.Enums.TypeComponent.InputHidden)
.Load();
}
Inside the tool, there are also many extensions that make the parse work easier.
public void AllTags()
{
var ret = new HttpRequestFluent(true)
.FromUrl("https://github.com/otavioalfenas/Scraping-Toolkit")
.Load();
var byClassContain = ret.HtmlPage.GetByClassNameContains("Box mb-3 Box--");
var byClassEquals = ret.HtmlPage.GetByClassNameEquals("Box mb-3 Box--condensed");
var byId = ret.HtmlPage.GetById("readme");
}
Examples
Below there is an example of all the methods inside the Load. The folder "test" contains many examples on Load usage and extensions. If any doubt or suggestion comes up, you may contact us or open an issue so we can improve the tool together.
public void LoagPageFull()
{
var ret = new HttpRequestFluent(true);
ret.OnLoad += Ret_OnLoad;
NameValueCollection parameters = new NameValueCollection();
parameters.Add("Name", "Value");
ret.FromUrl("https://github.com/otavioalfenas/Scraping-Toolkit")
.TryGetComponents(Enums.TypeComponent.ComboBox| Enums.TypeComponent.DataGrid|
Enums.TypeComponent.Image|Enums.TypeComponent.InputCheckbox|
Enums.TypeComponent.InputHidden| Enums.TypeComponent.InputText|
Enums.TypeComponent.LinkButton)
.RemoveHeader("name")
.AddHeader("name", "value")
.KeepAlive(true)
.WithAccept("Accept")
.WithAcceptEncoding("Accept-Encoding")
.WithAcceptLanguage("Accept-Language")
.WithAutoRedirect(true)
.WithContentType("ContentType")
.WithMaxRedirect(2)
.WithParameters(parameters)
.WithPreAuthenticate(true)
.WithReferer("Referer")
.WithRequestedWith("WithRequestedWidth")
.WithTimeoutRequest(100)
.WithUserAgent("User-Agent")
.Load();
}
private void Ret_OnLoad(object sender, RequestHttpEventArgs e)
{
e.HtmlPage;
e.ResponseHttp;
}
Contribution
Below you can contribute to the project as much as you want. Any advice,suggestion or adjust will always be welcomed. Here is a step-by-step guide on how to proceed to upload your update.
- Fork the Project;
- Create your Feature Branch (
git checkout -b branch/Example
); - Commit your updates (
git commit -m 'Message of any updates that were made to the program'
); - Send to your Branch (
git push origin branch/Exemple
); - Open a Pull Request;
Licences
Distributed over GNU Licence. See the file LICENSE
for more information.
Contact
Otavio Alfenas: @otavioalfenas<br/> E-mail: otavioalfenas@hotmail.com<br/>
Leandro Klaiber: @leandroklaiber<br/> E-mail: leandroklaiber@gmail.com<br/>
Acknowledgement
Eduardo Chen - https://www.linkedin.com/in/EduardoChen <br/> Edgard Yamashita - https://www.linkedin.com/in/eguilherme
Learn more about Target Frameworks and .NET Standard.
-
- HtmlAgilityPack (>= 1.11.18)
NuGet packages
This package is not used by any NuGet packages.
GitHub repositories
This package is not used by any popular GitHub repositories.