Scraping.Core 1.0.0

dotnet add package Scraping.Core --version 1.0.0                
NuGet\Install-Package Scraping.Core -Version 1.0.0                
This command is intended to be used within the Package Manager Console in Visual Studio, as it uses the NuGet module's version of Install-Package.
<PackageReference Include="Scraping.Core" Version="1.0.0" />                
For projects that support PackageReference, copy this XML node into the project file to reference the package.
paket add Scraping.Core --version 1.0.0                
#r "nuget: Scraping.Core, 1.0.0"                
#r directive can be used in F# Interactive and Polyglot Notebooks. Copy this into the interactive tool or source code of the script to reference the package.
// Install Scraping.Core as a Cake Addin
#addin nuget:?package=Scraping.Core&version=1.0.0

// Install Scraping.Core as a Cake Tool
#tool nuget:?package=Scraping.Core&version=1.0.0                

Contributors Forks Issues MIT License

Scraping-Toolkit

Read this in other language: English, Portuguese

Overview

The Scrapping-Toolkit is a fast-based structure to capture information within web pages, used to track websites and even extract or insert data on the web pages. It can be widely used to reach to any goal from data-mining to web site monitoring and automated tests.

Prerequisites

HTML Agility Pack or superior

Framework

How to use

To install the component you can use the "Install" command or access https://www.nuget.org/packages/Scraping/

Install-Package Scraping

To make it use the "load", you must inform the url (FromUrl) and one usage possibility is to let the tool try to identify the screen components.

public void LoadComponents()
{
	var ret = new HttpRequestFluent(true)
		.FromUrl("https://github.com/otavioalfenas/Scraping-Toolkit")
		.TryGetComponents(Scraping.Enums.TypeComponent.LinkButton| Scraping.Enums.TypeComponent.InputHidden)
		.Load();
}

Inside the tool, there are also many extensions that make the parse work easier.

public void AllTags()
{
	var ret = new HttpRequestFluent(true)
		.FromUrl("https://github.com/otavioalfenas/Scraping-Toolkit")
		.Load();
	var byClassContain = ret.HtmlPage.GetByClassNameContains("Box mb-3 Box--");
	var byClassEquals = ret.HtmlPage.GetByClassNameEquals("Box mb-3 Box--condensed");
	var byId = ret.HtmlPage.GetById("readme");
}

Examples

Below there is an example of all the methods inside the Load. The folder "test" contains many examples on Load usage and extensions. If any doubt or suggestion comes up, you may contact us or open an issue so we can improve the tool together.

public void LoagPageFull()
{
	var ret = new HttpRequestFluent(true);
	ret.OnLoad += Ret_OnLoad;
	NameValueCollection parameters = new NameValueCollection();
	parameters.Add("Name", "Value");

	ret.FromUrl("https://github.com/otavioalfenas/Scraping-Toolkit")
		.TryGetComponents(Enums.TypeComponent.ComboBox| Enums.TypeComponent.DataGrid| 
						Enums.TypeComponent.Image|Enums.TypeComponent.InputCheckbox|
						Enums.TypeComponent.InputHidden| Enums.TypeComponent.InputText|
						Enums.TypeComponent.LinkButton)
		.RemoveHeader("name")
		.AddHeader("name", "value")
		.KeepAlive(true)
		.WithAccept("Accept")
		.WithAcceptEncoding("Accept-Encoding")
		.WithAcceptLanguage("Accept-Language")
		.WithAutoRedirect(true)
		.WithContentType("ContentType")
		.WithMaxRedirect(2)
		.WithParameters(parameters)
		.WithPreAuthenticate(true)
		.WithReferer("Referer")
		.WithRequestedWith("WithRequestedWidth")
		.WithTimeoutRequest(100)
		.WithUserAgent("User-Agent")
	.Load();

}

private void Ret_OnLoad(object sender, RequestHttpEventArgs e)
{
	e.HtmlPage;
	e.ResponseHttp;
}

Contribution

Below you can contribute to the project as much as you want. Any advice,suggestion or adjust will always be welcomed. Here is a step-by-step guide on how to proceed to upload your update.

  1. Fork the Project;
  2. Create your Feature Branch (git checkout -b branch/Example);
  3. Commit your updates (git commit -m 'Message of any updates that were made to the program');

Request permission to send your branch. 4. Send to your Branch (git push --set-upstream origin Example); 5. Open a Pull Request;

Licences

Distributed over GNU Licence. See the file LICENSE for more information.

Contact

Otavio Alfenas: @otavioalfenas<br/> E-mail: otavioalfenas@hotmail.com<br/>

Leandro Klaiber: @leandroklaiber<br/> E-mail: leandroklaiber@gmail.com<br/>

Acknowledgement

Eduardo Chen - https://www.linkedin.com/in/EduardoChen <br/> Edgard Yamashita - https://www.linkedin.com/in/eguilherme

Product Compatible and additional computed target framework versions.
.NET net5.0 was computed.  net5.0-windows was computed.  net6.0 was computed.  net6.0-android was computed.  net6.0-ios was computed.  net6.0-maccatalyst was computed.  net6.0-macos was computed.  net6.0-tvos was computed.  net6.0-windows was computed.  net7.0 was computed.  net7.0-android was computed.  net7.0-ios was computed.  net7.0-maccatalyst was computed.  net7.0-macos was computed.  net7.0-tvos was computed.  net7.0-windows was computed.  net8.0 was computed.  net8.0-android was computed.  net8.0-browser was computed.  net8.0-ios was computed.  net8.0-maccatalyst was computed.  net8.0-macos was computed.  net8.0-tvos was computed.  net8.0-windows was computed. 
.NET Core netcoreapp3.1 is compatible. 
Compatible target framework(s)
Included target framework(s) (in package)
Learn more about Target Frameworks and .NET Standard.

NuGet packages

This package is not used by any NuGet packages.

GitHub repositories

This package is not used by any popular GitHub repositories.

Version Downloads Last updated
1.0.0 537 10/2/2020