Introduction to Web Scrapping
What is web scrapping: “Web scraping is the process of extracting data from websites. It involves using automated tools or scripts to retrieve specific information from web pages, which can then be used for analysis, research, or other purposes”.
• Some common automated web scraping tools are:
Beautiful Soup,
Scrapy,
Selenium,
Octoparse,
WebHarvy,
ParseHub,
Apify,
Puppeteer,
MechanicalSoup,
WinAutomation etc.
• You can also use C# standard libraries (System.Net.Http and HtmlAgilityPack) for web scrapping. Understanding HTML Elements and Attributes A brief summary of some HTML elements and attributes: Head: The element in HTML contains metadata about the document. Example : Title, character encoding, links to external stylesheets, and other resources that the browser needs to properly render the page. Body: The element defines the main content area of a web page. It includes all the visible content that users see when they visit the page, including text, images, links, and other media. Form: The