Open Access Journal

ISSN : 2394-2320 (Online)

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

Open Access Journal

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

ISSN : 2394-2320 (Online)

A study on Web Scraping

Author : Niranjan Krishna 1 Anvith Nayak 2 Sana Badagan 3 Chethan Jetty 4 Dr. Sandhya N 5

Date of Publication :1st December 2022

Abstract: Web Scraping or Web Harvesting is a software technology aims at extracting information from websites. Web scraping typically simulate human exploring of the World Wide Web by creating a low-level Hyper Text Transfer Protocol or implementing a Suitable Web Browser. It is closely related to Web Indexing, an information extracting technique used by multiple search engines to index-data on the Web using human programmed bots. In comparison, web scraping stresses on transforming unstructured information (usually in HTML format) on the web into structured information that can be saved and processed in a centralised database.

Reference :

    1.  https://academic.oup.com/bib/article/15/5/788/2422275?login=tr ue
    2.  https://www.researchgate.net/profile/Bo-Zhao-3/publication/317 177787_Web_Scraping/links/5c293f85a6fdccfc7073192f/WebScraping.pdf

Recent Article