International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Open Access Journal

ISSN : 2394-2320 (Online)

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

Open Access Journal

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

ISSN : 2394-2320 (Online)

Call For Paper : Vol 12, Issue 03, March 2025

Web Crawler: A Crawler for Efficiently Retrieving Relevant Data

Author : Prabhu Alamkare ¹ ShrikantGiri ² Shubham Bardiya ³ PradeepGite ⁴

Date of Publication :7th April 2016

Abstract: The World Wide Web is a rapidly growing and changing information source. Due to the dynamic nature of the Web, it becomes harder to find relevant and recent information. WebCrawler are one of the most crucial part of the search engines to collect pages from the Web. WebCrawler is to download most relevant web pages from such a large web is still a major challenge in the field of Information Retrieval Systems. WebCrawler uses two-stage framework. In the first stage, WebCrawler performs site-based searching for visiting a large number of pages. In the second stage, WebCrawler achieves fast in-site searching by extracting most relevant links with an adaptive link-ranking. To achieve more accurate results WebCrawler ranks websites to prioritize highly relevant ones.

Reference :

1. WenwenLia, ChaoweiYanga and ChongjunYangb,"An active crawler for discovering geospatial Web services and their distribution pattern –Vol. 24, No. 8, August 2010.
2. MahmudurRahman,"Search Engines going beyond Keyword Search",School of Computing and Information Sciences Florida International University, Miami, FL 33199,Volume 75 - No. 17, August 2013.
3. Trupti V. Udapure, Ravindra D. Kale, Rajesh C. Dharmik,"Study of Web Crawler and its Different Types” ISSN: 2278-8727Volume 16, Issue 1, Ver. VI (Feb. 2014)
4. A.B. Gil, S. Rodríguez, F. de la Prieta and De Paz J.F,"Personalization on E-Content Retrieval Based on Semantic Web Services
5. Pavalam S M, S V Kashmir Raja, Felix K Akorli3 and Jawahar M,"A Survey of Web Crawler Algorithms" National University of Rwanda Huye, RWANDA,Vol. 8, Issue 6, No 1, November 2011.
6. Ms. Pallavi Wadibhasme1, Prof. NitinShivale ,"Survey on – Self Adaptive Focused Crawler,Issue 6, No 1, November 2013.
7. Paolo Boldi_ Bruno Codenotti† Massimo Santini‡ SebastianoVigna“ A Scalable Fully Distributed Web Crawler”
8. Tiffany Ya TANG and Gordon MCCALL Smart Recommendation for an Evolving E-Learning System
9. Feng Zhao, Jingyu Zhou, Chang Nie, Heqing SmartCrawler: A Two-stage Crawler for Efficiently Harvesting Deep-Web Interfaces. IEEE Transactions on Services Computing Volume: PP Year: 2015
10. RajashreeShettar, Dr.Shobha G. “Web Crawler On Client Machine” Proceedings of the International MultiConference of Engineers and Computer Scientists 2008 Vol II IMECS 2008, 19-21 March, 2008.

Recent Article

● Efficient Cluster based Communication with Plausibility Checks in VANET

● Survey of various Query Mapping Techniques from SQL to No SQL

● A Secure and Authentication Based Mechanism in Zone Routing Protocol

● Preparation of Plan and Analysis of a Multistoried Medical College

● Detecting Malicious Apps on OSN Face-book Wall

● A Design of Approximation Algorithm for Efficient DNA Mapping using Hadoop with GPU Acceleration

● “Context-Based Diversification for Keyword Queries over XML Data”

● Automatic Parking Reservation Application for Smartphone Using IR Sensors

● “A review study on secure web application development using PHP with Laravel Framework”

● Confidential Image Sharing Using Visual Secret Sharing Scheme

● Quadruped-Walker

● Recommenders System for Effective ICT Based Learning

● Resolve the Classification Problem over Encrypted data Using K-Nearest Neighbour

● Disease Inference System based on Health-Related Question Answer system

● Implementation Of Efficient Privacy Preserving Classification Techniques With Outsource Data

● Privacy Preserving Association Rules Mining In Horizontally Distributed Databases

● Crop Field Analyzer

● A Secure Parallel Network File System using Protocols

● New Approach for Parallel Graph Computation Using Partition Aware Engine

● Web Crawler: A Crawler for Efficiently Retrieving Relevant Data

● User Authentication using Digital Signature and Biometric Factor

● Security in Searching Shared and Encrypted Data in Multi Party Environment

● Enhancing Scalable Reverse Dictionary Using Text Rank

● Strengthening Authentication System Using Mindmetrics

● Stock Market Prediction and Analysis using Hadoop

● Fuzzy Authorization on Cloud Computing By Using Merging Technique

● Dual Cryptography Based Data Security in Cloud Computing

● A New Differential Evolutionary Algorithm

● An Alternative Approach to Resolve Load Balancing Problems in Cloud Computing

● Time to Change the World around You “Intellectual Controller

● Table Complexity Measurement of Database Systems

● Augmentation of Apriori Algorithm

● 3D Brain Tumor Detection

● Automatic Detection of Leukemia Using Blood Smear Sample

● Active Feature Description in Animals Footprint Identification

● Effective Interoperability between IPV6 Networks through Tunneling and Dual Stack Mechanism

● Iot – Based Information System for Emergency Medical Services

● Information Security by Embedding Of QR Code into Color Image

● Data Hiding in Encrypted H.264/AVC Video Streams by Code word Substitution

● Improved Performance Of Web Based Database Management For Telemedicine By Using Three Fold Approach Of Data Fragmentation,Websites Data Clustering And Data Allocation

● Collaborative Data Publishing Using Privacy Preserving Technique

● Cloud Gaming: A Green Future

● Optimizing SDN Performance for Small Networks by Enhancing Open Flow

● Secure Query Processing Over Encrypted Data via Location Based Service Provider

● An Efficient Cost-A ware Secure Routing (CASER) Protocol for Wireless Sensor Network

● Trustworthy URI: Enhancing the Data’s On the Web Reliable and Immutable

● Machine Learning Methods for Medical Diagnosis VIA Web Application

● Social Recommendation with Cross-Domain Transferable Knowledge

● Improving Security and QOS in Device-To-Device Communication Using Elliptic Curve DIFFIE Hellman Algorithm

● Cost-Effective Authentic and Anonymous Data Sharing with Forward Security

● Analysis of Request over Possible Clouds without Merged Duplicates

● Reservation Based Smart Parking System

● Webcam Based Remote Authentication via Biometrics over Insecure Channels Using Steganography

● Effect of Looping On the Lifetime Of A Multi-Sink Wireless Sensor Network Deployed For Healthcare Monitoring System

● Flow Mobility Modeling In VANETS