Akshada K. Dhakade, Deepak C. Dhanwani
Abstract: With the vast growth of the Internet, many web pages are available online. Search engines use a component called as web crawlers for collecting these web pages from the web for storage and indexing. Many web pages are autonomous and are updated independent of the users. As the web pages are updated autonomously, users do not come to know of how often the sources change. Web crawler is the central part of the search engine which browses through the hyperlinks and stores the visited links for the future use. This paper represents concepts of web crawlers, its architecture and its various types.
Keywords: Search Engine, Web Crawler, Crawler Policies, Techniques