Downloads: 141 | Views: 192
Review Papers | Computer Science & Engineering | India | Volume 4 Issue 1, January 2015
eDEW: Effective Data Extraction from Web
Abstract: Internet has become most popular place for accessing World Wide Web (WWW). With the enormous growing amount of information over Internet, accurate and efficient web data extraction has become necessary. Nevertheless, there are various kind of web pages which are having structured, semi-structured and unstructured data. A web page is a formation of many information blocks. Besides an informative block, web pages often consist of the distracting elements such as advertisements, copyrights, navigational panel, etc which are called as Noise. Useful content or Information Extraction from the web pages becomes a critical issue for web users and web miners. The user can be misguided by the noise of the web page. So an effective web data extraction for users to conceive the useful information from the noisy information is urgently required. The main feature of web pages is that Web data extraction mainly deals with unstructured and semi structured form of data.
Keywords: DOM Tree, Information Extraction, Pattern Tree, Web Mining, Web Data Extraction
Edition: Volume 4 Issue 1, January 2015,
Pages: 398 - 401