International Journal of Science and Research (IJSR)

International Journal of Science and Research (IJSR)
Call for Papers | Fully Refereed | Open Access | Double Blind Peer Reviewed

ISSN: 2319-7064

Downloads: 141 | Views: 192

Review Papers | Computer Science & Engineering | India | Volume 4 Issue 1, January 2015

eDEW: Effective Data Extraction from Web

Shalaka Patil

Abstract: Internet has become most popular place for accessing World Wide Web (WWW). With the enormous growing amount of information over Internet, accurate and efficient web data extraction has become necessary. Nevertheless, there are various kind of web pages which are having structured, semi-structured and unstructured data. A web page is a formation of many information blocks. Besides an informative block, web pages often consist of the distracting elements such as advertisements, copyrights, navigational panel, etc which are called as Noise. Useful content or Information Extraction from the web pages becomes a critical issue for web users and web miners. The user can be misguided by the noise of the web page. So an effective web data extraction for users to conceive the useful information from the noisy information is urgently required. The main feature of web pages is that Web data extraction mainly deals with unstructured and semi structured form of data.

Keywords: DOM Tree, Information Extraction, Pattern Tree, Web Mining, Web Data Extraction

Edition: Volume 4 Issue 1, January 2015,

Pages: 398 - 401

How to Download this Article?

Type Your Valid Email Address below to Receive the Article PDF Link

Verification Code will appear in 2 Seconds ... Wait