Mohamed Omran, O.E. Emam, Laila Abd-Elatif, M. Thabet
Abstract: Text classification is the process of classifying documents into predefined categories based on their content. It is the automated assignment of natural language texts to predefined categories. Text classification is the primary requirement of text retrieval systems, which retrieve texts in response to a user query, and text understanding systems, which transform text in some way such as producing summaries, answering questions or extracting data. Existing supervised learning algorithms for classifying text need sufficient documents to learn accurately. This paper presents an algorithm based on rough set for the automatic grouping of PDF documents, and with potential application for Web document classification.
Keywords: rough sets, classifier, elfagr newspaper