Kariuki Paul Wahome, Wekesa Bongo, Dr. Rimiru Richard Maina
Abstract: Trend statistics through countless studies depict that there is an exponential growth of data form terabyte to petabytes and beyond in the world. This reality brings into perspective the apparent need for data mining which is the process of discovering previously unknown facts and patterns. Increasingly, data mining is gaining popularity due to the need by organizations to acquire useful information and develop hypothesis from the massive data sets they have in their data centers. Preprocessing comes in handy in the KDD process since it serves as the first stage while classification is the most common data mining task. This paper uses WEKA data mining tool which facilitates various data mining tasks through different algorithms to put into a kaleidoscope the importance of data preprocessing and the task of classification. Special focus is given to the procedure and results obtained after carrying out the two processes on WEKA.
Keywords: Data Preprocessing, Classification, Data Mining, WEKA