Downloads: 184 | Views: 239
Survey Paper | Computer Science | Malaysia | Volume 9 Issue 5, May 2020
A Review of Big Data Clustering Methods and Research Issues
Abstract: Data mining is a method for knowledge discovery from a dataset. The world today is moving toward data-driven in all ramifications, ranging from education, health care, security, customers’ management, smart city, etc. Unsupervised learning like clustering is the most big-data mining technique used for grouping large dataset when there is no prior information about the classes in the dataset. The use of the internet of things (wearable, sensors, RFID) and social networks has drastically increased data in the cyber-physical world resulting in what is called Big Data. With the increase in big data as a result of cloud computing, it has proliferated research on knowledge discovery on these avalanche of big data. Clustering is used to extract valuable hidden information from massive complex data. Clustering as unsupervised learning has an advantage over supervised learning when it comes to knowledge discovery in a huge dataset without a prior knowledge of the groups. In this review, we discussed big data mining techniques and narrowed it to clustering method. We also discussed different clustering approaches, and similarities measures used in clustering algorithms. Finally, we discussed the strength and weaknesses of clustering approaches and the research issues in clustering big data for information discovery.
Keywords: Big Data, Big Data Mining, Clustering, IoT Big Data Clustering, Distance/Similarity Measures, Unsupervised Learning
Edition: Volume 9 Issue 5, May 2020,
Pages: 253 - 264