A Review of Big Data Clustering Methods and Research Issues

Nweso Emmanuel Nwogbaga

doi:10.21275/SR20502183559

A Review of Big Data Clustering Methods and Research Issues

Nweso Emmanuel Nwogbaga

Abstract: Data mining is a method for knowledge discovery from a dataset. The world today is moving toward data-driven in all ramifications, ranging from education, health care, security, customersâ€™ management, smart city, etc. Unsupervised learning like clustering is the most big-data mining technique used for grouping large dataset when there is no prior information about the classes in the dataset. The use of the internet of things (wearable, sensors, RFID) and social networks has drastically increased data in the cyber-physical world resulting in what is called Big Data. With the increase in big data as a result of cloud computing, it has proliferated research on knowledge discovery on these avalanche of big data. Clustering is used to extract valuable hidden information from massive complex data. Clustering as unsupervised learning has an advantage over supervised learning when it comes to knowledge discovery in a huge dataset without a prior knowledge of the groups. In this review, we discussed big data mining techniques and narrowed it to clustering method. We also discussed different clustering approaches, and similarities measures used in clustering algorithms. Finally, we discussed the strength and weaknesses of clustering approaches and the research issues in clustering big data for information discovery.

Keywords: Big Data, Big Data Mining, Clustering, IoT Big Data Clustering, Distance/Similarity Measures, Unsupervised Learning

How to Cite?: Nweso Emmanuel Nwogbaga, "A Review of Big Data Clustering Methods and Research Issues", Volume 9 Issue 5, May 2020, International Journal of Science and Research (IJSR), Pages: 253-264, https://www.ijsr.net/getabstract.php?paperid=SR20502183559, DOI: https://dx.doi.org/10.21275/SR20502183559

Download Citation: APA | MLA | BibTeX | EndNote | RefMan