Downloads: 106 | Views: 391
Research Paper | Computer Science & Engineering | India | Volume 4 Issue 7, July 2015 | Rating: 6.4 / 10
Backup Anomaly Identification with R and Hadoop
Ravindra Phule, Madhav Ingle
Abstract: In recent years big data has become something of a buzzword in business, computer science, information studies, information systems, statistics, and many other fields. As technology continues to advance, we constantly generate an ever-increasing amount of data. This growth does not differentiate between individuals and businesses, private or public sectors, institutions of learning and commercial entities. It is nigh universal and therefore warrants further study. Increasingly larger scale applications are generating an unprecedented amount of data. In order to exploit data mining techniques on collected backup job metadata, we integrate a big data analytics platform with the existing enterprise backup architecture. We build a scalable data mining platform to store, process, and perform advanced data mining techniques on the overall data set. We leverage open source tools to reduce the overall cost while preserving flexibility.
Keywords: Bigdata, Big data analytic, Hadoop, data analytics, knn, RHadoop, framework
Edition: Volume 4 Issue 7, July 2015,
Pages: 2241 - 2244