Harshada Wagaskar, Prof. Gayatri Bhandari
Abstract: Data Stream Mining is the method of deriving knowledge from constant and quickly developing records of information. A data stream is an ordered sequence of occurrences. These occurrences can be read just once or a less number of times utilizing restricted using limited storage capabilities and computing. Examples of such data streams include ATM exchanges, sensor information, telephone discussions and so forth. Data stream characterization has many difficulties in the information mining field. The four major challenges in the field of Data stream classification which are infinite length, concept-drift and concept-evolution are proposed here. A data stream is never-ending in length, hence it is not practical to store and utilize all the historical data for training purpose. Concept-drift occurs as a result of changes in the fundamental concepts. Concept-evolution happens when new classes develop in the information data. An example of concept-evolution is Twitter, where new themes develop routinely in the stream of instant messages. Feature-evolution is a regularly happening process in data streams, where new features evolve and old features vanish. This problem is investigated during this paper, and improved solutions are proposed. The current work additionally addresses the recurring class problem in data streams.
Keywords: Data stream, concept-evolution, novel class, outlier