Meena S.U., P.Parthasarathi
Abstract: Clustering is a fundamental operation used in unsupervised document organization and information retrieval. It aims to find intrinsic structures in data, and organize them into meaningful subgroups. It groups data instances that are similar to each other in one cluster and data instances that are very different from each other into different clusters. Hierarchical clustering is used to find the cluster relationship between data objects in the data set. A novel multi-viewpoint based similarity measure and two related clustering methods are proposed. The main difference of the novel method from the existing one is that it uses only single view point for clustering and where as in Multi-Viewpoint Based Similarity Measure uses many different viewpoints, which are objects and are assumed to not be in the same cluster with two objects being measured. Using multiple viewpoints, more informative assessment of similarity could be achieved. The two objects to be measured must be in the same cluster, while the points from where to establish this measurement must be outside of the cluster. This is called as Multiviewpoint-based Similarity, or MVS. Based on this novel method two criterion functions are proposed for document clustering. We compared this clustering algorithm with other measures in order to verify the performance of multiviewpoint clustering.
Keywords: Multiview-point clustering, Document Clustering, Hierarchical Clustering, Information retrival