Open Access Journal

ISSN : 2394-2320 (Online)

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

Open Access Journal

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

ISSN : 2394-2320 (Online)

Clustering of Labeled And Unlabeled Data By Integrating Pre And Post Clustering Approaches

Author : Mr.Asadi Srinivasulu 1 Dr.Ch.D.V.Subbarao 2 I. Muralikrishna 3

Date of Publication :15th March 2017

Abstract: Clustering is the process of organizing objects into groups whose members are similar in some way or differ significantly from other objects. There are two approaches viz., pre-clustering and post-clustering. Pre-clustering is an unsupervised learning that assigns labels to objects in unlabeled data. The important pre-clustering approaches that we have considered are Dark Block Extraction (DBE), Cluster Count Extraction (CCE) and Co-VAT (Visual Assessment of Cluster Tendency). The present work focuses on pre-clustering approach. The limitations of these pre-clustering algorithms are i) DBE can’t handle the large data ii) CCE suffers because of perplexing iii) Co-VAT works with only rectangular data. Our work proposes Extended Dark Block Extraction (EDBE), Extended Cluster Count Extraction (ECCE) and Extended co-VAT to overcome the above said limitations. The following five steps results after integrating pre and post clustering approaches. They are 1) Extracting a VAT image of an input dissimilarity matrix. 2) Performing image segmentation on the VAT image to obtain a binary image, followed by directional morphological filtering. 3) Applying a distance transform to the filtered binary image and smoothing the pixel values on the main diagonal axis of the image to form a smoothening signal. 4) Applying first-order derivative and fast fourier transformation on smoothened signal for detecting major peaks and valleys. 5) Now post-clustering approach i.e. k-means algorithm is applied to the major peaks and valleys in-order to obtain refined clusters. The proposed algorithms viz., EDBE, ECCE and Extended Co-VAT uses VAT as well as the combination of several image processing techniques are applied on various real world data sets like IRIS, WINE and Image Data sets. These extended approaches use Reordered Dissimilarity Image (RDI) that highlights potential clusters as a set of 'Dark blocks' along the diagonal of the image. The simulation results show that EDBE, ECCE, Extended co-VAT outperform DBE, CCE and co-VAT in terms of time-complexity and accuracy of labeled and unlabeled data.

Reference :

Will Updated soon

Recent Article