Open Access Journal

ISSN : 2394-2320 (Online)

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

Open Access Journal

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

ISSN : 2394-2320 (Online)

Web Document Clustering Algorithm and Similarity Measure

Author : Ms.S.M.Durge 1 Mr.Y.M.Kurwade 2 Dr.V.M.Thakare 3

Date of Publication :20th April 2017

Abstract: The Clustering is an unsupervised method to divide data into disjoint subsets with high intra-cluster similarity and low inter-cluster similarity. Most of the approaches perform web documents clustering, i.e., they assign each object to precisely one of a set of clusters. Objects in one cluster are similar to each other. The similarity between objects is based on a measure of the distance between them.This works well when clustering the compact and well-separated groups of data, but in many situations, clusters are different at rerun. This proposed method usek-means++ algorithm,is capable of identifying problem by spreading the initial centers evenly and improves performance

Reference :

    1. Adrian Pusztat, Janos Sziilet and SandorLaki, “Near Real-Time Thematic Clustering of Web Documents and Other Internet contents”, 4th IEEE International Conference on Cognitive Info-communications, pp. 307-312, DEC. 2-5 2013
    2. EhsanElhamifar, and Rene Vidal, “Sparse Subspace Clustering: Algorithm, Theory, and Applications”, IEEE Transactions on Pattern Analysis and Machine Intelligence, VOL. 35, NO. 11, pp. 2765-2781, NOV. 2013
    3. X. Chen, X. Xu, J. Zhexue Huang and Yunming Ye, “TW-k-Means: Automated Two-Level Variable Weighting Clustering Algorithm for Multi view Data”, IEEE Transactions on Knowledge and Data Engineering, VOL. 25, NO. 4, pp. 932-944, APRIL 2013.
    4. Caimei Lu, Xiaohua Hu and Jung-ran Park, “Exploiting the Social Tagging Network for Web Clustering”, IEEETransaction on Systems, Man and Cybernetics, VOL.41, NO. 5, pp. 840-852, SEP. 2011.
    5. Masayuki Okabe and Seiji Yamada, “Graphcut based Iterative Constrained Clustering”, 2011 IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, pp. 126-129, NOV. 2011.

Recent Article