Date of Publication :19th December 2017
Abstract: Text document clustering is the process of distributing documents into similar groups called clusters, in order that documents within a cluster have the great affinity in comparison to other documents in different clusters. It has been taken into consideration intensified due to the fact of its substantial applicability in various areas like information retrieval, web mining and search engines like Google. It is determining the similarity among documents and based on the similarity it will organize the documents together. Fast and greatness of text document clustering algorithms perform a vital role in dramatically navigating, encapsulating, and establishing the information. The clustering algorithms can only generate the optimal solution. A globally optimal solution can be attained by applying high-geared and high-quality optimization. The main objective of this research work is to group the documents based on their contents and also to improve the clustering accuracy based on the content of the documents. In order to perform this task this research work uses the existing algorithms DBSCAN and PSO. A hybrid algorithm which is a combination of PSO and DBSCAN algorithm is also proposed. The outcome of this research work is the identification of the cluster of documents which has the same contents.
Reference :
-
- Nicholas O. Andrews, Edward A. Fox, “Recent Developments in Document Clustering”, October 16, 2007
- S.J Nanda, G. Panda, “A Survey on nature inspired meta heuristic algorithm for partition clustering” Swarm and Evolutionary Computation, Elsevier, Vol. 16, pp. 1- 18, 2014.
- Fang Yuankang, Huang Zhiqiu, Luo Yuping, Ye Zan and Liu Ying “Research on Improve DBSCAN Algorithm Based On Ant Clustering” The Open Automation and Control Systems Journal, 2014, 6, 1076-1084
- Shang L., et al., “A New Ant Colony Algorithm based on DBSCAN, Proceedings of 2004 International Conference on Machine Learning and Cybernetics, pp. 1491 – 1496, 2004.
- Xiaohui Cui, Thomas E. Potok, Paul Palathingal “Document Clustering using Particle Swarm Optimization” Swarm Intelligence Symposium, 2005. SIS 2005. Proceedings 2005 IEEE
- Manpreet KaurȦ and Navpreet KaurȦ “Text Clustering using PBO algorithm for Analysis and Optimization” International Journal of Current Engineering and Technology, E-ISSN 2277 – 4106, PISSN 2347 – 5161
- Taher, N., et al., An Efficient Hybrid Evolutionary Optimization Algorithm based on PSO and SA for Clustering, Journal of Shejiang University – Science A, Vol. 10, No. 4, pp. 512 – 519, 2009.