Date of Publication :7th August 2016
Abstract: The huge amount of data springs up naturally in various domains, which confronts a great challenge for the tralatitious data mining techniques in terms of efficiency and effectiveness. In order to achieve accurate information from the collected data various techniques gets evolved. Cluster analysis or clustering is the task of grouping a set of objects in such a way that objects in the same group are more similar to each other than to those in other groups. Boosting is the iterative process which aims to improve the predictive accuracy of the learning algorithms. Clustering with boosting improves quality of mining process. It is widely recognized that the boosting methodology provides superior results for classification problems. Boosting process possesses some limitations. Different approaches introduced to overcome the problems in boosting such as over fitting and troublesome area problem to improve performance and quality of the result. Cluster based boosting address limitations in boosting for supervised learning systems. In this paper, we propose the boost-clustering algorithm which constitutes a novel clustering methodology that exploits the general principles of boosting in order to provide a consistent partitioning of a dataset. The methodology is implemented in dot net and the experimental results show that the proposed methodology supports data in various environments even in presence of noise. The good performance in clustering the data gets obtained from large data set effectively.
Reference :
-
- Yi Liu, Rong Jin, and Anil K. Jain, ―BoostCluster: Boosting Clustering by Pairwise Constraints‖, 2007.
- Rutuja Shirbhate, Dr. S. D. Babar, ―Cluster based boosting for high dimensional data‖, International Journal of Advanced Research in Computer Engineering & Technology (IJARCET) 2016.
- R. Fergus. Visual Object Category Recognition. PhD thesis, Robotics Research Group, Department of Engineering Science, University of Oxford, 2005.
- L. Reyzin and R. Schapire, ―How boosting the margin can also boost classifier complexity,‖ in Proc. Int. Conf. Mach. Learn., 2006,pp. 753–760.
- N. Tomasev and D. Mladenic, ―Nearest neighbor voting in high dimensional data: Learning from past occurrences,‖ Computer Science and Information Systems, vol. 9, no. 2, pp 691–712, 2012.
- A. Vezhnevets and O. Barinova, ―Avoiding boosting over fitting by removing confusing samples,‖ in Proc. Eur. Conf. Mach. Learn.,2007, pp. 430–441.
- A. Ganatra and Y. Kosta, ―Comprehensive evolution and evaluation of boosting,‖ Int. J. Comput. Theory Eng., vol. 2, pp. 931–936,2010.
- Mr.D.Ravi, R.Deepika, R.Jai Gayatiri, S.Jaya Surya, ―Cluster based boosting algorithm for efficient recommender system‖, IJRTER-2016.
- Tran T.N., Wehrens R, and Buydens L.M.C, ‘Knn Density Based Clustering for High Dimensional Multispectral Images,‘Proc.Second GRSS/ISPRS Joint Workshop on Remote Sensing and Data Fusion over Urban Areas, 89(7), 147-151(2003).
- Arthur and Vassilvitskii, ‗K-Means++: The Advantages of Careful Seeding,‘ Proc. 18th Ann. ACM-SIAM Symp. Discret Algorithms (SODA), 8(3), 1027-1035(2007).