Author : Ganesh.T 1
Date of Publication :30th March 2018
Abstract: The usage of data is increasing day by day. There is a huge amount of data storage is required for handling the millions of twits, shares in social networks (twitter, facebook, WhatsApp, and youtube) per second. Databases are playing a vital role in data warehousing and mining. The process of storing the data in large repository place is known as Data Warehousing. Nowadays, Search Engines are struggling to follow the Search Engine Optimization techniques. So there is a pressure for the data analyst to fetch the data from the data warehouse efficiently. The task of classification with imbalanced datasets have attracted quite an interest from researchers in the recent years. Accordingly, various classification techniques are used to handle the newly arrived large amount of data. So many applications have been designed to address this problem from the different perspective such as data pre-processing, algorithm modification and sensitive learning. The problem of constructing fast and accurate classifiers in large data set is an important task in data mining and knowledge discovery. This paper illustrates the various classification techniques and also to improve the correctness of classifier for Classification Techniques in Data Mining
Reference :
-
- Kotsiantis, S., D. Kanellopoulos, and P. Pintelas, Handling imbalanced datasets: a review. GESTS International Transactions on Computer Science and Engineering, 2006. Vol 30(No 1): p.25-36.
- Yang, Z., et al., Association rule mining-based dissolved gas analysis for fault diagnosis of power transformers. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 2009. 39(6): p. 597-610.
- Zhu, Z.-B. and Z.-H. Song, Fault diagnosis based on imbalance modified kernel Fisher discriminant analysis. Chemical Engineering Research and Design, 2010. 88(8): p. 936-951.
- Tavallaee, M., N. Stakhanova, and A.A. Ghorbani, Toward credible evaluation of anomalybased intrusiondetection methods. Systems, Man, and Cybernetics, Part C: Applications and Reviews, IEEE Transactions on, 2010. 40(5): p. 516-524.
- T. Hastie, R. Tibshirani. Classification by pairwise coupling. The annals of statistics, 1998, vol. 26, no. 2, pp. 451-471
- R. Rifkin, A. Klautau. In defense of one-vs-all classification. The Journal of Machine Learning Research, 2004, vol. 5, pp. 101-141.
- N. Garcia-Pedrajas, D. Ortiz-Boyer. Improving Multiclass Pattern Recognition by the Combination of Two Strategies. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2006, vol. 28, no. 6, pp. 1001- 1006.
- Gustavo E. A. P. A. Batista, Ronaldo C. Prati and Maria Carolina Monard - A Study of the Behavior of Several Methods for Balancing Machine Learning Training Data‖ in Sigkdd Explorations.