Open Access Journal

ISSN : 2394-2320 (Online)

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

Open Access Journal

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

ISSN : 2394-2320 (Online)

Efficient Document Classification using Phrases Generated by Semi-Supervised Hierarchical Latent Dirichlet Allocation

Author : Rohit Agrawal 1 A.S. Jalal 2 S.C. Agarwal 3 Himanshu Sharma 4

Date of Publication :14th February 2018

Abstract: There are many models available for document classification like Support vector machine, neural networks and Naive Bayes classifier. These models are based on the Bag of words model. Word’s semantic meaning is not contained by such models. Meanings of the words are better represented by their occurrences and proximity of words in particular document. So, to maintain the proximity of the words, we use a “Bag of Phrases” model. Bag of phrase model is capable to differentiate the power of phrases for document classification. We proposed a novel method to separate phrases from the corpus utilizing the outstanding theme show, Semi-Supervised Hierarchical Latent Dirichlet Allocation (SSHLDA).SSHLDA integrates the phrases in vector space model for document classification. Experiment represents an efficient performance of classifiers with this Bag of Phrases model. The experimental results also show that SSHLDA is better than other related representation models.

Reference :

Will Updated soon

Recent Article