International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Open Access Journal

ISSN : 2394-2320 (Online)

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

Open Access Journal

International Journal of Engineering Research in Computer Science and Engineering (IJERCSE)

Monthly Journal for Computer Science and Engineering

ISSN : 2394-2320 (Online)

Call For Paper : Vol 11, Issue 07, July 2024

Efficient Document Classification using Phrases Generated by Semi-Supervised Hierarchical Latent Dirichlet Allocation

Author : Rohit Agrawal ¹ A.S. Jalal ² S.C. Agarwal ³ Himanshu Sharma ⁴

Date of Publication :14th February 2018

Abstract: There are many models available for document classification like Support vector machine, neural networks and Naive Bayes classifier. These models are based on the Bag of words model. Wordâ€™s semantic meaning is not contained by such models. Meanings of the words are better represented by their occurrences and proximity of words in particular document. So, to maintain the proximity of the words, we use a â€œBag of Phrasesâ€ model. Bag of phrase model is capable to differentiate the power of phrases for document classification. We proposed a novel method to separate phrases from the corpus utilizing the outstanding theme show, Semi-Supervised Hierarchical Latent Dirichlet Allocation (SSHLDA).SSHLDA integrates the phrases in vector space model for document classification. Experiment represents an efficient performance of classifiers with this Bag of Phrases model. The experimental results also show that SSHLDA is better than other related representation models.

Reference :

1. C. J. Burges, “A tutorial on support vector machines for pattern recognition”, Data Mining and Knowledge Discovery, 2:121–167, 1998.
2. B. Dasarathy, “Nearest neighbor (fNNg) norms:fNNgpattern classification techniques” 1991.
3. T. Joachims, “A probabilistic analysis of the rocchio algorithm with tfidf for text categorization”,Journel of Machine Learning Research, 14:143–151, 1997.
4. I. Androutsopoulos, J. Koutsias, and Chandrinos, “An evaluation of naive bayesian anti-spam filtering”,Arxiv preprint cs/0006013, 2000.
5. D. M. Blei, A. Ng, and M. Jordan, “Latent dirichlet allocation”,.Journel of Machine Learning Research, 3:993– 1022, 2003.
6. D. Wang, M. Thint, and A. Al-Rubaie, “SemiSupervised Latent Dirichlet Allocation and its Application for Document Classification,” IEEE/WIC/ACM International Conferences on Web Intelligence and Intelligent Agent Technology, IEEE, Vol. 03, pp. 306-310, 2012.
7. W. Zhang, T. Yoshida, and X. Tang, “Text classification based on multi-word with support vector machine. Knowledge-Based Systems”, Elsevier, 21:879– 886, 2008.J. Clerk Maxwell, A Treatise on Electricity and Magnetism, 3rd ed., vol. 2. Oxford: Clarendon, 1892, pp.68–73.
8. D.Gujraniya, and M.N.Murty, "Efficient classification using phrases generated by topic models." Proceedings of the 21st International Conference on Pattern Recognition, IEEE, pp. 2331-2334, 2012
9. S. Deerwester, S.T. Dumais, G.W. Furnas, T.K. Landauer, and R. Harshman, “ Indexing by latent semantic analysis”, Journal of the American society for information science, 41(6):391–407, 1990.
10. T. Hofmann, “Probabilistic latent semantic analysis” In Proc. of Uncertainty in Artificial Intelligence, UAI’99, page 21. Citeseer,1999.
11. C. Chemudugunta, P. Smyth, and M. Steyvers,”Text modeling using unsupervised topic models and concept hierarchies” Arxiv preprint arXiv:0808.0973, 2008.
12. D. Blei and J. Lafferty, “Correlated topic models”, Advances in neural information processing systems, 18:147, 2006
13. D.M. Blei and J.D. McAuliffe, “Supervised topic models”, In Proceeding of the Neural Information Processing Systems(nips),2007.
14. S. Lacoste-Julien, F. Sha, and M.I. Jordan,”ndisclda: Discriminative learning for dimensionality reduction and classification”,Advances in Neural Information Processing Systems, 21,2008.
15. M. Rosen-Zvi, T. Griffiths, M. Steyvers, and P. Smyth,” The author-topic model for authors and documents”, In Proceedings of the 20th conference on Uncertainty in artificial intelligence, pages 487–494. AUAI Press,2004.
16. D. Ramage, C.D. Manning, and S. Dumais,” Partially labeled topic models for interpretable text mining”, In Proceedings of the 17th ACM SIGKDD international conference on Knowledge discovery and data mining, pages 457–465. ACM, 2011.
17. D. Ramage, D. Hall, R. Nallapati, and C.D. Manning, “Labeledlda: A supervised topic model for credit attribution in multi-labeled corpora”, In Proceedings of the 2009 Conference on Empirical Methods in Natural Language Processing: Volume 1-Volume 1, pages 248– 256. Association for Computational Linguistics, 2009.
18. D. Blei, T.L. Griffiths, M.I. Jordan, and J.B. Tenenbaum., “Hierarchical topic models and the nested chineserestaurant process”, Advances in neural information processing systems, 16:106,2004.
19. W. Li and A. McCallum, “Pachinko allocation: Dagstructured mixture models of topic correlations”, In Proceedings of the 23rd international conference on Machine learning, pages 577–584. ACM, 2006.
20. D. Mimno, W. Li, and A. McCallum,” Mixtures of hierarchical topics with pachinko allocation”, In Proceedings of the 24th international conference on Machine learning, pages 633–640. ACM,2007.
21. Y. Petinot, K. McKeown, and K. Thadani, “A hierarchical model of web summaries”, In Proceedings of the 49th Annual Meeting of the ACL: Human Language Technologies: short papers-Volume 2, pages 670–675. ACL, 2011.

Recent Article

● Review on Text Mining

● Introduction to Cyber Security Threats and Models

● Overview of Big Data Processing with Fuzzy Set Techniques

● Development of the Intelligent Judgment System for the Welding Efficiency

● Friction Stir Welding Parameter Optimization for Aluminum

● An Analysis on Pattern Recognition

● Secure Cloud Framework for Data Management

● Review on Text Mining

● Introduction to Cyber Security Threats and Models

● Overview of Big Data Processing with Fuzzy Set Techniques

● Framework of Big Data for Corporate Strategy

● Detection of Face with Face Recognition

● Review of a Multipurpose Smart Bag

● An Analytical Paper on Smart Bandages

● Review on Biometric-Secure E-Voting System for Election Process in India

● Fading in Wireless Communication

● An Automatic Patient Monitoring System Using GSM and ZIGBEE

● Review on Wastewater Treatment Technologies

● Internet of Things (IoT) for Smart Cities

● Automotive Security Using GPS Technology

● Review on Iris Recognition

● A Review on Plagiarism Detection Tools and Method

● A Review on Nanotechnology

● A Method for Palm Vein Recognition

● A Review on an Efficient Ransomware Detection

● Traffic Detection Using Artificial Intelligence

● A Review Paper on Computer Firewall

● Palmprint Recognition Based on Wavelets

● Review on Fingerprint Identification

● Bird Position Monitoring System Using GPS

● A Review on Drive Energy Harvesting from Vibrations using poly-Vinylidene fluoride (PVDF) Piezoelectric Materials

● NFC Based Toll Collection System

● Automotive Security Using GPS Technology

● Analysis of Machine Learning Algorithms over Different Data Sets in R Programming

● Triclustering Algorithm for 3D Gene Expression Data using Correlation Measure

● Random Hexi Code Based Public Key Encryption (RHCE) Scheme for Code-Based Cryptography

● A New Approach to Fuzzy Bitopological Space via Î³-Open Sets

● Digital Watermarking for Developing Trust in Medical Imaging Workflows State of the Art Review

● A Survey of Cluster Ensembles and its Applications: Specific Problem Domains

● Back Propagation Algorithm Based Approach to Recognize and Categorize the DC Fault in PV Module

● Efficient Document Classification using Phrases Generated by Semi-Supervised Hierarchical Latent Dirichlet Allocation

● Efficient and Expressive Keyword Search over Encrypted Data in the Cloud

● A Novel Low-Power Reversible Vedic Multiplier

● A Comprehensive Review on Recent Development of Islanding Detection Method

● A Secured Cryptographic Technique for Protecting Online data in the Cloud

● Placement of FACTS Device Using Soft Computing Technique: A Review

● Behavioral model to obtain profit with guaranteed quality of service in cloud computing

● Comparative Study of Management Information System and Decision Support System

● Real-Time Attendance system based on video surveillance system

● Solution of the Verhulst Model in Mathematical Biology Using Natural Decomposition Method (NDM)

● Admission Control based Delay-Aware Routing Protocol for MANETs

● Dynamic cluster Selection based on less energy dissipation in Wireless Sensor Network

● Deep Neural Networks for Big Image Data Classification

● A Novel Text Detection Technique Based On Corner Response

● A Review on Load Flow Analysis of Ring Main System

● Performance Analysis of First Level Cache Memory Replacement Policies in Multicore Systems

● Use of A.I. Technologies for V.I.C.T.O.R

● AODV with Reliable and Energy Efficient Route Maintenance Phase

● A Recent Survey on Multiclass Object Recognition and Classification based on Machine learning methods

● Application of Metaheuristic Algorithms for Optimal Allocation of DGs in Radial Distribution System

● Delay Analysis Wireless Sensor Networks Considering Energy Costs of Sensing and Transmission in Energy Harvesting Nature

● Advancement of BLE Supported Healthcare System with Cloudlet

● Simulated Performance of a Photovoltaic Module: A Comparison of ANN and Regression Based Models

● Scope of Artificial Intelligence Techniques for Exhaust Emission Prediction of CI Engines and Renewable Energy Applications

● A measure of Temporal Contextual Information on Trust based Recommender Systems

● Compendious and Optimized Succinct Data Structures for Big Data Store

● A Resource Allocation Strategy using PSO in Heterogeneous Cloud

● A Study of New Horizon on Emerging IoT Technology

● Modelling of DC Motor Analysis by Microcontroller through PV

● Implementation of Logic Circuits with Low Energy Charge Recovery Logic

● Detection of Compromised Accounts in Online Social Network

● A Body Wearable Antenna for Online Real-Time Health Monitoring System

● Virtual Machine Live Migration Procedures in Cloud Computing Environment

● Machine Learning Based Approaches for Natural Language Processing

● Classification of Emotion using Different No. of MFCCs

● Audio/Video Compression Using Transformation Techniques

● Securing Cloud Resource Allocation Requests with Tls Connections Mandate and Improving DH Key Exchange with Additional Security Factor

● Segmentation of Periapical Dental X-Ray Images by applying Morphological Operations

● A Pattern Prediction on Electricity Consumption using Hidden Markov Model

● Search Engine Optimization

● Improved Real-Time Scheduling algorithm for mixed task Set with Constraint of Harvesting Energy

● An Efficient Steganographic Approach for H.264/AVC Compressed Videos

● A Review on Image Forgery Detection Techniques on Passive attacks

● A Novel Study on Pre-copy Method for Classification of Live Migration of Virtual Machines in Cloud Computing

● Internet of Things based Earth Tremors Warning and Notification System using ESP8266 nodeMCU

● Spatial Interpolation and Mapping of Soil Geotechnical Properties of Udham Singh Nagar District using GIS

● Weighted Goal Programming Approach for Solving Multi-Objective De Novo Programming Problems

● Parts of Speech Tagging For Konkani Language

● Indeed A Big Technology: The Kernel methods

● Accuracy of speech emotion recognition through deep neural network and k-nearest

● DGA Based Incipient Fault Diagnosis of Transformer Using AI Technique

● Sentiment Analysis on Twitter Data Using Machine Learning Algorithms in Python

● Automation of Building Extraction from Satellite Imagery Using Line Segment Detector

● A model-based scheduling approach for selection of Real-Time Scheduling Algorithm on basis of Different Parameters

● Impact of Security and Privacy in Consumer adoption of Immediate Payment Services (IMPS) in M-Commerce (With special reference to Banking Industry)

● Optimized Approach for Parallel OMR sheet Analysis

● Weather Forecasting Using Data Mining

● Using APIs to display information in AR headsets

● Patent protection through Patent Insurance in India: A research framework

● Effect of Varying Training Images on Performance of Face Recognition: A Study

● Analysis of Noise in Phase Stepping Interferometry Mobile AD - HOC Networks â€“ Issues and Challenges

● Data Mining Techniques for Online Communities

● Sensing Heartbeat and Body Temperature Digitally using Arduino

● A Survey on Smart Cities using IoT

● Unconstrained Handwritten Document Retrieval Based on User Query Interaction

● Reliable Prediction for Detection of Heart Attack among People Using Twitter

● Performance and analysis of texture synthesis based on Multi Seed-blocks and kernel support vector machine

● Forecasting Techniques for Time Series Financial Data

● Random Early Detection-A Comparative Techniques Performed In Red in Network Trafficking

● A study of Wireless Sensor Network Data Acquisition

● Implementation of a Perceptron-based Artificial Neural Network Classifier Circuit on FPGA Hardware

● The Energy Efficient Of Routing Protocol in Mobility Models Using Manet

● A Secure and Dynamic Multi-keyword Ranked Search Scheme over Encrypted Cloud Data

● Fuzzy C- Means Algorithm for Clustering

● Extraction of Superficial and Volumetric Features from 3D Color Images

● Homomorphic Decryption Technique in Cloud Computing for Privacy Preserving

● Smart farming with an Auto protection system

● Analysis of Noise in Phase Stepping Interferometry

● Implementation and Comparative Study of Tracking Methods for Human Tracking

● Artificial Neural Networks for Intelligent Communication Systems: A Study

● Noise Words Detection on Social Network Using Opinion Mining

● An Empirical Analysis of Component Based Software Engineering and Critical Evaluation of Component Selection using Analytic Hierarchy Process: A Quantitative Approach

● Visible Light Communication (VLC) Channel Modeling

● Performance Analysis of Bayes Classification Algorithms in WEKA Tool using Bank Marketing Dataset

● Smart Door Bell with facial recognition and RFID

● Light Fidelity (Li-Fi) Transceiver for Data Transmission Environment Based on MATLAB Simulation and System Design

● Optimal Load Flow Solution for Active Power Loss Minimization in Transmission System

● Impact of ICT on E-governance and its Service Delivery for Rural Development

● Hand Gesture Segmentation Using Skin Color Detection in YCBCR Color Space

● Implementation of TQWT for the detection of Sleep Apnea and Snoring from ECG signals

● Analysis of various Computing Techniques for Diagnosis of Sleep Disorders

● Framework and Architecture of Internet of Things

● Design of Suspicious User Profile Identification system Using ACO Algorithm

● Cloud-Based Secure Access

● Web Log Files in Web Usage Mining Research â€“A Review

● Intelligent Structured Self Optimizing ACO based Routing for MANET

● Adaptive Head-Light System For Vehicle

● An Efficient Restoration Mechanism for Big Data Analysis of Cloud Using Compression Techniques

● Residential Energy Optimization in Smart Grid Perspective

● Role of Big Data Analytics in HealthCare System: A Systematic Review for Breast Cancer

● Big Data Analytics Evaluation

● Review of IoT Market and Open Source Technologies in IoT

● Comparative Study of Big Ten Information Security Management System Standards

● Interprets the mutual Privacy Conflicts in Social Media