Author : Srinivasa R B.E 1
Date of Publication :17th May 2017
Abstract: The Demographic property of clients such as gender, age, and education gives important information for e-commerce service providers or specialists in merchandising and personalization of web applications. However, online clients often do not give this type of information due to privacy and security related reasons. In this we proposed a method for previsioning the gender of clients based o0n their catalogue viewing data on e-commerce systems, such as the date and time of access, list of categories and products viewed, etc. We use a machine learning techniques and investigate a number of characteristics derived from catalogue viewing information to prevision the gender of viewers. Experiments were carried out on the datasets. The results 81.2% on balanced accuracy shows that basic characteristics such as viewing time, products/categories characteristics used together with more advanced characteristics such as products/categories sequence and transfer characteristics effectively facilitate gender prevision of clients
Reference :
-
- M. Pennachiotti, and A. M. Popescu, “A machine learning approach to Twitter user classification”. Proceedings of AAAI, 2011.
- J. Schler, M. Koppel, S. Argamon, and J. Pennebaker, “Effects of age and gender on blogging,” In Proceedings of AAAI Spring Symposium on Computational Approaches for Analyzing Weblogs, pp. 191-197, 2006
- S. Argamon, M. Koppel, J. Fine, and A. Shimoni, “Gender, genre, and writing style in formal written texts,” Text 23(3), August 2003
- R. E. Schapire, “The boosting approach to machine learning: An overview,” Proc. MSRI Workshop Nonlinear Estimation and Classification, 2001.
- S. Kabbur, E. H. Han, and G. Karypis, “Content-based methods for predicting web-site demographic attributes,” Proceedings of ICDM, pp. 863- 868, 2010.
- M. Koppel, S. Argamon, and A. R. Shimoni, “Automatically categorizing written texts by author gender,” Literary and Linguistic Computing, 17(4), pp : 401-412, 2002
- C. Zhang, and P. Zhang, “Predicting gender from blog posts,” Technical Report. University of Massachusetts Amherst, USA, 2010.
- J. C. A. Culotta, N. R. Kumar, and J. Cutler, “Predicting the demographics of twitter users from website traffic data,” Proceedings of the 29th AAAI Conference on Artificial Intelligence, Jan 2015.
- D. Nguyen, R. Gravel, D. Trieschnigg, and T. Meder, "How old do you think i am? A study of language and age in twitter,” Proceedings of the Seventh International AAAI Conference on Weblogs and Social Media, 2013.
- Y. Dong, Y. Yang, J. Tang, Y. Yang, and N. V. Chawla, “Inferring user demographics and social strategies in mobile social networks.” In: KDD’14. ACM. p. 15–24, 2014.
- J. J. C. Ying, Y. J. Chang, C. M. Huang, and V. S. Tseng, “Demographic prediction based on users mobile behaviours,” In Nokia Mobile Data Challenge, 2012.
- J. Hu, H. J. Zeng, H. Li, C. Niu, and Z. Chen, “Demographic prediction based on user’s browsing behavior,” Proceedings of the 16th international conference on World Wide Web, pp. 151-160, 2007.
- F. Iqbal, M. Debbabi, B. C. M. Fung, and L. A. Khan, “E-mail authorship verification for forensic investigation,” Proceedings of the 2010 ACM Symposium on Applied Computing, ser. SAC '10. New York, NY, USA: ACM, pp. 1591-1598, 2010.
- D. T. Duc, P. B. Son, and T. Hanh, “Using content-based features for author profiling of Vietnamese forum posts,” In: Recent Developments in Intelligent Information and Database Systems, pp. 287–296. Springer International Publishing, Berlin, 2016.
- O. De Vel, A. Anderson, M. Corney, and G. M. Mohay, “Mining e-mail content for author identification forensics,” SIGMOD Record 30(4), pp. 55- 64, 200
- S. Argamon, M. Koppel, J. Pennebaker, and J. Schler, “Automatically profiling the author of an anonymous text,” Communications of the ACM, v.52 n.2, February 2009.
- C. X. Ling, and V. S. Sheng, “Cost-sensitive learning and the class imbalance problem.” In: Sammut C (ed) Encyclopedia of machine learning. Springer, Berlin, 2008.
- S. Kotsiantis, D. Kanellopoulos, and P. Pintelas, “Handling unbalanced datasets: A review,” GESTS International Transactions on Computer Science and Engineering 30 (1), pp. 25-36, 2006.