International Journal of Advanced and Applied Sciences
Int. j. adv. appl. sci.
Print ISSN: 2313-626X
Volume 4, Issue 8 (August 2017), Pages: 43-49
Title: Hybrid approach for sentiment analysis of Arabic tweets based on deep learning model and features weighting
Author(s): Altyeb Altaher *
Faculty of Computing and Information Technology in Rabigh, King Abdulaziz University, Jeddah, Saudi Arabia
Full Text - PDF XML
The increasing adoption of social media networks as a platform for sharing opinions on different aspects emerged the sentiment analysis and opinion mining as an active research area. Recently, the sentiment analysis on Twitter has attracted considerable attention due to its many applications in various aspects of our lives. Many approaches have been presented for sentiment analysis based on English language, thus there is a need for efficient sentiment analysis approaches for Arabic language, since it has different structure when compared to other languages. This paper proposes a hybrid approach for sentiment analysis of Arabic tweets based on two stages. Firstly, the pre-processing methods like stop-word removal, tokenization and stemming are applied, and then two features weighting algorithms (information gain and chai square) are utilized to assign high weights to the most significant features of the Arabic tweets. Secondly, the deep learning technique is employed to effectively and accurately classify the Arabic tweets either as positive or negative tweets. The performance of the proposed approach has been compared with some of the classification methods such as Decision Tree (DT), Neural Networks (NN) and Support Vector Machine (SVM) using the dataset collected from Arabic tweets. The proposed approach outperforms the other approaches and achieved highest accuracy and precision of 90% and 93.7%, respectively.
© 2017 The Authors. Published by IASE.
This is an open access article under the CC BY-NC-ND license (http://creativecommons.org/licenses/by-nc-nd/4.0/).
Keywords: Arabic text, Deep learning, Sentiment analysis, Features weighting, Classification algorithms
Article History: Received 6 May 2017, Received in revised form 13 June 2017, Accepted 3 July 2017
Digital Object Identifier:
Altaher A (2017). Hybrid approach for sentiment analysis of Arabic tweets based on deep learning model and features weighting. International Journal of Advanced and Applied Sciences, 4(8): 43-49
- Abdulla NA, Ahmed NA, Shehab MA, and Al-Ayyoub M (2013). Arabic sentiment analysis: Lexicon-based and corpus-based. In the IEEE Jordan Conference on Applied Electrical Engineering and Computing Technologies (AEECT'13), IEEE, Amman, Jordan: 1-6. https://doi.org/10.1109/aeect.2013.6716448
- Agarwal B and Mittal N (2016). Machine learning approach for sentiment analysis. In: Agarwal B and Mittal N (Eds.), Prominent feature extraction for sentiment analysis: 21-45. Springer International Publishing, Berlin, Germany. https://doi.org/10.1007/978-3-319-25343-5_3
- Aggarwal CC and Zhai C (2012). Mining text data. Springer Science and Business Media, Berlin, Germany. https://doi.org/10.1007/978-1-4614-3223-4
- Al-Ayyoub M, Nuseir A, Kanaan G, and Al-Shalabi R (2016) Hierar-chical classifiers for multi-way sentiment analysis of arabic reviews. International Journal of Advanced Computer Science and Applications (IJACSA), 7(2):531-539. https://doi.org/10.14569/IJACSA.2016.070269
- Aldayel HK and Azmi AM (2016). Arabic tweets sentiment analysis–a hybrid scheme. Journal of Information Science, 42(6):782-797. https://doi.org/10.1177/0165551515610513
- Bird S, Klein E, and Loper E (2009) Natural language processing with Python: Analyzing text with the natural language toolkit. O'Reilly Media Inc, Sebastopol, USA.
- Cambria E (2016). Affective computing and sentiment analysis. IEEE Intelligent Systems, 31(2): 102-107. https://doi.org/10.1109/MIS.2016.31
- Duwairi RM and Qarqaz I (2014) Arabic sentiment analysis using supervised classification. In the International Conference on Future Internet of Things and Cloud (FiCloud'14), IEEE, Barcelona, Spain: 579-583. https://doi.org/10.1109/FiCloud.2014.100
- Duwairi RM, Marji R, Sha'ban N, and Rushaidat S (2014). Sentiment analysis in Arabic tweets. In the 5th International Conference on Information and Communication Systems (ICICS'16), IEEE, Irbid, Jordan: 1-6. https://doi.org/10.1109/iacs.2014.6841964
- Farghaly A and Shaalan K (2009). Arabic natural language processing: Challenges and solutions. ACM Transactions on Asian Language Information Processing (TALIP), 8(4). https://doi.org/10.1145/1644879.1644881
- Imran M, Castillo C, Diaz F, and Vieweg S (2015). Processing social media messages in mass emergency: A survey. ACM Computing Surveys (CSUR), 47(4). https://doi.org/10.1145/2771588
- Khasawneh RT, Wahsheh HA, Alsmadi IM, and AI-Kabi MN (2015). Arabic sentiment polarity identification using a hybrid approach. In the 6th International Conference on Information and Communication Systems (ICICS'15), IEEE, Amman, Jordan: 148-153. https://doi.org/10.1109/iacs.2015.7103218
- Khoja S and Garside R (1999). Stemming Arabic text. Computing Department, Lancaster University, Lancaster, UK.
- Kim Y (2014). Convolutional Neural Networks for Sentence Classification. In the Conference on Empirical Methods in Natural Language Processing (EMNLP'14), Association for Computational Linguistics, Doha, Qatar: 1746–1751. Available online at: https://arxiv.org/abs/1408.5882 https://doi.org/10.3115/v1/d14-1181
- Laboreiro G, Sarmento L, Teixeira J, and Oliveira E (2010). Tokenizing micro-blogging messages using a text classification approach. In the fourth workshop on Analytics for noisy unstructured text data, ACM, Toronto, Canada: 81-88. https://doi.org/10.1145/1871840.1871853
- Larkey LS, Ballesteros L, and Connell ME (2007). Light stemming for Arabic information retrieval. In: Soudi A, Neumann G, and Van den Bosch A (Eds.), Arabic computational morphology: 221-243. Springer Netherlands, Amsterdam, Netherlands. https://doi.org/10.1007/978-1-4020-6046-5_12
- LeCun Y, Bengio Y, and Hinton G (2015). Deep learning. Nature, 521(7553):436-44. https://doi.org/10.1038/nature14539 PMid:26017442
- Manning CD, Raghavan P, and Schütze H (2008). Introduction to information retrieval. Cambridge University Press, Cambridge, United Kingdom. https://doi.org/10.1017/cbo9780511809071
- Mori T (2002). Information gain ratio as term weight: the case of summarization of ir results. In the 19th international conference on Computational linguistics, Association for Computational Linguistics, Taipei, Taiwan, 1: 1-7. https://doi.org/10.3115/1072228.1072246
- Nakov P, Ritter A, Rosenthal S, Sebastiani F, and Stoyanov V (2016). SemEval-2016 task 4: Sentiment analysis in Twitter. In the International Workshop on Semantic Evaluation (SemEval), Association for Computational Linguistics, San Diego, California, USA: 1-8. Available online at: http://anthology.aclweb.org/S/S16/S16-1001.pdf https://doi.org/10.18653/v1/s16-1001
- Niu T, Zhu S, Pang L, and El Saddik A (2016). Sentiment analysis on multi-view social data. In: Tian Q, Sebe N, Qi GJ, Huet B, Hong R, and Liu X (Eds.), Multi Media Modeling: Lecture notes in computer science: 15-27. Springer International Publishing, Berlin, Germany. https://doi.org/10.1007/978-3-319-27674-8_2
- Pak A and Paroubek P (2010). Twitter as a corpus for sentiment analysis and opinion mining. In the Louisiana Real Estate Commission (LREC'10), USA, 10: 1320-1326. Available online at: http://crowdsourcing-class.org/assignments/downloads/pak-paroubek.pdf
- Pang B and Lee L (2008). Opinion mining and sentiment analysis. Foundations and Trends® in Information Retrieval, 2(1–2): 1-35.
- Pang B, Lee L, and Vaithyanathan S (2002). Thumbs up?: sentiment classification using machine learning techniques. In the ACL-02 Conference on Empirical Methods in Natural Language Processing, Association for Computational Linguistics, Stroudsburg, PA, USA, 10: 79-86. https://doi.org/10.3115/1118693.1118704
- Poria S, Cambria E, Howard N, Huang GB, and Hussain A (2016). Fusing audio, visual and textual clues for sentiment analysis from multimodal content. Neurocomputing, 174: 50-59. https://doi.org/10.1016/j.neucom.2015.01.095
- Rushdi‐Saleh M, Martín‐Valdivia MT, Ure-a‐López LA, and Perea‐Ortega JM (2011). OCA: Opinion corpus for Arabic. Journal of the American Society for Information Science and Technology, 62(10): 2045-2054. https://doi.org/10.1002/asi.21598
- Salamah JB and Elkhlifi A (2014). Microblogging opinion mining approach for Kuwaiti dialect. In The International Conference on Computing Technology and Information Management (ICCTIM'14), Dubai, UAE: 388-396. Available online at: https://journals.iauip.de/index.php/icctim/article/view/225
- Schmidhuber J (2015). Deep learning in neural networks: An overview. Neural Networks, 61: 85-117. https://doi.org/10.1016/j.neunet.2014.09.003 PMid:25462637
- Shoukry A and Rafea A (2012). Sentence-level Arabic sentiment analysis. In the International Conference on Collaboration Technologies and Systems, IEEE, Denver, USA: 546-550. https://doi.org/10.1109/cts.2012.6261103
- Somasundaran S, Wilson T, Wiebe J, and Stoyanov V (2007). QA with attitude: Exploiting opinion type analysis for improving question answering in on-line discussions and the news. In the International Conference on Web and Social Media (ICWSM'07), Boulder, Colorado, USA. Available online at: http://icwsm.org/papers/2--Somasundaran-Wilson-Wiebe-Stoyanov.pdf
- Thelwall M, Buckley K, and Paltoglou G (2011). Sentiment in Twitter events. Journal of the American Society for Information Science and Technology, 62(2):406-418. https://doi.org/10.1002/asi.21462
- UNESCO (2012). World Arabic language day. United Nations Educational, Scientific and Cultural Organization. Paris, France. Available online at: http://www.en.unesco.org/
- Verma T, Renu DG, and Gaur D (2014). Tokenization and filtering process in rapid miner. International Journal of Applied Information Systems, 7(2): 16-8. https://doi.org/10.5120/ijais14-451139