Go home now Header Background Image
Submission Procedure
share: |
Follow us
Volume 22 / Issue 5

available in:   PDF (283 kB) PS (3 MB)
Similar Docs BibTeX   Write a comment
Links into Future
DOI:   10.3217/jucs-022-05-0691


Sentiment Classification of Spanish Reviews: An Approach based on Feature Selection and Machine Learning Methods

María del Pilar Salas-Zarate (Universidad de Murcia, Spain)

Mario Andres Paredes-Valverde (Universidad de Murcia, Spain)

Jorge Limon-Romero (Universidad Autónoma de Baja California Mexico, Mexico)

Diego Tlapa (Universidad Autónoma de Baja California Mexico, Mexico)

Yolanda Baez-Lopez (Universidad Autónoma de Baja California Mexico, Mexico)

Abstract: Sentiment analysis aims to extract users' opinions from review documents. Nowadays, there are two main approaches for sentiment analysis: the semantic orientation and the machine learning. Sentiment analysis approaches based on Machine Learning (ML) methods work over a set of features extracted from the users' opinions. However, the high dimensionality of the feature vector reduces the effectiveness of this approach. In this sense, we propose a sentiment classification method based on feature selection mechanisms and ML methods. The present method uses a hybrid feature extraction method based on POS pattern and dependency parsing. The features obtained are enriched semantically through common-sense knowledge bases. Then, a feature selection method is applied to eliminate the noisy and irrelevant features. Finally, a set of classifiers is trained in order to classify unknown data. To prove the effectiveness of our approach, we have conducted an evaluation in the movies and technological products domains. Also, our proposal was compared with well-known methods and algorithms used on the sentiment classification field. Our proposal obtained encouraging results based on the F-measure metric, ranging from 0.786 to 0.898 for the aforementioned domains.

Keywords: feature selection methods, machine learning, natural language processing, opinion mining, sentiment analysis

Categories: H.3.3, I.2.2, I.2.7, I.7, L.3.2