PSO-Based Feature Selection for Arabic Text Summarization
Ahmed M. Al-Zahrani (King Saud University, Saudi Arabia)
Hassan Mathkour (King Saud University, Saudi Arabia)
Hassan Abdalla (King Saud University, Saudi Arabia)
Abstract: Feature-based approaches play an important role and are widely applied in extractive summarization. In this paper, we use particle swarm optimization (PSO) to evaluate the effectiveness of different state-of-the-art features used to summarize Arabic text. The PSO is trained on the Essex Arabic summaries corpus data to determine the best particle that represents the most appropriate simple/combination of eight informative/structure features used regularly by Arab summarizers. Based on the elected features and their relevant weights in each PSO iteration, the input text sentences are scored and ranked to extract the top ranking sentences in the form of an output summary. The output summary is then compared with a reference summary using the cosine similarity function as the fitness function. The experimental results illustrate that Arabs summarize texts simply, focusing on the first sentence of each paragraph.
Keywords: Arabic text summarization, Particle Swarm optimization, feature selection, natural language processing
Categories: B.4.4, D.3.3, H.3.1, H.3.6, I.2.6, I.5.4