Go home now Header Background Image
Search
Submission Procedure
share: |
 
Follow us
 
 
 
 
Volume 13 / Issue 10

available in:   PDF (175 kB) PS (182 kB)
 
get:  
Similar Docs BibTeX   Write a comment
  
get:  
Links into Future
 
DOI:   10.3217/jucs-013-10-1471

 

Machine Learning-Based Keywords Extraction for Scientific Literature

Chunguo Wu (Jilin University and Beijing Jiaotong University, China)

Maurizio Marchese (University of Trento, Italy)

Jingqing Jiang (Jilin University, China)

Alexander Ivanyukovich (University of Trento, Italy)

Yanchun Liang (Jilin University, China)

Abstract: With the currently growing interest in the Semantic Web, keywords/metadata extraction is coming to play an increasingly important role. Keywords extraction from documents is a complex task in natural languages processing. Ideally this task concerns sophisticated semantic analysis. However, the complexity of the problem makes current semantic analysis techniques insufficient. Machine learning methods can support the initial phases of keywords extraction and can thus improve the input to further semantic analysis phases. In this paper we propose a machine learning-based keywords extraction for given documents domain, namely scientific literature. More specifically, the least square support vector machine is used as a machine learning method. The proposed method takes the advantages of machine learning techniques and moves the complexity of the task to the process of learning from appropriate samples obtained within a domain. Preliminary experiments show that the proposed method is capable to extract keywords from the domain of scientific literature with promising results.

Keywords: keywords extraction, machine learning, metadata extraction, support vector machine

Categories: H.3.7, H.5.4, M.0, M.7, M.9