|  | Combining Psycho-linguistic, Content-based and Chat-based Features to Detect Predation in Chatrooms
               Javier Parapar (University of A Coruña, Spain)
 
               David E. Losada (Universidade de Santiago de Compostela, Spain)
 
               Álvaro Barreiro (University of A Coruña, Spain)
 
              Abstract: The Digital Age has brought great benefits for   the human race but also some draw-backs. Nowadays, people from   opposite corners of the World can communicate online via instant   messaging services. Unfortunately, this has introduced new kinds of   crime. Sexual predators haveadapted their predatory strategies to   these platforms and, usually, the target victims are kids. The   authorities cannot manually track all threats because massive   amounts of online conversationstake place in a daily   basis. Automatic methods for alerting about these crimes need to be   designed. This is the main motivation of this paper, where we   present a Machine Learning approachto identify suspicious subjects   in chat-rooms. We propose novel types of features for representing   the chatters and we evaluate different classifiers against the   largest benchmark available.This empirical validation shows that our   approach is promising for the identification of predatory   behaviour. Furthermore, we carefully analyse the characteristics of   the learnt classifiers. Thispreliminary analysis is a first step   towards profiling the behaviour of the sexual predators when   chatting on the Internet. 
             
              Keywords: cybercrime, machine learning, psycho-linguistic analysis, sexual predation, support vector machines, text mining 
             Categories: H.3.0, H.4  |