Go home now Header Background Image
Submission Procedure
share: |
Follow us
Volume 24 / Issue 3

available in:   PDF (1000 kB) PS (1 MB)
Similar Docs BibTeX   Write a comment
Links into Future
DOI:   10.3217/jucs-024-03-0277


Linking User Online Behavior across Domains with Internet Traffic

Yuanyuan Qiao (Beijing University of Posts and Telecommunications, China)

Yan Wu (Beijing University of Posts and Telecommunications, China)

Yaobin He (China Electronics Technology Group Corp., China)

Libo Hao (Beijing University of Posts and Telecommunications, China)

Wenhui Lin (Aisino Corporation, China)

Jie Yang (Beijing University of Posts and Telecommunications, China)

Abstract: We are facing an era of Online With Offline (OWO) in the smart city - almost everyone is using various online services to connect friends, watch videos, listen to the music, download resources, and so on. Our online behaviors are separated by different domains, which may cause serious problem in the area of cross-domain recommendation, advertising, and criminal tracking in online and offline world, since it is a very challenging task to link user online behaviors belonging to the same natural person. Existing methods usually tackle user online behavior linkage problem by estimating the profile content similarity between two different online services. However, the profile contents in heterogeneous online services are unreliable or misaligned, and the proposed methods are always limited to several services in a specific domain. In order to link individual's online behavior across domains, in this paper, we propose user Online Behavior Linkage across Domains (OBLD), a novel hybrid model, to link user online behavior across domains with Internet traffic. It derives several signifficant attributes from users' online behaviors, such as user digital identity, various fingerprints of terminals and browsers, spatio-temporal behavior of users, and leverages a supervised classi_cation method to discover the relationship between users' online behaviors. Also, the proposed model has unsupervised setting for dataset with non or few label data if a certain percentage of user digital identities can be extracted from original dataset. By using real-world network traffic collected from two large provinces in China, we evaluate the OBLD model and the linkage precision achieves 89% and 97.9% for two datasets respectively. Especially, the inputs of OBLD, i.e., network traffic flows, cover all online behavior of users who connect with Internet through monitored networks, which makes it possible to link online behaviors of users in whole online world.

Keywords: across domains, internet traffic, online behavior linkage, user digital identity, user identity linkage

Categories: L.7.0