YEONG-SUNG LINHsiao C.-HZhang S.-YRung Y.-PChen Y.-X.2022-04-262022-04-26202113867857https://www.scopus.com/inward/record.uri?eid=2-s2.0-85107430951&doi=10.1007%2fs10586-021-03313-4&partnerID=40&md5=cf6da1706ac4fca852a314d33a6f2714https://scholars.lib.ntu.edu.tw/handle/123456789/608051Due to the rapid development of diversified technology, people may use multiple electronic devices, such as personal computers, tablets, and smartphones, to connect to the Internet in their daily lives. Switching between devices enables a user to use e-commerce on various platforms. The complexity of consumer behavior is directly proportional to the number of involved devices. Additionally, since the personal privacy regulations nowadays are getting more strict, the user data on the Internet starts to be anonymous. Thus, determining how the devices are related is an indispensable step in achieving precision marketing or developing customized applications. In this research, the dataset provided by the CIKM Cup 2016 Challenge is used. The representation of a device is created by extracting features from browsing logs. The computation cost is reduced by filtering candidates of a target device instead of comparing them in pairs. Latent semantic indexing representations and techniques of supervised learning are used to accomplish filtering. Performing word embedding can turn literature semantic into vectors through an unsupervised neural ensemble. The addition of feature engineering on the input vectors of supervised classification can enhance the classifier’s discrimination. The classification is used to determine the probability of any two instances belonging to the same user. The significant benefit of the implementation is to form the sequences mentioned above by a cross-device linking mechanism to provide a baseline for aligning with the computation limitation and boosting the performance. ? 2021, The Author(s), under exclusive licence to Springer Science+Business Media, LLC, part of Springer Nature.Cross-device trackingLatent semantic indexingSupervised learningWord embeddingConsumer behaviorEmbeddingsPersonal computersPrivacy by designSemanticsComputation costsElectronic deviceExtracting featuresFeature engineeringsLatent Semantic IndexingPersonal privacyPrecision marketingsSupervised classificationElectron devices[SDGs]SDG12Cross-device matching approaches: word embedding and supervised learningjournal article10.1007/s10586-021-03313-42-s2.0-85107430951