https://scholars.lib.ntu.edu.tw/handle/123456789/413130
標題: | Objectionable content filtering by click-through data | 作者: | Lee L.-H. Juan Y.-C. Chen H.-H. HSIN-HSI CHEN |
關鍵字: | Click-through mining;Collaborative filtering;Internet censorship | 公開日期: | 2013 | 起(迄)頁: | 1581-1584 | 來源出版物: | International Conference on Information and Knowledge Management | 摘要: | This paper explores users' browsing intents to predict the category of a user's next access during web surfing, and applies the results to objectionable content filtering. A user's access trail represented as a sequence of URLs reveals the contextual information of web browsing behaviors. We extract behavioral features of each clicked URL, i.e., hostname, bag-of-words, gTLD, IP, and port, to develop a linear chain CRF model for context-aware category prediction. Large-scale experiments show that our method achieves a promising accuracy of 0.9396 for objectionable access identification without requesting their corresponding page content. Error analysis indicates that our proposed model results in a low false positive rate of 0.0571. In real-life filtering simulations, our proposed model accomplishes macro-averaging blocking rate 0.9271, while maintaining a favorably low macro-averaging over-blocking rate 0.0575 for collaboratively filtering objectionable content with time change on the dynamic web. Copyright is held by the owner/author(s). |
URI: | https://scholars.lib.ntu.edu.tw/handle/123456789/413130 | ISBN: | 9781450322638 | DOI: | 10.1145/2505515.2507849 |
顯示於: | 資訊工程學系 |
在 IR 系統中的文件,除了特別指名其著作權條款之外,均受到著作權保護,並且保留所有的權利。