Collaborative cyberporn filtering with collective intelligence
Journal
34th International ACM SIGIR Conference on Research and Development in Information Retrieval
Pages
1153-1154
ISBN
9781450309349
Date Issued
2011
Author(s)
Lee L.-H.
Abstract
This paper presents a user intent method to generate blacklists for collaborative cyberporn filtering. A novel porn detection framework that finds new pornographic web pages by mining user search behaviors is proposed. It employs users' clicks in search query logs to select the suspected web pages without extra human efforts to label data for training, and determines their categories with the help of URL host name and path information, but without web page content. We adopt an MSN porn data set to explore the effectiveness of our method. This user intent approach achieves high precision, while maintaining favorably low false positive rate. In addition, real-life filtering simulation reveals that our user intent method with its accumulative update strategy achieves 43.36% of blocking rate, while maintaining a steadily less than 7% of over-blocking rate.
Subjects
Pornographic blacklists
Query log analysis
Searches-and-clicks
Type
conference paper
