Link-based Web Site Analysis
Date Issued
2004
Date
2004
Author(s)
Tsai, Yu-Li
DOI
zh-TW
Abstract
The prospering of World Wide Web has brought some unexpected social problems, one of which is the influx of material not suitable for children, such as pornography and hate groups.
How to shield impressionable minds from such pollution has become a challenge for computer scientists. One common approach is to build a content filtering tool that block websites containing improper information from being transmitted to the browser. Most content filtering software use keyword comparison or content analysis to identify such websites. Although these methods are effective to some extent, there are still some drawbacks. For instance, same words may represent different concepts under different cultures could lead to misdetection. When applying a pure textual based mechanism on different cultural environments for developing web site analysis algorithms, blocking sites by mistake or fail to block intended sites is a critical and crucial issue.
In this thesis, we propose a new approach to website analysis. Our method is based on the observation that related websites tend to refer to each other through hyperlinks. A graph-based algorithm that utilizes this property has been designed and implemented. We have shown that our algorithm is efficient and effective in finding related site by collecting porno-sites together as an example. Additional experiments conducted on butterfly-related websites and gun-related websites have also produced satisfactory results.
Subjects
鏈結
網站
網址
link
URL
hyper-link
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-93-R91922102-1.pdf
Size
23.31 KB
Format
Adobe PDF
Checksum
(MD5):e63837b2b0f39106d7305c32b70a0924
