利用現存之分類架構針對特殊知識網路自動產生分類典之研究

2004-11-012024-05-18https://scholars.lib.ntu.edu.tw/handle/123456789/703727摘要：本研究計畫的目的在設計一套自動化的方式讓政府、企業等各色組織能夠快速收集網路上的相關資訊，並且依照組織的需求及指定的分類架構，建立有利於企業或組職發展的知識庫。這一套自動化的設計中，將利用網路上已經存在的多個網頁分類架構，如Yahoo和DMOZ，以及搜尋引擎，收集企業及組織知識架構相關的網頁。依照網頁本身的內容及互連的結構，進行網頁分群。不同於過去的做法，我們希望分群的結果並不是只反映了內容的相似性，而是能夠依照企業的預設架構，並參考網頁上已經標記過的人工分類資訊，如Yahoo的分類架構，或是特定專業人士的註解，將網頁的分類更貼近企業知識管理的需求。本研究計畫將突破過去必須利用訓練資料(Corpus)將文件做分類的方式，企圖在僅有不完整的預設知識架構下，整理網路上的資訊以建立資料庫。<br> Abstract: The object of research project is to study on the automatic approaches that can help government, enterprise or any organization collect the information on the web, and establish a knowledge database. This study will try to design such an automatic mechanism that will utilize the existing classification hierarchical – like Yahoo, DMOZ, and search engines to collect information interested by the organization. With measuring the similarity and checking the link structures between page contents, the collected pages will be clustered with proposed Guided Clustering Method. Compare with previous approaches, the proposed approach will exploit the existing class label from web directory site like Yahoo, or other metadata labeled by experts in our clustering method. This study will try to break the traditional limitation, which is building the classifier by training from lots of existing corpus, and organized the web information according to the special requirements of organizations.知識網絡網頁分群Knowledge NetworkWeb Pager Clustering利用現存之分類架構針對特殊知識網路自動產生分類典之研究