Improved Summarization of Chinese Spoken Documents by Probabilistic Latent Semantic Analysis (PLSA) with Further Analysis and Integrated Scoring
Journal
2006 IEEE ACL Spoken Language Technology Workshop, SLT 2006, Proceedings
Pages
26-29
Date Issued
2006
Date
2006
Author(s)
Sheng-yi Kong
Abstract
In a previous paper [1] two new scoring measures, Topic Significance (TS) and Topic Entropy (TE), obtained from Probabilistic Latent Semantic Analysis (PLSA) were shown to outperform very successful baseline Significance Score (SS) in selecting the important sentences for summarization of spoken documents. In this paper extensive experiments using the ROUGE scores with respect to different parameters at different summarization ratios were carefully analyzed in great detail. It was also found that integration of these two scoring measures offered further improvements, and special considerations of the structure of Chinese language was also helpful when summarizing Chinese spoken documents. ©2006 IEEE.
Event(s)
2006 IEEE ACL Spoken Language Technology Workshop, SLT 2006
Subjects
Probabilistic latent semantic analysis; Spoken document; Summarization
Other Subjects
Image retrieval; Information theory; Learning systems; Probability; Semantics; Chinese language; Probabilistic latent semantic analysis (PLSA); Scoring measures; Spoken documents; Spoken languages; Linguistics; Semantics; Chinese language; Probabilistic latent semantic analysis; Scoring measures; Spoken document; Summarization
Description
Aruba
Type
conference paper
File(s)![Thumbnail Image]()
Loading...
Name
25.pdf
Size
23.23 KB
Format
Adobe PDF
Checksum
(MD5):2561a740b31f298f758dd8135604d28f