Repository logo
  • English
  • 中文
Log In
Have you forgotten your password?
  1. Home
  2. College of Electrical Engineering and Computer Science / 電機資訊學院
  3. Electrical Engineering / 電機工程學系
  4. Design of Personal Preference Inference from Questionnaire Data with Exemplary Application
 
  • Details

Design of Personal Preference Inference from Questionnaire Data with Exemplary Application

Date Issued
2015
Date
2015
Author(s)
Sung, Ming-Chieh
URI
http://ntur.lib.ntu.edu.tw//handle/246246/276674
Abstract
In a rapidly developing digital society, computer identification and inference of personal preference is more important than ever to predict market trends and tailor services to customers. To assess personal preference, questionnaires are often used as a direct approach. Current methods in questionnaire analysis, however, are only able to derive preferences stated directly in questionnaires. To predict personal preferential answer to a new question, a methodology is needed to profile a person and to perform inference based on a knowledge base of existing questionnaire data. This thesis designs a semantic-based methodology – “Questionnaire data-based Personal preference Inference Engine” (QPIE) to predict the preferential answer to a new question by analyzing the relationships between the new questions and the existing questions. Such relationships include the semantic meaning of each question and the associated answer. QPIE innovatively integrates existing methods and the corresponding tools in the public domain into a implemented system, and successfully solves the following four challenges arising from personal preference inference processing: i) Construction of knowledge base of questionnaires, including personal preference profile from answers and meaning of questions, ii) Numerical representation of meaning of questions and answers for further computer processing, iii) Inferring semantic relationships between existing questions and the new questions, and iv) Predicting the preferential answer. The design of QPIE consists of following four parts in response to the four challenges: (1) Semantic abstraction of single-sentence questions It is challenging to extract proper keywords by computer processing for representing meaning of a question. QPIE first exploits the grammatical structure of a sentence to facilitate abstraction by adopting a probabilistic natural language parser, the Stanford Dependency Parser, for deriving dependency-parsing tree of each question. Based on the parsing result, a Syntax-based Keyword Extraction Algorithm (SKEA) identifies keywords to represent the meaning of each single-sentence question. (2) Numerical representation of a single-sentence question in “semantic” space QPIE then applies word2vec to encode each keyword of a question to a numeric vector representation based on its semantics. Word2vec is a class of neural-network models that provides each word with a set of numerical coordinates in a semantic space learned from an un-labeled corpus. Vectors of words serve as the foundation of semantic similarity calculation. By treating a sentence as a concatenation of syntax-based keywords, QPIE encodes the semantics of a single-sentence question into a vector by concatenating the vectors of syntax-based keywords of the question. (3) Semantic inference among questions Once semantic-based vectors of questions are available, QPIE performs straightforward classification of questions according to their respective answers, one class per preferential answer choice. To infer the preferential answer to a new question, QPIE adopts support vector machine (SVM) as a probabilistic classifier to calculate, by exploiting the semantic-based vectors of questions, the similarity of the new question to existing questions in each class and the preference probability of choosing the answer of the class. (4) Preferential answer prediction for new questions based on real questionnaire data A reference implementation of this research implements QPIE methodology into a system by exploiting existing tools including Stanford Dependency Parser, word2vec, and LIBSVM, and new design, SKEA. System integration is realized by sharing data folder between MATLAB® and VirtualBox®. The training and testing data set consist of 44 single-sentence questions, each with the same four possible choices: {never, seldom, sometimes, often}, selected from Taiwan Communication Survey . In the Experiment 1, it is proven that higher preference probability can be related to higher semantic similarity between training and testing questions. In the Experiment 2, QPIE statistically and significantly outperforms the random guess approach by personal average accuracy of 66.65% over 1,313 people in predicting answers of 14 testing questions. The contribution of this thesis is an innovative design of a semantic-based methodology, QPIE, for enriching questionnaire analysis with personal preference inference capacity, which is capable of predicting personal preferential answers to new questions according to semantic relationships among questions. Based on the design, an integrated system is developed, which can be evaluated by prediction accuracy. Besides, inspirational results proven and discussed in experiments include that preference probability to new questions accounting for the semantic similarity, and further analysis of preference probabilities showing insights of personal patterns. Specifically, contributions include: (1) Abstracting the meaning of single-sentence, multiple-choice questions ; (2) Representing each question numerically for computer processing based on externally trained word vectors; (3) Semantical inference from numerical representation of questions by adopting SVM model; (4) Reference implementation of QPIE into a system; (5) Verification of the preference probability accounting for semantic similarity; (6) Achievements of i) personal average accuracy of 66.65%, significantly higher than random guess (47.66%).
Subjects
personal preference
inference
questionnaire
dependency parser
syntax-based keyword
semantics
classification
SVM
Type
thesis
File(s)
Loading...
Thumbnail Image
Name

ntu-104-R02921016-1.pdf

Size

23.32 KB

Format

Adobe PDF

Checksum

(MD5):3c14a35eec6d0e930bd7e036acce649a

臺大位居世界頂尖大學之列,為永久珍藏及向國際展現本校豐碩的研究成果及學術能量,圖書館整合機構典藏(NTUR)與學術庫(AH)不同功能平台,成為臺大學術典藏NTU scholars。期能整合研究能量、促進交流合作、保存學術產出、推廣研究成果。

To permanently archive and promote researcher profiles and scholarly works, Library integrates the services of “NTU Repository” with “Academic Hub” to form NTU Scholars.

總館學科館員 (Main Library)
醫學圖書館學科館員 (Medical Library)
社會科學院辜振甫紀念圖書館學科館員 (Social Sciences Library)

開放取用是從使用者角度提升資訊取用性的社會運動,應用在學術研究上是透過將研究著作公開供使用者自由取閱,以促進學術傳播及因應期刊訂購費用逐年攀升。同時可加速研究發展、提升研究影響力,NTU Scholars即為本校的開放取用典藏(OA Archive)平台。(點選深入了解OA)

  • 請確認所上傳的全文是原創的內容,若該文件包含部分內容的版權非匯入者所有,或由第三方贊助與合作完成,請確認該版權所有者及第三方同意提供此授權。
    Please represent that the submission is your original work, and that you have the right to grant the rights to upload.
  • 若欲上傳已出版的全文電子檔,可使用Open policy finder網站查詢,以確認出版單位之版權政策。
    Please use Open policy finder to find a summary of permissions that are normally given as part of each publisher's copyright transfer agreement.
  • 網站簡介 (Quickstart Guide)
  • 使用手冊 (Instruction Manual)
  • 線上預約服務 (Booking Service)
  • 方案一:臺灣大學計算機中心帳號登入
    (With C&INC Email Account)
  • 方案二:ORCID帳號登入 (With ORCID)
  • 方案一:定期更新ORCID者,以ID匯入 (Search for identifier (ORCID))
  • 方案二:自行建檔 (Default mode Submission)
  • 方案三:學科館員協助匯入 (Email worklist to subject librarians)

Built with DSpace-CRIS software - Extension maintained and optimized by 4Science