Skip navigation
  • 中文
  • English

DSpace CRIS

  • DSpace logo
  • Home
  • Organizations
  • Researchers
  • Research Outputs
  • Explore by
    • Organizations
    • Researchers
    • Research Outputs
  • Academic & Publications
  • Sign in
  • 中文
  • English
  1. NTU Scholars
  2. 生物資源暨農學院
  3. 昆蟲學系
Please use this identifier to cite or link to this item: https://scholars.lib.ntu.edu.tw/handle/123456789/572822
Title: Large expert-curated database for benchmarking document similarity detection in biomedical literature search
Authors: Brown P
EN-CHENG YANG et al. 
Keywords: article; benchmarking; human; medical research; Medical Subject Headings; Medline; nonhuman; okapi; plant seed; search engine; statistical bias; systematic review
Issue Date: 2019
Journal Volume: 2019
Start page/Pages: 1-67
Source: Database
Abstract: 
Document recommendation systems for locating relevant literature have mostly relied on methods developed a decade ago. This is largely due to the lack of a large offline gold-standard benchmark of relevant documents that cover a variety of research fields such that newly developed literature search techniques can be compared, improved and translated into practice. To overcome this bottleneck, we have established the RElevant LIterature SearcH consortium consisting of more than 1500 scientists from 84 countries, who have collectively annotated the relevance of over 180 000 PubMed-listed articles with regard to their respective seed (input) article/s. The majority of annotations were contributed by highly experienced, original authors of the seed articles. The collected data cover 76% of all unique PubMed Medical Subject Headings descriptors. No systematic biases were observed across different experience levels, research fields or time spent on annotations. More importantly, annotations of the same document pairs contributed by different scientists were highly concordant. We further show that the three representative baseline methods used to generate recommended articles for evaluation (Okapi Best Matching 25, Term Frequency-Inverse Document Frequency and PubMed Related Articles) had similar overall performances. Additionally, we found that these methods each tend to produce distinct collections of recommended articles, suggesting that a hybrid method may be required to completely capture all relevant articles. The established database server located at https://relishdb.ict.griffith.edu.au is freely available for the downloading of annotation data and the blind testing of new methods. We expect that this benchmark will be useful for stimulating the development of new powerful techniques for title and title/abstract-based search engines for relevant articles in biomedical science. ? The Author(s) 2019. Published by Oxford University Press.
URI: https://www.scopus.com/inward/record.uri?eid=2-s2.0-85082592913&doi=10.1093%2fdatabase%2fbaz085&partnerID=40&md5=3ac12f90c94597a4b51ea7b10caa0af0
https://scholars.lib.ntu.edu.tw/handle/123456789/572822
ISSN: 17580463
DOI: 10.1093/database/baz085
Appears in Collections:昆蟲學系

Show full item record

SCOPUSTM   
Citations

7
checked on Aug 11, 2022

WEB OF SCIENCETM
Citations

8
checked on Jul 24, 2022

Page view(s)

7
checked on Jun 19, 2022

Google ScholarTM

Check

Altmetric

Altmetric

Related Items in TAIR


Items in DSpace are protected by copyright, with all rights reserved, unless otherwise indicated.

臺大位居世界頂尖大學之列,為永久珍藏及向國際展現本校豐碩的研究成果及學術能量,圖書館整合機構典藏(NTUR)與學術庫(AH)不同功能平台,成為臺大學術典藏NTU scholars。期能整合研究能量、促進交流合作、保存學術產出、推廣研究成果。

To permanently archive and promote researcher profiles and scholarly works, Library integrates the services of “NTU Repository” with “Academic Hub” to form NTU Scholars.

總館學科館員 (Main Library)
醫學圖書館學科館員 (Medical Library)
社會科學院辜振甫紀念圖書館學科館員 (Social Sciences Library)

開放取用是從使用者角度提升資訊取用性的社會運動,應用在學術研究上是透過將研究著作公開供使用者自由取閱,以促進學術傳播及因應期刊訂購費用逐年攀升。同時可加速研究發展、提升研究影響力,NTU Scholars即為本校的開放取用典藏(OA Archive)平台。(點選深入了解OA)

  • 請確認所上傳的全文是原創的內容,若該文件包含部分內容的版權非匯入者所有,或由第三方贊助與合作完成,請確認該版權所有者及第三方同意提供此授權。
    Please represent that the submission is your original work, and that you have the right to grant the rights to upload.
  • 若欲上傳已出版的全文電子檔,可使用Sherpa Romeo網站查詢,以確認出版單位之版權政策。
    Please use Sherpa Romeo to find a summary of permissions that are normally given as part of each publisher's copyright transfer agreement.
  • 網站簡介 (Quickstart Guide)
  • 使用手冊 (Instruction Manual)
  • 線上預約服務 (Booking Service)
  • 方案一:臺灣大學計算機中心帳號登入
    (With C&INC Email Account)
  • 方案二:ORCID帳號登入 (With ORCID)
  • 方案一:定期更新ORCID者,以ID匯入 (Search for identifier (ORCID))
  • 方案二:自行建檔 (Default mode Submission)
  • 方案三:學科館員協助匯入 (Email worklist to subject librarians)
Build with DSpace-CRIS - Extension maintained and optimized by Logo 4SCIENCE Feedback