Automated Classification of Taiwanese Land Deeds
Date Issued
2008
Date
2008
Author(s)
Lu, Chia-Ching
Abstract
Before the modernization of land administration by the Japanese during their occupation of Taiwan (between 1895 and 1945), hand-written land deeds are the only proof of the transaction or leasing of land. Land deeds are thus an important source of primary documents for studying Taiwanese society before 1895.ollaborating with the National Taiwan University Library, the Digital Archives Laboratory of the Department of Computer Science of NTU built a full-text digital library of primary historical documents, the Taiwan History Digital Library (THDL), which includes, among other things, 21,399 land deeds in searchable full-text. We believe that it is the largest data base of its kind in existence. In order to provide a better understanding of the contents and make them easier to use, we attempt, in this thesis, to categorize the collection.he difficulty arises from the fact that the land deeds in THDL came from different sources. Although most of them (21,121) also contain metadata, they were produced by different people using different standards. Thus, one cannot classify them easily using the descriptions provided in the metadata. We first studied existing classification scheme and chose one, which classified land deeds into 14 categories, that seems most suitable for our purpose. (To simplify the task, we only considered those with metadata.) We then designed an algorithm that, takes each collection, re-classified its content according to the 14 categories. ur method successfully classified 20,698 of the land deeds. The remaining 423 required examination by experts. We also discovered that two more categories, zugu (租榖) – rental charges in rice, and qiwei (契尾) – official certification for transaction of land, could be added to better capture the nature of the land deeds.
Subjects
Taiwan History
land deeds
category
metadata
digital archives
Type
thesis
File(s)![Thumbnail Image]()
Loading...
Name
ntu-97-R95922086-1.pdf
Size
23.32 KB
Format
Adobe PDF
Checksum
(MD5):3a9ea6a5e890fd37c4e890882a928291
