中文版 | English

Development of Historical Text Information Extraction Techniques for Compiling the Historical Events Recorded in Shi Ji

Basic information
Project identifier AS-ASCDC-111-206
Conducted by Research Center for Humanities and Social Sciences
Director
Overview

Expanding on work accomplished in the past few years through the digital humanities subproject "Development of Big Historical Text Information Extraction Techniques for Compiling Research-oriented Knowledge Bases," this project aims to promote the integration of methodologies, approaches, and tools between information technology and humanities in support of digital humanities research. Employing natural language processing technologies, not only can the system automatically recognize keywords such as person names and place names from an extensive corpus of historical texts in an efficient manner and systematically present their temporal and spatial characteristics, but it can also deduce the semantic relationships between text contents through the cross-reference comparison of multiple text paragraphs, so as to establish a historical event extraction method to assist in the development of humanities-related research topics. 

This project will focus on the historical event analysis of the Records of the Grand Historian (Chinese name Shi Ji), improve the data preprocessing methods, enhance the efficacy of the Named Entity Recognition algorithm, and deconstruct patterns of events referenced across multiple paragraphs of text. Further, harnessing deep learning techniques, we will explore the technology of event extraction for textual semantic reasoning, and implement textual entailment analysis, event detection, and recognition of complex event composition patterns. The above work will use expert research data as a training corpus and, based on the integration of multiple further texts, will also verify the practicality of the analysis model of this research.  

The method developed in this research is expected to accommodate different historical texts. It will be capable of effectively extracting historical events, assisting in analyzing their cause and effect, and further strengthening the integrated application of temporal and spatial information and historical knowledge. Moreover, it is designed to improve the visualization of historical events presented on maps and expand the research scope of integrated spatiotemporal-textual data systems in digital humanities.

Find out more

Historical Text Information Extraction from the Perspective of Digital Humanities

Back to Project List

 

Facebook RSS


 

Subscribe RSS