中文版 | English

Core Technologies

Research efforts include multimedia technology, language technology, and database & metadata technology.

 

Efforts

1. Linked Data Technology

■ Transforming metadata into interlinked units on the semantic web crossing the boundaries of languages and nations
Linked Open Data (LOD) has recently become an important approach used to interlink data on the semantic web. LOD breaks down the original structure of metadata into smaller, semantically-interconnectible bits that can be understood and processed automatically by computers. Publishing in LOD involves an evolution in protocols and licensing options, enabling data from different resources to be effectively interconnected and queried, thus making it more useful across the globe.

Applying LOD may lead to significant research findings with a tightly-connected data pool from various fields. To achieve this, ASCDC is currently converting the Taiwan e-Learning and Digital Archives Program (TELDAP) Union Catalog into the LOD model so that it can be integrated with data from other open platforms across institutions, countries, and languages, thereby improving the dissemination and reuse of information.

ASCDC platforms: LODLab; LODDatasets; Linked Taiwan Artists
Collaborating groups:The Getty Research Institute (AAT-Taiwan); Institute of History and Philology (The Wooden Slips Character Dictionary – Database of Juyan Han Wooden Slips from the Institute of History and Philology Collections)
Adopting groups: The Construction of Asian Buddhist Art Thesaurus and Information System Project, MOST (Asian Buddhist Art Thesaurus and Information System)

 

2. Digital Museum

■ Collect, curate, and present you or your institution’s collections and research
The Open Museum allows organizations or individuals to upload and manage digital collections. You can integrate open data from other organizations’ treasured archives, carry out digital curation, or connect to international platforms, in order to provide more narrative context for your collection. The platform provides features for timelines, story maps, and data visualizations tools. Curators can adapt to multiple material attributes and narrative situations to present the diversity within the collection’s storylines.

ASCDC platform: Open Museum
Collaborating groups: Pingtung County Government (Pingtung County Digital Archive); Taiwan Film and Audiovisual Institute (Taiwan Film Open Museum); Biodiversity Research Center (Biodiversity Digital Museum); Institute of History and Philology (Museum of the Institute of History and Philology on the Open Museum); Deng Yu-xian’s grandson Deng Tai-chao (Deng Yu-xian Digital Archives); Taipei Chinese Center, International PEN (Taipei Chinese Center, International PEN on the Open Museum); Taiwan Music Institute, National Center for Traditional Arts (Taiwan Music Institute on the Open Museum)

 

3. Digital Humanities Research Technology

■ Create a cloud platform with both open access and a multi-person collaborative research mechanism
This platform is an open, cloud-based text repository that assists humanities scholars in analyzing large quantities of data. It contains billions of characters of open text imported from international contributors, and is equipped with textual analysis tools such as automated text markup, term frequency analysis, term co-occurrence analysis, text similarity comparison, related term analysis, social network analysis, and spatiotemporal data visualization. You can not only access texts imported by others but also upload your own texts and authority terms to this platform, where you can allow colleagues to share and co-edit with you. The computational model introduces new possibilities for investigation.

ASCDC platform: Digital Humanities Research Platform
Collaborating groups: The Scripta Sinica Research Group, Institute of History and Philology; Institute of Taiwan History; Institute of Modern History; Institute of Chinese Literature and Philosophy; Institute of Information Science
Adopting groups: “Digital Research on the Image of the Periphery During Wei-Jin Southern and Northern Dynasties” project; “Digital Humanities and Taiwan Elites Study: Case Study of the ITH Archives” project; “Discourses and Treatment Strategies for ‘Mei (Fairy) Illness’ in Traditional China” project; “Digital Humanities and Wooden Slips Research: Characters Interpretation and Wooden Slips Restoration” project

 

4. Optical Character Recognition (OCR)

■ Making machines read faster and smarter
We use machine learning to develop a computational reader specialized in transcribing digitized images of rare books into electronic texts, in order to accelerate the overall process of digitization. Results show that the success rate of OCR of our program is over 90%, which is higher than that of the existing open source software as well as commercialized interfaces. In reducing the time-consuming human labor, this automation process will enable more texts to be queried and used.

In addition, we have developed an online editing tool so that users can upload an image, perform OCR, and then manually revise the outcome. Users’ revised text results can serve as training for the OCR program.

Collaborating group: The Scripta Sinica Research Group, Institute of History and Philology
ASCDC platform: OCR and Proofreading Platform

 

5. Web Image & Text Annotation

■ Our image annotation tools live up to the standard of the International Image Interoperability Framework (IIIF)
We make online image resources open to direct annotation, correction and revision. Such a revision can be carried out straightforwardly, as the user can pick a passage or an image to annotate the content in a way he/she sees suitable. Employing IIIF ensures the interoperability of the resources on other IIIF programming interfaces.

ASCDC platform: Open Museum
Collaborating groups: Institute of History and Philology (The Wooden Slips Character Dictionary – Database of Juyan Han Wooden Slips from the Institute of History and Philology Collections); Nara National Research Institute for Cultural Properties; Historiographical Institute, The University of Tokyo; National Institute for Japanese Language and Linguistics; National Institute of Japanese Literature; Institute for Research in Humanities, Kyoto University (Multi-database Search System for Historical Chinese Characters)
Adopting groups: The Construction of Asian Buddhist Art Thesaurus and Information System Project, MOST (Asian Buddhist Art Thesaurus and Information System)

 

6. Object Detection and Image Search

■ Machine recognition and retrieval of objects in images
By combining image processing with machine learning technology, we can extract data on specific objects within images, localize their positions, and make them searchable. Users can also upload or select an image in order to search for similar images.

Adopting platform: The Wooden Slips Character Dictionary – Database of Juyan Han Wooden Slips from the Institute of History and Philology Collections
Adopting group: “‘Wooden Slips Character Dictionary’ Development Project – An Exploration of the Structure and Technology of Open Source Database”

 

7. Geographical Information System (GIS)

■ Do your fieldwork on the go: upload and annotate data with GIS
Our development of GIS technology has produced the GIS for Religious Landscape in Taiwan platform, which combines historical literature, fieldwork, and collaboration between experts and locals. It enables the user to locate all kinds of religious sites in Taiwan and get a sense of their distribution in space, and development through time, ushering in a new methodology for digital humanities.

ASCDC platforms: GIS for Religious Landscape in Taiwan; Walking into the Past & Present app series (for Taipei, Taichung, Tainan)
Collaborating groups: Department of Religious Studies, Fu-Jen Catholic University; I-Kuan Tao College

 

8. Text Lexical Entity Recognition and Event Classification Technology

■ Utilize Textual Analysis Method for name recognition and event classification
Researched and developed by "The Institute of History and Philology's Project to Digitally Innovate Academic Settings." This technology use method of CRF (Conditional Random Field), then through BERT (Bidirectional Encoder Representations from Transformers) method embed words to extract essential terms related to events in the text, then use algorithms such as K-means to perform event grouping. Combined with the character-related attribute data "Database of Names and Biographies," an automatic named entity recognition and linking model without manual annotation is established, which can quickly retrieve personal names and link to external databases.

Collaborating Groups: The Scripta Sinica Research Group, Institute of History and Philology; Grand Secretariat Archives Project, Institute of History and Philology; Center for GIS, RCHSS, Academia Sinica; Intelligent Information Service Research Lab, Department of Computer Science & Information Engineering, National Central University
Adopting Platforms: Ming Shilu Weiso Event Retrieval System; Chinese Historical Climate Geographic Information Retrieval System

 

9. Triple-Based Semantic-Relationship Automatic Frame Technology

■ Utilize Language Understanding Technology to automatically extract and analyze the events of the characters
The manual labelling process is time-consuming and exertive hence the "Information Extraction for Ancient Chinese" Project developed this technique, with the focus on automated identification and retrieved names of people, officials, events, etc. to establish a knowledge graph based on triple-based semantic-relationship (in the form of <subject, verb, object>). Through using Language Understanding Techniques such as "Analysis" and "Anaphora Resolution" to analyze the text of Qing Shilu, in addition to automatically extracting and analyzing the events of the characters (the relationship between characters and events, that is what a person has done), it can even display the sentiment distribution of articles and sentences.

Collaborating Group: Institute of History and Philology, Academia Sinica
Adopting Platform: Institute of History and Philology, Academia Sinica

 

10. Word Segmentation for Old, Middle, Early Mandarin, and Standard Chinese

■ Employing zero-shot learning and capable of transfer learning when tasked with language materials from different time periods, this technology can correctly segment words at an average rate of 90%!
When using Chinese-language materials or data, whether “old Chinese” (pre-Qin period to the Western Han dynasty), “middle Chinese” (Eastern Han, Wei, Jin, and Northern and Southern dynasties), “early Mandarin Chinese” (Tang dynasty and the Five Dynasties onwards), or “standard Chinese,” this technology allows one to target language models, word segmentation, and lexical markers to conduct transfer learning experiments and modeling. Recently, a semantic analysis model for standard Chinese (vernacular) has been successfully experimented with and can even be transferred to ancient Chinese texts (classical Chinese). Experiments with word segmentation markers has also found that language materials across different periods of time are able to further the learning of the model when tasked with word segmentation, thereby demonstrating the model’s transferability.

Collaborating Groups: Institute of History and Philology, Academia Sinica
Adopting groups: Institute of History and Philology, Academia Sinica

 

11. Metadata Architecture & Application

■ Enabling richer access to metadata
As an object is archived digitally, it cannot go without a set of metadata that represent it in computational terms within a standardized data structure. The organization of metadata is, therefore, key to the efficiency of its retrieval, display, management, and use so that the data can be preserved in an open, interoperable, and sustainable environment.

Find out more: https://metadata.teldap.tw/standard/standard-frame.html
Adopting groups: National Digital Archive Program (NDAP); Taiwan e-Learning and Digital Archives Program (TELDAP); Ministry of Education (MOE Resources for Teaching); Institute of History and Philology; Institute of Ethnology

 

12. Union Catalog System

■ Over 580 million digitized items are available on over 770 sites – accessible via one portal
Search among the archives built during the Taiwan e-Learning and Digital Archives Program (TELDAP) via the Union Catalog. You can switch between basic search and advanced search, with search criteria including subject, time, archival location, participants and contributors of the Program, etc.

ASCDC Platform: Digital Taiwan

 

13. Digital Archives Construction & Maintenance

■ Archive components allow for a fast and streamlined process for building databases
This technology componentizes modules that are shared by digital archival management systems so that they can be reused to facilitate the construction of new archives. With these components, repetitive building processes can be reduced and system maintenance runs more smoothly. Current components include: string processing, XML file processing, date and time processing, missing Chinese character processing, dynamic HTML webpage processing, permissions control processing, multimedia image processing, database access, file transfer processing, system logs, authority term and code management, etc.

Adopting groups: Institute of History and Philology (Digital Archives System for the Rare Books of the Fu Ssu-Nien Library, Database for the Seals of the Fu Ssu-Nien Library, IHP Name Authority Files, Digital Archives of Archaeological Data, Rubbings Data, Digital Archive of the Han Wooden Slips, Digital Archives of Bronze Images and Inscriptions); Institute of Ethnology (Taiwan Ethnography Video and Audio Archive)

 

14. System Preservation & Maintenance

■ A powerful back-up
This technology protects websites, databases and programs by virtue of virtualization. It provides archive managers with back-up and maintenance systems.

ASCDC Platform: Digital Taiwan - Culture & Nature

 

15. Restitution of Missing Chinese Characters

■ A lost-and-found for missing Chinese characters
With the assistance of the Home of Chinese Document Processing Lab, our exchange code draws on word formation methods to tackle the problem of missing characters of electronic texts in the Chinese language in digital archives. Based on the rules of Chinese word formation, we divide a character into small image components, which are crucial to digitally restituting missing characters.

Find out more: Missing Character Processor, Home of Chinese Document Processing Lab
Adopting groups: Institute of History and Philology (Digital Archives System for the Rare Books of the Fu Ssu-Nien Library, Database for the Seals of the Fu Ssu-Nien Library, IHP Name Authority Files, Digital Archives of Archaeological Data, Rubbings Data, Digital Archive of the Han Wooden Slips)

 

16. Digital Image Archiving

■ A multimedia data manager
This technology specializes in converting HD images of digitized objects into a browser-friendly version online with an animated watermark and other functions to ensure high-quality presentation and rights protection.

Find out more: Multimedia Center
Adopting groups: Institute of History and Philology (Digital Archives System for the Rare Books of the Fu Ssu-Nien Library, Database for the Seals of the Fu Ssu-Nien Library, IHP Name Authority Files, Digital Archives of Archaeological Data, Rubbings Data, Digital Archive of the Han Wooden Slips)

 

17. Video Archiving Technology

■ Publish your own video on Vcenter
The National Digital Archives Program has developed advanced technologies for managing digital video archives. With these technologies, we can build indexing systems to quickly retrieve and add value to digital video content.

We develop tools enabling the user to upload a video, convert it, stream it, edit it, and archive it – and all the tools are unified on Vcenter. There are many things that one can do with Vcenter: video format transformation, video shot detection, video abstract extraction, key frame extraction, metadata searching, full text searching, voice searching, streaming video format, and a nonlinear online editing tool. Watermarks, subtitles, bookmarks and GIS technology are also available on the site.

Go to Vcenter

 

18. Web Album & Social Media

■ Share your photos on social media
iPicbox is a site that allows users to manage their photos before publishing them on social media such as Facebook or Plurk. On iPicbox, you can also share your albums with people and design how the pictures are to be viewed.

Go to iPicbox

 

Facebook RSS


 

Subscribe RSS