Research efforts include multimedia technology, language technology, and database & metadata technology.


1. VCenter

In order to disseminate the huge amount of high-quality digital contents of TELDAP and to attract users to archive and share the digital contents they generated, we are currently developing and several services to implement the notions of Web 2.0, including VCenter, a video archive and sharing platform based on video streaming technology.

2. Video Enhancement

Video Enhancement includes three topics: tone reproduction, video enhancement and video inpainting. Tone reproduction is a new technique for "National Digital Archives Program". In order to eliminate the low contrast problem of old photos, we developed a new method to solve such problem.

3. Audio Processing and Retrieval

Our goal is to develop methods for analyzing, extracting, recognizing, indexing, and retrieving information from audio data such as speech and music. We have developed several basic technologies as well as prototype retrieval systems. SoVideo is a web-based TV news retrieval system, which accepts keywords and returns a ranked list of relevant TV news stories. The system is based on technologies such as large vocabulary continuous speech recognition for Mandarin Chinese, automatic story segmentation, and information retrieval. SoMusic is a query-by-singing karaoke music retrieval system, which accepts singing queries and returns a ranked list of relevant songs. The system is based on technologies such as melody extraction, phrase onset detection, and melody matching.

4. Resolving Unencoded Chinese Character Problem

The current Chinese character Interchange Code is not good enough for daily applications. It cannot meet users’ needs due to the so-called Missing Character Problem. In this project, we try to utilize our knowledge of Chinese glyph structures to resolve the problem of missing characters, since the structure of a glyph is the best symbol of itself. Thus far, the project has compiled a Chinese Glyph Structure Database containing 115,197 ancient and modern Chinese characters. Knowledge about Chinese glyph structures can be also used to improve search techniques for characters.

5. Chinese Natural Language Understanding

We are developing a robust Chinese parsing system with semantic composition capability under the representational framework of E-HowNet. The techniques developed by this project may support many high level intelligent systems, such as dialogue systems, intelligent interfaces and machine translations etc.

6. Document Image Analysis and Recognition

Our research activity will focus on the development and application of machine learning techniques for solving problems that occur in automatic recognition of legacy newspaper, document layout analysis, text/ontology categorization, and identification of language types in multi-linguistic documents. In the legacy document application, we are interested in recognizing fragmentary characters that appear in newspapers printed in the 50s and 60s.

7. Database Technology Development Team

Following the SOA concept, the database technology development team (DTDT) will be divided into a few REST web services and the interface facilities will be separated. Based on the MVC design pattern and RoR framework, the new DADT will be able to create multi-table database applications easily and quickly, even if the schema is complex. It will also include more complex Web 2.0 features. Because of RoR, the new DADT will be able to cross various platforms, browsers, and DBMS.

8. Metadata Architecture and Application Team

The MAAT has established since 2001 to offer metadata service and planning for thematic collection-based projects in harmony with objectives of the National Digital Archives Program (NDAP). The MAAT also expands its services and tasks into elearning domain when moving into the Taiwan e-Learning and Digital Archives Program (TELDAP) with integration of NDAP and Taiwan eLearning Program in 2008.