Being billed as the Internet's largest collection of Chinese documents, the "Chinese Electronic Document Archive" of the Academia Sinica was opened this month, and its materials are now available to the public.
Under development for 13 years, the archive has reached a size of 120 million characters. It holds various collections, including the classic "Twenty-Five Books of History," the "Thirteen Classics" of Chinese philosophy and literature, The Literary Mind and the Carving of Dragons, and a collection of documents relating to local Taiwanese affairs. This large archive, which is equipped with search and index functions, provides the academic, cultural and educational worlds with a shuttle to the past. It is a revolutionary tool that will be of great service in the fields of literature, history, philosophy, political science, economics and sociology.
As early as 1984, to meet research needs, the Academia Sinica's Computing Center and its Institute of History and Philology began to develop an archive of passages relating to food and implements in the Twenty-Five Books of History. There were several expected advantages: On-line documents would be impervious to wear and tear and easy to store, and they could be collected in large quantities that would be easy to search and index, thus raising the efficiency of research. With rapid advances in computer technology and networking, the scale of the project grew larger and larger, and in 1990 the Twenty-Five Books of History were ready to go on line.
Originally the archive was conceived as a research tool to be used solely by members of the Academia Sinica, but access was later granted to outside scholars and educational institutions that were engaged in cooperative efforts with the Academia Sinica or were willing to pay for access. So far, it is on-line in more than 10 universities in Taiwan, and abroad it can be found at the Harvard-Yenching Library and the Chinese department at Heidelberg University. In 1995 the Internet took off, and one archive after another at the Academia Sinica went on line, but outsiders were still not granted full access. With calls for open access growing louder, Academia Sinica President Lee Yuan-tseh finally called for completely open access in March of this year.
There are two ways the Academia Sinica is going about doing this. The first is free access, which is being provided for seven different collections: the Twenty-Five Books of History; a collection on local Taiwan historical materials; central Qing government files about Taiwan; The Literary Mind and the Carving of Dragons, including annotations and comparisons between different editions; three commentaries on the Buddhist sutras; a collection of Qing dynasty history and biography; and a database of materials relating to ancient Chinese philology. Altogether these collections contain more than 60 million characters. But there are still some limitations: For instance, one can only check 30 documents per search, and in viewing documents in the Twenty-five Books of History, one cannot scroll through a document from start to finish.
The other way is access for a fee. Apart from being free from the restrictions mentioned above, one is given access to the Thirteen Classics of Chinese Literature and philosophy, the works of various Zhou dynasty thinkers, a collection of 34 historical documents, and a Tang dynasty compendium of Buddhist texts, the Dacang Jing. At this stage of development only domestic institutions are eligible for access. Access to all the materials for a single computer costs only NT$4000. Access to a network of computers (with a limit of 250 individual machines) costs NT$25,000 a year.
Rate of error: less than one in 10,000
Materials that have been copied and republished invariably have mistakes, and are never as valuable to researchers as the originals. For this huge archive, how has the Academia Sinica been able to overcome this problem and attain a rate of error that is conservatively estimated at being no higher than one in 10,000?
The Computing Center explains that these materials were originally entered by two different typists into computer files. A computer then compared the two files, which were then corrected manually. After this first correction, two different readers would compare the text against the original to check for mistakes. The Academia Sinica is quite confident in stating that researchers of early and medieval imperial Chinese history, for instance, can regard the on-line Twenty-Five Books of History as primary sources. For researchers of late imperial China from the Song dynasty on, because there are abundant historical materials, the materials of this archive are not the only ones available, but they certainly should not be overlooked.
The on-line Twenty-five Historical Classics were based on the mainland's Zhonghua Shuju editions, which are regarded as the most authoritative. From the ancient Historical Records to the Qing dynasty histories, passages are grouped in various categories, such as imperial biography, local affairs, chronologies, biographies of commoners, etc. The collection is nearly 40 million characters all told. Not only did the Institute of History and Philology handle this difficult task with great skill, but it also found errors in the original. You can look at these findings by pressing the first icon in the table of contents: "Explanations of computer-generated corrections of copies of historical documents."
The "Taiwan Local Records Archive" is another important collection that was put on line relatively early. Most of its materials come from Wenxian Congkan, which was published by the Bank of Taiwan Economic Research Chamber. It mostly consists of documents kept by Qing dynasty government units of various levels (fu, xian, and ting). It holds a great variety of materials, including information about geography; climate; local histories; government, economic, cultural, educational and military establishments; as well as local vocabulary and expressions. And because it is information that comes from the bottom, from the localities rather than the center, it is an essential tool for research on Taiwan, not just for historians, but for archaeologists, linguists, sociologists, economists and political scientists as well.
Another collection that has recently been completed is the "Taiwan Files Collection" which includes Taiwan-related materials from another perspective. They are largely taken from the Qing dynasty central government files, including ministerial reports, imperial annotations and other documents from the ministries in charge of foreign affairs, defense, and the interior. This comprehensive set of documents range in time from the reign of Kangxi (whose rule began in 1662) to the reign of Guangxu (whose died in 1908). Besides documents from the central government, it also includes information from newspaper reports about diplomatic events. This archive also includes information from Bank of Taiwan's Wenxian Congkan (with the addition of punctuation).
Among the archives which one must pay to see, the on-line Thirteen Classics of Chinese Literature and Philosophy has been based on the relatively accessible and well-reviewed Ruanyuan edition (which lacks punctuation); the "34 Classical Texts" which brings together various pre-Han dynasty collections, including the philosophical work Guigu Zi , the medical texts Jingui Yaolue and Huangdi Neijing, Miscellanies of the Western Capital, and the Song-dynasty Confucian text Zhuzi Yulei. The collection "18 Classical Texts" is largely composed of works from China's middle ages, from the Tang dynasty onward, including notes on The Classic of Mountains and Seas, Tong Dian, and Tang Hui Yao (itself a collection of documents).
Other institutes at the Academia Sinica have also been bringing collections of Chinese documents on line. These collections contain more than 70 million characters in all, says Chen Juo-shui, head of the Institute of History and Philology's on-line archive, and documents are being added at a pace of about 8-10 million characters a year. In the future will access be granted to overseas users? According to our sources, Academia Sinica President Lee Yuan-tseh wants to move step by step toward completely open access, so as "to bring a balance to Taiwan's long-term informational trade deficit."
The location of the Academia Sinica's website is http://www.sinica.edu.tw. Select the "databases" icon to proceed to the archives.