|
|
|
## Database sources
|
|
|
|
|
|
|
|
A structure of Chinese words and earliest text attestations of these words is taken over from _Computerlinguistische Datierung schriftsprachlicher chinesischer Texte_ [^1]. Information is parsed from a UTF-8 plain text version of _Hanyu da cidian_ 漢語大詞典 (_HDC_) [^2] and enriched with information from other sources, such as _CBDB_ [^3] and a corpus of _Early Chinese Texts_, see »Loewe corpus« in [^4].
|
|
|
|
|
|
|
|
Different sources were parsed to document word categories, words within categories and equivalent/synonymic relationships, mainly:
|
|
|
|
- The Erya 爾雅
|
|
|
|
- The
|
|
|
|
|
|
|
|
Biographical and bibliographical information
|
|
|
|
|
|
|
|
A database of texts, comments and people of Chinese history relevant to the project has been started separately and was then merged into the semantic database.
|
|
|
|
|
|
|
|
## References
|
|
|
|
|
|
|
|
[^1]: Schalmey, T, 2022: _Computerlinguistische Datierung schriftsprachlicher chinesischer Texte_. Diss., Universität Trier. Forthcoming
|
|
|
|
|
|
|
|
[^2]: _HDC_ = Luo Zhufeng 羅竹風, ed., Hanyu da cidian 漢語大詞典, 13 vols., Shanghai 上海: Cishu chubanshe 辭書出版社, 1986–1994
|
|
|
|
|
|
|
|
[^3]: CBDB = Fuller, Michael A., ed., China Biographical Database Project, 2017, https://projects.iq.harvard.edu/cbdb
|
|
|
|
|
|
|
|
[^4]: Schalmey, T., 2021: “Raw frequency data: Thoughts on "Reliable" Learner's Vocabularies for Classical and Literary Chinese”, https://doi.org/10.5281/zenodo.5638881
|
|
|
|
|
|
|
|
|
|
|
|
TODO
|
|
|
|
|
|
|
|
How the information is added to the database
|
| ... | ... | |