The field of Tibetan technology is a perfect example of a multi-cultural, multi-disciplinary decentralized network. Most of the time progress is driven by discreet behind the scene interactions, but today we would like to document such an example and put the spotlight on some of the behind the scene figures making the field move forward.
The story starts with a major breakthrough by the Tibetan Manuscripts Project Vienna (TMPV) and Christian Luczanits: identifying and scanning in Western Nepal two well preserved early (13-14th c.) collections of Buddhist Canonical texts, the Namgyal collection and the Drakmar collection. Both major discoveries per se.
For more on the discovery and digitization process, see
- the open access book Two Illuminated Text Collections of Namgyal Monastery by Christian Luczanits (SOAS) and Markus Viehbeck (TMPV)
- TMPV blog entry about the field work in Drakmar in 2022
- TMPV blog entry about the field work in Drakmar in 2023)
The Namgyal collection, for instance, is so old it likely predate the standardized form of the Kanjur, offering a unique glimpse into the very process of the creation of the Tibetan Buddhist Canon. The Drakmar collection contains a previously unknown version of the Old Tantra Collection, a rare find for scholars. These discoveries are reshaping our understanding of early Buddhist literature and the history of the Tibetan canon.
These two collections were then put online in open access on the BDRC website (MW2KG229028 for Namgyal, MW1BL10 and MW1BL6for Drakmar), thanks to a very productive collaboration agreement between TMPV and BDRC, and all the parties’ commitment to open access.
During our exploration of OCR, we realized that Namgyal was not working too well but identified it as an important target as it represents one of the earliest large-scale collections of Canonical texts, predating the better researched Derge edition by nearly 400 years.
In order to make systematic and measurable progress on its OCR system, BDRC hired Pentsok, a very well connected specialist in Tibetan digital fonts, to create a typology of Tibetan writing styles (stay tuned for more).
Bhod Dhorchang (pseudonym) is a young graphic and type designer from Tsolho, Amdo. He has designed nine new Tibetan fonts, the Uchen hand of Tibetan polymath Gedun Chophel, and a Dunhuang manuscript style into digital font under the font family name “Bhozuk”. In his type design pursuit, he is interested in breaking Tibetans’ perception about what Tibetan fonts can be in the contemporary era. Thus, after his exploration into experimental fonts, he has turned to recreating historical writing styles that fell out of fashion into digital fonts.
He showed Pentsok some manuscript fragments of a very old script style that he got from Western Tibetologists, which were not enough for him to create all the glyphs for a working font. Pentsok recognized the script style as the same style from the old manuscripts from Namgyal and Drakmar collections and provided him one volume from each collection. Pentsok advised him in the creation of the font.
They produced the font Bhozuk Katenma as a side project and provided it in open access (download link) on August 13.
Pentsok did a thorough review of the font, in Tibetan.
In turn, this font will allow BDRC to generate synthetic data (essentially create images of text using the font) for its future exploration of OCR systems and hopefully produce a high quality transcription of the very precious Namgyal and Drakmar collections. Ideally, TMPV will then use these transcriptions and integrate them in the rKTs database, forming a full circle.
This type of behind the scene interaction is what makes the field so vibrant and we’re very proud to be a modest node in this network!