Posts

Tibetan Unicode normalization

In this blog post we will document one of the many wonderful technical things developed in the BDRC-MonlamAI OCR project (that resulted in the Tibetan OCR desktop app). Introduction: Tibetan Unicode Normalization? During our experiments with Tibetan encoders for OCR, we created an encoder based on Tibetan stacks. The idea is the make the model “see” the data in a way that is optimal. An intuition we tested is that it would be useful for the model to “see” Tibetan stacks, or glyphs. For instance ...

Sorting Tibetan in LibreOffice

Here is a quick tip to sort Tibetan in a recent LibreOffice using the rules created by BDRC (see blog post about the sorting app for the context). First, download and install a recent LibreOffice, we tested version 25.8 on Linux, this should work on other platforms too. The most important part is to set the language of the document to Tibetan (PRC) or Tibetan (India), here’s a method you can use: click on ...

A little app for sorting Tibetan alphabetically

BDRC just released a minimalistic page to sort Tibetan in alphabetical order, in Unicode or Wylie: https://buda-base.github.io/tibetan-sort-js/ The app has very limited functionality but the possibilities to sort Tibetan very simply and online are very sparse1 so we thought it would benefit our users to release it! Any feedback can be given on the issue tracker. An old problem Sorting Tibetan automatically is something that was worked on as early as the 1980s, long before Tibetan Unicode was normalized. The earliest documented project was led by Yoshiro Imaeda (then working for the French CNRS) in Bhutan in the late 80s. Among other things, the project set up a 4th Dimension database for the catalog of the National Library of Bhutan, which included alphabetical sorting as a feature. ...

It is still possible to run Corel WordPerfect 3.5 thanks to the amazing preservation work of Internet Archive

Report from the digital field: archeology and WordPerfect 3.5

BDRC is always active behind the scenes to foster a collaborative and passionate network of institutions and individuals providing practioners, scholars and translators with the material they need. The story of this blog post starts in the mid-1990s, when a project was undertaken by His Eminence Khochhen Tulku (b. 1937) in Dehradun to publish the Collected Works of the great master Terdak Lingpa (1646-1714). The project resulted in an impressive 16 volumes publication in 1998, scanned and put online in open access by BDRC under the number MW22096. ...

The font integrated in the original manuscript, design by Pentsok W. Rtsang

Interconnections in Tibetan technology: a case study

The field of Tibetan technology is a perfect example of a multi-cultural, multi-disciplinary decentralized network. Most of the time progress is driven by discreet behind the scene interactions, but today we would like to document such an example and put the spotlight on some of the behind the scene figures making the field move forward. The story starts with a major breakthrough by the Tibetan Manuscripts Project Vienna (TMPV) and Christian Luczanits: identifying and scanning in Western Nepal two well preserved early (13-14th c.) collections of Buddhist Canonical texts, the Namgyal collection and the Drakmar collection. Both major discoveries per se. ...