Converters in Jskad
In recent versions of Jskad, the 'Tools' menu has an option 'Launch Converter...'. If you use that option, you will find a set of first-class converters that can convert digital Tibetan from one form to another. (A command-line interface is also available; see below.)
Some of the converters there are based on Jskad technology, but all are first-class in the sense that they are well though-out, well tested, and handle errors nicely. Certain features in Jskad are quite buggy; for example, its keyboards do not work as desired, but even when they do, they silently drop certain input characters. Do not worry that the converters described here suffer from these flaws; not one character of input is ever silently dropped. It is the intention of the developers that a Buddhist canon one day could be entrusted to these converters. Before you do that, though, please contact the developers to be sure that this documentation is up-to-date and to develop a custom validation and verification plan. None of the converters has yet been hand-validated on a real text of any size, but extensive unit testing has been performed for each conversion at every stage of development.
The following converters are available:
- ACIP->Unicode (Text->Text)
- ACIP->Tibetan Machine Web (Text->RTF)
- EWTS->Unicode (Text->Text)
- EWTS->Tibetan Machine Web (Text->RTF)
- TMW->ACIP (RTF->RTF)
- TMW->ACIP (RTF->Text)
- TM->TMW (RTF->RTF)
- TMW->TM (RTF->RTF)
- TMW->Unicode (RTF->RTF)
- TMW->EWTS (RTF->RTF)
- TMW->EWTS (RTF->Text)
Moreover, EWTS->Unicode and EWTS->TMW converters are in development. Wylie Word 2.0 has better EWTS support at present.
Above, RTF is an abbreviation for Rich Text Format; Text refers to an unformatted text file (in one of several encodings); TMW refers to the Tibetan Machine Web font; TM refers to the Tibetan Machine font; Unicode refers to the Tibetan Unicode characters in the range U+0F00-U+0FFF mainly but also sometimes includes other Unicode characters; EWTS refers to Tibetan encoded using the Extended Wylie Transliteration Scheme, a Roman transliteration scheme; ACIP refers to Tibetan encoded using Asian Classics Input Project (ACIP) Tibetan Input Code, another Roman transliteration scheme.
Invoking the Converters
The converters have a user-friendly GUI interface, and it tells you when things go wrong (from things like the lack of a needed glyph in the output font to things like your having selected the wrong conversion). The GUI is not properly documented here, and probably will not be until you contact the developers and ask them to document it.
To use the GUI, first launch Jskad itself. Then select 'Launch Converter...' from the 'Tools' menu. Let's hope from there it's self-explanatory, because it is not yet properly documented.
For batch conversions of many files, a command-line interface to the converters may be more suitable than the GUI interface. In the same JAR file as Jskad, power users will find a command-line utility that can do everything the GUI interface to the converters can do. To learn how to invoke it, see the output you get when you use this invocation:
java -cp "c:\my thdl tools\Jskad.jar" \ org.thdl.tib.input.TibetanConverter --help
where you must replace "c:\my thdl tools\Jskad.jar" with the appropriate path on your system.
Both the converters and this document are released under the THDL Open Community License Version 1.0.
Please e-mail us your comments about this page.
The THDL Tools project is generously hosted by: