Package org.thdl.tib.scanner

Provides the classes to take Tibetan language passages and divide the passages up into their component phrases and words, and display corresponding dictionary definitions.

See:
          Description

Interface Summary
SyllableListTree The generic interface for accesing dictionaries.
TibetanScanner Defines the core methods required to provide access to a dictionary; local or remote.
 

Class Summary
AboutDialog Window that displays copyright stuff.
AcipToWylie Provides an interfase to convert from tibetan text transliterated in the Acip scheme to THDL's Extended Wylie scheme.
AlmostDefaultTableCellRenderer Used by DictionaryTable to display multiple lines of text (in Roman script) in a single cell.
AppletScannerFilter Inputs a Tibetan text and displays the words with their definitions through through a graphical interfase using a Browser over the Internet.
BinaryFileGenerator Converts Tibetan dictionaries stored in text files into a binary file tree structure format, to be used by some implementations of the SyllableListTree.
CachedSyllableListTree Provides recommended implementation of the SyllableListTree (currently most efficient memory-speed combination) loading from file into memory only the "trunk" of the tree, and resorting to the disk when searching the rest of the tree.
ConsoleScannerFilter Inputs a Tibetan text and displays the words with their definitions through the console over a shell.
Definitions Stores the multiple definitions (corresponding to various dictionaries) for a single Tibetan word.
DictionaryListSelectionListener Used by the DictionaryTable to display the full definition of Tibetan word displayed in a table when its row is clicked.
DictionarySource Specifies a subset of dictionaries among a set of dictionaries.
DictionaryTable Table of two columns that displays a Tibetan word or phrase (in either Tibetan or Roman script) and the first couple of lines of its definitions.
DictionaryTableModel Stores the words being displayed in a DictionaryTable.
DuffCellRenderer Used by DictionaryTable to display a Tibetan word or phrase (in either Roman or Tibetan script) in a single cell.
DuffScannerPanel Graphical interfase to be used by applications and applets to input a Tibetan text (in Roman or Tibetan script) and display the words (in Roman or Tibetan script) with their definitions (in Roman script).
FileSyllableListTree Searches the words directly in a file; not the preferred implementation.
Link Used by LinkedList to provide the implementation of a simple dynamic link list.
LinkedList Implementation of a simple dynamic link list.
ListIterator Used by LinkedList to provide the implementation of a simple dynamic link list.
LocalTibetanScanner Loads dictionary stored in tree format and searches for words recursively.
Manipulate Miscelaneous static methods for the manipulation of Tibetan text.
MemorySyllableListTree Loads the whole dictionary into memory; not the preferred implementation.
OnLineScannerFilter Interfase to provide access to an on-line dictionary through a form in html; Inputs Tibetan text (Roman script only) and displays the words (Roman or Tibetan script) with their definitions.
PunctuationMark Right now, it only used by LocalTibetanScanner to separate "paragraphs"; eventually it will be one of many tokens representing grammatical parts of the sentences that will be interpreted by the parser.
RemoteScannerFilter Running on the server, receives the tibetan text from applet/applications running on the client and sends them the words with their definitions through the Internet.
RemoteTibetanScanner Used by applets and applications to access remote on-line dictionaries.
ScannerPanel Graphical interfase to be used by applications and applets to input a Tibetan text and displays the words with their definitions.
SimpleScannerPanel A non-Swing graphical interfase to be used by applications running on platforms that don't support Swing, to input a Tibetan text (in Roman script only) and display the words (in Roman script only) with their definitions (in Roman script).
Token Represents a basic grammatical unit; seems unnecessary but when the parser is developed it will make sense.
WindowScannerFilter Provides a graphical interfase to input Tibetan text (Roman or Tibetan script) and displays the words (Roman or Tibetan script) with their definitions.
Word Tibetan word with its corresponding definitions.
 

Package org.thdl.tib.scanner Description

Provides the classes to take Tibetan language passages and divide the passages up into their component phrases and words, and display corresponding dictionary definitions.

This tool helps Tibetan to English translators partially automate the translation process. In the Tibetan language, the boundaries of individual words are not marked in any manner such as the way in which spaces separate and mark words in English. Instead, there is a punctuation mark called a "tsheg" which separates each syllable. Thus while syllabic boundaries are utterly explicit, word boundaries are often unclear. One of the main difficulties beginning students thus have with translating Tibetan texts is figuring out where each word ends and the next word starts, and determining what series of syllables to look up in the dictionary either as constituting a single word or a larger compound phrase. This entails a very time consuming process of looking up multiple combinations of syllables to determine which are found within a given dictionary.

It partially automates that process by breaking up a sentence/paragraph entered in Extended Wylie or Tibetan script into the biggest component parts it can find in multiple dictionary databases. Then for each component part found, it displays its stored definitions and relevant information. This will thus often yield only the definition of a long phrase, rather than its component words, but one can also search for the syllables of that phrase one by one separately.

The tool can run on-line through a:

The tool can also run off-line in:

The classes designed to be run from the command-line are:

Notes on Input:

Author: Andrés Montano Pellegrini

Related Documentation

See Also:
org.thdl.tib.text, org.thdl.tib.input


These API docs were created 02/02/2003 08:19 PM.
Copyright © 2001-2002 Tibetan and Himalayan Digital Library. All Rights Reserved.
Hosted by SourceForge_Logo