Converts Tibetan dictionaries stored in text files
into a binary file tree structure format, to be used
by some implementations of the SyllableListTree.
Syntax (Dictionary files are assumed to be .txt. Don't include extensions!):
- If this option is omitted, it is assumed that each line is an entry
(no multiple-line entries) and the definition and definiendum are separated
by '-' (a dash). Even though it is not
required, it is highly recommended to include a space before and afterwards
(to eliminate any possible ambiguity with regards to the transliteration of
reverse vowels in
Extended Wylie). A sample entry for the dictionary is:
bkra shis - 1) auspiciousness, good luck, good fortune, goodness, prosperity, happiness. 2) auspicious, favorable, fortunate, successful, felicitous, lucky. 3) verse of auspiciousness; benediction, blessing. 4) a personal name.
bde legs - 1) goodness, happiness, well-being, wellfare, auspiciousness, good fortune. 2) well, fine.
If this were the content of a file called "my-glossary.txt" the
binary tree file would be generated with the command:
java -cp DictionarySearchStandalone.jar org.thdl.tib.scanner.BinaryFileGenerator my-glossary
- -tab: it is assumed that each line is an entry (no multiple-line
entries) and the definition and definiendum are separated by '\t' (horizontal tabulation).
One tabulation is enough; don't feel the need to "align" the definitions in your
word-processor. A sample entry for the dictionary is:
bkra shis 1) auspiciousness, good luck, good fortune, goodness, prosperity, happiness. 2) auspicious, favorable, fortunate, successful, felicitous, lucky. 3) verse of auspiciousness; benediction, blessing. 4) a personal name.
bde legs 1) goodness, happiness, well-being, wellfare, auspiciousness, good fortune. 2) well, fine.
Here, the
binary tree file would be generated with the command:
java -cp DictionarySearchStandalone.jar org.thdl.tib.scanner.BinaryFileGenerator -tab my-glossary
-
-string: it is assumed that each line is an entry (no multiple-line
entries) and the definition and definiendum are separated by the character or
string of characters specified by the user. A sample entry for the dictionary
is:
bkra shis ** 1) auspiciousness, good luck, good fortune, goodness, prosperity, happiness. 2) auspicious, favorable, fortunate, successful, felicitous, lucky. 3) verse of auspiciousness; benediction, blessing. 4) a personal name.
bde legs ** 1) goodness, happiness, well-being, wellfare, auspiciousness, good fortune. 2) well, fine.
Here, the
binary tree file would be generated with the command:
java -cp DictionarySearchStandalone.jar org.thdl.tib.scanner.BinaryFileGenerator -** my-glossary
- -acip: it is assumed that the electronic file is a transliteration of
a Tibetan dictionary. It is called "acip" because it accepts Acip's comment
codes ('@' to mark page numbers, brackets to mark comments, etc). Nevertheless,
it still requires the files to be in
Extended Wylie, so if your file is in Acip's transliteration scheme make
sure to run org.thdl.tib.scanner.AcipToWylie first. Definitions here can
be of multiple lines, but with no blank lines in between. It is assumed that the
definiendum starts after a blank line (except at the beginning of a new page
where it could start with the last part of the previous definition) up to the
shad (except when the shad is omitted because of grammar rules as for
instance no shad after a "ga" suffix without a secondary suffix). Each
time a new letter starts, it should be clearly marked in brackets ('[', ']'),
parenthesis ('(', ')') or llaves ('{','}'). A sample entry for the dictionary is:
@1
(ka)
ka ba/ gdung 'degs don byed nus pa/
rkyen/ grogs byed
- Author:
- Andrés Montano Pellegrini
- See Also:
SyllableListTree
,
FileSyllableListTree
,
CachedSyllableListTree
posHijos
private long posHijos
sil
private String sil
def
private String[] def
delimiter
private static String delimiter
delimiterType
private static int delimiterType
delimiterGeneric
private static final int delimiterGeneric
- See Also:
- Constant Field Values
delimiterAcip
private static final int delimiterAcip
- See Also:
- Constant Field Values
delimiterDash
private static final int delimiterDash
- See Also:
- Constant Field Values
sourceDef
private org.thdl.tib.scanner.DictionarySource sourceDef
- Number of dictionary. If 0, partial word (no definition).
wordRaf
public static RandomAccessFile wordRaf
defRaf
private static RandomAccessFile defRaf
BinaryFileGenerator
public BinaryFileGenerator()
BinaryFileGenerator
public BinaryFileGenerator(String sil,
String def,
int numDef)
toString
public String toString()
- Overrides:
toString
in class Object
deleteQuotes
private static String deleteQuotes(String s)
addFile
public void addFile(String archivo,
int defNum)
throws Exception
Exception
add
private void add(String word,
String def,
int defNum)
addMoreDef
private void addMoreDef(String def,
int numDef)
equals
public boolean equals(Object o)
- Overrides:
equals
in class Object
printMe
private void printMe(boolean hasNext)
throws Exception
Exception
print
private void print()
throws Exception
Exception
printSintax
private static void printSintax()
main
public static void main(String[] args)
throws Exception
Exception
These API docs were created 02/02/2003 08:20 PM.
Copyright © 2001-2002 Tibetan and Himalayan Digital Library. All Rights Reserved.
Hosted by