Book logo xindy

A Flexible Indexing System


[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]

Re: Xindy newby intro.



Alan Eugene Davis writes:
 > 
 > First an introduction.  I have been using latex to typeset a lexicon
 > of animal names in Chuukese.  I wanted to be able to specify an
 > arbitrary collating sequence.  With a good bit of help, and over
 > several years of time, I wrote an emacs lisp sort function to sort
 > accoring to a more or less arbitrary sequence.  Xindy now proposes to
 > do what I have had to kludge to do.
 > 
 > I would like to inquire whether the sort functions themselves are
 > accessible and useable outside xindy.  This could be useful as a
 > generalized sort function.  In fact, it would seem more useful in that
 > context.  Xindy would use the same routine as anyone was using to sort
 > this language.  I think it would be a service to native speakers to
 > enable native linguists in Chuuk to tinker, and come up with what
 > looked right.  Or just teachers.  I like my collating sequence for me,
 > but I am not a native speaker.  I did what xindy is doing---make  "a
 > and a equivalent.  (Rather make "a look to the sort program like an
 > a).  I think that gnu sort may be amenable to rewriting, but I am not
 > a C programmer, so I gave up.

Currently the implementation of the sort routines consists of two
parts.

a) C-written stuff, basically implements tables of sort rules for
   efficient lookup. The interface allows 

   - to insert rules (character, string, and regexp) to a table (each
     run is realized with one such table),

   - take a a character sequence and apply all rules stored in a
     particular table and return the rewritten character sequence.

   This is the core part which is specified in the following CLISP
   Foreign Function Interface Definition Language:
     
     (def-c-call-out add-keyword-sort-rule
         (:name "add_sort_rule")
         (:arguments (run int)
     		(left c-string)		
     		(right c-string)	
     		(isreject int)
     		(ruletype int))
         (:return-type int))
     
     (def-c-call-out gen-keyword-sortkey
         (:name "gen_sortkey")
         (:arguments (key c-string)
     		(run int))
         (:return-type c-string :malloc-free))
     
   This part exists as a separate C library (libordrules) with include
   files and all necessary stuff to use it in another environment.
   This should not be a problem at all. Additionally, you need the GNU
   Rx library for the regexp-stuff.

b) The Common Lisp part essentially handles the
   not-so-performance-critical stuff needed to do all other
   management.

Coming back to your question. It should be easy to implement a new
frontend to that library that reads the rules from a file, then a
stream of words from another file and returns all the words in the
rewritten form, just bypassing xindy totally. And adopting GNU sort to
just sort according to this scheme could be worth a try. I haven't
looked at the implementation, yet. 

 > I wrote routines to sort the index in that order then had to
 > reassemble the index.  So I like the potential to do it another way,
 > with xindy, with a few keystrokes of initial set up.  

This heavily depends on the complexity of your rules. As long as the
rules of the Chuukeese language (BTW: where is it spoken?) are
expressible in the current scheme, this shouldn't be a problem. If you
need any help to write rules, contact us. 
 
 > Also I would like to ask about possibility for several indices: I'd
 > like to have a scientific name index as well as a headword index.
 > Excuse me if this is in the manual, as I have only quickly looked it
 > over.

We are working on this and especially on LaTeX interfaces for this
purpose. For the moment I'd suggest to filter the relevant data with
something like sed/awk/perl and run xindy for each index separately.

Give it a try and tell us and share your experiences with us.

Cheers,
Roger

-- 
======================================================================
Roger Kehr			   kehr@iti.informatik.tu-darmstadt.de
Computer Science Department         Darmstadt University of Technology