Hello out there, I have followed your discussion and I'd like to bring another topic into discussion once more. As learned from several discussions and readings the sorting problem is still not very well solved. Specifying sorting rules can be an tedious task which is error-prone in many cases. Based on the evaluation of the ISO standard (see our Homepage) I have developed the following concept, which I have partially implemented this weekend. 1. Letters are entities owning properties that can be used for sorting purposes. A letter can be defined with the following declaration (define-letter "umlaut-u with circumflex" (:case lower) (:accent circ) (:letter "u")) This defines the letter "umlaut-u with circumflex" to have the properties as defined above. Another example is (define-letter "umlaut-U with trema" (:case upper) (:accent trema) (:letter "u")) 2. Sorting is done on a sequence of partial orderings that should result in a total order. Partial orders can be defined with definitions such as (define-partial-order :letter ("a" "b" "c" ... "u" "v" ...)) (define-partial-order :case (upper lower)) (define-partial-order :accent (trema acute circ tilde hat)) The names of the partial orders directly refer to the property names above. 3. A total order can be specified with the declaration (define-total-order (:letter) (:accent backwards) (:case)) This sorts the a word (a sequence of letters) first according to the weights as given by the partial order :letter, then according to the weights from :accents (this is the French sorting order) and finally according to the :case. As long as we have a sorting model that is based on this scheme we are finished. Still missing is a appropriate mapping that transforms a string (a sequence of chars) into a sequence of letters (which have become real objects now). This could look like: (define-mapping "umlaut-u" ("\~"u" "ü")) (define-mapping "umlaut-A" ("\~"A" "Ä")) [I hope you can see the ISO-Latin chars as well] What I was just discussing with Gabor is the problem of markup (once more). Often indexes contain commands such as "\index" (see for example the LaTeX Companion) for with different index entries must be specified for the command "\index" and the word "index" sorted as a) <i markup=cmd><n markup=cmd><d markup=cmd><e markup=cmd><x markup=cmd> versus b) <i><n><d><e><x> Here the <...> notation indicates a letter-object with additional properties. A partial order (define-partial-order :markup (cmd other)) can then be used to solve the remaining ambiguities. The question remains how to define the mapping "\index" -> a) "index" -> b) Two schemes seem to be possible: 1. A mapping is based on string or regexp-transformations (such as the current sort-rules) but extended with mapping rules. Informally we could say that "\index" must be written as "\cmd{index}" and there is a mapping rule that says (define-mapping "\cmd{(.*)}" "\1" :with-property (:markup cmd)) indicating that the replacement text "\1" will be further mapped onto letters that have the additional property (:markup cmd). This needs a flexible and dynamically configurable parser (not too hard to implement). 2. We try to tackle the problem the other way around. This concerns the discussion about \indexindy command. Something like \indexindy[markup=texttt,...]{foo} instead of \indexindy[...]{\texttt{foo}} could solve the problem. Markup is not embedded in the plain keyword. A scanner is not necessary anymore. Markup can be done in the markup-backend with something like (markup-keyword :markup "texttt" :open "\texttt{" :close "}") This would effectively yield the same results. It suffers from the fact that not more than one markup can be associated with a keyword, which seems be the case rarely. Any comments are really welcome on this topic. Please participate which solution you prefer most. If there are open questions, ask me. Maybe I'm too deep into this stuff that my explanations are not unterstandable :) Thanks for your patience. Bye -- ====================================================================== Roger Kehr kehr@iti.informatik.th-darmstadt.de Computer Science Department Technical University of Darmstadt