Unicode doesn seem to distinguish between tréma and umlaut, but I need to distinguish. What shall I do?

0
Posted

Unicode doesn seem to distinguish between tréma and umlaut, but I need to distinguish. What shall I do?

0

A. For some purposes, it may be necessary to maintain a distinction between tréma and umlaut, for example, in bibliographic records kept by the German library network. For the Latin script, the Unicode Standard does not distinguish identically appearing diacritical marks with different functions. Doing so would result in confusion in implementations and among users. The character U+034F COMBINING GRAPHEME JOINER (CGJ) may be used to make the relevant sorting, searching, and data mapping distinctions required for umlaut versus tréma. The semantics of CGJ are such that it should impact only searching and sorting, for systems which have been tailored to distinguish it, while being otherwise ignored in interpretation. The CGJ character was encoded with this purpose in mind. The sequences and are not canonically equivalent. this means that the distinction will not be normalized away on conversion in and out of bibliographic systems. This eases the interoperabili

Related Questions