Message-ID: <4AB01DF1.80304@bcs.org.uk> Date: Wed, 16 Sep 2009 00:06:25 +0100 From: Peter Knaggs Newsgroups: comp.lang.forth To: forth200x@yahoogroups.com Subject: RfD: Internationalisation ANS Forth Internationalisation ============================== 2009-09-16 Remove SUBSTITUTE, now subject to a separate proposal. 2009-09-02 Converted into text file. 2007-06-26 Updated rationale section, LOCALE@, and minor wordsmithing 2005-10-23 Added GET-ENCODING and SET-ENCODING, Changed stack action of SUBSTITUTE. 2001-04-25 Added GET-ESCAPE to provide restoration capabilities. 2001-03-25 Minor text changes 1999-06-21 Wordsmithed at ANS meeting 1999-06-20 Tightened up some wording Added references to more standards. 1999-06-14 Added an ambiguous condition to SUBSTITUTE. Changed COUNTRY and LANGUAGE to SET-COUNTRY and SET-LANGUAGE returning an ior. 1999-05-30 Derived from parallel discussion document Problem ======= Forth Applications designed to run in many countries and languages cannot yet make enough assumptions about strings and character sets to be portable. The LOCALE word set is designed to provide words for portable internationalisation of application programs. The proposal does not attempt to cover text processing in general, but only to permit conversion of a limited set of application-defined text for internationalisation. The proposal is based on techniques used in large Forth applications for many years. In practice, many applications are not localised by the software developer, but by their agents in other countries. The LOCALE word set permits the software developer to provide tools that will produce text files that can be edited and converted to another language locally without dependency on computer language or operating system specific tools such as resource compilers and managers. At the same time, the proposed word set does not inhibit the use of sets of statically compiled strings for each language, it just does not define the mechanism. The basis of the LOCALE word set is that all strings for internationalisation are compiled as LOCALE structures, and all access to the strings is through these structures. It appears that the following word set is adequate in the first place. The word set is designed to cope with character sets that are of different size to the native set. The word set is split into a base and extension sets to indicate what factors need to be language sensitive. It is also likely that all LOCALE structures will need to be linked in case reindexing of hash tables or other internal structures is necessary. The word L" is proposed for language sensitive strings, and behaves in a similar way to the ANS word C", but returns a string identifier known as a locale string identifier (lsid) from which the required language string can be extracted. The reason for this is so that text information in the native development language is still available in the source, making source maintenance easier because the intention of the string is still available to the developer. In addition, the Forth compiler can be extended to produce a text file containing the native strings. The number of items to be displayed which are, or may be, language sensitive is large. Not all applications may need to deal with all of them. In addition, many applications need to be able to perform text substitution, for example: Your balance at