Update 07/24/10. — The YALT code has been completely rewritten. The following article doesn't reflect this evolution. YALT 2 encapsulates a L10N object in a more compact way, it improves the compatibility with session targetted scripts and allows the developer to escape Unicode characters with the \u.... syntax (in order to keep the entire script ascii encoded). You also can get the active locale in a string by using L10N.locale. Finally, YALT 2.1 abandons the file auto-parsing approach and provides the ability to pass string arguments through %1, %2... placeholders.
Check out the updated article.

ARCHIVED ARTICLE:

Localization (frequently abbreviated to the numeronym L10N) is the process of translating a software interface into appropriate languages or adapting a language for a specific country. A localized application automatically displays the menus, the dialogs and the message strings in the user locale language.

From a scripting point of view, it is easier to maintain a single file than to deploy the multiple translated versions of the same script. Ideally we would like to code the whole interface using default strings often international English and to attach a “translation file format” to each locale. This mechanism is very popular since the advent of the “PO format” from the GNU Gettext translation system. It is also used in many script libraries, from PHP to JavaScript.

However, in the InDesign JS context —and to avoid file inclusion issues—, we prefer to have a single file containing both the script and the L10N data.

A few words on ExtendScript localization

ExtendScript allows you to localize the text in your script, using “localization objects”: “A localization object is a JavaScript object literal whose property names are locale names and whose property values are the localized text strings.” A locale name is a standard 5-character code in the form LL_RR (for example: en_US, fr_FR...), where LL is an ISO 639 language specifier, and RR an ISO 3166 region specifier.

When you set the localize property of the Dollar ($) object to true, you enable the extended localization features of the built-in toString method. Then you can use this kind of syntax:

// enable ExtendScript L10N automatism
$.localize = true;
 
// create a localization object (for 2 locales)
var locMsg = {
    en_US: "Hello, world!",
    fr_FR: "Bonjour tout le monde !"
    };
 
// displays the localized message
// [ implicit call to toString() ]
alert( locMsg );
 

As explained in the Adobe JavaScript Tools Guide, ExtendScript matches the current locale and platform to one of the object properties and uses the associated string. On a French system, for example, the property fr_FR: "Bonjour tout le monde !" will be converted into the string "Bonjour tout le monde !"

The ExtendScript approch is very intuitive, but requires that the scripter builds an object each time he wants to displays something on the screen. Suppose that your script provides a lush user interface with countless widget captions. You may think that it is easy to centralize the whole L10N objects in a global array and to send localization requests to the Magical Translator. It won't be that easy! How can we set the keys of such an array?

Reinventing the wheel

In the first French/English scripts I wrote for InDesign —like IndexBrutal or EanDesign—, I got into the habit of setting a number of L10N global variables in a special section named “LINGUISTICS”. Each entry was structured as an array of two strings, the English one and the French one, for example:

var ERR_FILE_OPENING =
    [
    "Unable to open the file",
    "Impossible d'ouvrir le fichier"
    ];
 

Then I declared a LANG variable, defined by:

var LANG = (app.locale == Locale.frenchLocale) ? 1 : 0;
 

So the script was able to access the appropriate translated string, using the syntax ERR_FILE_OPENING[LANG].

But I realized that this method only attempted to replace the ExtendScript localization objects with as many localization arrays (and mnemonics)!

The “YALT” Approach

Now I would like to introduce another localization framework. The idea basically consists in using the JS comment syntax to store all the localized strings in one block at the beginning of the script file. Like this:

//======================================
// <L10N> :: FRENCH_LOCALE :: GERMAN_LOCALE
//======================================
// Yes :: Oui :: Ja
// I love :: J'aime :: Ich liebe
// This is a translated text :: Ceci est un texte traduit :: Dies ist ein Text in Übersetzung
// </L10N> ::
 

Since all lines begin with double slashes (//), the “code” above is perfectly silent and harmless. JavaScript can't interpret it, but we can! It is obviously a translation table. The first row provides a generic header for our localization section, specifying the available languages for the script:

//======================================
// <L10N> :: FRENCH_LOCALE :: GERMAN_LOCALE
//======================================
 

" :: " (space + double colon + space) is an arbitrary separator. The first field (<L10N>) has a special meaning: it acts as an opening tag for the L10N process. You can also see it as a placeholder for the first column, corresponding to the default locale. The default locale/language is applied when the user locale is not supported by the script or when a string has no localized counterpart. This is the reason why the first column contains the English (=default) strings, considered as the table keys:

// Yes :: Oui :: Ja
// I love :: J'aime :: Ich liebe
// etc.
 

The fist line above means that "Yes" is a default string/key to be translated into "Oui" in French locale, and into "Ja" in German locale. For any other locale, "Yes" will remain "Yes", unless you add a new column (for example, " :: SPANISH_LOCALE") and the translated fields. Each new line specifies a key and the corresponding localized strings, until we reach the </L10N> :: end marker.

OK, this virtual syntax is pretty cool, but how can we make it work in a real script?

Part 2