Saturday, March 27, 2010

I18N, L10N, G11N and things related

Hi y’all:



I am working on a bilingual project and will soon be getting
into the stage where data has to be shown in more than just
English. But this may not be so simple, at least for me, since I
have never worked with databases in this context.



Imagine a table that lists categories of some kind. I have
thought about the following:



1. Have one table of categories for each language - that
sounds crazy.



2. Have a different page/template for each language –
even crazier!



3. A resource bundle with all the category names in two or
more languages; then, in the database, store NOT the category
names, but the resource bundle key names and write a CFC, where one
of its functions is to retrieve the key name from the database and
extracting the key’s value from the appropriate rb.properties
(bundle) file.



Of the above, I am inclined to go for #3. But having no
experience in this, I would like to hear from someone about how to
approach this, as I want to do some good planning before embarking
in something that may turn out to be the wrong solution.



Any ideas? Paul Hastings, where are you?.



Regards,



Carlos

I18N, L10N, G11N and things related
Why not put all category names for each language all in 1
table? Something like this:



Category, Language,Text

Food, EN, Food %26amp;amp; Meals

Food, IT, Cibo %26amp;amp; Cena



We've built out one of our major sites in english, spanish
and french using the above method. The resource bundle seems like a
good idea especially if you are going to be sharing those resources
with a non-CF application, but it seems like a database solution
might give you more maintainability.I18N, L10N, G11N and things related
insuractive wrote:

%26gt; Why not put all category names for each language all in
1 table? Something

%26gt; like this:

%26gt;

%26gt; Category: Language: Text

%26gt; Food EN Food %26amp;amp; Meals

%26gt; Food IT Cibo %26amp;amp; Cena

%26gt;

%26gt; We've built out one of our major sites in english,
spanish and french using

%26gt; the above method. The resource bundle seems like a good
idea especially if you

%26gt; are going to be sharing those resources with a non-CF
application, but it seems

%26gt; like a database solution might give you more
maintainability.



on the contrary there's a whole ecosystem that's built up
around rb. most

professional translations services will either supply an rb
or something that

can easily be imported to an rb (very often XLIFF). there's
several very good

tools that can manage rb, from IBM's icu4j rbManager to jason
sheedy's cf based

rb manager--for complex projects, a good rb management tool
is a *must*. if you

skimp on professional translations services (which is not
often a good idea), a

good rb management tool will really help in managing in-house
translations,

what's been translated for what locales/languages, etc.




Hello, Michael



Thank you very much for replying; due to connection problems,
I had not been able to look at the forums since I posted two days
ago.



I use RbManager, and with all its bugs and quirks, it’s
one of my favorite applications to work with. I agree about the
translations thing, I have not yet ever seen a decent machine
translation.



I do want to look into your database suggestion (and when I
said separate ‘tables’ in my previous post, I meant
‘columns’). But I don’t quite understand how it
works – I was thinking a column for each language, but it
looks to me like you have all languages in one column; and how does
the ampersand come into the picture?



If you have the time, I would really appreciate a little more
explanation about this – I am by no means an expert: most of
what I know about this subject is from Paul Hasting’s chapter
in Ben Forta’s advanced CF book.



I would be curious to look at your trilingual site; if
that’s OK, would you post the URL?. Here is the little site I
made using RbManager and CF:
http://www.dircolombia.com/.




If you wish, you are welcome to contact me directly. Hope to
hear something.



Best regards,



Carlos – carlos@timos.com


Hey Carlos,

I can see why the example I used was a little confusing.
First let me start by trying to clarify what I meant. BTW - the
ampersands were just part of the category name, not really anything
to do with the table formatting.



The idea I proposed was to have a table with 3 columns:

Category (your category key goes in here)

Language (what language the text is in)

Category_Text (your translated category text)



So instead of needing a new column for each language, you can
use 1 field (Language) to ID what language the translated text is
in:



Record 1:

Category: Food

Language: English

Category_Text: Food and Meals



Record 2:

Category: Food

Language: Italian

Category_Text: Cibo e Cena



You then can use your SQL statements to pull up the correct
Category_Text for a given language.



Unfortunately, the site we currently had running in 3
languages we've now switched back to only being in english (due to
a business decision from the higher ups) - the cost of maintaining
current professional translations wound up being a little too high
for the benefit received.



I think you should definitely check out PaulH's post about rb
resources. I had not worked that much with rb's in the past and
PaulH defintely has a little more experience in that area.
Hi Michael:



Got it! Now your table structure is clear to me.



But I am inclined to keep the key values in the RB, where
they would reside with the rest of the resource keys for the whole
site; that seems more modular, easier to move around and share.



In the case of my Categories table (which is now in Spanish),
I wouldn’t have to do anything to the database, as the names
now in there could become the key names (and Spanish labels in the
RB). And in looking, I realized that the CFC methods I already
have, will work for what I need.



I am still hoping paulH will respond, as there may be things
I haven’t contemplated. He rules on this subject.



Too bad your site has reverted to just one language. I
don’t know what your product or who the audience is, and I
don’t know about the French/Italian numbers, but it seems to
me the Spanish and Japanese markets should not be ignored by anyone
in the USA selling on the www.



Any way, thank you very much for your help, it gave me an
alternative I hadn’t thought about for future jobs.



Have fun.



Carlos


Happy to Help, Carlos. Good luck on your project.
Hello, Michael (insuractive), et al:



It looks like I have worked out my language issues with
%26lt;cfselect%26gt;, and your assistance was instrumental in coming up
with the solution.



After exploring the thing, I opted for going the database
route as you originally suggested, with some modifications. So far,
so good; I have tested it several times and it looks like
it’s working like a charm. If you have the time to check it
out, I would like to know what you think – and that goes for
anyone else interested in this subject.



I have posted the explanation of what I did on
my blog, under the title
“I18N/G11N/L10N with CFSELECT”. Again, thank you very
much for invaluable help.




Best regards,



Carlos


Glad I could help.
cheftimo,



Nice work on the columbia site. I need some clarification
though.



Did you end up using resource bundles?

Is the translated text stored in the db or in a resource
bundle?

How did you break out each page's translation- a db value
pointing to the appropriate RB?



Thanks for your help,



Jeff


Hello Jeff See4th:



I use both. In most of the site, the translated text is in a
resource bundle (RB), but in the template where my category list is
displayed by a %26lt;CFSELECT%26gt;, the translations are in a db, and
the appropriate labels are fetched for the current locale.



However, my approach differs from that suggested by
‘insuractive’ in his answer at the top of the forum
thread: I have a column of labels for each language and the columns
are named according to the locale. In the Colombian site case, the
colums are named ‘categoryName_en_US’ and
‘categoryName_es_CO’. The list of category names
fetched depends on the locale that is being used when the query is
executed. Doing it this way, when you work with the table directly
in the db, you can see (or edit) the labels side by side.



Look at this other site, also bilingual -
http://www.timos.com/, See the
restaurant reviews, the recipes and the cartoon lists – in
all these I use the db because the pages are generated dynamically.
All other text in the site is in a RB.



I have used two different techniques:



1 – If I think visitors are likely to switch locales
more than once or twice, my query retrieves both colums and I cache
it, then create a local variable to store the column name I want to
display, and use the CF ‘Evaluate()’ function to read
the data.



2 – If I think visitors are going to pick a
language/locale and stick with it, my query just retrieves the
appropriate column when the locale is selected – in this
case, the query is not cached, but nothing has to be evaluated.
Evaluate() takes more overhead.



If you end up wanting to get the actual code, contact me
directly – carlos@timos.com – and we will work
something out.



I hope this helps.



Carlos

No comments:

Post a Comment