First Non-Latin TLDs Go Online Today 302
eldavojohn writes "ICANN today switched on the country code top level domains for Egypt, Saudi Arabia, and the United Arab Emirates, which are the first non-Latin TLDs available and are also fully readable right to left. Slashdot does not support them but you can find the TLDs in the BBC article. ICANN said it had 21 more requests for TLDs in 11 different languages. A quick note — if you do not have the language packs installed, you may experience unpredictable browser behavior in the URL bar. Right now countries like China and Thailand have implemented workarounds to achieve the same effect."
Why not post example (Score:4, Interesting)
Why did the BBC article not include a link to a valid non-latin URL so we could see how our browsers cope? Even if the page is not understandable, it would be nice to know that the pages load.
Re:Non-latin TLDs? (Score:5, Interesting)
While every keyboard can type A-Za-z, that's not true of Chinese or Arabic, so sites using those TLDs will be effectively off-limits to those that aren't "native".
For now, I hope so. Imagine a RTL domain name, coupled with a phishing email telling recipients to visit moc.tfosorcim.[NEWGTLD] that renders as [NEWGTLD].microsoft.com. Won't that be fun?
Re:Social media IDN fail (Score:3, Interesting)
I have and use .info and .name domains too, but have not seen any problems with them (yet). Maybe some programs don't check RFC-822 (or whatever it is called nowadays) addresses as they should, but this is not new.
Re:Really? (Score:5, Interesting)
Re:Non-latin TLDs? (Score:3, Interesting)
When was the last time you had to type in a relativelly unknown URL? (not things like google, gmail, your bank, etc.)
For that matter, when was the last time you had to type an URL of a site in a language which is off-limits to you anyway?...
This might help greatly in popularization of the internet in large part of so called "developing countries", especially since the biggest changes can be expected when the common folks get hang of it; they are much more likely to be fluent only in their native language and script. Or - imagine the uptake of the internet in the latin world if all URLs were in, say, the Georgian alphabet [wikipedia.org].
ever hear of facebook? twitter? (Score:3, Interesting)
i'm not at all implying that other people care about USA-centric crap, but i'm saying they most definitely are interested in tech that often starts in the usa
there's also the network effect
http://en.wikipedia.org/wiki/Network_effect [wikipedia.org]
more people using a given website simply makes it more compelling, because how many people are in a given social website often defines how useful that site actually is. this renders languages other than english at an automatic, and continuing, disadvantage
even internet tech that started outside the usa, if it gained an international following, say the chan message boards from japan (4chan), icq in israel, or chatroulette in russia, they all migrated to the english web as an inevitable aspect of becoming an international success, and even though they of course have multilanguage abilities and continue to be used in multilanguage ways, their english manifestations are their largest elements
then there is the bizarre phenomenon of paleolithic tech that gets born in the usa, and mostly forgotten there, but continues to live on in other areas
google's orkut started in the usa, but faded, but is huge in brazil, and also india. google relocated orkut from california to belo horizonte
remember friendster? its still alive and well in malaysia, philippines, indonesia. a malaysian company in fact recently purchased friendster
all i'm saying is we're talking about technology, not culture, and no one believes that being usa-centric is the point or even an aspect of being rooted in the english language
Re:Non-latin TLDs? (Score:2, Interesting)
This way lots of non-English speakers, or even users of non-Latin alphabets, can use the Internet much better than they could before. Only half of the world uses Latin. So the other half is more or less excluded because domain names have those limitations, so just to be able to use the Internet they first have to learn a foreign script (a phonetic script - Chinese for example is not phonetic, so that in itself is a huge challenge for a Chinese to learn).
But you probably never set foot outside of your own country, let alone into a place where people actually use such a non-Latin script. If you did so you may start to understand why this is a Very Good Thing.
And the Internet is not going to be more "fragmented". When is the last time you visited say a Swedish or French or Hungarian web site? It is not that they use a different script or so. However I have the feeling that you still can not read what is written there - so that is "fragmented" already for you. And as long as you do not learn how read Arabic or Japanese or Chinese you will not likely want to visit any of those web sites that use Arabic, Japanese or Chinese native characters for name.
Where to go to register a domain? (Score:2, Interesting)
How it works (Score:5, Interesting)
As I maintain my own DNS servers and such, I was curious how this worked. Here's what I learned with 15 minutes of research:
My first stop was to see the root.zone [internic.net] and I looked for these new TLDs, curious to see how they would show up in a Latin-based zone file. Ah, I spotted these odd XN-- zones and then knew what to dig into more.
Take for instance (I pasted a Unicode domain, but Slashcode won't show it) which is handled by ns[1-3].dotmasr.eg.:
$ dig ns (Unicode domain)
; > DiG 9.6.2-P1-RedHat-9.6.2-3.P1.fc12 > ns (Unicode domain) ;; QUESTION SECTION: ;.(Unicode domain) IN NS ;; ANSWER SECTION:
. 3600(Unicode domain) IN NS ns1.dotmasr.eg.
. 3600 (Unicode domain)IN NS ns2.dotmasr.eg.
. 3600(Unicode domain) IN NS ns3.dotmasr.eg.
If you look in the root.zone file, you will see that the ASCII/Latin version of this zone is really XN--WGBH1C.:
XN--WGBH1C. NS NS1.DOTMASR.EG.
XN--WGBH1C. NS NS2.DOTMASR.EG.
XN--WGBH1C. NS NS3.DOTMASR.EG.
TLD Reserved Domains [wikipedia.org] has a list of the current mappings. ToASCII and ToUnicode [wikipedia.org] are the methods to convert back and forth which links to RFC 3490 [ietf.org] which has the nitty gritty details.
(meh, Slashcode doesn't support Unicode encoding, but I can see the Unicode domain name I am pasting in before I hit Preview in Firefox)
Also, the whole switching from right to left in Latin characters to left to right in some Unicode is odd when trying to edit!
you do see the irony (Score:2, Interesting)
of posting what you just wrote in english, on a usa-started and hosted website
as a dutchman though, you are very much within the western world, which is even more english dominated than the wider world, and your perfect english is an example of that
but as i a sit here in midtown manhattan staring out at brooklyn (from breukelen in utrecht), read about the yankees baseball team (from jon quesa: "johnny cheese", how the dutch derisively referred to the english dairy farmers), and contemplate all the kills in the area (creeks), and all the roosevelts in our presidencies, i know that linguistic and cultural influence is a very relative thing indeed
Re:Really? (Score:5, Interesting)
Re:Thats all good (Score:3, Interesting)