International URLs Pass First Test 159
Off the Rails writes "The BBC reports on the results of a successful test of non-ASCII domain names on Internet-equivalent hardware (pdf) carried out last October. The next stage is to plug the system into the net, and if it still works, it could go live sometime next year. 'Early work on the technical feasibility of using non-English character sets suggested that the address system would cope with the introduction of international characters tests were called for to ensure this was the case ... Also needed are policy decisions by Icann on how the internationalised domain names fit in and work with the existing rules governing the running of the address books. Icann is under pressure to get the international domain names working because some nations, in particular China, are working on their own technology to support their own character sets.'"
Great (Score:5, Funny)
Re:Great (Score:5, Funny)
Re: (Score:3, Funny)
Re: (Score:2, Insightful)
Re: (Score:3, Funny)
Re: (Score:1, Funny)
It is? You vanilla people are weird.
Re: (Score:3, Funny)
Re: (Score:2)
Phishing just got a lot more interesting (Score:5, Funny)
Re:Phishing just got a lot more interesting (Score:4, Informative)
This is already happening. A common example is the cyrillic lower case "?", which looks almost exactly like the latin "a" in most fonts.
See http://en.wikipedia.org/wiki/IDN_homograph_attack [wikipedia.org] for more information.
Re:Phishing just got a lot more interesting (Score:5, Funny)
Re:Phishing just got a lot more interesting (Score:5, Funny)
Didn't you even read the post? When it's lowercase. Duh.
Re: (Score:3, Informative)
Re: (Score:2, Informative)
Re: (Score:1)
Re: (Score:2)
Re: (Score:2)
To me, the best approach would be to limit the character sets that can appear below any given TLD. So, having simplified Chinese under ".ch" would be fine, but not under ".com" or ".us" -- The idea be
Re: (Score:2)
Passwords aren't any better, done right. (Score:2)
The problem of certificate management is, IMO, actually more tractable than the problem of password management. There are lots of ways that you could allow people to move certificat
Great (Score:1)
Re: (Score:2)
Dibs! (Score:4, Funny)
Re: (Score:3, Informative)
Umm, you do realise this was registered in 2005? Such domains already exist and can be registered today.
The technical test is about having Internationalised Domain Names at the top-level, or root, of the DNS. So then you can have
Re:Dibs! (Score:5, Funny)
So we could theoretically have sex at any level... but this is slashdot, so it's not likely to happen for anyone around here.
Phishing (Score:2, Redundant)
Re: (Score:2)
They'll do the same as is done right now: very little. If you're a company in this day-and-age, you have to register as many variants of your name as you can to ensure that phishers/domain squatters don't get undue traffic from your name. On the other hand, phishers don't necessarily need domain names that are close to their target domain; people don't generally read URLs that closely, just clicking on links they are sent. That's why phishing is still effective despite all the negative publicity.
They could split unicode into sections (Score:3, Insightful)
Then only allow names and queries all from the same character set.
Multiple character sets in one URL? (Score:2)
I mean, first of all, in order to use non-Latin characters at all, you have to have some way of transmitting which character set / codepage you want to use. I can't find any place in TFA where they actually describe how this is going to work (although I didn't read the PDF, so perhaps it
Re: (Score:2)
Re: (Score:2)
However, the GGP was specifically talking about the possibility of using multiple character sets in the same URL, which I think would be wholly impractical, and unnecessary given the widespread use of the UCS.
Speaking out of ignorance here: surely some languages require multiple character sets? If you're using English, French, German, or Spanish, fine, ASCII will do fine. But If you're using Polish, Lithuanian, Czech, Ma_ori, or Turkish, you're going to have to use a combination of basic Roman characters plus characters with a variety of diacritics or other modifications, aren't you? I guess there's something I'm missing.
Anyway, even presuming I am missing something, it looks to me like it'd still be pretty e
Re: (Score:3, Informative)
This has actually been discussed to some extent for years. One method is to only allow domains to be registered or displayed in a single language character set, such that a domain name can use latin characters or greek characters, but not both. This can be enforced at registration or when displayed in the browser (the browser can highlight improper URLs). This does not prevent attacks where the entire spelling of the domain is available in an alternate character set. One solution is for the browser to some
Maybe not.. (Score:1)
Re:Maybe not.. (Score:4, Insightful)
Re: (Score:2)
Re: (Score:3, Informative)
Re: (Score:2)
Re: (Score:2)
What about security issues? (Score:1, Redundant)
Testing functionality and behaviour with "good" names is an easy bar to hurdle.
Re: (Score:2)
Re: (Score:3, Insightful)
Phishing attacks mostly works not because people can't see a minute difference between two lookalike letters; they work because as long as nothing is utterly obviously, grossly out of order people just assume they're in the right place. You can have domain names that aren't even close to the real one, and websites with only superficial similarities to the original and a lot of peop
Re: (Score:2)
Indeed. This makes an existing problem much much bigger.
Phishing attacks mostly works not because people can't see a minute difference between two lookalike letters; they work because as long as nothing is utterly obviously, grossly out of order people just assume they're in the right place.
And what people see as "obviously out of order" changes as people learn about phishing. It's
Re: (Score:2)
The concern I have with IDNs is that they will make it too easy to produce "lookalike" domains, like "mcrosoft.com".
This really seems like a pretty minor issue to me. Browsers would just need to adopt a policy of flagging URIs with mixed language character sets, highlighting that character in red or something. More dangerous is the new domain land grab as companies grab legitimate domains in other languages that natives feel the real company simply must own, but which the parent company probably does not. This can be addresses by a certificate scheme that ties identity verification to the site, however, and such a sche
Re: (Score:2)
It's not a minor issue, and it's not an insoluble issue, but it's one that needs to be positively and aggressively addressed.
And it's not just browsers: you need to flag these characters in any application that renders internationalized text with or without HTML being an intermediary. Alternatively, registered domai
Re: (Score:2)
Not if it's actually implemented.
But given some of the ratbags running domain registrars, you think they'll bother?
the Japanese set contains the full alphanumeric alphabet
There are always a few special cases. You just deal with them... for example, deny names using just those characters.
Re: (Score:2)
In practice it means "national" URLs. (Score:2)
Re: (Score:3, Informative)
But you will still be able to click them. IDN support is available in most popular browser (although disbled for security issues.)
Re: (Score:2)
What browser are you referring to? IDN support is in Firefox, IE, Opera etc. and not disabled, so I am wondering what this most popular browser you are referring to is...
Re: (Score:2)
Those who will have these "international" URLs will almost all be using their national keyboards so they will not be familiar with the US keyboard layout... or other foreign layouts. And umlauts was just one example... what about "ç" (had to paste it myself..) or "". How would they be certain how they're mapped in a fo
Re: (Score:2)
type a " and e and you will get ë
Re: (Score:2)
How often do you ever type in an URL in the first place? You get the link from another web site, from Google, in an email or wherever. And AFAIK, the fallback representation is no less readable and typeable than many current domain names.
Besides, if the website is already in the country's language, you won't be too likely to be interested in it anyway unless you know it (and, presumably, know how t
Re: (Score:2)
And I said "If y
Re: (Score:1)
Re: (Score:2)
No, it means you can't rely on them. Which is pretty much the same for any new technology that requires client support.
A practical way of using these domains is to set up an ASCII one that you advertise, and redirect to the canonical one with the umlauts etc. That way native speakers aren't alienated by the mangling of their language and don't get errors when they type in the rea
Re: (Score:2)
It's not just a "doing business with Americans" (or other Westerners) problem, it's a 'doing business with anyone outside your area' problem. ASCII is the only character set where you have a good chance of ensuring that some other person will be able to type it. I.e.,
Can we have "/..org" now ? (Score:1)
Re: (Score:2)
Re: (Score:2)
I'm sure there's plenty of Unicode characters which look like a period too, so yeah, if you just want it to look like it you're probably fine. At worst the dot could be replaced with a dot at half-line height (which would probably be more accurate to the word "dot" anyway ;) ).
They are not "international urls" (Score:1)
Couldn't they just have encoded it? (Score:2)
Re: (Score:2)
Already been done. See Punycode (RFC3492). The problem with encoding schemes, though, is that they aren't memorable, and hence are problematic to typo into, say, the location bar of a browser.
Re: (Score:2)
Re: (Score:2)
BIND seems to handle it just fine; I don't know of any problems with UTF-8 in BIND. I still don't get why punycode was invented, and this is the one issue where I agree with Daniel J. Bernstein. See his page [cr.yp.to] on the issue.
Re: (Score:1)
This is exactly what is happening behind the scenes AFAIK. It's called Punycode.
See http://en.wikipedia.org/wiki/Punycode [wikipedia.org]
English "X" vs. Cyrillic "khah" (Score:4, Insightful)
Romanization as DNS lingua franca (Score:2, Interesting)
Re: (Score:3, Interesting)
Re: (Score:2)
Re: (Score:3, Insightful)
Re: (Score:2)
My suspicion is that the only way to deal with this is to completely disallow mixing of languages in the same URL (or at least in the domain name, which should be enforced by the registrars). Anything less leaves far too much room for abuse. Imagine the field-day phishers would have with this: register www.bankofamerica.com with an omicron and a digamma (ok, the lower case wouldn't look right - you know what I mean), and you control a domain visually indistinguishable
Glyph Masquerade (Score:2)
A good complement to the new system to preempt the huge coming problem of "glyph masquerade" would be registrations including a list of the domain name translated into different languages. Or at least a declaration of the home language. Without enforcement (ICANN doesn't even enforce name/address veracity) it won't be pr
Re: (Score:2)
Perhaps you need an automatic translation from English to, say, duh.
Security minded questions (Score:3, Interesting)
Some Unanswered Questions About IDNs ... (Score:3, Interesting)
Excerpt from a post of mine on DNForum regarding IDNs:
http://www.dnforum.com/showthread.php?p=732080 [dnforum.com]
I'm running into a lot of issues that many IDN folks aren't discussing - probably because they've not consider them
Various issues / threats / questions:
?? The existance of numerous diverse dialects, even totally different languages, etc in the same country
?? An IDN that contains western european characters that very close matches a non IDN
?? Trademark issues
?? language variants (more applicable to asian languages, etc) related issues
?? what happens when a language variant table changes? -how are conflicts handled?
?? what happens if a character variant (an IDN [IDL package] technically can comprise multiple character variants [code points]) is released?
?? What happens if a reserved character variant is changed to a preferred character variant? - while such a change would have little to no effect on affected IDNs (IDL packages), it could result in the appearance of some IDNs changing
?? How reliable, especially for those in languages with numerous character variants, will IDN domain resolution be?
?? How well will IDN resolution APIs be regulated
Rambling on, but there are a lot of things that one needs to be aware of with IDNs.
H4x0rs our there rejoice... (Score:1)
Imagine it with different ANSI colors for each char.
Balkanising the internet? (Score:4, Interesting)
If non-Roman domain names become popular, will I still be able to access them, or will they disappear behind untypeable URLs? A search engine may be able to mitigate this problem somewhat, but ATM I sometimes get search results for Japanese-language pages only because my search term is present in the URL.
1: yes, a site can still be useful in this case and no, despite the stereotype it's not just for porn.
Re: (Score:2)
Imaging all the Japanese who don't know English, but have to learn/type english domain names. Very unintuitive for them.
My concern would be for all the internet filtering and firewalling software which explicitly only allows ASCII in HTTP headers.
Re: (Score:3, Informative)
IDN encoding is pure ASCII, in a similar way that MIME email attachments are. The protocol layer never sees anything other than letters, numbers and hyphens. All IDN encoded domains are prepended with "xn--" so that end-user interfaces can tell them apart and convert them back and forth.
Re: (Score:2)
Re: (Score:3, Interesting)
Bad example.
The Japanese are probably the *least* likely of any non-English speaking country to use non-roman url's. The fact is the standard Japanese keyboard is the same exact QWERTY keyboard we use. They can type Japanese through software, which is how they normally work when writing to each other, but there's nothing "non-intuitive" in using an English keyboard in the way that it was
Re: (Score:1)
First Test? (Score:2)
I heard of this long time ago (Score:2)
Some have been working on this for a while... (Score:1)
As a side note, it's interesting that Slashdot says this link is at cr.yp.to.
fingering fun (Score:1)
if there is going to be some traditional ASCII alternative url.. then just what are we doing?
i am all for versatility, but there is always talk about unification, this would just segregate the web into 'things i can type' and 'things i can't'
and considering that html is in american, and that most people take into account that english is a very common language when designing a page, are we not just creating some no
Re: (Score:2)
You're not. Same way there is no convient way to write english chars on a russian keyboard. There's nothing to do but switch charset and try to remember where the characters were.
The more important question is, why should countries with completely different alphabets than us be forced to use our alphabet? Right now, Russians ha
Re: (Score:2)
so how am i, on my gb keyboard suppose to conveniently type in all sorts of foreign characters?
What are you saying -- that you actually TYPE URLs into the address bar? Have you never heard of del.icio.us? Or bookmarks? Or clicking on a link?
Already done (Score:3, Interesting)
Once again, committees lag behind actual problems and actual solutions.
Now if you'll excuse me I'll go back to browsing
(I seem to recall that
Hogwash and a waste of time... (Score:2)
I don't know (Score:2)
Re: (Score:2, Interesting)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
I don't see this as being very popular. Does the average Internet user know how to get an umlaut to display?
Yeah. All those people of the world who speak languages that use those characters have no clue how to actually type them in. Are you freaking stupid?
Re: (Score:2)
Re: (Score:2)
Re: (Score:2)
Has anyone found a list of the new characters they are planning to allow? There are loads of ASCII ones currently banned, and I'd like to know if would allow a backdoor to registering some english domains that I might want, such as Andy_R.com
Re: (Score:2)
Re: (Score:2)
Strangely, accessing ©.com in IE directed me to an advertisement for VeriSign's IDN client software [idnnow.com]. xn--gba.com works just fine in IE though.