Follow Slashdot stories on Twitter


Forgot your password?

Unicode 7.0 Released, Supporting 23 New Scripts 108

An anonymous reader writes "The newest major version of the Unicode Standard was released today, adding 2,834 new characters, including two new currency symbols and 250 emoji. The inclusion of 23 new scripts is the largest addition of writing systems to Unicode since version 1.0 was published with Unicode's original 24 scripts. Among the new scripts are Linear A, Grantha, Siddham, Mende Kikakui, and the first shorthand encoded in Unicode, Duployan."
This discussion has been archived. No new comments can be posted.

Unicode 7.0 Released, Supporting 23 New Scripts

Comments Filter:
  • by Fubari ( 196373 ) on Monday June 16, 2014 @09:33PM (#47251149)
    Fragmented? I haven't heard of any unicode forks. The people at the Unicode_Consortium [] seem like they're doing ok. Unicode seems pretty backwards compatible; have any of the the newer versions overwritten or changed the meaning of older versions (e.g. caused damage)? That isn't true for various ascii encodings, which is an i18n abomination on the hi-bit characters. Or with ebcdic, which isn't self compatible. One of the things I love about unicode is the characters (glyphs) stay where you put them, and don't transmute depending on what locale a program happens to run in.

    The larger Unicode becomes, the more fragmented the implementations will be.

    Maybe instead of fragmented, you mean there won't be font sets that can't render all of unicode's characters?
    *shrug* Even if that were a problem, the underlying data is intact and undamaged and will be viewable once a suitable font library is obtained.

    The more fragmented it is, the more errors and incompatibilities will compound. It will get less and less useful, and more and more bulky, and will eventually be as useful as Flash. (well, it may not be that bad, but still, Flash was all things to all people, and almost universally installed, until it wasn't.

    Can you give me an example of an incompatibility? I'm not saying there are none, just that I don't know of anything and that, in general, I've been very pleased with Unicode's stability - compared to other encodings - for doing data exchange.

  • Re:Why emoji? (Score:5, Interesting)

    by BitZtream ( 692029 ) on Monday June 16, 2014 @11:12PM (#47251693)

    But they're not "standard" even if Unicode claims they are.

    They are standard in reference to Unicode because the Unicode Consortium defines the Unicode standard. Someone has to be the first to define the standard.

    but there is not central body that dictates exactly what they look like, so that pile of poop symbol will vary depending upon which texting app you use it with

    Yes, those are called fonts, and in case you haven't noticed, that was true before digital computers with silicon microprocessors even existed and has been true for thousands of years.

    The apps that use emojis are not coordinating with any standard's body or ensuring that the intended meaning is preserved.

    Apple does, hence why the Messages app already matches the new code points. Google Hangouts seems to work fine as well. Both Messages and Hangouts convert even things like :) into the proper unicode code point and use standard fonts for display. Sure, some half assed apps may not work correctly, but anyone that supports unicode and has fonts will receive them properly already.

    Emoji is somewhat silly, but its hardly new, just go ask Japan. Just because you're new to the ballgame doesn't mean its a new ballgame.

  • The main problem is the broken CJK (Chinese, Japanese, Korean) support that has caused numerous ad-hok work-arounds and hacks to be developed. In a nutshell all three languages shared some common characters in the past, but over time they diverged. Unfortunately these characters share the same code points in Unicode, even though they are rendered differently depending on the language. A Japanese and Chinese font will contain different glyphs for the same character.

    It is therefore impossible to mix Chinese and Japanese in the same plain text document. You need extra metadata to tell the editor which parts need Chinese characters and which need Japanese. There are Japanese bands that release songs with Chinese lyrics and vice versa, and books that contain both (e.g. textbooks, dictionaries). Unicode is unable to encode this data adequately.

    Even the web is somewhat broken because of this. If a random web page says it is encoded with Unicode there is no simple way for the browser to choose a Japanese, Korean or Chinese font, and all the major ones just use whatever the user's default is.

    It really isn't clear how this can be fixed now. Unicode could split the code pages but a lot of existing software will carry on using the old ones. It's a bit of a disaster, but most westerners don't seem to be aware of it.

An elephant is a mouse with an operating system.