Google Releases An Open Source Font That Supports 800 Languages (googleblog.com) 175

Posted by EditorDavid on Sunday October 09, 2016 @09:40PM from the Unicode-complete dept.

An anonymous Slashdot reader quotes Hot Hardware: It's been working on the project over the past five years in collaboration with Monotype in hopes of eradicating so-called "tofu" -- the blank boxes you see when a PC or website can't display a particular text -- from the web. Noto, or No more tofu, is Google's answer, and it's available now to download...

"We are thrilled to have played such an important role in what has become one of the most significant type projects of all time," said Scott Landers, president and CEO of Monotype... Monotype played the biggest role, though Google also collaborated with Adobe and had a network of volunteer reviewers. As far as Monotype is concerned, Noto is one of the expansive typography projects ever undertaken.
There's 110,000 characters, and Google says the project "required design and technical testing in hundreds of languages."

This discussion has been archived. No new comments can be posted.

Google Releases An Open Source Font That Supports 800 Languages

Load All Comments

Search 175 Comments Log In/Create an Account

Comments Filter:

Keeping up with the emojis (Score:2)

by Megane ( 129182 ) writes:

Isn't a lot of this due to all the new stuff that Unicode keeps adding? I still have a Bitstream Cyberbit font somewhere from... was it back in the late '90s? This is the same thing all over again, just up to date.
- Re: (Score:1, Funny)
  
  by Anonymous Coward writes:
  
  Isn't a lot of this due to all the new stuff that Unicode keeps adding? I still have a Bitstream Cyberbit font somewhere from... was it back in the late '90s? This is the same thing all over again, just up to date.
  The whole rest of the world just needs to learn fuckin' English!
  Signed,
  Provincial Americans Everywhere (by "everywhere" I mean the USA -- clearly there is no "where" else to be! So what if Candians and other foreigners understand our politics better than we do. That just shows our awesomeness!)
  - Re: Keeping up with the emojis (Score:2, Funny)
    
    by Anonymous Coward writes:
    
    I just need the Klingon word for mocking condescension to belittle you with.
    - Re: (Score:2)
      
      by paiute ( 550198 ) writes:
      
      Ziplock!
      - Re: (Score:2)
        
        by WallyL ( 4154209 ) writes:
        
        You misspelled p'takh!
    - - Re: Keeping up with the emojis (Score:5, Funny)
        
        by Guppy ( 12314 ) writes: on Monday October 10, 2016 @03:36AM (#53045601)
        
        toDSaH
        Wow, Klingons have a word for everything. They're like Space Germans.
        
        Parent Share
        twitter facebook
  - Re: (Score:2)
    
    by mcswell ( 1102107 ) writes:
    
    Why not tell the rest of the world (including you) to learn Chinese, or Spanish, or Bangla? That's easy, right?
    - Re: (Score:3)
      
      by account_deleted ( 4530225 ) writes:
      
      Comment removed based on user account deletion
      - Re: (Score:2)
        
        by account_deleted ( 4530225 ) writes:
        
        Comment removed based on user account deletion
        
        Re: (Score:2)
        
        by Hognoxious ( 631665 ) writes:
        
        I'd say that's more of a feature than a bug.
    - - Re: (Score:2)
        
        by Hognoxious ( 631665 ) writes:
        
        English is the current lingua franca
        Je vois ce que vous avez fait là.
        
        Re: (Score:2)
        
        by Thanatiel ( 445743 ) writes:
        
        There is no real french equivalent of "I see what you did there". (a.k.a. : saying in one expression you a witty remark)
        Perhaps one could say "joli" (nice), or "bien dit" (well said) maybe with a smiley next to it ... but it does not feel natural and they do not require more than 7 bits per character.
        Most french-speaking kids would probably end up using the english expression without a clue of how to write it nor its exact meaning.
      - Re: (Score:2)
        
        by greenfruitsalad ( 2008354 ) writes:
        
        > it's pretty much the only language you can be sure to find somebody that speaks it nomatter where you are
        this is why i'm learning spanish.
- Re:Keeping up with the emojis (Score:5, Informative)
  
  by dmoen ( 88623 ) writes: on Sunday October 09, 2016 @10:33PM (#53044673) Homepage
  
  Bitstream Cyberbit was closed source, and had a license incompatible with GPL. Noto is free and open source. The source files for the fonts, and the build tools, are all open.
  Noto is an ongoing open source project that will continue to track the Unicode standard, while Cyberbit implemented Unicode 1.0.1 and then just stopped.
  Noto has Sans and Serif variants in a range of weights and styles, unlike Cyberbit, which had only a single style and weight (serif).
  So that's more than just "the same thing all over again".
  
  Parent Share
  twitter facebook
  - hells teeth (Score:4, Interesting)
    
    by johnjones ( 14274 ) writes: on Monday October 10, 2016 @06:12AM (#53046087) Homepage Journal
    
    honestly
    where is the mathematical fonts and symbols for science ?
    STIX goes some way but why this is not in noto ?
    why would you send a mathematical explanation into the stars but we cant express those notations on machines we use every day ?
    thanks
    John Jones
    
    Parent Share
    twitter facebook
  - Google management is becoming more and more messy. (Score:2)
    
    by Futurepower(R) ( 558542 ) writes:
    
    Thanks for the explanation.
    
    I notice that Noto Serif is a well-designed font. There is an italic and a bold, but no semi-bold. The Google Noto font download web page [google.com] is a mess. How is NotoSansMandaic-unhinted different from NotoSans? When I look at the font in Windows font preview, I see no difference.
    
    I see many examples of Google management becoming more and more messy.
- Re:Keeping up with the emojis (Score:4, Interesting)
  
  by Anonymous Coward writes: on Sunday October 09, 2016 @11:14PM (#53044795)
  
  Hate to say it but I consider the conversion of all emojis to tofu a feature, not a bug. The tofu neatly summarises the vacuousness of the original abomination... I mean, message.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by Michael Woodhams ( 112247 ) writes:
    
    So make your own branch of Noto called NoEmo, in which all emoji are rendered the same (possibly blank, possibly some generic 'this is an emoji' symbol.) It is open, so there is nothing to stop you.
- Re: (Score:2, Insightful)
  
  by DraconPern ( 521756 ) writes:
  
  I think it's more, this is all the glyph in one font, where as before, you had Chinese, Arabic etc. all in separate fonts. The other half the problem google had was that they didn't have good font rendering in Android, e.g. how you actually render the font. Microsoft, Apple, and Adobe had it figured out a long time ago and all that knowledge is part of the OS. So google is basically just playing catch up and open sourcing the data part. Also... do we really want to load that large of
  - Re:Keeping up with the emojis (Score:4, Informative)
    
    by AmiMoJo ( 196126 ) writes: on Monday October 10, 2016 @06:58AM (#53046231) Homepage Journal
    
    There are still multiple font files for different languages, because you can't have a unified "all language" font with Unicode. It's impossible to support Chinese, Japanese and Korean in the same font, for example.
    Android's font rendering is excellent, has been for years. It also helps that many Android phones, even mid range ones from a few years back, have 1080p or better displays that start to rival print for DPI (400-500 PPI on the screen, 3x that horizontally with sub-pixel rendering, vs. 600 DPI for prints).
    Google just want consistency everywhere and the ability to ship one font that covers all possible languages. You still need hacks because of the Unicode flaw mentioned above, but it's a big step none the less. AFAIK the only other open source font that tries to do this is GNU Unifont, but it's more functional that pretty.
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by HideyoshiJP ( 1392619 ) writes:
      
      Yeah, I was curious how they were going to handle language dependent characters that occupy the same unicode space [wikipedia.org].
  - Re: (Score:2)
    
    by thegarbz ( 1787294 ) writes:
    
    Also... do we really want to load that large of a font when most people only use a fraction of the data?
    The problem with this argument is that people only use a fraction of the data right up until the point where they don't, and then everything breaks. I don't speak a word of Japanese but I have Japanese fonts on my computer. Why? Because at some point something important was embedded in a PDF which had some Japanese in it and it refused to render. Up until that point I would have agreed with you, but really the ability to see things how they are supposed to be trumps having a broken view that could construe
- Re: (Score:2)
  
  by unixisc ( 2429386 ) writes:
  
  Isn't a lot of this due to all the new stuff that Unicode keeps adding? I still have a Bitstream Cyberbit font somewhere from... was it back in the late '90s? This is the same thing all over again, just up to date.
  Did Noto need to support emojis?
"Now available to download" link (Score:5, Informative)

by aneroid ( 856995 ) writes: <gmail> on Sunday October 09, 2016 @09:59PM (#53044603) Homepage Journal

https://www.google.com/get/not... [google.com] You're welcome
Came across this a few days ago when I borked my Slackware upgrade. Everything went fine except GUI login; X kept crashing because I deleted the fonts it was trying to use. One of the google search results was Noto.
All fonts = 472.6 MB.

Share
twitter facebook
- Re: (Score:1)
  
  by aneroid ( 856995 ) writes:
  
  Forgot to mention - this still doesn't solve the tofu problem since you need to have the font installed to not see tofu. In which case Google Web Fonts [google.com] is still the way to go. You just pick a font which supports your content/language. Or one of the Noto fonts.
  - Re:"Now available to download" link (Score:5, Informative)
    
    by aneroid ( 856995 ) writes: <gmail> on Sunday October 09, 2016 @10:42PM (#53044691) Homepage Journal
    
    1. On the emjoi's fonts [google.com] there's "Raised Hand With Part Between Middle And Ring Fingers" - WhyTF is that not called "live long and prosper"? Some fonts are described by how they look while others are described by what they mean. A bit inconsistent but I guess that's more of a Unicode consortium issue [unicode.org].
    2. Some of the hand emoji's like "White Left Pointing Backhand Index" are all called "white..." even though they've clearly done the race/skin tone colour spectrum ala whatsapp [indiatimes.com].
    2b. The colours are a second unicode code (emoji modifier sequence [unicode.org]) on the emoji ranging from U+1F3FB (white/pale) to 1F3FF (black/dark). (Btw, that's counter intuitive to programmers since RGB colour codes [google.com] have "#00" being dark and "#FF" being light.) P.S. I haven't decided if the skin colour aspect of emoji's is racist or not. There may be some people who found the default yellow emoji's racist.
    Answer to #2 [unicode.org]:
    
    Names of symbols such as BLACK MEDIUM SQUARE or WHITE MEDIUM SQUARE are not meant to indicate that the corresponding character must be presented in black or white, respectively; rather, the use of “black” and “white” in the names is generally just to contrast filled versus outline shapes, or a darker color fill versus a lighter color fill. Similarly, in other symbols such as the hands U+261A BLACK LEFT POINTING INDEX and U+261C WHITE LEFT POINTING INDEX, the words “white” and “black” also refer to outlined versus filled, and do not indicate skin color.
    and
    General-purpose emoji for people and body parts should also not be given overly specific images: the general recommendation is to be as neutral as possible regarding race, ethnicity, and gender. Thus for the character U+1F777 CONSTRUCTION WORKER, the recommendation is to use a neutral graphic like (with an orange skin tone) instead of an overly specific image like (with a light skin tone). This includes the emoji modifier base characters listed in Sample Emoji Modifier Bases. The emoji modifiers allow for variations in skin tone to be expressed.
    
    Parent Share
    twitter facebook
  - Re:"Now available to download" link (Score:4, Insightful)
    
    by ptaff ( 165113 ) writes: on Sunday October 09, 2016 @11:32PM (#53044845) Homepage
    
    Google Web Fonts is still the way to go.
    And helps Google track users one more way. Please be a good hacker and serve fonts from your own domain. Thank you.
    
    Parent Share
    twitter facebook
    - - Re: "Now available to download" link (Score:5, Insightful)
        
        by TheRaven64 ( 641858 ) writes: on Monday October 10, 2016 @04:20AM (#53045749) Journal
        
        It's not always laziness (or tracking, from Google's perspective). Google sets a long cache value for most of these resources. If 10 different sites all host them individually, then someone visiting the site will have to download the fonts 10 times. Alternatively, if they all point to Google then they'll download once and cache the copy locally for the other 9 sites.
        There was a proposal a couple of years ago to embed a cryptographic hash of the resource in the link. This would allow you to specify a download location, but if you've already downloaded the file from another source then you could still use it (it would also make caches more efficient, because you could set an infinite timeout and make clients redownload by having a different hash in the link - clients would keep their copy potentially forever, until you updated the version). I don't know of any browsers that implemented it though.
        
        Parent Share
        twitter facebook
        
        Don't favor minor cache savings over tracking. (Score:2)
        
        by jbn-o ( 555068 ) writes:
        
        Storage is cheap and plentiful these days; the caching argument doesn't convince me and minor improvements strike me as possibly nice conveniences but nothing significant. I'd rather promote not centralizing the web and not encouraging doing work with known trackers including Google.
        
        Re: (Score:2)
        
        by thegarbz ( 1787294 ) writes:
        
        Storage is cheap and plentiful these days;
        Tell that to my mobile phone contract struggling under the weight of yet another multi-megabyte websites that does not need to be.
        
        Re: (Score:2)
        
        by TheRaven64 ( 641858 ) writes:
        
        Storage is cheap, bandwidth is often not. When you're downloading 500KB of a JavaScript library for multiple different sites, that adds up quickly on mobile devices. It also adds to the page load times - the odds are your users will have cached the thing from Google already, so it doesn't add anything to their load times. Additionally, for JavaScript, it's possible to store commonly-used libraries in pre-parsed form (Safari will keep the bytecode for cached JavaScript libraries for a while), which also i
  - Re: (Score:2)
    
    by Travis Mansbridge ( 830557 ) writes:
    
    In HTML5 you can serve fonts, so it's just a matter of including Noto on sites where tofu might be a problem.
- Re: (Score:2)
  
  by hcs_$reboot ( 1536101 ) writes:
  
  Big. But with some luck, they will be integrated into Chrome, at least the main ones, regular / bold / italics. The size would go down 75+%.
- - Re: (Score:1)
    
    by jonwil ( 467024 ) writes:
    
    By far the vast majority of that download size is taken up by the fonts for the 1000s of characters in Japanese, Korean, Simplified Chinese and Traditional Chinese.
    All the other fonts only total to about 10mb or so.
    - Re:"Now available to download" link (Score:5, Informative)
      
      by Qzukk ( 229616 ) writes: on Sunday October 09, 2016 @10:55PM (#53044741) Journal
      
      Way back when Unicode decided to unify all the CJK glyphs they made several screwups in unifiying characters that were not actually the same in each of the languages. Aside from the character looking wrong in Chinese or Japanese (whichever language you don't have installed as default) they may sort differently in different languages so collation is wrong too. More information (note that you'll need a full CJK font and a browser supporting language selection to see the differences). [wikipedia.org]
      Noto's solution was to create a font with every possible glyph, then for systems which can't support identifying the correct glyph based on language, they made versions of the fonts where the default characters are the Japanese versions or the Chinese versions or so on, then for embedded stuff they made versions of the fonts with just one language's characters. Noto's explanation of their CJK fonts [google.com]. In other words, you only need one of the 110MB font files.
      
      Parent Share
      twitter facebook
      - Re: (Score:2)
        
        by TheRaven64 ( 641858 ) writes:
        
        Aside from the character looking wrong in Chinese or Japanese (whichever language you don't have installed as default) they may sort differently in different languages so collation is wrong too.
        Collation shouldn't be broken. Collation is always locale-specific. German, English, and French all have different collation orders, even though they're using the same character set (how you sort capitals vs lower case vs accented variants is different in each). The only reason that this would break collation would be if, for example, Japanese sorts Chinese characters differently from the equivalent Kanji (does it? I have no idea).
        
        Re: (Score:2, Informative)
        
        by Anonymous Coward writes:
        
        German and Swedish might be a better example.
        They both have ö and ä, but German orders ö like o and ä like a, while Swedish puts them after z.
        And those very much ARE the same characters.
        
        Re: (Score:2)
        
        by doom ( 14564 ) writes:
        
        utf8mb4
        
        Real programmers avoid using MySQL.
        The "Han Unification" hack does have it's problems (often exaggerated, but still there there), but I wouldn't say that that's the real problem: I think you're right about needing metadata for every string, and the real question in my mind is why isn't that part of unicode itself? There used to be a way to embed locale hints in the text, but that was deprecated with Unicode 5. WTF? What exactly were they thinking?
        There's another issue I don't get at all, w
        
        Re: (Score:2)
        
        by TheRaven64 ( 641858 ) writes:
        
        Oh, but of course, we're expected to rewrite every application where every box a user could type an international name or address or text has a separate drop down to select a language. That's totally less exasperating.
        No you're not. If it's a desktop application, you get the locale from the local user's settings. If it's a web application, you get it from the Accept-Language HTTP request header field. And then you just use that. Since POSIX 2008, even libc has contained thread-safe interfaces for locale-aware sorting. If you're using a database that doesn't support locale-aware collation, then I suggest that you find one that doesn't suck: PostgreSQL has had support for it for well over a decade and can use either l
      - Re: (Score:2)
        
        by _merlin ( 160982 ) writes:
        
        It's for situations where you allow user input, and don't want to limit them to entering text in a single language. Or if you want to display filenames, or the contents of e-mails, or whatever.
        
        Re: (Score:2)
        
        by AmiMoJo ( 196126 ) writes:
        
        Imagine you were writing software for an airline that operates in East Asia. Naturally you have customers from Japan, China and Korea, and naturally they expect their names to be rendered correctly on your web site and on printed material like tickets and boarding passes. They expect to be able to book online. Note that HTML doesn't allow mixing Japanese and Chinese in the same page, the most you can do is Unicode and the browser is guaranteed to render some characters incorrectly for your international cus
        
        Re: (Score:2)
        
        by Rockoon ( 1252108 ) writes:
        
        Note that HTML doesn't allow mixing Japanese and Chinese in the same page
        Note that the font is half a gigabyte and any web page that attempts to send it off to your browser, because a character might look slightly different otherwise, should be removed from the internet.
        
        Re: (Score:2)
        
        by AmiMoJo ( 196126 ) writes:
        
        All modern operating systems come with Japanese and Chinese fonts. The issue is that each HTML page can only specify one character encoding. If it says "Unicode" it can also specify a language to give the computer a hint as to which font to use, but again only one.
        If you look at pages like Chinese language lessons for Japanese readers they often use images or Flash to render the Chinese text correctly, because the browser can't do it. More recently it became possible to hack it with CSS and font stacks, but
        
        Re:"Now available to download" link (Score:4, Interesting)
        
        by _merlin ( 160982 ) writes: on Monday October 10, 2016 @12:57AM (#53045103) Homepage Journal
        
        Yeah, but it's like "90% of people use 10% of features" - everyone uses a different 10%, so 100% of features are used. Similarly, everyone needs a different combination of languages, so if you're going to use one family of fonts, you want to have massive coverage.
        
        Parent Share
        twitter facebook
        
        Re: (Score:2)
        
        by Qzukk ( 229616 ) writes:
        
        I can't find any indication of what makes those fonts different than the alternatives
        The CJK Unified Ideographs block has 20950 assigned code points [wikipedia.org] most of which are significantly more complicated than Latin script. Add to that katakana, hiragana, hangul, radicals, and so on and there are a lot of characters, making the font significantly larger than fonts for latin-1.
    - Re: (Score:2)
      
      by aneroid ( 856995 ) writes:
      
      All fonts = 472.6 MB.
      That's for all of them. Individual fonts are reasonably and typically sized. Bear in mind, having these many more glyphs for so many languages does require them to be bigger.
      Noto Sans: 657 KB (4 styles, 581 languages)
      Noto Serif: 838 KB (4 styles, 581 languages)
      Noto Mono: 69.5 KB (1 style, 209 languages) # this should have had 581 langs
    - Re: (Score:2)
      
      by jrumney ( 197329 ) writes:
      
      I'd hazard a guess that the color emoji are taking up considerably more room than the fairly standard CJK glyphs that have been shipping in fonts around 3-4MB in size for the last 20 years..
      - Re: (Score:2)
        
        by omnichad ( 1198475 ) writes:
        
        Fonts generally don't have support for color. It's just lines, fills, and ligature instructions. There are just a LOT of languages out there.
        
        Re: (Score:2)
        
        by jrumney ( 197329 ) writes:
        
        You might like to update your knowledge [github.com] of the topic.
        
        Re: (Score:2)
        
        by omnichad ( 1198475 ) writes:
        
        That doesn't negate anything I've said. In fact, what you linked to says that support is rare or even difficult to get working. And that's even on Linux, so that's likely some sort of non-standard extension.
        But you can see on the download page [google.com] that Noto color emoji is only 2.8MB.
- - - Re: (Score:2)
      
      by Hognoxious ( 631665 ) writes:
      
      pyone
      
      Is that a combination of pyrotechnic + phone?
      I think Samsung have a patent on that.
This should have been put together by Unicode (Score:5, Insightful)

by complete loony ( 663508 ) writes: <Jeremy.Lakeman@gmai[ ]om ['l.c' in gap]> on Sunday October 09, 2016 @11:44PM (#53044881)

The Unicode consortium should have published glyphs like these as part of the effort of defining the standard.
Why did it take a separate private company to do this?

Share
twitter facebook
- Re: (Score:3)
  
  by speedplane ( 552872 ) writes:
  
  The Unicode consortium should have published glyphs like these as part of the effort of defining the standard.
  Why did it take a separate private company to do this?
  Probably because building a consortium to even define the characters is hard enough and expensive. Getting buy-in from everyone in the consortium to develop high quality glyphs for orphan languages would have reduced overall support. I agree they should have, but I don't think most company's are as generous as Google.
- Re: (Score:2)
  
  by AmiMoJo ( 196126 ) writes:
  
  Unicode doesn't consider renderings, that's why. A lot of characters can be rendered in multiple ways, but there is only one code point for all of them and it's up to the font designer which one they want to use. It's actually a huge problem in Chinese, Japanese and Korean, as well as other languages.
  It's time Unicode was deprecated and we moved on to something better. There is the TRON system that fixes or avoids most of the problems with Unicode, for example. Wouldn't be much of a change for applications
  - Re: (Score:2)
    
    by ColdWetDog ( 752185 ) writes:
    
    It's time Unicode was deprecated and we moved on to something better.
    So we can be even further behind the curve here?
- Re: (Score:3)
  
  by TheRaven64 ( 641858 ) writes:
  
  The entire point of unicode is that the glyphs are separate from the codepoints. The codepoints (defined by the unicode spec) convey semantics, not presentation. There are lots of different (valid) ways of representing each codepoint (if there weren't, then you wouldn't need fonts at all).
  Then along came emojis and the entire clusterfuck that led to.
- - Re: (Score:2)
    
    by praxis ( 19962 ) writes:
    
    When you choose to allocate your time to an open source project, you are choosing to allocate your "private" capital to that project.
    That's true but there are also people who's public time is allocated to an open source project.
No programmers' typeface (Score:5, Insightful)

by tdelaney ( 458893 ) writes: on Sunday October 09, 2016 @11:51PM (#53044895)

They have a monospaced typeface, but it's not useable for programming - doesn't even have a significant distinction between zero and O, let alone any other programmer-friendly features.
Since I presume they're going to want people at Google to use Noto as standard, it seems sensible to me that they create a programmers' version.

Share
twitter facebook
- Re: (Score:2, Insightful)
  
  by Anonymous Coward writes:
  
  I don't see why distinguishing between the zero digit and the letter O is more important for programmers than for anyone else. Sure, programmers might make mistakes when writing code and want to fix them; but that's true for other people writing text that might contain digits and letters, too.
  If anything, distinguishing between the characters is less important for programmers than other people because programmers will already notice the problem when their code won't compile. I think it is very probable not
  - Re:No programmers' typeface (Score:4, Insightful)
    
    by Hypoon ( 1095383 ) writes: on Monday October 10, 2016 @01:43AM (#53045291)
    
    ...because programmers will already notice the problem when their code won't compile.
    Substitutions of the letter 'O' for the number zero in numeric literals, function names, variable names, and other similar constructs will usually generate syntax errors, yes. (This makes me want to create a library called "Input0utput", just for headaches.)
    However, the compiler probably won't notice if you make the substitution within a string or character literal (if the user types "Outbound", but the software is expecting "0utbound", this might be a hard problem to debug). I've only done this once or twice, but it was infuriating. It's one of those few times when commenting out the line and retyping it verbatim will actually fix the problem.
    The fact that the keys are adjacent on QWERTY keyboards doesn't help anything.
    ...but that's true for other people writing text that might contain digits and letters, too.
    I misunderstood this at first. I was picturing something like, "Mr. Orville's appointment is at 1O:OO.", where the substitution is harmless, so I didn't understand. In something like a model number, "MSO001" might be the first (001) release of a Mixed Signal Oscilloscope (MSO). Writing it as "MSOOO1" definitely obfuscates the meaning behind the model number. Of course, "MSO-001" would probably be best, but it's preferable to match the label on the hardware itself. So yes, I see your point.
    But no, I'm firmly of the belief that the average programmer has a greater need (than the average typist) for easily distinguishable characters.
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by AmiMoJo ( 196126 ) writes:
      
      I see this mistake a lot with my girlfriend's handwritten text entries. She writes in Chinese and occasionally inserts Arabic numerals (0123456789). The zero is often interpreted either as a capital O or as a Chinese character that seems to have been adopted from Japanese that is just a perfect circle, used as a substitute for censored characters. It's similar to how newspapers write "sh*t" in English (maybe it's a British thing).
      She knows my Chinese is crap so sometimes writes '9' in Chinese and then selec
      - Re: (Score:2)
        
        by Yvan256 ( 722131 ) writes:
        
        I'm pretty sure AmiMoJo meant arabic numerals [wikipedia.org].
  - Re:No programmers' typeface (Score:4, Insightful)
    
    by Nethead ( 1563 ) writes: <joe@nethead.com> on Monday October 10, 2016 @03:01AM (#53045517) Homepage Journal
    
    Where I find the problem is in randomly generated passwords. I have a large spreadsheet of VPN passwords for users at work that I had to change the the password column to an OCR font just to make sure I was giving out the correct code.
    The original C64 had this issue which was worse on the SX64 with its 5" screen. I went as far as to design a custom font and burn it into the font EPROM.
    
    Parent Share
    twitter facebook
    - Re: (Score:2)
      
      by lars_stefan_axelsson ( 236283 ) writes:
      
      Where I find the problem is in randomly generated passwords.
      Yes. KeePassXs "exclude lookalike characters" when generating is really useful. I doesn't drop that many bits, and for most situations I can just make the PW a bit longer if it's a concern.
      Trying to type a "random" generated password with lookalikes is an exercise in futility.
  - Re: (Score:2)
    
    by Ken D ( 100098 ) writes:
    
    I once transcribed a program from a magazine into my first computer... as a hex dump.
    The magazine chose a font where 0, 8, and B were practically identical. That's ~20% of the hexadecimal digit space that's confusing.
    I guess I was a glutton for punishment, because I did get the program to run.
  - - Re: (Score:2)
      
      by locofungus ( 179280 ) writes:
      
      Saying oh for zero is common in (British) English.
      Dialing code for London:
      020 - Oh two oh.
      Start of a telephone number:
      700 - seven double oh.
      International dialing code for the US:
      00 1 - oh oh one. (Don't know why we don't say double oh but I've never heard it said that way.)
      Bus number:
      205 - two oh five
      In normal spoken or written English you can usually determine whether it's a zero or a letter-o from the context and where you can't it rarely matters.
      - Re: (Score:2)
        
        by UberVegeta ( 3450067 ) writes:
        
        00 1 - oh oh one. (Don't know why we don't say double oh but I've never heard it said that way.)
        You mean in the same way that nobody says "double oh seven?"
        
        Re: (Score:2)
        
        by locofungus ( 179280 ) writes:
        
        The international dialing code for Kazakhstan from the UK would be 00 7 (I've just looked it up). I've never heard anyone quote a Kazakhstan telephone number to call from the UK but I would expect them to say oh oh seven, not double oh seven. Apart from anything else, if you did try to tell someone a Kazakhstan telephone number and started double oh seven I'd expect them to not hear the rest of the number while they were laughing.
- Re: (Score:2)
  
  by johannesg ( 664142 ) writes:
  
  Since I presume they're going to want people at Google to use Noto as standard, it seems sensible to me that they create a programmers' version.
  What kind of madness makes you presume Google wants all its employees to use this font?
  Tell, I'm genuinely curious. Do you also believe they do all their programming on phones running Android? Or do you suppose they might be allowed to use, I don't know, laptops or normal desktops?
- Re: (Score:2)
  
  by TheRaven64 ( 641858 ) writes:
  
  This font is intended as the fallback font. When the currently selected font doesn't have a glyph for the desired codepoint, your font engine will provide a substitute. It will start with similar styles (e.g. sans serif, monospace) and if that fails it will fall back to a generic font that has large coverage. That's the point of this font. If you're using it for most of the glyphs you're rendering, then you're doing it wrong.
  If you want a good font for programming, Adobe released Source Code Pro [github.com] a coup
  - Re: (Score:2)
    
    by hackertourist ( 2202674 ) writes:
    
    Except Source Code Pro only contains English glyphs, so it's useless for e.g. debugging exotic-language XML files. I keep switching between Source Code Pro and Arial Unicode MS, which has pretty good language support.
- - Re: (Score:2)
    
    by PRMan ( 959735 ) writes:
    
    I use Verdana. Everyone hates the fact that I use a proportional font, but we have laid out text in decades...
Slashdot solved the tofu problem long ago (Score:1)

by hcs_$reboot ( 1536101 ) writes:

fu
Horrible Mono Font (Score:2)

by brianerst ( 549609 ) writes:

That lowercase 'm' is a horror show. Simply awful.
It's also no good as a coding font (lack of distinction between various problematic glyphs) but that's probably not its audience.
- Re: (Score:2)
  
  by KozmoStevnNaut ( 630146 ) writes:
  
  Yeah, it's a bit naff, and obviously not their main focus. Luckily, there are tons of awesome monospaced fonts out there, and coding rarely needs full Unicode coverage.
- Re: (Score:2)
  
  by omnichad ( 1198475 ) writes:
  
  I don't think it's intended for use as a general-purpose font at all. Just for filling in gaps if the font you're reading in is missing a glyph for a particular codepoint. As an English reader/writer, it's unlikely you'll be seeing an 'm' substituted in.
  Anywhere you would see a square box now for missing characters, this font would render in. Will be really useful for viewing Wikipedia (where I see this the most).
Repairing the Unicode Consortium Clusterfuck (Score:5, Interesting)

by Anonymous Coward writes: on Monday October 10, 2016 @01:32AM (#53045251)

Thank you Google! This is badly needed because the Unicode Consortium screwed up Asian language support badly. The problem started when a bunch of Silicon Valley WASPS got together and formed the Unicode Consortium. Their experts were a joke. They had a foreign language expert who by his own admission couldn't speak the language he was supposedly expert it.

Then without consulting Asian language speakers they decided to combine all the Asian language characters - including those that were physically different.The result was like some elitist looking at the Greek and Roman alphabets and deciding 'a' is a lot like alpha, 'b' a lot like beta, so why not comine the two of them into a single alphabet, then tell you your name isn't Sam, it's "S". (Slashdot probably won't display this but you get the idea.) This affected eastern and central and south east asian languages.

This created the absurd situation where some people couldn't even spell write their names or enter them into databases prompting the famous "I Can Text You A Pile of Poo, But I Can't Write My Name" https://modelviewculture.com/p... [modelviewculture.com]

When it was pointed out did the Unicode Consortium admit they fucked up and fix it? Nope. They dug in their heels and insisted each country produce their own font which would display each Unicode character differently to suit their own language. Given the original goals of Unicode this was an amazing backflip. https://en.wikipedia.org/wiki/... [wikipedia.org] https://books.google.com/books... [google.com] https://plus.google.com/+LizHa... [google.com] There are other problems too: The encoding the consortium expected makes asian codepages use more space than the standards they were supposed to replace. This was stupid since ASCII was already super efficient for English language, so what was the point?

If you only write English language software and ASCII is good enough you won't notice any of this but if you have to write International software it's a nightmare. Yes, you might think adding Unicode support allows any your app to run in any language, but it doesn't work like that because of this clusterfuck. You still have to provide different fonts for different countries, and you often have to provide support for old codepages (the various BIG5 variants) for fallback which Unicode was supposed to replace. It also makes translation very hard.

But Unicode fixed it eventually? Nope. The Unicode consortium continued to ignore it to this very day and instead started churning out stupid emoji: a steaming pile of poo, a taco, and farcical 'equality' emoticons. https://www.theguardian.com/te... [theguardian.com] https://www.theguardian.com/ar... [theguardian.com]

I hope this new font gives us one font which can display all languages and fuck the Unicode Consortium

Share
twitter facebook
- Re: (Score:2)
  
  by UnknownSoldier ( 67820 ) writes:
  
  Mod Parent +1 Informative !
  I've running into my own problems of Unicode's shortsightedness.
  2 common glyph are:
  * mouse pointer (See fa-mouse-pointer [] [fontawesome.io])
  * cardinal 4 direction arrows (such as used on Windows, Move) (See fa-arrows [] [fontawesome.io])
  Yet are nowhere to be found in Unicode.
  You're definitely right - the Unicode Consortium is more interested in fluff crap like emoji then practical stuff.
  If the Unicode Consortium didn't have their head's up their asses we wouldn't even need fonts like Font Awesome [fontawesome.io]
  The funny thing
  - - Re: (Score:2)
      
      by alexhs ( 877055 ) writes:
      
      Arrows are definitely [wikipedia.org] present [wikipedia.org] in Unicode.
      - Re: (Score:2)
        
        by UnknownSoldier ( 67820 ) writes:
        
        And the cardinal 4 direction arrows [wikimedia.org] are _where_ again in Unicode ??
        We're not talking about general arrows, we are talking about a specific arrow. If you don't want to look like a fool, learn to read before replying, please.
- Re: (Score:3)
  
  by KozmoStevnNaut ( 630146 ) writes:
  
  I've been using the Noto font(s) for a while, they're installed by default in Linux Mint (probably Ubuntu and others, too), so I assume this is an incremental release, where they've finally achieved some semblance of full(ish) coverage.
  While I have a couple of minor issues with the fonts design (the lowercase 'm' and 0/O distinction in Noto Mono are atrocious), the font is quite nice on the whole. And while I will never personally use all of the myriads of different scripts included, I whole-heartedly appla
- Re:Repairing the Unicode Consortium Clusterfuck (Score:5, Informative)
  
  by AmiMoJo ( 196126 ) writes: on Monday October 10, 2016 @05:44AM (#53046015) Homepage Journal
  
  It's even worse than that. On many systems, e.g. Windows, w_char is defined as 16 bits, meaning it can only ever support the Unicode Basic Multilingual Plane without hacks. Since a lot of the fixed CJK characters are outside this plane, software that uses w_char usually doesn't support them. Some of this is baked into hardware, for example Unicode uses UTF16,
  I'm seriously thinking about writing an open source library to support TRON encoding. The lack of a good alternative seems to be what is preventing Unicode from being deprecated in favour of something better.
  
  Parent Share
  twitter facebook
  - Re: (Score:2)
    
    by legRoom ( 4450027 ) writes:
    
    On many systems, e.g. Windows, w_char is defined as 16 bits, meaning it can only ever support the Unicode Basic Multilingual Plane without hacks.
    True UTF-16 supports non-BMP code points just fine, and is not a "hack". In fact, it's actually slightly easier to do so in UTF-16 than with UTF-8 (the only other common Unicode encoding).
    The real problem is that there is no single concept in Unicode that maps to the "character" of the old, simple ASCII standard with which most programmers are familiar. Depending on the task at hand, the correct substitute under Unicode may be code units, code points, or graphemes. Ignorant and/or lazy programmers who make
    - Re: (Score:2)
      
      by AmiMoJo ( 196126 ) writes:
      
      A lot of developers just throw in Unicode support and assume their software supports all languages. We need something better that actually does that, rather than Unicode.
      - Re: (Score:2)
        
        by legRoom ( 4450027 ) writes:
        
        I get the feeling that you don't understand how numerous, complex, arbitrary, diverse, ambiguous, etc. natural languages are. That phrase, "all languages", doesn't even have a knowable, well-defined meaning, either in theory or in practice.
        It would certainly be possible to improve upon Unicode, if you're willing to sacrifice backwards compatibility. However, it will never achieve your stated goal of guaranteeing support for "all languages" just by "throwing in" a new text processing library.
        Projects that re
        
        Re: (Score:2)
        
        by AmiMoJo ( 196126 ) writes:
        
        That was my point. Developers who aren't experts in languages just assume that if they tick the Unicode box in the compiler options their software supports everything, but in reality that's far from the case.
- Re: (Score:2)
  
  by hackertourist ( 2202674 ) writes:
  
  Note that this new font doesn't fix the 'Han unification' problem. It just provides 3 versions of the font, one for C, one for J and one for K. This sidesteps the clusterfuck (and forces you to select a different font for each language), but does not fix it.
  - Re: (Score:2)
    
    by omnichad ( 1198475 ) writes:
    
    This is a substitution/fallback font - and shouldn't be used for design or UI except where the chosen font is missing a character. If your native language is Chinese, you won't be using this font to view any glyphs that are already included in your Chinese font.
That's great .. there's nothing more annoying (Score:2)

by Chrisq ( 894406 ) writes:

That's great .. there's nothing more annoying than having little rectangles on a web page instead of the proper glyph that you wouldn't understand anyway!
- Re: (Score:2)
  
  by KozmoStevnNaut ( 630146 ) writes:
  
  I'm sure a lot of East Asian people share your annoyance
- Re: (Score:2)
  
  by baka_toroi ( 1194359 ) writes:
  
  Currently the world is focused on more demographics than monolingual English speakers.
Accept headers schmaccept schmeaders. (Score:2)

by Hognoxious ( 631665 ) writes:

I can see why this is important to Google, since they seem to like showing me ads in the wrong language.
- Re: (Score:2)
  
  by ColdWetDog ( 752185 ) writes:
  
  Feature, not a bug. Be quiet.
"Reelelsed"? When? (Score:2)

by Zanadou ( 1043400 ) writes:

http://www.google.com/get/noto/updates [google.com]
Last entry: "September 29, 2015"
Yeah... so it's the same thing I downloaded and installed last year.
I'm so glad Slashdot is catching up...
This is dumb. (Score:2)

by jimbob6 ( 3996847 ) writes:

So what are they using besides "tofu", nothing? Blank spaces?
If I don't have the character set installed that the page is written in, then its probably because I can't read that language any way.
And I damn sure don't want to load every character set that exists on the web into my browser. It would run like balls.
If you go to a page that has characters that your browser doesn't understand and you need to get to the information, use Google translate.
tofu is faster (Score:2)

by OrangeTide ( 124937 ) writes:

Don't show me glyphs that I am not trained to read. i'd really rather see square boxes in situations where foreign text was displayed with the wrong font. Wrong font being the font that I'm using.
- Re: (Score:2)
  
  by hcs_$reboot ( 1536101 ) writes:
  
  Try this one [apple.com] instead.
- Re: (Score:2)
  
  by fendragon ( 841926 ) writes:
  
  why is there not a single "Noto serif" font that combines them all? Or how else is one supposed to configure the browser now to give access to all those symbols?
  A single font for all of them, as has been mentioned above, is possible but would be over 400MB, which is a problem for some of us.
  Browsers will search other available fonts for a code point that's not in the current font, so you can install a collection of subset fonts that includes all the characters you are likely to need.
- Re: (Score:2)
  
  by ledow ( 319597 ) writes:
  
  That's not what Unicode is for.
  If you want Serif or Sans Serif, those are entirely different typefaces.
  If you want monospaced or not, again those are entirely different typefaces.
  All Unicode does - especially when you combine it with TrueType semantics or want a font that works everywhere - is provide characters for everything you might need.
- - - - Re: (Score:2)
        
        by the_B0fh ( 208483 ) writes:
        
        I for one welcome our sharks with lasers on their heads, eating hot grits, suitcase cracking overlords! From Soviet Russia, in the name of longcat.
        Whatever happened to Natalie Portman...?

There may be more comments in this discussion. Without JavaScript enabled, you might want to turn on Classic Discussion System in your preferences instead.

Keeping up with the emojis (Score:2)

Re: (Score:1, Funny)

Re: Keeping up with the emojis (Score:2, Funny)

Re: (Score:2)

Re: (Score:2)

Re: Keeping up with the emojis (Score:5, Funny)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:Keeping up with the emojis (Score:5, Informative)

hells teeth (Score:4, Interesting)

Google management is becoming more and more messy. (Score:2)

Re:Keeping up with the emojis (Score:4, Interesting)

Re: (Score:2)

Re: (Score:2, Insightful)

Re:Keeping up with the emojis (Score:4, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

"Now available to download" link (Score:5, Informative)

Re: (Score:1)

Re:"Now available to download" link (Score:5, Informative)

Re:"Now available to download" link (Score:4, Insightful)

Re: "Now available to download" link (Score:5, Insightful)

Don't favor minor cache savings over tracking. (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:1)

Re:"Now available to download" link (Score:5, Informative)

Re: (Score:2)

Re: (Score:2, Informative)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re:"Now available to download" link (Score:4, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

This should have been put together by Unicode (Score:5, Insightful)

Re: (Score:3)

Re: (Score:2)

Re: (Score:2)

Re: (Score:3)

Re: (Score:2)

No programmers' typeface (Score:5, Insightful)

Re: (Score:2, Insightful)

Re:No programmers' typeface (Score:4, Insightful)

Re: (Score:2)

Re: (Score:2)

Re:No programmers' typeface (Score:4, Insightful)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)

Slashdot solved the tofu problem long ago (Score:1)

Horrible Mono Font (Score:2)

Re: (Score:2)

Re: (Score:2)

Repairing the Unicode Consortium Clusterfuck (Score:5, Interesting)

Re: (Score:2)

Re: (Score:2)

Re: (Score:2)