Google Builds a Native PDF Reader Into Chrome 285
An anonymous reader writes "Google's latest Chrome 6 Developer Update comes with a few subtle GUI changes, but there is also a major update under the hood. As its ties with Adobe quite apparently grow stronger, there is not just an integrated Flash player, but also a native PDF reader in the latest version of Chrome 6. Google says the native reader will allow users to interact with PDF files just like they do with regular HTML pages. The reader is included in Chrome versions (Chromium) 6.0.437.1 and higher, and you can use the feature after you have enabled it manually in the plug-ins menu. That is, of course, if you can keep Chrome 6 alive — Windows users have reported frequent crashes, and Google has temporarily suspended the update progress to find out what is going on." The Register has some more details on the PDF plugin and a link to Google's blog post about it.
Awesome (Score:0, Informative)
Re:PDF plugin, OK. PDF built-in? Not so sure... (Score:5, Informative)
I'm not fully qualified to comment on this since I will never be a Chrome user until someone forks off a "stainless steel" release where a group of people have poured over the source code to ensure there is no Google data collecting going on and then compiles it themselves for distribution.
No, I think what you want is the "tinfoil hat" release.
But seriously, it's called Chromium. It's the fully open source project that feeds into Chrome, and it's free of all Google branding and such. For what it's worth though, there's nothing in Chrome that does anything remotely close to what you're afraid of. Feel free to run it for a couple of weeks through a debugging proxy to watch what it does (I have).
Re:PDF files will render as seamlessly as HTML? (Score:5, Informative)
Or am I misunderstanding that feature?
Saying "PDF files will render as seamlessly as HTML" is not the same as "PDF files will render as HTML".
So, yes, I think you misunderstand.
Re:PDF files will render as seamlessly as HTML? (Score:5, Informative)
Re:Chrome, you're losing me! (Score:3, Informative)
Faster perhaps, but less memory? Many tests show it uses more memory than other browsers.
http://lifehacker.com/5457242/browser-speed-tests-firefox-36-chrome-4-opera-105-and-extensions [lifehacker.com] http://dotnetperls.com/chrome-memory [dotnetperls.com]
http://www.whoisandrewwee.com/browsers/verdict-on-google-chrome-memory-hog/ [whoisandrewwee.com]
Re:PDF files will render as seamlessly as HTML? (Score:1, Informative)
There is an extension [google.com] that automatically converts links to PDFs into links to the viewer version. It also handles PPT and "other documents".
Re:PDF is fat (Score:4, Informative)
PDF viewing is very fast on OS X, and Safari has natively displayed PDFs for a long time. I blame Adobe's reader.
Re:Chrome, you're losing me! (Score:5, Informative)
From a security point of view, I'd feel better if Google wrote their own PDF implementation. Far be it for me to read TFA, but I get the impression that this code comes from Adobe, whose software generally makes me nervous.
I've read it for you. The code doesn't come from Adobe, Google wrote it themselves. It also uses Google's new sandboxed plugin API, so it would be less of a security concern even if it did.
(I'm surprised you got two replies who also didn't RTFA.)
Re:PDF is fat (Score:3, Informative)
I know PDF has embedded fonts, but that shouldn't take much room, should it?
Embedded fonts can get pretty big if the software doesn't subset them or a lot of glyphs are used. DeJavu sans for example is over half a megabyte! Some fonts are much bigger (pan-unicode fonts and CJK fonts for example)
What are they doing that converts something that would be a 10K ASCII file into a 500K PDF monstrosity?
PDFs will always be a bit bigger than plain text because they control the positioning of stuff exactly and that takes information. It shouldn't be a factor of 50 though unless images are involved.
Once images are involved the sky's the limit, a single large image can make a pdf huge (and remember images can be inserted at any resoloution so a huge image can display small!)
One of the things about pdfs is always embeds images and usually embeds fonts. This is a mixed blessing, on the one hand it makes the file far more portable than something like html but on the other hand it means you re-download stuff like logos with every pdf you grab.
Can't LaTeX handle it?
LaTeX has it's place but afiact it was never designed to be a distrubution format. A typical LaTeX document involves a load of files that become figures in the document and many use LaTeX add-on packages that may or may not be installed.
About the only thing worse than PDFs are raster scans of documents, and those typically aren't served, they're used as an intermediate step towards porting to a more useful format.
That has not been my experiance with large digitisation projects i've seen the output of (e.g. http://ethos.bl.ac.uk/ [bl.ac.uk] ). In my experiance they do OCR for searchability but the accuracy isn't good enough to do a full conversion so they produce pdfs with the image visible but OCR text for copy/paste/search.
It's done because it's a lot easier for computers to search text documents.
Afaict this is the main reason for doing OCR at least in large digitisation projects.
And it saves lots of space.
It does if you throw the originals away. But only an idiot would do that without careful proofreading of the OCRed text and careful proofreading costs a LOT more than storing the original images does.
Re:Chrome, you're losing me! (Score:2, Informative)
Re:PDF files will render as seamlessly as HTML? (Score:3, Informative)
They translate PDF to html already - try opening a PDF attachment in gmail.
Re:PDF plugin, OK. PDF built-in? Not so sure... (Score:2, Informative)
Re:You did not RTFA either (Score:5, Informative)
because TFA doesn't explain that google wrote it themselves. Heck, even the google blog announcement doesn't explain that google wrote it themselves. Guess what, it turns out google did not write it themselves, they're using libpdf.so [chromium.org] which is libpdf [sourceforge.net]
I was referring to the Google blog post, which is linked from the Slashdot summary and thus counts as "TFA".
It says "Currently, we do not support 100% of the advanced PDF features found in Adobe Reader, such as certain types of embedded media" and "We would also like to work with the Adobe Reader team to bring the full PDF feature set to Chrome using the same next generation browser plug-in API", which I took to mean that:
1. it clearly isn't being written by Adobe, and
2. even if Google didn't write it, they are maintaining and improving it, so they "wrote it" in the same sense that Apple "wrote" WebKit.
As for the "libpdf.so", part, I assume you're looking at the part of the code that says
#if defined(OS_WIN) // Linux and Chrome OS
cur = cur.Append(FILE_PATH_LITERAL("pdf.dll"));
#elif defined(OS_MACOSX)
cur = cur.Append(FILE_PATH_LITERAL("PDF.plugin"));
#else
cur = cur.Append(FILE_PATH_LITERAL("libpdf.so"));
#endif
Which means that they're using a file called libpdf.so on Linux. As another one of your replies points out, this is doubtful to be the 9-year-old unmaintained incomplete C library you link to, and judging from the Windows and Mac filenames, this is nearly definitely a library written (or at least maintained) by Google.
Re:PDF files will render as seamlessly as HTML? (Score:3, Informative)
Re:PDF is fat (Score:3, Informative)
How exploitable is/was doc? MS's implementations usually asked you if you wanted to run macros and had macro settings.
I think it only started to do that since Office2K or so; pretty sure that there were popular releases which already had scripting, but no user control over when that starts executing once the document is opened. Most certainly, viruses written in VBA ("macroviruses" was the word for that) costituted a hefty chunk of popular virus registries back in late 90s.
Re:PDF files will render as seamlessly as HTML? (Score:2, Informative)
Re:PDF files will render as seamlessly as HTML? (Score:3, Informative)
And the further problem is that if you want another presentation, you probably need another set of divs than what you have right now.
To a certain extend, yes, as you can't reorder div's with CSS, you have to arrange them in your HTML to fit your planed CSS layout. But that is really not the fault of the web designer, as CSS just doesn't offer any tools to do a better job. Considering what HTML/CSS allow, many webpages do as good as a job of separating presentation and content that is possible.