30 Years of Change, 30 Years of PDF (pdfa.org) 53
PDF Association, in a blog post: We live in a world where the only constant is accelerating change. The twists and turns in the technology landscape over the last 30 years have drained some of the hype from the early days of the consumer digital era. Today we are confronted with all-new, even more disruptive, possibilities. Along with the drama of the internet, the web, broadband, smart-phones, mobile broadband, social media, and AI, the last thirty years have revealed some persistent truths about how people use and think about information and communication. From the vantage-point of 2023 we are positioned to recognize 1993 as a year of two key developments; the first specification of HTML, the language of the web, and the first specification of PDF, the language of documents. Today, both technologies predominate in their respective use cases. They coexist because they meet deeply-related but distinct needs.
Re:How is there not an Open Source PDF alternative (Score:4, Informative)
Alexa, open my.PDF (Score:3)
Also, in before "30 years of "how do I open a PDF?" " .
Re: (Score:2)
Latex is THE text formatter for the sciences. One yokel where I work told me that one can add formulas to a Word document. Yes, one could, and one would also spend the next year hunting down symbols and poking them in where they belong. The result will still look like crap because Word is crap.
Re: (Score:2)
(a) LaTex is a giant pain to use. (I like to say that Knuth turned the art of typesetting into the art of computer programming -- every LaTeX document requires you to write code if it's not completely trivial text.)
(b) The main value of LaTeX is to produce a PDF.
(c) Open source alternatives like LibreOffice are riddled with bugs and misfeatures.
(d) LaTeX itself is a revolt against the byzantine nature of TeX. Les Lamport tried to fix it 40 years ago. Not much progress since, other than a million add-ons
Re:How is there not an Open Source PDF alternative (Score:5, Informative)
It's not meant to be any of the things you've listed.
Just like word's .doc format or .ppt format, they are internal file formats reliant upon tools to render the human readable presentation.
As it is based upon postscript (used in printers to drive creating quality printed materials), the portable aspects means I can give you a pdf that when you print in your printer, it should match what is printed by my printer.
All of the things you call out, it is not meant to be used in situations where lightweight, compressible and human-readability are necessary. And there are plenty of open source formats which do accommodate these requirements.
But PDF is what it is, and for what it is it has survived 30 years in the same way HTML has survived (has undergone changes, but core concepts still remain).
Of course you're welcome to try and create an alternative open-source equivalent, no one will stop you. I can tell you that you're going to be hard-pressed to come up with something that performs the same functions as the PDF does...
Re: (Score:2)
The PDF format became established because it was safe and it worked.
Unfortunately it was created by a company - Adobe I assume - and they kept adding "features" until it was no longer safe. There are Open Source alternatives but they only cover a subset of PDF's capabilities, partially because some are too difficult to implement and partially because some are too dangerous to implement.
Most PDFs I run across are - so to speak - KISS PDFs, they only use basic the functions, unfortunately my bank (for exampl
Re: (Score:2)
Postscript was based on Forth. That's a stacked based language, it might even still be in use somewhere.
Anyhow, as someone mentioned, Postscript used to be good, then Adobe started added more active code abilities to it making it a royal pain in the tookus when some yahoo uses those to force you into doing things a certain way with. For instance, Adobe has a way of adding some crap that forces you to open a document with at least Adobe Reader or Acrobat. It causes no end of stupidty given Reader leaves drop
Re: (Score:3)
There was https://en.wikipedia.org/wiki/... [wikipedia.org] but I only ever ran into it occasionally.
Re: (Score:2)
Actually djvu was quite good. BTW, it was developed by the same guys who have given us "deep learning" and the current state of AI, notably Yann LeCun and his buddies.
Re: (Score:3)
Simple answer: There's nothing driving a replacement hard enough to make people change to something else.
Right now PDF has become the standard everywhere. I know if I get my hands on one, I can view it. PC, Tablet, Phone, etc... Everything understands and displays PDF. Trying to implement something new means getting a reader into everything that currently supports PDF. And, if there's no driving reason to abandon PDF... it's not going to happen. Not easily, anyways. Philosophical reasoning isn't
Re: (Score:2)
Re:How is there not an Open Source PDF alternative (Score:5, Interesting)
Also it's a bitch to edit. I want to cry every time I have to edit PDFs.
PDFs are for consuming, not creating. Compose your document in something (anything!) else, then render to PDF for distribution so the finished document looks the same on whatever platform it's viewed on, including paper. Need to make changes? Edit your source, then re-render.
If you're trying to edit PDF files, you're doing it wrong.
HTML is nice, open and editable
And almost never looks the same on multiple platforms, or even different versions of the same platform. It almost seems like HTML and PDF are completely different animals meant for completely different tasks, doesn't it?
Re: (Score:2)
And almost never looks the same on multiple platforms
Which is the whole damn point. It's supposed to adapt to the capabilities of the platforms.
"Designers" have been struggling to understand this, which is why virtually every modern web page is still 80%+ wasted screen space, only runs on the most popular platforms and browsers without shitting the bed, and runs slower than a block of molasses in the freezer.
Re: (Score:1)
It's expensive to test on all possible clients and client configurations. There's often glitches and nuances that have to be dealt with on a case-by-case basis.
Idealism usually doesn't scale.
Re: (Score:1)
> PDFs are for consuming, not creating. Compose your document in something (anything!) else, then render to PDF for distribution
One can think of them as the "vector equivalent of JPEGs". JPEGs are not meant to be re-edited. But since pixels don't scale up and down as well as vectors, PDF's are used over JPEGs if and when such is desired, which is usually if text is involved.
Re: (Score:1)
HTML is nice, open and editable
Except is a PITA to export it to anything else AND don't screw up anything in the layout. Or do automated PDF export from HTML that doesn't f* up the repeating headers (hint: on the browser you can't use the "save page" thing from js, it always need the user to do it by hand).
Re:How is there not an Open Source PDF alternative (Score:5, Insightful)
How is there, after 30 years, not a viable open source PDF alternative?
As PDF is an open standard that does the job beautifully, there is no demand for an alternative.
PDF SUCKS as a document file.
I disagree. PDF fills the role wonderfully. With it, one can faithfully create an electronic page that exactly mimics the corresponding printed page. I generally despise Adobe, but they did everything right (eventually) with PDF.
It's bulky, it compresses poorly...
The resulting file size is not a deficiency of the PDF standard, as it has room for expansion of the types of compression it supports. The only time I've seen bloated PDF's is when custom fonts are embedded. Otherwise, the PDF size is rather good. In my job, I generate a lot of customer statements using PDF that contain an embedded font and custom graphics, and the resulting file size is around 80K. Without the embedded font, the file size would be much smaller (similar documents using non-embedded, standard fonts come in at around 15K).
...it's not human readable.
Encrypted and/or compressed content and embedded fonts are not human-readable, but PDF's are otherwise plain text.
Adobe keeps a lot of tools to themselves so you have to pay monthly subscriptions to do otherwise basic things (like re-order pages).
The only things stopping you (or anyone else) from creating tools to do the same thing are ability, willingness, and time. Adobe is fully within their rights (legally, morally, and ethically) to charge for their tools, but everything they do (to the best of my knowledge) is with the open, documented PDF specification that is available to everyone.
It's interesting that you should point out reordering pages as something to hold against Adobe, as that is something I and others in my company do on a regular basis using available open source tools. My own programs use iText (actually, its open-source fork that happened when iText went from open source to proprietary) to read PDF, reorder pages, and write them back to new PDF's. Paying for Adobe's software is a choice you make to get the features you want, but you are free to create your own if you so choose. Adobe's PDF software dominance is solely due to its extensive feature set,
Before I discovered iText, I started a short-lived project to do the very thing I eventually settled on iText for. I used the published standards to write basic PDF's containing text and line graphics. I didn't get very far before I discovered iText, so my own project died shortly after it started.
It doesn't surprise me that PDF has dominated for this long, as it does its job very well.
Re: (Score:2)
.
It doesn't surprise me that PDF has dominated for this long, as it does its job very well.
If you're willing to continually pay for current Acrobat for ever and ever and ever.... Adobe is committed to making certain it never works for long without feeding that jones.
Re: (Score:2)
It doesn't surprise me that PDF has dominated for this long, as it does its job very well.
Yeah... among people who greatly prioritize aesthetics over information.
My favorite PDFs are the ones that don't use a standard sheet paper size, especially when in landscape format. User manuals are constantly guilty of this.
Re:How is there not an Open Source PDF alternative (Score:4, Insightful)
Yeah... among people who greatly prioritize aesthetics over information.
I see a lot of comments like this. Information is not just text. Layout is information as well. It conveys a lot of information. Putting something top-left instead of bottom-right conveys information (ask any newspaper editor... above the fold or below the fold makes a huge difference... yes, I know that is an "old school" example - but it really doesn't matter the medium).
A more extreme example: would you rather have a good graph or a table of values? A good graph will convey the same information much more quickly than just a list of values. Layout matters.
It is much more difficult with electronic information. Designers have to deal with screens that are more than an order of magnitude different in size. HTML is a decent solution to this - to be able to adapt the layout to the screen size - but it is never perfect. "Responsive design" is always a compromise. Either it works good on a large screen or a small screen - rarely both.
PDF is another solution. The designer assumes a certain page size and the PDF renderer does its best no matter where it is. Yes, you get examples where the designer picks an aspect ratio that isn't adapted for someone else's needs (the aspect ratio was chosen because they were designing the manual to be printed and not viewed on a screen).
But my main point: layout is important information. Too many people (especially programmers) think it isn't. (And then they argue over the right placement of braces...)
Re: How is there not an Open Source PDF alternativ (Score:2)
Re: (Score:2)
and it's not human readable.
I'm human, and I've been reading PDFs for 30 years without problem.
Re: (Score:3, Funny)
A typical lie by an AI.
Re: (Score:2)
a human... that can read PDFs?? nah, im suspicious. is no one else suspicious of this guy? i mean where did superdave80 even come from? sounds like a computer to me.
Re: (Score:2)
Re:How is there not an Open Source PDF alternative (Score:5, Informative)
Don't confuse the PDF format, which is openly specified, and the proprietary tools from Adobe that happen to edit PDF. Here some FOSS tools thatr can edit PDF files: LibreOffice Draw, Inkscape, Scribus, KDE Calligra Carbon. For simple reordering / merge / rotate / scale: the CLI suite pdfjam (pdfjam, pdfnup, pdfjoin, pdf90) or the CLI tools that come with the poppler library (pdfunite, pdfseparate).
Re: (Score:2)
Scribus is telling me that PDF isn't supported, and LibreOffice Draw opens PDFs with no text wrapped and everything overflowing the page and needing a lot of manual adjustment.
I really like Okular for PDFs, only it can't insert images which is the main reason I'd need to edit a PDF.
Re: (Score:2)
Re: (Score:2)
The unfortunate truth is that LibreOffice Draw is the best of the FOSS options. Inkscape and scribus mess the textboxes more than that.
Re:How is there not an Open Source PDF alternative (Score:4, Interesting)
if your need is pdf annotation with images then you can try xournal on linux, maybe mupdf cross platform (I can't try right now). If you mention you use okular and just need to add an image somewhere, then inkscape is also ok, you don't need text boxes. For Scribus one gets better support with -DWITH-PODOFO=ON at build time (with podofo https://github.com/podofo/podo... [github.com] installed on your machine).
PDF bulky? (Score:1)
PDF SUCKS as a document file. It's bulky, (..)
Depends entirely on the type of contents. Data sheets for electronic components are a good example:
Overview / general description is mostly text, some tables with numbers, etc. This doesn't produce big PDF's. Nowadays they'll begin their life in electronic form, and you get PDF's that are easy to text-search, look sharp etc.
Add some graphs, pie charts, pinouts etc, these type of graphics compress well. But add some embedded fonts, and this can easily add hundreds of KB's to the file size. Same if eg. some b
Re: (Score:2)
PDF SUCKS! Just had to repeat that. The problem is partly that it's a closed standard, but also it _should_ be a read-only document format, and any scripting should be safe and read-only. Instead now it can do more complex things that have opened the door to malware, documents can change, have DRM, etc. Bleh.
Note also that loading it in Adobe is sloooow, even on my fast computer whereas loading in the browser is much faster.
All of this is likely due to Adobe insisting that they be relevant, that people
Re: (Score:2)
But Open Source software can read it and re-ordering pages is fairly easy once you can read it.
Just wondering why XPS didn't get its IE moment... (Score:2)
...and the first specification of PDF, the language of documents. Today, both technologies predominate in their respective use cases.
I have always wondered why Microsoft's XPS format for documents never got its Internet Explorer moment - a moment where it became the "default" for lack of a better word, even for a few years!
Re: (Score:3)
XPS came a bit late for MS. Their monopoly ruling really slammed them pretty hard and limited what they were allowed to build into windows. People forget how important that ruling was to the creation of the modern world we live in now. If it had gone the other way, then firefox would never have existed, Flash Video would never have existed, iTunes for Windows would never have existed, Google, if it existed at all, have be come an MSN extention, the entire internet as we know it would just be a secondary ex
Re: (Score:2)
Firefox should never had existed. It was nothing more than a removal of perfectly useful features from Seamonkey because a mail reader and irc client were 'bloat' that with in a handful of releases grew to have a larger runtime footprint than Seamonkey ever had.
Firefox is a great example of something that was all hype of absent any real value. Which is not say the the Mozilla technology under both browsers was of no value, but FF should not have been code fork, it should have been a re-brand and refocus.
Re: (Score:2)
There are a couple of things wrong with that.
The Mozilla suite was derived from the Netscape Suite level 4.x, there were then two developments:
- Firefox was split off - as you say - as a lightweight browser-only component.
- The Mozilla Suite was renamed Seamonkey, it consisted of Browser, Mail/News, Web Editor and Calender (later dropped). At that point it was still possible to install Seamonkey without the Mail/News component. At some point Seamonkey was been reduced to importing Firefox changes to fix s
Re: (Score:3)
Web kicks WYSIWYG in the balls with metal boots (Score:3)
PDF's proliferate largely because HTML/DOM cannot control text positioning and size accurately. It's unrealistic that every content producer be an expert on "responsive layout" so that the same content auto-flows to fit watches and wide-screen desktops at the same time. Even developers struggle with that goal*. Thus, they use PDF's.
Further, if HTML/DOM at least had the option of precise positioning, then it would be much easier to make and have interactive diagrams, such as click-able flowcharts, class diagrams, ERD's etc. Auto-flow works poorly on charts; they really need direct positioning to do right. (SVG and Canvas lack most the interactive and form input features needed for such.)
It's time for a new web standard to supplement HTML/DOM: a state-ful GUI markup language that does CRUD & GUI well, and has precise text positioning, at least as an option. It could directly support the missing-or-defective GUI elements of HTML/DOM [reddit.com] so that we don't have to keep reinventing GUI engines using the ill-suited JS+DOM. (JS is a glue language, not an infrastructure-building language.)
* I'm more a full-stack-developer, so don't have time to micromanage the full screwiness of CSS and DOM's auto-flow UI nightmare. An org generally has to hire a UI specialist to do such well, but that's both expensive and creates a dependency on a UI specialist. DOM is inherently too flawed to do GUI's correctly without tons of testing, trial-and-error, or a giant learning curve. Common internal CRUD should NOT require rocket science. Most internal biz apps are not even used on phones nearly often enough to pay the big auto-shrink-tax over. YAGNI & KISS still matter. Stacks that handle too many what-if's become WTF's: Swiss Army Spaghetti. There are ways to get a degree of auto-stretch and WYSIWYG at the same time, but first we need a more controllable UI standard.
Re: (Score:2)
> PDF's proliferate largely because HTML/DOM cannot control text positioning and size accurately.
So true and very frustrating. Same problem making SVGs.
Re: Web kicks WYSIWYG in the balls with metal boot (Score:3)
I thought the idea of HTML was to tag the content and let the browser flow the text and stuff?
How does the content creator know what the browser is set up to display or upon what it is displayed/printed/spoken?
Re: (Score:1)
Sometimes you don't want to let the browser auto-compute the layout because:
1. It isn't smart enough to do what you want, and/or
2. Inconsistent across browser brands/versions, and/or
3. You want to control placement yourself for various reasons (or via a server-side layout engine).
Auto-flow has its use cases, but it shouldn't be the ONLY option.
Re: (Score:3)
Re: (Score:3)
> That complaint really disappeared 10-20 years ago. You can be extremely specific with fonts, font sizes, text positioning, etc. It's trivially easy to convert a PDF to standardized HTML, just stick every text object in its own element, and place the element using absolute coordinates, display block, every aspect of the font, etc.
Those who tried that say it doesn't work; they end up having to draw text bit by bit in emulators (or at least vector by vector). OS DPI settings, zoom levels, and other factor
Acrobat Reader originally priced at $50 per user (Score:2)
Acrobat Reader was originally priced at $50 per user till the IRS purchased a right to distribute Reader 1.0.
after that then it be came the way to read PDF files.
Re: (Score:2)
PDF is not what you think (Score:2)
PDF stands for Proprietart Document Format. Very shortly after it because a document format, it became a very specialized tool for the distrubution of malware. It is not a "Document" format, it is a Malware Distribution Format.
One may take action to turn off the Malware Distribution Features and return it to a Document format. but almost noboty can be bothered to do that -- hence PDF is one of the most dangerous file formats in existance, save that Flash crap.
"Many websites are - ..find a specific PDF" (Score:2)
I'll grant that PDF documents are, for better or worse, important, that statement simply doesn't stand up to investigation.
I don't know what percentage of websites are "mostly navigation to help visitors find a specific PDF", but I'll wager a bet it's low.
PDF has a torrid history of being closed source - it was a proprietary format right up to 2008. Sure, that's pretty much half it's current lifespan, but the pa