Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Windows Microsoft Operating Systems Software

Windows is Bloated, Thanks to Adobe's Extensible Metadata Platform (bit.ly) 135

An anonymous reader shares a report: Over the weekend, I put together a little tool that scans executable files for PNG images containing useless Adobe Extensible Metadata Platform (XMP) metadata. I ran it against a vanilla Windows 10 image and was surprised that Windows contains a lot of this stuff. Adobe XMP, generally speaking, is an Adobe technology that serializes metadata like titles, internal identifiers, GPS coordinates, and color information into XML and jams it into things, like images. This data can be extremely valuable in some cases but Windows doesn't need or use this stuff. It just eats up disk space and CPU cycles. Thanks to horrible Adobe Photoshop defaults, it's very easy to unknowingly include this metadata in your final image assets. So easy, almost all the images on this site are chock full of it. But you can appreciate my surprise when a bunch of important Windows binaries showed up in my tool.
This discussion has been archived. No new comments can be posted.

Windows is Bloated, Thanks to Adobe's Extensible Metadata Platform

Comments Filter:
  • Windows is Bloated (Score:4, Insightful)

    by decipher_saint ( 72686 ) on Wednesday April 26, 2017 @09:43AM (#54304963)

    Full Stop.

    • by caseih ( 160668 )

      To be fair, your average Linux distro is pretty fat too. A basic installation of, say Linux Mint, can still run several GB. Granted the default installs of most Linux distros include a fair amount of utility programs and full-blown applications, such as LibreOffice, that Windows does not include.

      It is pretty embarrassing for MS to have 40% of an EXE consist of this unnecessary XML code.

      • by gmack ( 197796 )

        A basic Linux install also tends to have an office package and other productivity apps so it's not comparable.

        For a better comparison, I just checked my monitoring station and it uses 4 GB of drive space and that includes two web browsers but not an office package. We recently tried to install Windows on one of those machines and abandoned the idea after windows filled the 14 GB MMC drive to the point where it installed but barely and without enough room to install updates.

    • Agree but because the system contains many things that are not necessary for it to work but you can't remove anyway. I don't need "Groove Music", or a maps app, or the Xbox app.
      It'd be nice if we could get a barebones OS and the install things as needed. Yeah I know, I also use Linux and I love how you can install some minimal distros without even an UI.
      • As with a lot of annoying Microsoft things these days; the fact that you can't is more of a licensing issue than a technical one.

        On the desktop, Windows 10 LTSB is the de-crapified version you actually want; but haha, volume-licensed enterprise SKUs only!

        If you have the appropriate Windows Server version license; you can install "server core [microsoft.com]" or "nano server [microsoft.com]"; which have even more cut out; but while that can at least be purchased in single units; it's a fairly expensive way to declutter a workstation.
        • Yeah, I know about the LTSB version and it's obvious to anyone that knows a bit about computers that all those apps and bits aren't actually needed to run the OS. I'd actually be willing to pay for a clean version on Win 10 (well, I'd rather they change the UI also but that's another matter) but as you said they won't sell it to individuals.
        • by Karlt1 ( 231423 )

          In fact Microsoft will sell you a decrapified version if you buy a computer directly from them through the Microsoft Syore - either a brick and mortar store or online.

      • As an entrepreneur, I would like to know that when you say "It'd be nice if we could get a barebones OS" does that mean you'd pay money to have that?

        • Yes, but to be honest I actually meant a clean version of Windows. I use a lot of software, ok, let's be honest, games, that only work on Windows so I can't easily move to other OS.
          Also, clean versions of Linux already exist.
          Anyway, I doubt that's a priority to many people.
  • So Adobe photoshop puts metadata in PNG images that can cause "bloat".

    OK.

    Riddle me this batman: Why the hell should the Explorer.exe binary compiled from C code have 20% of its bytes be from an Adobe photoshop metadata tool? Ditto for a DLL that's not a PNG asset?

    I think this guy ran a program that misinterpreted some bytes in a binary since it's not really designed to be a general-purpose parser and then jumped to a really really dumb conclusion.

    • by Anonymous Coward on Wednesday April 26, 2017 @10:16AM (#54305235)
      Clippy says: It looks like you're going full retard. Would you like to learn about RC, the Microsoft Resource Compiler which can be used to embed PNG images into exes and dlls?
    • Programs have graphic resources embedded inside them all the time. They would take up a lot less space without unnecessary/unusable metadata. This stuff should be automatically stripped during the process of embedding anyway.

      • No they shouldn't. Creating a *policy* of stripping metadata and enforce it through code audits. Embedding resources (or not) into a file is a developer decision, not a compiler decision. The compiler has no way of knowing which bytes of the resource you embed are important and which are not, be they strings, PNGs, or anything else.

        • You probably want the metadata on the source - there's a lot of useful information you can store there and it's much more onerous to strip it after every edit. It should only be stripped during compile time. Making this the default but allowing full control is a much better way of handling that.

    • Because compiled EXE files can have embedded resources, such as sounds, graphics, etc.

      Most commonly for Windows applications, this is icons - usually several variations and versions for different themes, sizes and resolutions.

      =Smidge=

    • Windows binaries can contain embedded images. For example, the start button is an embedded image you can dump from explorer.exe.
  • by Anonymous Coward

    /. doesn't care about bloat...

    the bit.ly link goes to https://www.thurrott.com/windo... [thurrott.com]

  • by JoeyRox ( 2711699 ) on Wednesday April 26, 2017 @09:50AM (#54305023)
    The Windows executable loader doesn't look at this extraneous XMP data so why would it consume CPU cycles?
    • Maybe the Windows executable loader doesn't care. But what about the bloat loader that runs when you first power up the machine?

      Even if it takes no cpu cycles, it is a waste of disk space that could have been used to hold pr0n. Think of the space that could be saved merely by shortening Microsoft to MS everywhere it appears.
    • by dbialac ( 320955 )
      There is the time taken to read the file into memory, but I'm going out on a limb that the calculated 5MB isn't going to make much of a difference were it purged.
      • There is an asymmetric relationship between the time it takes to access data and the time it takes to read, even with fast SSDs with low command overhead. So the additional transfer time of 5MB over a large set of files will be a rounding error relative to the total time spent issuing the I/Os.
      • by ljw1004 ( 764174 )

        There is the time taken to read the file into memory, but I'm going out on a limb that the calculated 5MB isn't going to make much of a difference were it purged.

        It doesn't read the file into memory. It only reads the pages that need to be read. You can have a 100mb file, and if you only attempt to read a 1mb chunk in the middle, then the rest of it won't even be read off disk.

    • by Luthair ( 847766 )
      Presumably they need to parse the images at some point.
      • They don't need to parse it - it's described as separate sections within the executable.
        • So you are claiming that they don't need to parse the image data - that apparently contains XML metadata. But just copy those blit those raw bytes to screen and bingo image it displayed?

          I would have thought they wouldn't have been retarded enough not to use an image format, say like PNG, but that would require parsing so I guess not.

          • I'm not really sure what you mean by "parsing" in this instance, but a decoder is free to skip over irrelevant chunks (real name for them) in a PNG file and access only the chunks it needs (header, palette, image data, etc.).

          • Windows doesn't display the XMP data from executables, at least not without prompting from the user or a Win32 app.
    • During the very quick process of transferring the data from the CPU to RAM. Though with bus mastering, I'm not sure you'd get even one more CPU cycle even with additional KB of data to be loaded (I'm not sure what goes on down at that level).

    • Because PAGEVIEWS!

  • by Anonymous Coward on Wednesday April 26, 2017 @09:52AM (#54305039)

    As can be seen from the link in his comment section, the total of wasted space his tool found was 5MB. On a whole windows system, comprising several GB.
    Even if his tool didn't just find some false positives, that's basically nothing at all.

    Nothing to see here, move along.

    • by sinij ( 911942 ) on Wednesday April 26, 2017 @11:01AM (#54305609)

      As can be seen from the link in his comment section, the total of wasted space his tool found was 5MB.

      This is well over 7500 punch cards, you insensitive clod. This would cover multiple foodball fields!

    • Agreed, but just for a fun comparative reference, the size of an entire Windows 3.1 installation was less than 15 MB :)
      • by bmk67 ( 971394 )

        For another fun comparative reference, the average size of a typical hard drive is over a three orders of magnitude larger in 2017 compared to 1992.

        Wasting 5MB across a shitton of files is noise - would you care about a stray 5MB file? I sure wouldn't.

        A great deal if it is probably lost in partially allocated blocks, and thus does not use any additional space over and above what the file already uses.

        • by Anonymous Coward

          Multiply that 5 megs with every windows user and it's a shit ton of space wasted for absolutely nothing.

          • by Raenex ( 947668 )

            Multiply that 5 megs with every windows user and it's a shit ton of space wasted for absolutely nothing.

            So? How many gigs of unused hard drive space is sitting around the average user device?

    • Next up: shocking story of wasteful millionaire that *doesn't* spend hours clipping coupons to save pennies.

  • Ok, Windows users, the rest of us have been telling you how Windows is a nightmare on many levels for years, nay, decades. If you are shocked about anything Windows 10 does, you have been ignoring the people around you ergo you only have yourself to blame at this point. Stop making excuses and make a clean break from Windows forever.

    • by Megol ( 3135005 )

      So who are "us" and who are the Windows 10 users that are shocked? What are they shocked about? What excuses are they making? What nightmares are hidden in the dark bowels of the dreaded Windows world?

      Either you are imagining things or posting to the wrong place...

  • The couple of kilobytes per file for some XML stream is minuscule and immaterial, a few
    megabytes per computer. MS is smart to not step over a dollar to pick up a penny.

    The REAL bloat comes from Executable code modules' executable code, lack of a proper package management system for DLL dependencies And keeping around multiple preceding revisions of each library with SXS backups as a system is updated by Windows update, or keeping unnecessary libraries around as software is uninstalled; However, on the

  • by Chrisq ( 894406 ) on Wednesday April 26, 2017 @10:04AM (#54305135)
    Though XMP was developed by Adobe it is now an ISO standard [iso.org]. Also almost every editor or camera will include XMP data, not just photoshop
    • OOXML is also an "ISO standard", and it's both about as standard or useful as Adobe XMP.

      • by thegarbz ( 1787294 ) on Wednesday April 26, 2017 @12:26PM (#54306487)

        Except that Adobe actually made the standard open rather than just open enough to break, implemented the standard in their own software in the same was as the public toolkit they made available under the BSD license rather than killing it with their own software, and promoted it to get wide spread and compatible acceptance for both proprietary and open source software that touches media.

        The only thing XMP and OOXML share in common is the first three letters in their standard designation.

  • ...my iPhone keeps running out of space. All of the photos I take with it are storing that dad-gum EXIF data. *sarcasm inserted*

  • ...I wrote a tool to find duplicate files in Windows NT.

    I was very careful to not include links (believe it or not NT supported the equivalent of unix Hard Links and SoftLinks), and also once suspicious duplicates were found - first I sorted by file size, then check-summed duplicate sized files, then matching files got the bytewise comparison - a bytewise comparison was performed to ensure an exact match before naming culprits.

    It was always a source of amazement to find the exact same file located in at lea
    • (believe it or not NT supported the equivalent of unix Hard Links and SoftLinks)

      Do you have a source for this? I'm only finding symbolic links in XP from googling. And that was only kernel mode; user-mode symlinks debuted in Vista.

      NTFS junction points[edit]
      Main article: NTFS junction point
      The Windows 2000 version of NTFS introduced reparse points, which enabled, among other things, the use of Volume Mount Points and junction points. Junction points are for directories only, and moreover, local directories only; junction points to remote shares are unsupported.[12] The Windows 2000 and XP Resource Kits include a program called linkd to create junction points; a more powerful one named Junction was distributed by Sysinternals' Mark Russinovich.

      Not all standard applications support reparse points. Most noticeably, Backup suffers from this problem and will issue an error message 0x80070003[13] when the folders to be backed up contain a reparse point.

      Shortcuts[edit]
      Shortcuts, which are supported by the graphical file browsers of some operating systems, may resemble symbolic links but differ in a number of important ways.

      Symbolic links to directories or volumes, called junction points and mount points, were introduced with NTFS 3.0 that shipped with Windows 2000. From NTFS 3.1 onwards, symbolic links can be created for any kind of file system object. NTFS 3.1 was introduced together with Windows XP, but the functionality was not made available (through ntfs.sys) to user mode applications. Third-party filter drivers – such as Masatoshi Kimura's opensource senable driver – could however be installed to make the feature available in user mode as well. The ntfs.sys released with Windows Vista made the functionality available to user mode applications by default.

  • https://imageoptim.com/ [imageoptim.com]

    The bloat is so bad that for a long time I had co-workers who only wanted to use GIF because "PNG files are much bigger". They just could not accept that Adobe was the problem until I ran their Photoshop-saved PNGs into ImageOptim, giving us PNG files smaller than the GIF version.

  • Okay, we've got file header data being sent to MS (who recently disclosed the data collected confirming this), pushing ads through it's "live tile" interface, the Superfetch (whose effectiveness is in doubt but it's use of memory is not) and monitoring services to ensure you don't copy media data MS partenrs don't like. Come on. This was bloatware before Adobe. Now Adobe has long had more bloatware and adware, plus the new ad javascript tags Adobe is putting out on websites. MS windows takes a lot of your c
  • by sootman ( 158191 ) on Wednesday April 26, 2017 @10:38AM (#54305429) Homepage Journal

    "Thanks to horrible Adobe Photoshop defaults, it's very easy to unknowingly include this metadata in your final image assets."

    If you're saving for the web, use the "save for web and devices" option and it should strip out most, if not all, extraneous data. That's why it's there. If you just do File -> Save As it'll include other stuff.

    • If you're saving for any embedded asset. It's good for more than just web, because you can evaluate the output and make sure the quality is what you want.

    • And after the people at marketing have sent those .png files to us devs I'll run them through pngcrush [wikipedia.org] to trim them to ~70% of the original size. Then they're ready for the web.
    • by Falos ( 2905315 )

      While in that pane, I have a dropdown that says "Save as HTML and image" or "Save as image only".

      I run an old photoshop7.0 though. Has all the features I'll ever use but with instant bootup.

  • Over the weekend, somebody put together a useless tool that scans executable files for PNG images containing useless Adobe Extensible Metadata Platform (XMP) metadata. Some small amount of extra data was found, and a report was written about it. The report might be useful to somebody, but slashdot doesn't need or use this stuff. Thanks to editors and reviewers who don't pay any attention, it's very easy for these reports to get published amongst actual news. So easy, many news sites like reddit are chock

  • This is what I consider Fake News. Someone is seriously concerned about the meta data in their image files taking up space? Are they still using floppy disks? How is a few MB of meta data considered bloat? Dumb

  • We have finally solved one of the biggest mysteries for the past 30 years: Why is Microsoft Windows so bloated? Adobe, those bastards. I suspect that was Adobe's plan all along. It's clear Adobe sought to sabotage Microsoft's efforts so that they could supplant them in the OS market with their own operating system.
  • Windows 10 does come with lots of location and tracking services. This must all be part of the implementation. At location XML tags to all of the files and they can track who uploads illegal copies of Windows to a torrent - then automate sending the authorities to your house.

  • I wonder how much of this stuff is really leftover Adobe metadata and how much is components of malware?

    With 20% to 40% of the code/data space of major applications composed of "along for the ride" data that's never interpreted, there's a LOT of room for malware to park itself, its redundant copies, its resources, and its purloined data without having to actually create files of its own.

  • Thanks to horrible Adobe Photoshop defaults

    Yeah it's absolutely horrible that an image editor faithfully saves data the same way it opens it without silently stripping things out. Oh the humanity of opening and saving a picture I took and finding out the metadata which was originally recorded is still intact.

    What's even worse is that there's a dedicated save dialogue to share data which gives people the option of not destroying their metadata when they hit save. What a horrible horrible idea.

    • I know. Next you'll be telling me people don't like the fact I forward-on any word file I get as a stripped down plain text file. None of that fancy unicode business either, use proper English I say.

  • The entire bloat on disk is about 5MiB in size. For an OS around 20GiB, that is less than 0.1% bloat from this issue. It is also only an issue with a small handful of files (we're talking like 5-10 files total, a bunch of which are just SxS copies of explorer.exe)

    This should have just been reported as a simple bug in explorer.exe, not turned into a witch hunt claiming to be THE BLOAT of Windows. This is next to nothing overall.

    Source: The guy uploaded his entire scan dump: https://gist.githubusercontent... [githubusercontent.com]

  • If I post a picture (more so of the kids) I'll scrub it of EXIF data, didn't know of adobe XML.
    https://en.wikipedia.org/wiki/... [wikipedia.org]

  • From 4.7 MB down to 1.3 MB. I don't think binaries should be compressible that much.
    • A binary that was compiled with a lot of code in-lining will be very compressible, but will be faster executing that one with less duplication and more branches. It can also use less memory at runtime. While it will consume more as code space, it can run with fewer stack frames.

Keep up the good work! But please don't ask me to help.

Working...