Scaling Algorithm Bug In Gimp, Photoshop, Others 368
Wescotte writes "There is an important error in most photography scaling algorithms. All software tested has the problem: The Gimp, Adobe Photoshop, CinePaint, Nip2, ImageMagick, GQview, Eye of Gnome, Paint, and Krita. The problem exists across three different operating systems: Linux, Mac OS X, and Windows. (These exceptions have subsequently been reported — this software does not suffer from the problem: the Netpbm toolkit for graphic manipulations, the developing GEGL toolkit, 32-bit encoded images in Photoshop CS3, the latest version of Image Analyzer, the image exporters in Aperture 1.5.6, the latest version of Rendera, Adobe Lightroom 1.4.1, Pixelmator for Mac OS X, Paint Shop Pro X2, and the Preview app in Mac OS X starting from version 10.6.) Photographs scaled with the affected software are degraded, because of incorrect algorithmic accounting for monitor gamma. The degradation is often faint, but probably most pictures contain at least an array where the degradation is clearly visible. I believe this has happened since the first versions of these programs, maybe 20 years ago."
short version (Score:5, Informative)
Most scaling algorithms treat brightness as a linear space, so e.g. if you're doing downscaling to 1/2 the size in each dimension, collapse 4 pixels into 1 by setting the 1 pixel to the numerical average of the original 4 pixels. But, most images are displayed with an assumption that brightness is a nonlinear space, i.e. gamma > 1. Therefore, scaling changes the perceived brightness, an unexpected result.
Re:Monitor gamma? (Score:5, Informative)
The data in the pictures is not linear data. It assumes that it will be displayed on a system that introduces a gamma of 2.2. (If your display system does not do that physically, it should correct for this.) That is, a gray 127 should not display as halfway between a white 255 and a black zero, in terms of light output. (It should *appear* halfway between them visually, because your eyes aren't linear — that's (part of) why gamma is in use in the first place.) So, a checkerboard pattern of white / black squares will have half the luminosity of the white squares. When scaling down, software will turn it into a bunch of gray pixels. But they should be gray pixels of value 186, not 127.
The page is not well written, but his example images make the issue very clear. It's not about your monitor gamma; it's about the "standard gamma" that all image files assume your monitor has.
Re:This isn't really a bug (Score:5, Informative)
But most people who use images expect to look at them eventually. And most image files are meant to be viewed at gamma 2.2. (Printer drivers will at least approximately emulate a gamma of 2.2, and LCDs emulate it intentionally.) If you view the image at some other gamma, you don't see quite what was intended.
Another way of looking at it is that most standard image formats are stored with a nonlinear representation, and people who do math should realize that. For an untagged image, gamma=2.2 is a good bet. gamma=1.0 is a terrible bet.
Of course, if we really want our software to do a good job, then that software should be aware that specifying colors like #FF0000 isn't a good idea -- they look very different on different screens. What the user probably meant to do was specify a particular color, which means that the numbers need to be marked with a color space. (For a great demo, get an HP LP2475w or some other good wide-gamut display, don't install a profile, and look at anyone's photo album. Everyone looks freakishly red-faced.)
Re:short version (Score:3, Informative)
Brightness by itself is not a function, so it can't be linear or nonlinear.
Displayed luminosity is a function of the data value in the image file, which is what the OP meant by brightness. And it most certainly can be linear or non-linear.
But then, I suppose you already knew that.
Re:What about Irfanview and Picasa? (Score:4, Informative)
Re:Monitor gamma? (Score:5, Informative)
Actually, any well-specified file format will specify the gamma. Not all allow you to set it per-file, but they do specify it. Normally this is a line in the spec that reads something like "color values use the sRGB color space" or similar — which specifies a gamma of 2.2 (roughly). And sRGB with it's nearly 2.2 gamma has become so standard that assuming anything else (in the absence of a clear spec) would be idiotic.
Re:Oh dear. Linear color space again, 11 years lat (Score:4, Informative)
The example images that make it really clear are academic examples. But the scaled photos are all enough of a change to be worth noticing and caring about if you're a serious amateur photographer (never mind professional). And they don't look particularly unusual to me (I haven't looked for odd trickery, but I assume he's being honest here).
Old news (and workaround) (Score:5, Informative)
Ok, so he made a very informative page about it, but this is still a well known effect. It affects practically everything you can do in image editing. Blurs, etc. Most people neither notice nor care. It's rooted in the fact that most images come with undefined black and white points and a gamma chosen for artistic effect rather than physical accuracy. Thus correctly converting to linear gamma is hardly ever possible. You can still correct for monitor gamma to avoid some rarely seen inconsistencies and artifacts, but most people don't even notice, so why bother? However, Photoshop does have everything you need to avoid the effect completely, even in the ancient Photoshop 6.0.
Here's how to properly resize in Photoshop:
1. Convert mode to 16 bit (to avoid tone aliasing in the next step, no other influence on the calculations)
2. Convert to profile, select "Custom RGB", set Gamma to 1.0 (this converts the internal image data to linear gamma, no visible change because the image is color managed and corrected back to monitor gamma on the fly)
3. Image Size
4. Convert to profile, select "Custom RGB", set Gamma to 2.2 (default)
5. Convert mode to 8 bit
Done. You can substitute your favorite image filter for the image resize. Unsharp mask works much better at gamma 1.0, for example. Of course you can use several filters before converting back to monitor gamma and 8 bit.
Re:Monitor gamma? (Score:3, Informative)
Re:A great demo... (Score:5, Informative)
If you're looking for lcd test images, http://www.lagom.nl/lcd-test/ [lagom.nl] is probably better. It's got a whole bunch of images dedicated to various monitor problems, along with explanations.
Re:Author expands scaling defination (Score:3, Informative)
You're absolutely correct, AC. The reported issue isn't about a linear/nonlinear gamma bug at all - it's an averaging side effect.
The sample Dalai Lama image on TFA's page is intentionally constructed of interlaced lines of red and green data to thwart the averaging of source data used in common scaling algorithms. If you use the Gimp with the "None" scaling method, which will just pick-up every other row and column when scaling by 50%, (instead of trying to average 2x2 grids) you get a mostly-green image instead of the grey image advertised.
Re:Oh dear. Linear color space again, 11 years lat (Score:4, Informative)
It's basically an implementation issue. The algorithms may be fine as intended ... in linear space. The programmers that implemented them didn't understand linear vs. gamma, or didn't care, or had a fire breathing PHB on their back. Hence we get junk software.
At least all MY image processing code always works in linear space. Bu merely converting 8-bit gamma to 8-bit linear is no good because that now introduces some serious quantizing artifacts (major banding effects happen). So I convert the 8-bit gammas to at least 30 or 31 bit integer if I need processing speed, or all the way to double precision floating point if I need as close to correct as possible. After processing, then I convert back to 8-bit gammas. Even then, you can't totally eliminate some banding effects that result from being in 8-bit. If you can get more bits from the raw images from your camera, that's the best to use. Apparently many JPEG compressors are also doing their DCT calculations in the non-unit gamma space instead of the linear space, too (which reduces the effectiveness of the compression somewhat, and may add more compression artifacts).
Correct, yes. Expected, maybe. Desired, no. (Score:4, Informative)
I think the author specifically isn't stating whether the scaling is correct or not - it is; the whole story doesn't relate to scaling at all, but rather color space and how -it- affects, among other, scaling. Yes, with filtering - scaling without filtering can hardly be called scaling at all as you're just discarding data - and for anything but multiples of 2 (4x, 2x, 0.5x, 0.25x, etc.) that'd have a whole 'nother set of problems.
The author, I think, is suggesting, quite rightly so, that while...
The desired result for scaling down likely being that of the same visual image as when you simply stand further back.
( although at some point the resolution limit of a display and the image itself being presented on that display prevents that concept from being applied to "moving your eyeballs closer to the screen" for scaling up. )
Re:Oh calm down.. (Score:2, Informative)
Yes because everyone in the world who posts to the internet is a native English speaker.
Having read that I am willing to be English isn't his first language
Re:Monitor gamma? (Score:5, Informative)
It seems crazy to me to embed a particular Gamma value into an image. ...In fact it seems so crazy I must be missing something. Am I?
The article actually touches on this point. The sensitivity of the human eye isn't linear. If you use a linear scale to store luminosity information for an image, you waste a lot of bit depth at high luminosities - the eye has difficulty distinguishing between very bright and very bright plus a little tiny bit. On the other hand, the eye is very good at telling the difference between very dark and black. You need a lot of finely-graduated steps at low luminosity or else your shadows get jaggy.
If you uniformly (linearly) space out luminosities on an 8-bit (256-shade) scale, you store a lot of uninteresting information at the high end, and lose out on visible detail at the low end. A scale with gamma of 2.2 (typical these days) fits a full twenty-eight grey values between 0 and 1 on our hypothetical linear scale. To maintain that kind of luminosity resolution (down where it matters), you'd have to store an extra five bits on your linear scale. An extra sixty percent costs.
Re:Nitpicking (Score:1, Informative)
Urrgh... it's "so many programs" or "so many software packages" ... you don't have "one software" -- you have a piece of software. It's a collective noun like "hardware" and "clothing." There is no word, "softwares."
Re:Monitor gamma? (Score:2, Informative)
Yes, you are missing something. Human perception isn't linear either. Twice the amount of light does not look twice as bright. Our eyes see differences between dark tones more clearly. The result is that we need many more dark tones than light tones for an "evenly" distributed tone curve (which is a tone curve where two neighboring light tones appear to be the same brightness difference as two neighboring dark colors). A physically linear gradient has the perceptual half tone shifted close to the black point.
One consequence is that if you store an image with linear gamma, you need more bits to cover the same dynamic range with the same minimal distance between two dark tones. You can immediately see the decrease in resolution for the dark tones when you create an 8-bit image with a black-white gradient in Photoshop and then convert this image to a color profile with gamma 1.0.
So not only is the 2.2 gamma which is used in the sRGB standard a sensible choice for the display technology of yesteryear, it also makes better use of the allocated bits than a gamma 1.0 image would.
Editing in RGB is wrong too (Score:5, Informative)
Sound makes a good analogy. When you play music through any given combination of source, amp and speakers, it sounds different. Sometimes we actually like a particular type of sonic "distortion". It's never exactly like the "original" live music, though.
Likewise, any graphics manipulation is "distorting" the original. In fact, when I take a digital image and run it through Lightroom, do a range expansion/equalization, and do a bunch of tweaks to make the image look good, I'm making much larger changes than those little scaling problems listed in the article. The point is, do you think the result looks good?
There's other important variables, such as what colors are next to other colors in the image, how long you look at the image, what else is around you, how tired you are, etc. There's no such thing as color fidelity, there's only approximations to it. Color is hard, and I mean, really hard. See Hunt, "The Reproduction of Colour", or any number of other fine texts to learn more.
Re:HA! (Score:5, Informative)
This is one of many reasons why creative professionals prefer macs over PCs --- and I'm not saying this as platform evangelism -- for one, you'd be hard pressed to disagree that Mac OS X's font-rendering, kerning, and anti-aliasing abilities are far superior to those provided by Windows when presented with side-by-side examples.
Mac OS X's font rendering is different [joelonsoftware.com], but calling it "far superior" is simply platform evangelism.
OS X renders text so that the on-screen representation looks more like the printed representation, which is good for tasks like designing print advertisements (where you want to approximate the finished product as closely as possible). Windows takes liberties with the shape and spacing of on-screen text in order to line it up with the pixel grid, which is good for tasks like word processing and programming (where legibility on screen is more important). When you're used to Windows, Mac text looks blurry; when you're used to the Mac, I imagine Windows text looks thin and lanky.
Re:Author expands scaling defination (Score:3, Informative)
Actually a good scaling algorithm should perform a lowpass filter when downscaling. This is similar to downsampling of digital audio where you do need to filter out frequencies above half the sampling rate. Leafing these higher frequencies in would cause noise because they can not be faithfully represented in a lower resolution file.
Tux Paint has the code (Score:3, Informative)
The "magic" tools are done right. Scaling (for stamps) needs fixing.
It's GPL. Grab the code if you want it: rgblinear.c and rgblinear.h have what you need.
(and yes, the difference is very noticable for special-effect paint tools)
Re:Editing in RGB is wrong too (Score:3, Informative)
Actually, different color spaces are OK, so long as they are just linear transformations from cone space. That is the case for (linear) RGB, (linear) HSV, XYZ, and a few others. As long as the transformation is linear (i.e. just a matrix times the color vector to give you a color vector in the cone space), you can apply any linear operation (such as scaling, blurring, and other weighted sums), and the order of transformation is exchangeable.
For example, say LMS = M * RGB and you want to average two pixels. Then you have
RGB_AVG = M^-1 * (1/2 * (M*RGB1 + M* RGB2)) = M^-1*M * 1/2 *(RGB1 + RGB2) = 1/2 (RGB1 + RGB2)
Re:Monitor gamma? (Score:5, Informative)
meanwhile, I see a grey rectangle in firefox, and I still don't get what that signifies.
Right-click the image, then click view image. You'll see the image full-scale, like the first image. Scaling it down 50% shouldn't make it gray.
Re:What about Irfanview and Picasa? (Score:3, Informative)
KDE's KolourPaint (MS Paint clone) gets it right! Yay KDE!
Re:HA! (Score:5, Informative)
No, I'm pretty sure that font display is measured by fidelity to the creator's intention and design
And I'm pretty sure you're wrong. Sorry. You might choose to measure font rendering that way--fidelity at any cost, even if it means reducing text to illegible blobs--but I don't, and evidently Microsoft doesn't either. In fact, I'd wager that most people don't.
The technical ins and outs of photo editing and display are all about fidelity; why would that not be the case with typeface rendering and display?
Because in many (if not most) cases, the primary purpose of rendering text is to make that text readable to the person sitting in front of the screen, and the formatting of that text is a secondary concern. In such cases, making "click here" distinguishable from "dick here" is more important than preserving the font designer's artistic vision.
So the question is, if Microsoft had bothered to put some effort into proper rendering, would there be any meaningful loss of legibility?
Empirically, the answer is obviously "no". As even that article points out, the determining factor is familiarity.
Again, this is only true if you ignore what happens at small sizes.
Why the article pretends that Microsoft's decision was anything other than lack of interest in fine-tuning is rather curious. [...] Pixel grid rendering is simply easier to implement
Nope, you've got it backwards: Microsoft's approach involves more fine-tuning. Apple's approach is easier to implement: just scale the outline to the appropriate size and fill it in.
Again, font rendering is measured by fidelity to the creator's intent. If the typeface is illegible because of blurring, then it's a poor typeface.
Nonsense. A font that looks fine when rendered in a 15-pixel-high line (on a 300 DPI printer) may look illegible when rendered unchanged in a 5-pixel-high line (on screen). That doesn't mean the typeface is poor, it just means it's too intricate for such a low resolution.
But Microsoft's decision to abandon the design in favor of a simplistic snap-to-grid renderer does nothing to improve legibility.
You seem to be confused about how that renderer works. It's far from "simplistic snap-to-grid": simply snapping the font's vertices to a pixel grid would produce garbage. What it actually does is apply the TrueType hints found in the font file.
The OS X font renderer ignores nearly all the hints, which is why the outline it renders at small sizes looks the same as the one it renders at large sizes: the hints are placed there by the font creator to tell renderers how to distort the outline at small sizes.
(Indeed, Microsoft's ClearType renderer pays less attention to the hints than their old one, because the greater resolution of subpixel rendering makes horizontal pixel-fitting less important.)
I happen to be of the print publication era, and I cannot stand the state of Microsoft and Linux font rendering. I'm happy to concede those with a personal and subjective preference for that system, but the reality is that it is objectively inferior in every way.
No, not in every way; only in the particular way you've chosen to focus on. Your personal, subjective preference is showing.
Re:Monitor gamma? (Score:3, Informative)
They did exactly that. It's called sRGB, and these days (nearly) all monitors claim to follow it. Some have better color than others, but they're all nominally sRGB. The ones that don't follow sRGB are well-specified as doing something else (and expensive), and purchased by people that know what to do with them. The (somewhat) arbitrary standard is a gamma of roughly 2.2. Monitors that don't produce that naturally fake it with software (in the monitor, not the computer).
The problem is that the computer software doesn't expect linear, when it comes to what color is what, because it never has been linear. Basically, 256 shades of brightness is enough — but only if they aren't linearly spaced. There's a huge difference between a gray that's 1% of the white luminosity and 0%, but you can't see the difference between 99% and 100%. Using a gamma curve fixes this, by roughly matching the apparent change in brightness per change in the brightness. So the whole toolchain is built around a gamma of roughly 2.2, except when it comes to things like image manipulation.
Re:Monitor gamma? (Score:4, Informative)
IMO, what it actually means is that the so-called image is deliberately designed to be as catastrophically horrible as possible when scaled down.
Yes, the image is designed to exploit the bug in way that makes it very obvious that the scaling is wrong, but no, the gray rectangle is not the correct solution. The problem is that the browser or the image application assumes that the brightness is stored on a linear scale, while in reality its stored exponential one. Thus when you scale things or apply other filters the brightness will be messed up. In real photos it will be less noticeable then here, but it will still happen.
The correct way would be to transform the image to a linear scale before applying the filter and then restoring it the exponential one for display. A simple example of how to do that in Gimp (using gausian blur instead of scale, but it is the same bug) would be this [seul.org]. The left image is the original from the article, the middle one is blur applied and the right one is applying a gamma of 0.5 to get it to a linear scale, then the blur and then a gamma 2 to restore it to its original scale. As you can easily see the right one looks like the left one, but blurred, while the middle one has no resemblance to the original, all the information got lost due to the bug. In practice you would need a higher bit depth to make this trick practical, as else you would end with banding artifacts.
Re:short version (Score:3, Informative)
No, it's an algorithm that's just plain wrong. It's doing linear calculations on values that represent an exponential curve. It's a pretty big screw-up, made by almost everyone that designs resampling algorithms. Given that graphics people don't usually write software, it's not for them. People that try to blend colors 0 and 255 in software need to know that the result should be 186 and not 127.
Re:HA! (Score:3, Informative)
By now it's clear that you have no interest or understanding of any application of digital text other than what you've encountered in your own limited experience, and that you lack the technical knowledge to evaluate or discuss the workings of any font rendering algorithm.
Between your ignorance, your accusations, your nonsensical demands (don't use a bitmap image to demonstrate the renderer's output), and your elitist insistence that you are the sole arbiter of what a font renderer is supposed to aim for as well as how a person is supposed to edit documents (always zooming in rather than reading text at small sizes; choosing a print font based on what looks good on screen, rather than using software that makes their preferred print font legible on screen), I see no reason to continue this discussion. Have a nice day.