Reverse Engineering of a Graphics Format? 62
Jimbo God of Unix asks: "I recently purchased a color laser (Samsung CLP-500) because it claimed to have Linux compatibility. It does, mostly. However, I was irritated to find that the drivers are proprietary (splc, Samsung Printer Language - Color) and somewhat cranky to get working. I was hoping to find some good resources on reverse engineering the graphics format used to drive the printer. I've managed to mostly dissect the file format, so I think I can get the graphics data out, but I don't really know how to proceed to the next step. Are there any good resources for figuring out how to reverse engineer the graphics format? Are there any tools out there that will help me analyze the format (other than hexdump) or tell me if it's close to something else so I don't have to do as much work?"
"I have something of an advantage since I can compare the output from the Windows driver to the Linux driver, and I was able to dissect the Windows output file from the info gleaned by dissecting the Linux output file. But I'm kind of stuck at the moment and there don't seem to be too many documents or tools out there for dissecting graphics data.
I thought this might be useful for reverse engineering some of the proprietary image compression formats for web cams as well, but that's a project for another day."
A few general hints? (Score:5, Informative)
I believe the old wardialing tool Toneloc had a mode, or a companion program, to display logs this way. It was easy to see things like "numbers ending in -0100 never get answered" as a vertical red line, for instance.
The important things would be an adjustable margin, to "wrap" the pixels at varying widths, and adjustable bit depth, so you can discover odd packings that might not otherwise be apparent.
If the data might be compressed, have a look at the article Hacking Data Compression [fadden.com] for a great, if slightly dated, conceptual overview.
ERANAI (I am not a reverse engineer), but I hope this helps. Let us know if you have any luck!
I thnk you mean ANAIE (Score:1)
Wouldn't that be IANAR? (Score:2)
Re:Wouldn't that be IANAR? (Score:1)
Re:Wouldn't that be IANAR? (Score:2)
Re:A few general hints? (Score:3, Informative)
Re:A few general hints? (Score:2)
Now, if there were something similar for filesystems...
Reverse Engineering a proprietary format? (Score:2, Insightful)
and reside in the USA?
My next step would be to get a good lawyer to find out if what you are doing will open yourself up to potential legal action.
Re:Reverse Engineering a proprietary format? (Score:2)
Whatever you do anyways always opens you to potential legal action.
+1, Shared Attitude (Score:2)
So I become a martyr to some random cause? Even the Bible says some evil must come, but I hope one of the DMCA framers is reading this, so they can also read the followup to it [biblegateway.com], or this concise description of the RIAA's activities [biblegateway.com].
Re:+1, Shared Attitude (Score:2)
2) The only thing interfering with this sort of reverse engineering is people like you three yapping about "the enemy" and becoming a "martyr" and generating this sort of FUD every time someone with real skills wants to do a little digging.
Check linuxprinting.org first (Score:5, Informative)
I can't help you with your question, I have no experience with reverse engineering.
But for others who don't want to have the same problem: you should have checked www.linuxprinting.org [linuxprinting.org], which says of the Samsung CLP-500:
Samsung supports this printer with proprietary drivers which come with the printer on its driver CD or can be downloaded on the web sites of Samsung. Unfortunately, these drivers do not work necessarily with all Linux distributions and there are no free drivers available. As it is also not sure whether Samsung will update their drivers for future Linux versions, this printer cannot be recommended.
I would try to get the proprietary driver to work, basically by getting the distro it was made for, or at least finding out why it works there but not on your distro - probably it needs some specific kernel image that it was compiled with, which would suck...
Re:Check linuxprinting.org first (Score:2, Interesting)
This is a problem if you only have one computer since you pro
Re:Check linuxprinting.org first (Score:1)
Lesson Learned (Score:4, Interesting)
Unless you're just adventurous that way, and want to write drivers.
Re:Lesson Learned (Score:2)
This isn't really helpful, but... (Score:5, Informative)
I don't understand how companies can sell printers that don't support Postscript. On the other hand, this seems to be a case where a company heard complaints from its customers, and corrected thier bad practices (the toner issue, and Postscript support).
Re:This isn't really helpful, but... (Score:1)
Re:This isn't really helpful, but... (Score:2)
Actually, Brother doesn't support Postscript per-se, but does support Brotherscript. (which seems to work fine for me)
I have a Brother 5170DN, which is a wonderful network printer. Just plug it into your network and it works as an independent network client.
Re:This isn't really helpful, but... (Score:2)
Re:This isn't really helpful, but... (Score:2)
For the exact reason why compaines can sell software modems (Winmodems) rather than the real thing. It is just something to watch out for as a *nix user.
Re:This isn't really helpful, but... (Score:2, Interesting)
Because Adobe charges rip-off rates for the right to call it PostScript. We stopped making printers when over half of the cost of our printer was the stupid fees to Adobe.
Re:This isn't really helpful, but... (Score:4, Insightful)
I don't understand how companies can sell printers that don't support Postscript.
Easy. It costs money to develop, license and ship a postscript based printer, and your typical home user doing nothing fancy and running windows doesn't need it - but they're very sensitive to up-front price.
The real question is why do people buy non-postscript printers when they know that their operating system will work trivially with a postscript printer, but will require a lot of effort to work (often badly) with a non-postscript printer?
The same line of reasoning explains the "demo" toner cartridges shipped with low-end printers. Your typical home user is very sensitive to up-front price (and probably never looks at the per-page cost). If the population of people buying printers wanted manufacturers to behave this way they just need to, en masse, be less stupid.
Re:This isn't really helpful, but... (Score:2)
I don't understand how companies can sell printers that don't support Postscript. On the other hand, this seems to be a case where a company heard complaints from its customers, and corrected thier bad practices (the toner issue, and Postscript support).
Consumer-grade lasers are something of a new market, one that manufacturers are being very careful about wading into. The more expensive business-class laser printers are big money and manufacturers don't want to see their business-class customers downgra
Re:This isn't really helpful, but... (Score:2)
BTW, the 2430DL, which claims Linux support, is actually a ZjScript printer. It's just Minolta wrote a binary-only, RedShat 8+, SuSE 8+ compatible driver for it. My guess is that it would work on a 2300DL, unless it's got code to check for it being used with a 2300DL, too.
Re:This isn't really helpful, but... (Score:2)
trial, error, and compare (Score:4, Informative)
If you cannot get specs you really only have one choice: trial, error, and compare. Print a blank page, then print a page with on pixel. Then print with two pixels. Start simple and make things more complex.
It helps greatly if you buy (or build if you can) some sort of hardware trace tool. I've used this for SCSI devices before, good ones will give you all the data that is transferred to/from the device in question.
If this was simple everyone would do it. However it is complex, and generally boring. A half functioning drive is worthless.
P.S. a better idea would be to return this printer now while you still can. Buy a printer that supports postscript. That hits the bottom line of companies who pull these tricks and in the end is worth more to the linux comunity.
Re:trial, error, and compare (Score:2)
P.S. a better idea would be to return this printer now while you still can. Buy a printer that supports postscript. That hits the bottom line of companies who pull these tricks and in the end is worth more to the linux comunity.
Except, of course, that it doesn't hit their bottom line because most printers with postscript support are priced higher than the new consumer-level lasers that are starting to come out.
Re:trial, error, and compare (Score:2)
So it hits the bottom line harder. They still have to pay money to develop that cheap printer, but they don't get many sales because people buy postscript anyway, which means the postscript printers have a better return on investment.
PGM (Score:3, Informative)
Useful for those wanting to muck with images directly from code. I learned about that last week, and I'm having fun with neural nets
start with a known pattern .. (Score:3, Insightful)
if you can see the 'obvious' change in pattern in the file, you've got a lead. but the important thing is to start from the very beginning with something you know
It's probably a flakey printer (Score:2)
If the linux driver is flakey, it's probably because the printer's firmware is itself flakey, and the Windows driver just contains innumerable hacks to get around the problems that keep cropping up.
Take the thing back, complain that you can't get it working under Linux, and buy a different one.
Be Methodical (Score:5, Informative)
The best way to reverse engineer a graphics format is to use a collection of sample images to get a high level idea of what is going on. Choose the images in a way that will give you the most information.
Make sure the printouts always the same size, layout, color depth, margins, etc. It does no good to compare an A4 grayscale image to a color letter sized one.
If you're operating under the assumption that it's a simple bitmap, the following may work.
1. Is it compressed?
Print out a page with some dots on a colored background.
Print out a page with more dots on it.
Are they the same size?
If so, most it's most likely a bitmap.
If not, it's probably compressed.
What type of compression is it?
Print out a page which is half white, half another color.
Print out another page which is checkered (with *very* small squares) half white, half the other color.
Is one smaller than the other, if so it may be compressed. If it is, it *could* be Jim-Bobs compression algorithm, but programmers are lazy so it's most likely something off-the-shelf.
If it's the half-and-half print that is smaller, it's either RLE or something like JPG (most likely RLE as JPG is lossy -- compare a gradient print to find out if it's RLE or not).
If it's the checkered print then it's probably LZW.
If neither is smaller, re-evaluate your compression assessment.
2. Create a decompresser to test your decompression theory.
Print a colored page.
Print a second colored page a couple of changes.
If you can't create two data dumps of (relatively) equal size from the input data, you're probably wrong about the compression algorithm.
If they are the same size you may be going in the right direction. (If they're exactly the same size be very happy).
3. Guestimate packing.
Print a cyan* page, a yellow page, a magenta page and a white page. Take a look at the first four bytes or 16 bit words. If you've got clearly observable patterns (ff 0 0 0; 0 ff 0 0; 0 0 ff 0; etc.) you're in luck. If not try to work out the packing order. Just keep in mind, if it's a bitmap, and you've got the decompression down, and the page is one color *eventually* you will find a repeating pattern that represents that color.
4. Visualize the decompressed data.
The best way from this point is to find a way to visualize what you've got. In the past I had stock BMP code that I would use to generate a new displayable image, but I've also created custom apps to display it.
If the resulting image looks right but is a funky color, it's packing.
If the resulting image looks like it *could* be close but has a lot of shear, play with your assumed width and height.
If it looks like static and you've previously determined that you're dealing with 16 bit values, try changing the byte order and try again.
5. Lather, rinse and repeat.
Despite what the nay-sayers want you to do. Don't give up. Figuring out someone's attempt to hide data from you is a reward you give yourself. Even if it takes days or weeks, when the light goes on and you think, "Ah ha! I've got you now you bastard!", it makes the time worth it -- at least for me it always did.
Besides, if you do get it working, you can release it and make Open Source better by your efforts.
* Remember, it's cmyk, not rgb.
Re:Be Methodical (Score:1)
If it's a head control language or something you might in trouble, but if it's simply an image being sent you should be able to figure it out eventually.
I decoded a head control language once. Eventually I had great black and white, but color was really tough, since you had to interleave the data based on when each head was going to pass over an area. I gave up when I bought another really nice printer for ~$200. Still, for years later, I got e-mail thanking me for my black-and-white GP
Re:Be Methodical (Score:2)
Contact the people who write ghostscript (Score:3, Insightful)
They know printers. They know lots about printers, and printer languages. My guess is that they'll be thrilled to get an opportunity to hack another printer working. I know when I bought a printer that has "PS Support", it had a postscript driver in software that talked a propriatary protocol to the printer. They would have gladly written the output driver for it, but they didn't know how it worked.
Maybe if you know how it works, you'll be able to get them to do something with it.
Kirby
Quick googling found this (Score:4, Informative)
There's a bunch of info on the CLP-500 here that might help. There are lots and lots of comments from users with both good and bad results and the distros they used.
Check this out:
Good Luck!
return it... (Score:2, Insightful)
really folks, when you buy a printer don't just look at features and speed. look at the printer languages it features. if it only features a proprietary language (like yours does), be prepared for what you are getting into. pcl5 is okay, but postscript 3 is where its at
linuxprinting.org (Score:2)
Color laser printer, max. 1200x1200 dpi, this is a Paperweight
Doh!
fft (Score:1)
seriously though, its not worth it to write a driver for a single instance of a device. and if you dont have adequate documentation, the bar has to be even higher to make it worthwhile. if its really that trivial for you to do, you should get a real job doing it, throw
Your Samsung Printer (Score:4, Informative)
'K? [I think that covers most of the current crop of printers]. Next time, buy a PostScript device.
Ratboy
Disassemble the driver (Score:3, Informative)
For a graphics format, however, I'd be inclined to go for disassembly of the proprietary driver. Perhaps you could try various test cases (scan a white sheet of paper, what's the data look like? Try a black, red, green, blue.. etc). But if it's compressed with some unknown algorithm (like the Audio codec that I've reversed) I don't like your chances of getting it that way.
There are a bunch of disassemblers around, I have written my own (which isn't available publically cause it's still too shit) but I would highly recommend Datarescue's IDA. Old versions work fine in wine.
However, something to be mindful of: Just rewriting their binary driver in C is copyright violation, make sure you properly document the spec and then do a cleanroom implementation.
David.
Re:Disassemble the driver (Score:3, Informative)
However, something to be mindful of: Just rewriting their binary driver in C is copyright violation,
Well, that may depend on the EULA. But assuming either that the EULA doesn't forbid reverse engineering, or that you're willing to bet that the purpose your work will qualify as 'intercompatibility' in court. (Which it should, but not everyone wants to take that risk.)
Anyway, if you rewrite it without duplicating their code, you're not infringing their copyright.
make sure you p
Re:Disassemble the driver (Score:2)
Not entirely correct, it is going to depend on what country you are in, and whether click-wrap EULAs actually have any legal credence. I'm not in the US of A.
That is not clean-room, since the developer is already 'tainted' by having seen t
french cafe analogy (Score:1)
Render to a raster buffer (Score:2)
Using a Windows printer driver in Wine (Score:2, Informative)