Want to read Slashdot from your mobile device? Point it at m.slashdot.org and keep reading!

 



Forgot your password?
typodupeerror
×
Technology

Tighter Video Compression With Wavelets 156

RickMuller writes: "There is a Caltech Press Release here that talks about a new 3D video compression algorithm by Caltech's Peter Schroeder and Bell Labs' Wim Sweldens that they claim is 12 times smaller than MPEG4 and 6 times smaller than the previously best published algorithm. The algorithm uses wavelets for the data compression. Potential applications in real estate (digital walk-throughs of houses) are cited in the article. Anyone figure out a way to wire this stuff up to Q3 Arena yet? The results were presented in a talk at SIGGRAPH 2000 in New Orleans."
This discussion has been archived. No new comments can be posted.

Tighter Video Compression With Wavelets

Comments Filter:
  • isn't is kinda bogus to compare this to MPEG4, since MPEGs are 2D and this is for compressing 3D images?

    -Spazimodo

    Fsck the millennium, we want it now.
  • It is probably safe to assume that the MPAA will somehow try to stop this.

  • [new algorithm] is 12 times smaller than MPEG4 and 6 times smaller than the previously best published algorithm

    In other words we could already do twice better than MPEG4. This would be very significant for downloads, yet I don't see videos twice as compressed as MPEG4 on the net...do you? Somehow I think it will be a long time until this is put into standards and implemented. Sad.

  • Yes but if they came up with a method that can store 3D data in a smaller space than 2D data isn't that fairly impressive itself?

    -cpd
  • Don't get me wrong compression is important, but with DSL, cable, and Ethernet connections, massive compression doesn't seem as important as it was say when everything was connecting with a 14.4 modem. I think compression research will fade and speed will increase in the coming years...
  • by Tower ( 37395 ) on Friday July 28, 2000 @09:46AM (#896803)
    "You'll also be able to see how it will look after you knock out a wall, reapaint the rooms, and drop in new furniture from a 3-D catalogue"

    ...but will it allow wireframe/noclip mode, so I can track the plumbing, electrical, and network connections through the walls?
    --
  • I was under the impression that most wavelet compression algorithms are patented. Of course, this is true of various MPEG formats (includeing MP3), and that hasn't slowed them down.

    What's the patent issue in this case?
  • this can only mean one thing - download full length hollywood films which are only 50 megs instead of 300!

    hooray!

    --
  • by Anonymous Coward
    Does anybody have a laymans definition as to what a wavelet actually is?
  • How long will it be before somebody hacks/cracks/pirates this and it becomes DivX ;-) 2 or whatever?
  • Yeah, all your friends are absolutely DROOLING to see the latest epic played out in 480x320 resolution on a 17-inch screen with a few little speakers. But if you turn the lights off and sit real close, it's just like being in a movie theater!
  • I've been hearing about how much better wavelet compression is for a long time (5 to 10 years, perhaps). The empansis has often been on sound, mainly because it's simpler to work with than video. However, you don't see people using wavelet-based audio formats in wide use. Why not?
  • Nah - everyone will have a cluster... all those home networks with net appliances...

    "My kitchen cluster is comprised of 7 dual-processor GHz 21464 Alpha devices: my toaster, microwave, icemaker, blender, coffee maker, dishwasher, and Mr. Popeil GigaRotisserie! And that's not counting the 16-way NUMA fridge!!!"

    Gotta love those $500 electric bills...

    --
  • Analog Devices has had a part available [analog.com] for a while that is based on wavelet coding. They claim 350:1 realtime video compression, which sounds quite impressive!

    SuperID

  • Before everyone starts talking about the MPAA and piracy, and all of the other wonderfull uses of this technology, read the story. It has nothing to do with compressing video. It compresses 3d scenes. Think fast vmrl, not fast movies. Unless the movie is 3d cg this won't matter.
  • by fence ( 70444 ) on Friday July 28, 2000 @09:51AM (#896813) Homepage
    a new 3D video compression algorithm by Caltech's Peter Schroeder and Bell Labs' Wim Sweldens that they claim is 12 times smaller than MPEG4 and 6 times smaller than the previously best published algorithm.

    That's great that the algorithm is smaller, but what we really want is smaller data

    ---
    Interested in the Colorado Lottery?
  • isn't mpeg4 strictly for compressing 2-D video? this thing sounds more like it's for compressing 3d meshes & stuff... and meshes are almost always smaller than the rendered videos of them... so of course this thing gets better compression!
  • Think of it this way...

    With what you've got right now, you're going to want to do more and more. With each boost of bandwidth, you're going to want to do more. Even with xDSL and Cable, raw, uncompressed video would choke all but the fattest of those pipes. And if you could do that, I'm pretty sure you'd want to do something else at the same time or someone else would.

    No, so long as we keep wanting more out of what we've got, we're going to come up with clever ways of reducing the amount of info that we need to carry across from one point to another.
  • The article mentions that the wavelet concept started in motion back in the early 80's. Reasons that you haven't seen wavelets in action that much might include:

    [] No standard, just a few people playing with things, so they never introduced a product.
    [] More obscure than 'typical' compression, more effort and background is required to implement it.
    [] Cowboy Neal (not really, this was just starting to look like a poll)

    --
  • by kashent ( 213325 ) on Friday July 28, 2000 @09:53AM (#896817)
    This isn't going to make movies any smaller to download.
    What it's going to do is make 3D worlds smaller to download.
    It's not the compression technique that will allow you to view in complete 3D the inside of a house, but the fact that you can record a 3D model of a house and still have it small enough to download.
    The biggest improvent would probably be for VRML type technologies. And it's not going to make quake faster, but it could possibly let someone on a 28.8 use a customized skin that can be quickly sent to all other computers. Most people download quake worlds before they start playing rather than on the fly. -Kashent
  • Yup, I love it when the guy wiggles the camcorder, too... or somebody coughs.


    --
  • by captaineo ( 87164 ) on Friday July 28, 2000 @09:54AM (#896819)
    This technique compresses 3D vertex information - not 2D video as the headline implies. It will be useful in sending high-res geometry across low-bandwidth connections. Two applications:

    • Streaming 3D geometry over the Internet - fully buzzword compliant, but how useful is this *really* going to be?
    • Sending compressed geometry to 3D cards - this is the truly interesting application, IMHO - compressed geometry may be the solution to feeding ultra-fast OpenGL cards, where bus bandwidth may soon become a limiting factor in geometry throughput
  • I wonder how the use of this compression algorithm would impact rendering speed (and possibly require a loss in quality). Whereas an uncompressed 3D environment would transmit a wireframe representation of each object with associated texture information and lighting info, and merely requires the client to perform the necessary ray-tracing. I am not familiar with wavelets and was wondering what additional interpretation msut be done on the client side and how that would impact overall rendering performance/quality.
  • the easiest way to transmit something faster is to make that something smaller. compression isn't going away.

    -c
  • this is the next logical step from what jpeg 2000 does (which is wavelet based). the only real difference is one more spatial dimension in the fourier transforms--jpeg 2k obviously uses 2 dimensions. great applicaton of a simple mathematical idea. researchers have been using this for years (digital spatial recognition/navigation anyone?) but up until the recent past it has been prohibitively expensive to crunch the numbers for this in anywhere near realtime, but as our commodity hardware becomes more and more powerful, more and more of these great mathematical premises can be applied. bravo!

    --
  • That is ok if you are physically connected, but i've some rumours about the internet via wireless technologies (read cellular phones) becoming popular, where this kind of compression would be particularly useful, given the limited transmission speeds.
  • by David Leppik ( 158017 ) on Friday July 28, 2000 @09:57AM (#896824) Homepage
    They are comparing polygon mesh compression with video compression. Sounds like apples-to-oranges to me. Also sounds like it will have no effect on video compression, and it will have limited impact on rendering time.

    I say limited, because you still need to draw those polygons. However, one nice feature of wavelets (at least for images) is that you can easily extract just enough data for displaying at a particular resolution. If that property holds for polygon meshes, then you should be able to draw only as many polygons as are useful for your display resolution.
  • Reduce your images to 2 Bit pix maps, black, red, and blue... throw in a pair of those funky multi-colour 3D glasses and your in business. Brilliant 3D- Tiny Filez!

  • "New 'MP3' compression algorithm makes transferring digital audio to computers more feasible."

    ...but will it allow me to use skins, so I can make my player look pretty?

  • Yes, speed will increase, and bandwith will also increase, but there's another factor. People. There continue to be more people on the Internet, whether with cable modems, xDSL, or dial-ups. As this continues to be a bigger part of our culture, more people each pulling down more bytes all the time creates problems faster than we can run fiber and install new routers.

    Compression will be essential, since it will be one of the only things helping the backbone keep from choking. People can either download a 40-50MB version of 'Debbie does Dallas 2000" or a 300MB version (assuming, of course, that people would actually use the web for porn)... that saves a lot of bandwith. The other side of that is that people then download 5-10x the amount of content... either way, compression needs to be an intergral part of the Internet in the future.

    --
  • Surely this compression is taking a set of data and then working out the best way of representing that data - does this mean that wavelet compression can also be used on normal text files (after all, all video input is represented in the same 1 and 0 form)?

    If that is the case, then there could be some even more interesting areas of use rather than just letting people sell their homes slightly differently. Network traffic could be lightened (if the algo is fast enough), storage requirements for data warehouses lessened, etc. All interesting stuff, and far more valuable than letting me see what my house will look like if I knock a wall down. :-)
    --
  • [] patents
  • Streaming 3D over the 'net -- maybe VRML that doesn't suck?
  • Actually, since this is for compressing 3D vertices, gaming would be a perfect application for this.
  • From what I understand of compression (and I'm no expert), it's easy to find supercompressive algorithms that work using various methods (wavelets, neural networks). The problem is that they're too computationally expensive. I'm sure someone can come up with an algorithm that can compress video even further... but will it even play at 1 frame/sec on a 1GHz system?
  • Mpegs are 2d + time witch gives you 3 dimensions.
    --
    "take the red pill and you stay in wonderland and I'll show you how deep the rabitt hole goes"
  • I have already developed an algorithm that is much better than this, using a technology called "bit stripping". It works on 3D worlds and movies.

    I have included a sample of the technology compressing "The Matrix" below:

    1

    As well as Quake 3 demo:

    0

    Note: Also decreases viewing time, increasing the ability of the user to consume more media.

  • ..but the studies and samples I've looked at for using wavelets to compress graphics were.. um.. well, they weren't anything special. At all. Bad? No. Revolutionary? Bahaha...
  • is often lost in the text world...

    --
  • Ah yes, I knew I was missing an important one :)

    --
  • ...the window is 480x320 but half of it is occupied by knobs and sliders and there is a thick tv-like frame around the viewing that's 50 pixels on each side. In fact the viewing area is only a tad smaller than a first class postage stamp. BTW. there's no broadband access in the UK and we won't see it for a good few months yet so the picture's not animated either. We're almost there Tony, e-revolution rocks, belch.
  • isn't is kinda bogus to compare this to MPEG4, since MPEGs are 2D and this is for compressing 3D images?

    The article states "their technique for geometry compression is 12 times more efficient than the method standardized in MPEG4."

    The critical sentence is: "THE method standardized in MPEG4." Therefore, I take it to mean a method for geometry compression was standardized in MPEG4. You do realize MPEG4 covers more than just 2D video, right?

    -thomas
  • MPEG4 seems to encompas a lot more than just 2-D video compression, such as scene composition, and more. See this link [cselt.it] for more than you ever wanted to know about the MPEG-4 standard.
  • this is gonna be great for holodeck stuff...
    think about what the army could do with this tech (this is related to that [slashdot.org]...)
  • Trying to design an algorithm that is as fast as the Fast Fourier Transform (FFT) is *hard*. FFT is O(nlog(n)). Until we can get an algorithm that you can decode in real time, well you won't be able to listen to it in real time. This is why we see alot of work on images, they don't pop and skip if you decode them too slow. Here is a hint on a fast wavelet transform: look up the DWT of Mallat and Daubechies. But the math is way beyond me. One of the great accomplishments of algorithm design was the Fast Fourier Transform (FFT
  • I've got a smaller 3D video compression algorithm - just promise not to tell anybody else, since I'm planning on patenting it - it runs from a shell, too!:

    #!/bin/sh
    cat $1 > /dev/null

    elegant, isn't it? The algorithm is small, and the compressed data is so tiny, even your OS can't find it!

    --
  • "I was under the impression that most wavelet compression algorithms are patented."

    That's like saying, "I was under the impression that most lossy compression algorithms are patented." Just because their are patented algorithms based on wavelet compression, does not mean all algorithms that use wavelet compression techniques are patented.

    All squares are rectangles, but not all rectangles are squares.

    -thomas
  • I'm wondering how this type of thing (modified, of course) could be used for stereo pairs. I mean sure, storing geometric data is great. But I'd love a way to store compressed, full stereo pairs that was standardized. I dunno if wavelets would be the way to go (don't know much about'em really) but while compressed stereograms and compressed geometry would be used for 2 totally different things, I think a good (GOOD) compression algorithm for stereo pairs couldn't hurt. It'd be just the thing to use in the new home theater. ;) Seems to me that comparing full stereo pairs to saving geometry is like, oh.. comparing MPEG video to vector animations. Yeah, it'll give you *huge* savings, and give you some darned cool (and useful) features. But we still use MPEG video... I dunno. Just blabbering. ;)
  • gotcha:
    [] Need specialized hardware...
    --
  • This isn't video compression at all. This is geometric data compression. Comparing it to MPEG4 is really misleading. These things are apples and oranges.

    This is like finding a better way for the Quake client and server to talk to each other, not a better way to stream The Matrix to your monitor.
  • wavelet compression is lossy. however, the more passes (which maps to higher fft approximation accuracy) the better your "image" will be. but since this tech relies on an approximation at every stage this is a bad compression format to consider for normal files (executables, source code, etc.). it is great, however, for files that are processed by our senses, e.g. sound and video, because our senses are easily fooled. this isn't the compression, however, that'd you'd want to squeeze the entire human genome onto your hard drive with because i doubt any of the "missing" or "wrong" bits are insignificant.

    --
  • As well as Quake 3 demo:

    0


    I'm sorry... I just ran the Quake 3 demo through my Perl interpreter, and it's telling me it doesn't exist. What next?

    -thomas
  • You forget that compression research is done mostly by computer scientists and mathematicians. They have a completely different set of priorities than the software companies. Even if bandwidth increases to some rediculous point where software companies stop implementing the best compression algorithms, the research will not stop.
  • I remember that NeXT had a digital video product called NeXTtime, with pluggable codecs -- their own codec was wavelet-based, I believe. That work could have been folded into Quicktime already, perhaps. Does anybody know?
  • High compression ratios are easy. High compression ratios that don't look like utter crap is difficult. This algorithm isn't even for movie data though, its for 3D scans points, more along the lines of a model.

    As one poster jokingly stated the Matrix could be stored in a single bit, '1', for instance albeit with some loss in signal quality. An algorithm won't become popular based only on its ability to reduce data, nor will it become popular only based on its fidelity (otherwise we'd be sharing 30 frame per second raw digital data)

  • Nope. 3d data is already smaller than 2d data. You have a set of textures (already present in a 2d movie, except that you don't tend to save space by knowing where/how they're tiled) and a set of meshes. Everything is included ONCE and then re-used later when it's needed without having to redownload if you're streaming.

    For instance, if you look at the content stuff that comes with a 3d package (probably not a good example) you have ~500k of files which are used to create animations which, even when compressed (albeit with shitty indeo compression) take up over a megabyte.

    Finally, if you took a look at the objects, motion files, and textures for toy story and then compared that to what a full-resolution MPEG 4 movie of the entire feature would be like, I think you'd come to the conclusion that the 3d data is more space-efficient than the 3d data. Of course, you have to walk the line between processing time needed to render on the remote end, and amount of data you're going to send.

    So really, I can't see what they've actually accomplished here, and it's definitely apples and oranges to compare anything for 3d data with a 2d video standard. Perhaps they are misusing the term three-dimensional.

    On a side note, a coworker says it would be possible to send vertex coloring data which you've prerendered, and then have the user's system do all the easy rendering tasks, overlaying the vertex coloring. This would let you have good dynamic lighting effects (Radiosity, anyone?) and still be able to keep the bandwidth low. If anyone does this after reading this post, you owe me two copies of the server and the content creation software :)

  • by steved ( 109399 ) on Friday July 28, 2000 @10:18AM (#896854) Homepage
    MPEG-4 Can be used for 3D content. The Web 3D consortium is currently working on the project [web3d.org]. I assume this is what they were comparing to.
  • > FFT is O(nlog(n)).

    And the wavelet transform is o(n) since Mallat's filter bank algorithms. Speed is never a problem with wavelet processing.
  • When JPG2000 was announced here on slashdot [slashdot.org], there was a major discussion about whether or not wavelet compression is the best for images. Is there any difference in this case scenario involving 3d movies?

    Mike Roberto
    - GAIM: MicroBerto
  • yeah, but it is a shame that you have limited yourself to 2 files. =)
  • > [] No standard, just a few people playing with
    > things, so they never introduced a product.

    Currently changing. JPEG 2000 is based on wavelets.

    > [] More obscure than 'typical' compression, more
    > effort and background is required to implement
    > it

    No. Maths and algoritms are quite simple in the wavelet theory.

    The real reason people don't hear much about it is that people in general don't hear much about what is really going on in day-to-day technology. Wavelets are everywhere, in thousands of application, for dedicated tasks. Jpeg 2000 is one of the very first application that can attract some attention by the media. This work might be another.
  • by be-fan ( 61476 ) on Friday July 28, 2000 @10:28AM (#896859)
    As anybody who has ever seen the stependous quality of high-res wavelet images knows, this compression is pretty amazing. Although there are probably other, more legitimate I might add, uses for this, the first thing that comes to mind is porn videos. Seriously! The adult market is one of the only internet ventures making money, and this new compression just helps out the US ecomony some more. (Not to mention the fact, that if wavelet compression allows streaming quality comparable to DVD, then you can cancel your subscription to the Spice channel ;)
  • Wavelet theory in mathematics is relatively (in computer terms) old, but it nothing like a 'technology.' The thoery of wavelets has been used in almost all image compression to date, most notably the JPEG and MPEG formats.

    What Schroeder and Sweldens have done is develop a method for generating wavelets which better compresses geometrical data. This is news, and you most certainly did not evaluate a product using this algorithm at your last job.

  • Actually, it's for 3D datasets, not video.

  • Stereo pairs could probably use a system similar to the MPEG motion compensation, where a frame is generated based on moving 16x16 pixel blocks around in another frame and storing the small difference.
    <O
    ( \
    XGNOME vs. KDE: the game! [8m.com]
  • I don't think that this technology would be used in the final stages of rendering a scene.
    The most interesting feature of this technology to me is the fact that it can be used to filter out the noise in 3D scanned objects.And it can also be used to simplify models - low poly versions of distant objects.
    While this is not new, I think that most of those routines are based on NURBS (and other spline-based) technology. And with this new algoritm it can be used for polygon-based objects. Or provide a better way to convert polygon objects to spline-based objects without losing too much details.
    When modeling some highly-detailed objects, I've often wished for a good polygon-reducer - maybe this is it?(that would mean that you have to thank me - hey, I wished it)
  • This is just for 3d geometry data. It is a *lossy* compression algorithm, which means that the uncompressed data is not identical to the original data, just very close. The algorithm requires certain assumptions about the data set--for example, it probably assumes that all the vertices in a certain region of space lie close to some surface that readily expressed in terms of wavelets. (If that is not the case, then the data set you are working with probably does not resemble an object occurring in nature.) It is those assumptions which make such large compression ratios possible.
  • Surely this compression is taking a set of data and then working out the best way of representing that data - does this mean that wavelet compression can also be used on normal text files (after all, all video input is represented in the same 1 and 0 form)?

    Only for lossless compression. JPEG, for example, uses lossy compression - when you compress a bitmap to a JPEG, you lose some of the information, and you can't get that information back. Colors are blended together, edges are less defined, etc. It looks almost as good as the original, though. The problem is, for text and data, you must have lossless compression. Opharwize yuo migt winb up wiht txet zhat loucs leqe tjis!

    --

  • I can verify this later... I've got a Nextcube with Dimension board in it... as well as some turbo colours. The impressive thing about the NeXT codecs (which are all software, even with the Dimension) is that they pull 25fps on a 68040 box.

    Its funny to watch year old computers struggle to pull that off consistantly.
    ---
    Solaris/FreeBSD/Openstep/NeXTSTEP/Linux/ultrix/OSF /...
  • by Jerf ( 17166 ) on Friday July 28, 2000 @10:38AM (#896867) Journal
    I have contacted the RIAA and the MPAA, and have pointed them at this post, which contains so-called "bit stripped" versions of every movie and every song ever produced and that will ever be produced.

    If Napster's damage can be measured in the trillions in a lawsuit, just imagine what you've opened yourself up to.

    "Exadollars, and soon, petadollars. One thousand billion trillion dollars. How many lawsuits per second can your software handle?" with apologies to IBM [adcritic.com]

    please forgive duplication if it occurs, I'm having trouble getting through

  • It would make sense that as high bandwidth becomes more available at ever greater distances, the effective use of that bandwidth should become less and less important. Unfortunately, it doesn't seem to work that way.

    Instead, expectations seem to rise at a rate that is a multiple (>1) of the actual performance increases. Back in the day, people wanted to download, say, a clone of the Breakout arcade game (for DOS) quickly. Today, the same people want to download the Slackware Linux distro quickly. Is compression less important now?

    Furthermore, the increasingly wide availability of decent bandwidth at work, at home, or wherever you have your gd cellphone, again, pushes those expectations further.

    Another /. article today discuss getting /. wirelessly. Is there any doubt that soon we'll expect to watch a trailer for Star Wars on our cellphone/palm pilot before we order tickets on the same device? That ain't gonna happen without compression a little bit better than we have now.

  • Maybe you should look at some different samples. Wavelet image compression is impressive. Check out JPEG 2000 [jpeg.org] for what is bound to become the next big standard.
  • Run the image through a low-pass and a high-pass filter. Compress the high-pass information into tiny wavelets. Repeat the process on the low-pass signal a few times. Quantize and encode the images, and you have a bitstream. This works so well because the eye is believed to use wavelets to recognize features in images.
    <O
    ( \
    XGNOME vs. KDE: the game! [8m.com]
  • by be-fan ( 61476 ) on Friday July 28, 2000 @10:47AM (#896871)
    I know this article specifically refers to using wavelets for compressing 3D, but it should be able to be used for video. FYI, wavlet compression is a method that uses fractals to compress an image. If you've seen the wavelet demos, you know just how much better than JPEG wavlet compression looks. Using the same process, it is also possible to compress a sequence of frames as in a video. This is demonstrated in some of the wavlet demos floating around the net. Right now they're in black and white, and are pretty small, but it is conceivable that they'll get better. A big problem with wavelet compression is the CPU cost. Even single images can take a second or two to decompress. However, that can probably be offset by hardware decompression mechanisms, and it doesn't seem that they're will be a shortage of CPU power anytime soon. If you've ever seen how well wavelet lets you compress images 50:1 with so little quality degredation, you'll understand what an impact this method could have on computer video.
  • I am very impressed with wavelet compression in general. The wavelet decompostion of an image isn't unique -- that gives the heuristics a litle more ``wiggle room'' to choose the most optimized representation of the image.

    I just finished writing a proposal to NASA for some instruments on the Solar Probe [nasa.gov] spacecraft. That's a pretty telemetry-constrained mission. We tested a proprietary wavelet-compression algorithm at 50:1 on 14-bit images (yes, that's about a quarter-bit per pixel) and even at that level it's very hard to tell the difference between compressed and uncompressed images with the naked eye. (The algorithm seems to work by quantizing the sizes of features in the image).

    At that level of compression, a 30Hz stream of 6bit-per-channel 640x480 images would only require just over 3Mbps of bandwidth -- and that's without taking any advantage of the relationship between frames. It's easy to believe that another factor of 50 could come out of a combination of more aggressive compression and either diferential encoding or 3-D wavelets. We could end up with full-motion, full-rate video being squirted through 60kbps connections.

  • (this reply will suck; slash has been eating my replies)

    MPEG compresses motion by taking 16x16 pixel blocks from the previous frame and storing the (small) difference. Perhaps this same technique could be used to store the differences between the left and right eyes.


    <O
    ( \
    XGNOME vs. KDE: the game! [8m.com]
  • This comment explains to us all why this entire thread is worthless while educating us at the same time.
  • by coult ( 200316 ) on Friday July 28, 2000 @11:28AM (#896881)
    Okay, so all of the descriptions that people have given here for wavelet compression is wrong. I've got a Ph.D. in Applied Math and do research in wavelet compression of 3-D data (not geometry data, mind you, but 3 dimensional real data, like images, but in 3 dimensions instead of 2). The basic idea behind wavelet compression is the following:

    In most natural or real-world data (i.e. images, geometry data, etc.) the information at a given point in the data is very highly dependent on the data at nearby points. Thus, there is a certain amount of redundancy in the data, and this redundancy is spatially localized. The concept in transform coding is to apply some transformation (either linear or nonlinear; the wavelet transform and Fourier transforms are linear) to this data to reduce the statistical redundancy.

    Even after applying the transform, you haven't saved anything in terms of the space required to store the data; all you've done is change the basis used to represent the data. Now you take the transformed data and place it into a bunch of bins, each of which is identified with an integer. At this stage, called quantization, you are modifying the information present, because the best accuracy with which you can recover the data is given by the width of the bins. At this stage, you take the sequence of integers and apply a lossless coding scheme to it to reduce the number of bits required to represent the stream of integers. The compression happens at this stage. Wavelets do a better job than blocked discrete Cosing transform (used in JPEG) at reducing the statistical redundancy of the input data; thus wavelet-based image compression compresses more efficiently than JPEG.

    What Schroeder and Sweldens have done is taken an a very general, widely applicable method for constructing wavelet transforms (known as the lifting scheme, invented by Sweldens) and adapted it for representing mesh nodes and connectivity information, i.e. geometry (which incidentally could just as easily be higher dimensional data). Thus they have a wavelet transform for geometry. They achieve compression by using the EZW coding scheme, developed for coding wavelet coefficients of images and used in the JPEG2000 standard, and applying it to their geometry wavelets.

    It should be very nice for low-bitrate storage and transmission of geometry, as well as successive-refinement transmission (i.e. the 3-d data gets better and better looking as more bits arrive).

  • Nope. 3d data is already smaller than 2d data. You have a set of textures (already present in a 2d movie, except that you don't tend to save space by knowing where/how they're tiled) and a set of meshes. Everything is included ONCE and then re-used later when it's needed without having to redownload if you're streaming.

    I think what you're saying here is that *motion* 3d is smaller than 2d video. True enough, assuming the motion 3d is done in an intelligent fashion. However, a 3d geometry snapshot of a scene has more information than a 2d image of the scene. The 3d snapshot has to store 2d textures for *all* the surfaces, plus vertex information.

    So really, I can't see what they've actually accomplished here, and it's definitely apples and oranges to compare anything for 3d data with a 2d video standard. Perhaps they are misusing the term three-dimensional.

    As I understand things, they have come up with a new method to compress the vertex information in a 3d wireframe. I do not know whether their research has taken the next logical step of compressing a wireframe as it moves through time (an algorithm could be developed which exploits redundancy between frames).

    Also it is not "apples and oranges" to compare this to MPEG4, as MPEG4 does contain a standard for compressing 3d geometry information. It's not just a simple 2D video standard.
  • The wavelet decomposition for a given basis is in fact unique (for scalar wavelets, anyway). You are probably thinking of the wavelet-packet/best-basis algorithm, which is a generalization of the wavelet transform and chooses the basis which minimizes the information cost for a given signal.

    I agree that 60kbps full-motion high-quality video is probably possible. Using 3-D wavelets would build in some lag time (like 128 frames or so, depending on the basis) so it wouldn't ever be "live" (but heck, what's 4 seconds?)

  • According to the MPEG4 FAQ [cselt.it], the standard contains a Scene Description Specification in which the "...structure and scene description capabilities borrow several concepts from VRML".
  • But that's for 2D display of data. By making the display of data 3D, we handily avoid all those patents.

    (Moving out of the way to avoid the stampede of people rushing to patent the process of applying 2D algorithms to 3D)

  • "(known as the lifting scheme, invented by Sweldens)"

    Not the "lifting scheme", popularized by Microsoft when "lifting" their GUI from Apple as well as 99.99% percent of their other "innovative" technologies?
  • You're right. I am not sure whether wavelet "compression" is a right term at all. There is a wavelet transform or wavelet decomposition. After you've done it, you didn't do any compression at all. All you did (and that's the most important thing) is to separate "crude" details from "fine" details (in layman terms).

    The benefit is that now you can quantize them, and use far more agressive quantization on the "fine" details because our eyes are far less sensitive to them. Then comes the actual compression - and that's probably the part that has most patents, and some very clever algorithms.

    I doubt the idea of the wavelet transform itself has any patents - it was developed by French matematicians (I believe), not some corporate slave with pen in one hand and the phone (to call the patent office) in the other...
  • by tilly ( 7530 ) on Friday July 28, 2000 @01:02PM (#896908)
    Sure, wavelets are O(n), FFT is O(n log(n)).

    But the FFT has a much better constant, and so is generally faster on real-world data sets.

    The real win with wavelets isn't speed, it is the match to the real world data. A sharp boundary in the FFT has to have a "long tail" in the coefficients, causing Fourier transforms to suffer from things like the Gibbs effect. Wavelets allow you to make a deliberate tradeoff between smoothness and sharp boundaries. So more information is in fewer coefficients.

    BTW a lot of the better wavelet algorithms (eg wavelet packets) are no longer O(n). Why not? Because they allow you to dynamically choose the best representation out of a family of representations. That extra freedom requires processing time...

    Cheers,
    Ben
  • by tilly ( 7530 ) on Friday July 28, 2000 @01:09PM (#896911)
    With wavelets at a very basic level there are too many options. Wavelet researchers don't talk about a wavelet transform, they have entire families of wavelet transforms algorithms to argue over. Each is better in different circumstances.

    This makes standardization harder. There are a lot of tradeoffs. Do we go with the one that works better on smooth data? Or on boundaries? The one which is symmetric so that the errors it produces tend to be harder for the human ear to pick up? Or the one which is orthogonal, giving it a ton of nice mathematical properties? Shall we have a simple wavelet transform? Or a dynamic wavelet packet transform? Do we work from the most significant bit of data to the least? Do we try to order the data in some way? (The first allows for bandwidth to determine the compression level chosen, the second is key for streaming output.)

    The basic idea of a wavelet is very flexible. So you get a lot of choices, none of which is obviously better than the others. This makes it hard to decide which should be made a standard...

    Cheers,
    Ben
  • You can already do that type of stuff with Metastream [metastream.com], and it is based on wavelets, too. They hired a guy straight from MIT to do their compression algorithms - and it was worth it. I remember that a model of a safe that was 2 megabytes in DXF was compressed down to 15 KB, with hardly any loss in quality. Try it out...
  • Wavelet transform just turns data with local similarity of information (think about a picture for a second) into an alternate format where the fast majority of your information is squeezed into a small fraction of the terms.

    It is useless on things like text where from point to point things jump around. Feeding a sentence of English into a wavelet transform would be silly.

    Now what does this transform give you? Three things. First of all you now have your significant information in a small number of terms that can be easily analyzed. (Think speech recognition.) Secondly you now know that the majority of your data consists of numbers close to zero, which is something we know how to say efficiently. And thirdly we know what the least significant information is (all those little terms) and we can just chop it out for a lossy algorithm.

    So wavelets are useful for data processing (visual and auditory recognition, etc), lossless compression, and lossy compression of visual, audio, and other similar data. It is particularly valuable with data that has a mix of boundaries and smooth regions. (Fourier transforms are good on smooth regions only.)

    Cheers,
    Ben
  • Compression works by coming up with a compact description of "predictable data". After a wavelet transform most of your data is predictable - it is in terms close to 0.

    This allows wavelets to be used for efficient lossless compression (compress the small terms using their predictability) as well as decent lossy term (ah heck, throw away small terms).

    Cheers,
    Ben
  • As so often, there is a great algorithm (set of algorithms, actually). The problem is -- make a standard out of it. Create a file format that supports the standard. Create a library that reads and writes those files. Create a browser plugin from that library.

    This usually takes several years, not to mention the problems with patents.
  • Wavelet compression has already been used to compress images. It does this by finding a fractal pattern or something like that. I've actually run the demo that compresses video using wavlets. Its a small animation of a girl brushing her hair, (I think, I don't remember it exactly.) Also, there is a site that contains pictures compressing stuff like a picture of a glass of wine, a cow, etc.
  • For instance a traditional wavelet transform cannot touch Wim's lifting scheme. There is no way that any standard that was written before he came up with it could have taken it into account.

    I saw him give a talk on it a few years ago. One of the coolest visual effects that I have ever seen was a picture of a ballroom with a metal ball in the middle. He laid down a wavelet transform on the ball, and another on everything else. He strengthened the detail on the ball, and weakened it everywhere else. The room faded to a blur, while the ball was in some sort of super-focus. And the blur did not look like a smudge like you sometimes see, it looked exactly like things look when they are out of focus.

    As for the data ordering, let me give you the issue in a simple form. When all is said and done a picture takes a certain amount of data. But data is not created equal. Some pieces (eg the most significant bit of the number for the average color of the whole picture) say more about the picture as a whole than others (eg the least significant bit distinguishing one dot from another).

    Mathematically you can calculate the energy of each piece of information. Now wouldn't it be nice to send the information ordered from the most significant to the least significant bit? Then once the receiver has an acceptable picture they can just cut the transmission short.

    OTOH now stop and think about this in context of a song. The basic tune at the end of the song is going to come through before the first word is clear! Streaming media really needs information sent in an order that is time-sensitive. Sure, some information can be sent ahead, but ideally you want to be able to have a fairly small buffer of stuff sent ahead, and be receiving the details in order of execution.

    Cheers,
    Ben
  • The Internet standard for stereograms is right eye on the left, left eye on the right. Cross-eyed. That's how the 3d South Park images at www.sweeet.com are presented.
    <O
    ( \
    XGNOME vs. KDE: the game! [8m.com]
  • Correct, it's a way to compress geometry data for 3D models. This is not about video compression.

    MPEG4, though, has a whole collection of stuff in it besides video compression. There's a VRML-like 3D system. There's a 2D compositing and interaction system, sort of like Macromedia Flash. There's audio compression. There's a way to send MIDI data. There's an extension for Java content. There's even a scheme for encoding data for a text-to-speech converter with character lip sync. Most of these fancy features are unused, seldom implemented, and probably shouldn't have been there at all. But they're in the spec.

  • I posted this after I read the responses to my original post. I'm not out to "prove" anything. I've read a few articles on it awhile ago, and somehow fractals got mixed in with the memory. It was a mistake. I'm sorry. It happens to everybody. As for my comments about WIllamette, I did make some mistakes and had to revise my concept a little bit, however my overall point still stands. In consumer space, Willamette will be a beast. I was wrong about some aspects of it, (for example, I thought the clock doubling applied to the whole pipeline) but nobody has disproven my point. It seems that Intel is trying to take on NVIDIA in the geometry department. In doing so, they make a chip that performs geometry acceleration very well, but is a little weaker per-clock for other tasks. However, the high clock-speeds outweigh this disadvantage, and help greatly for matrix operations where Willmatte is really strong per clock (because the 20 stage pipeline really doesn't have a negative effect on something regular like matrix operations.) Additionally, Intel not only has the motive (geometry accelerators make them less relevant in the all important games market) but the concept makes sense. If Willamette is weaker per clock than K7, but at a higher clock-speed, it's performance should match most of the time. However, the high-clock speed really helps matrix operations, where Willamette doesn't have a per-clock disadvantage. Also, I said that AMD needs to do something to counteract Willamette. AMD doesn't HAVE anything to counteract Willamette. Not only is the next chip (Mustang) not aimed at all at the consumer market, but AMD's chip after that (K8, Sledgehammer) is aimed at competing in server space. Additionally, it appears (if you read the articles recently on ZDNet and CNET) that Intel may be lowering the market they're aiming at with Willamette. If that's the case, then AMD has nothing to counter that. There you go, a summery of the entire thread. Where's the "total bullshit?" Prove me wrong here somewhere, then maybe you've got a point.
  • I stand corrected. I read the article a long time ago, and somehow fractals got in there.
  • 1) SSE2 doesn't have 2 instructions per cycle. I said it is twice as fast as 3DNow! Since SSE2 is 128bits wide (meaning 4 32bit floats per instruction) and 3DNow! is 64bits wide (2 32bit floats per cycle) that's true. Look it up. It's not two instructions per cycle, but 4 operations per instruction.
    2) You've got me on the yield problems. However, Intel has rarely had yield problems, and if AMD can take it, I'm pretty sure Intel will make it.
    3) AMD cannot migrate down the Mustang since it is the same exact (K7) core with more cache. Source: Sharky Extreme (www.sharkyextreme.com) AMD/VIA roadmap, 6th page.
    4) Willamette actually will be targeted at mainstream systems. Check www.zdnet.com for several articles. Additionally they are supporting PC133 SDRAM for these lower-end systems. Lastly, <a href="http://www.maximumpc.com/content/2000/05/08/ 11344">this article </a> and MaximumPC proves it.
    Now what were you saying about not being totally sure?
  • Okay, I reread the docs. You're right, Willamette only has one FP pipe and one load/store pipe. However, the single 128bit wide pipe should be faster than P3's double 64bit wide pipes, and we already know taht SSE is as fast or faster than 3DNow!. An intersting article covers these, you can find it here. [examedia.nl]

    Read the FAQ carefully. The thing is simply an Athlon with .18 micron technology and 1MB of cache. The reduced core size and lower power requirements are simply a byproduct of the .18 micron process. Considering that the Thunderbird Athlon already runs a .18 micron process, it already has the reduced core size and lower power requirements. And instead of "1MB of on-chip, performance enhancing L2 cache memory" Thunderbird is "256KB of on-chip performance enhancing L2 cache memory." Thus, it IS simply an Athlon with more L2 cache.

    As for Willamette and the MaximumPC artice, it is quite accurate. You can find similar articles on ZDnet, and yes, Intel HAS been quoted as saying it is aimed at the high end, but it HAS done focus shifts before. In fact, the article in the latest MaximumPC even mentions that quote, but says they may aim it at the lower to middle end to compete with Athlon. Ans since Willamette itself won't really come out in volume till very late this year, the SDRAM chipset isn't very far off. In fact, it is very reminiscent of the whole PII, where Intel aimed PII at more mainstream applications quite quickly.
  • But my point is that AMD's new CPU is simply an Athlon with more cache. No new features, no rearchitecturing of the pipeline. Don't you think they would put it in the FAQ is the actual chip was going to be changed? All along it has been known that the Athlon Ultra would be an Athlon with more cache, just like the Xeon chips are compared to the Pxx chips. The FAQ doesn't say anything about changes, just a new process. AMD has already done this process change. The Thunderbird Athlon is simply an Athlon with 256K of L2 cache, and based on a .18 micron process. Plus, it is already based on copper. Aside from the change in L2 size and speed (and any changes in circuit layout), there are no other changes. Same thing with Mustang. The Mustang will actually be less of a change than Athlon -> Thunderbird. It will be based on the same Thunderbird core (Most 900MHz+ Athlons are based on this core) with 1MB of L2. Process changes HAVE been made without architectural changes. Take PII Klamath to PII Deschutes. In that case the process was moved from .35 to .25 with no other changes. Athlon Ultra (Mustang) is the same way. Anyone without any preconceptions would read the FAQ as talking about an Athlon with more cache. Everything the FAQ talks about, (except the 1MB of cache) is already implemented in the regular Thunderbird Athlon.
  • You don't seem to understand what Mustang is. I just looked at the AMD faq. It says that Mustang is an enhanced version of Athlon based on a .18 micron process and with a larger cache. However, the Thunderbird Athlons are already .18 micron
    "
    Q6:Does the new version of the AMD Athlon processor use a 0.18 micron manufacturing process?
    A6:Yes. All of the AMD Athlon processor wafer starts are now on 0.18 micron process technology. The die size of the new
    AMD Athlon processor is 120mm square
    "
    The other quote, from the same FAQ is this
    "
    A15:
    "Mustang"
    Enhanced version of AMD Athlon processor with reduced core size, lower power requirements, and up to 1MB of on-chip,
    performance-enhancing L2 cache memory. Manufactured on a 0.18 micron copper process technology. Multiple
    derivatives of the Mustang core are planned to address the requirements of the high-performance server/workstation,
    value/performance desktop and mobile mark"
    If you carefully read past the marketing speak, you'll notice this. Both the current Athlon (Thunderbird) and the next Athlon Ultra (Mustang) are based on the same .18 micron copper process. This is where the "enhanced with reduced core size and power consumption" part comes in. The only difference is the case of the Mustang where it has up to 1MB of full speed L2 cache. I don't see why this is so hard to understand. My original point was that AMD doesn't have a processor that can compete with Willamette if Willamette's performance is as high as expected. I said AMDs real next gen design is the Sledgehammer chip. You put for Mustang as AMD's next chip. However, Mustang is simply an AMD Athlon Thunderbird with more cache. It is almost exactly like a PIII Xeon. Same core, same copper process, same everything except cache. Sure there may be changes in circuit layout to accomodate the cache, but no real performance benifets will come from that. It is doubtful that AMD's .18 micron process will scale to 1.5+ GHz given the limitations of the process. Even the .18 micron Thunderbirds are already producing huge amounts of heat. You can't argue that Mustang is a next-gen processor, so where's your ground? If Willamette is a big 3D powerhouse, AMD doesn't have a leg to stand on. If you actually go out and read the articles that are comming out, they say the same thing. ZDNet even said that AMD may have to go back into the value market if Intel can pull off the whole Willamette thing. (Get developers to recode, be able to get decent performance on non 3D apps, get good yeilds, etc.)

    PS> It isn't possible to change the process without changing the layout, but steppings almost never result in performance increases, just better yeilds. Case in point. A .35 micron PII overclocked to 333 gives the exact same performance as a 333 MHz .25 micron PII.

"Remember, extremism in the nondefense of moderation is not a virtue." -- Peter Neumann, about usenet

Working...