Things I Need to Know

Last updated: 5/25/95

OggSquish is not perfect, but the quest is to make it arbitrarily close. A lot of people who might be interested in OggSquish likely know quite a bit about audio compression I don't. Comments are welcome; mail them to me at xiphmont@cs.titech.ac.jp. Sorry, I don't have mail access from DeskFish, and I don't want to have to install a mail server just to be able to use a mail-to form on the Web server :)

Arithmetic Encoding

Arithmetic encoding is patented by IBM, and they generally charge a fee for licensing. I'd like opinions on how much of the idea is covered by the patents. I have read the patents, but don't really know how much implication is legally drawn from algorithm descriptions in common legal practice.

Arithmetic encoding would both result in better compression ratios and substantially lower memory usage (OggSquish is typically using several hundred Huffman trees internally at any one time). In addition, it allows for much more flexible construction of probability spaces; OggSquish currently loses some ratio because is cannot construct a probability space quicky to fit the exact immediate circumstance (constructing a huffman tree is time consuming), so it uses a pregenerated one that is a close match.

Psychoacoustics

OggSquish uses a first pass filter on all input blocks to remove 'incosequential' frequencies from the frequency domain. The fewer frequencies, the fewer bits needed to encode. The idea is to preserve the minimum number of frequencies, while preserving the maximum amount of quality.

Currently, OggSquish uses a filter that preserves anywhere from 20% to 50% of the original output amplitudes from the DCT transform. Those of you who have seen the source know that this filter is a bit of a hack. It's fast, and works reasonably well, but there is much room for improvement.

Anyone with extensive knowledge of psychoacoustic filters is welcome to comment or point me toward papers that I should read. I have read texts on psychoacoustics, hearing, the anatomy of the ear, etc, but none offered practical experience from or toward implementing computer algorithms. This experience is hard won; I'm happy to share what I know with people who can show me how much I still need to learn :)

DeskFish WWW Server / Address comments to: xiphmont@cs.titech.ac.jp