![]() |
![]() |
![]() |
![]() |
The Data Compression Book
2nd edition
by Mark Nelson and Jean-loup Gailly, M&T Books, New York, NY 1995
ISBN 1-55851-434-1
541 pages
List price in the US is $39.95
|
Update: I'm sorry to say this book is out of print. However, you can almost always find used copies on Amazon.com for very reasonable prices. Be sure to get the second edition!
"The best all-around book on the subject" - Andrew Schulman, Dr. Dobb's Journal "The book hits its target audience right between the eyes." Jeff Duntemann, PC Techniques "One of my favorite books on applied computer technology is The Data Compression Book" - Jeff Prosise, PC Magazine |
This authoritative guide details various data compression techniques used on personal and mid-sized computers. It explores different data compression methods, explaining the theory behind each and showing C programmers how to apply them to significantly increase the storage capacity of their system. Each technique is fully illustrated with complete, working programs written in portable C. These programs not only demonstrate how data compression works but can also be used to build your own data compression programs.
Topics include:
-
Fractal Compression
Shannon-Fano and Huffman coding.
Differences between modeling and coding.
Expanding and improving Huffman Coding with Adaptive Huffman techniques.
Arithmetic coding.
Implementing powerful statistical models.
Dictionary compression methods using LZ77 and LZ78.
Applying lossy compression techniques to computer graphics and digitized sound data.
The JPEG compression algorithm.
Developing a complete archiving program.
What's new in the second edition?
The second edition of this book was printed in November, 1995. The text and source code of the book was cleaned up somewhat to match up with current events. In addition, Jean-loup Gaillyadded a chapter on Fractal Image Compression, and performed some work on the rest of the text.
Want more information on data compression?
To keep up with the field of data compression, you should monitor the news groups: comp.compression. Before you post any questions or comments in this group, you probably ought to read the comp.compression FAQ.
Want permission to use the code in the book? Check out my liberal code use policy.
Errata
First Edition, First printing only! Page 310.
The top of this page starts with a declaration for find_child_node(). The two local variables are defined as:
-
unsigned int index;
-
int offset;
These need to be changed to:
-
unsigned int index;
-
unsigned int offset;
The code is correct on the disk, and is correct in the second printing, and the second edition.
First Edition, Page 121
Second Edition, Page 112
The listing for ahuff.c was truncated, resulting in the loss of a fairly big chunk of code. The listing on the diskette is complete, so most people won't miss it. Those who bought the first edition sans disk can get the updated copy here.
Second Edition only, Page 50
The listing for main-c.c did not include the terminating angle bracket ( '}' )character. The code is correct on the diskette.
Second Edition Ony, Page 332
A nasty transcription error rendered the formula shown in Figure 11.8 invalid. The bottom half of the formula shows a term that gets the cosine of ((2j+1)*j*π)/2N. This is incorrect, it should take the cosine of ((2j+1)*i*π)/2N. Note that the change means that we multiply π by i, not j.
The code and supporting documentation are correct as printed.




17 users commented in " The Data Compression Book "
Follow-up comment rss or Leave a Trackback[...] The Data Compression Book [...]
I need to hide 4 digit binary in 8 digit binary. Can it be done?.
@Alex:
Hi Alex,
I'm not sure I know what you mean by "Hide".
Hi Mr. Mark Nelson,
Finally had a chance to buy your book. Was used, but in very good state. Only wnated to know if you have the contents somewhere in this site, to download and study it, if possible...
Regards,
Sven Bleckwedel
Hi Mr Mark Nelson,
I forgot to mention that about the contents I refer to the diskette contents (archived in one .zip) that could be available in your site...
Regards,
Sven Bleckwedel
hi Mr. Mark Nelson
this about book "The Data Compression Book" errata in Chapter 12 "An Archiving Package". In carman.c source code
i have compile it in Borland C++ 5.5.1 for Win32
and i have got several error that make me confuse like :
Error E2129 carman2.c 228: Character constant must be one or two characters long in function main
- Maybe '' is String so "", are i'am right
Error E2451 carman2.c 1149: Undefined symbol 'OL' in function Insert
- OL is not define variable
Error in line 1045 which number_of_bytes-- > ? bigger from what
and at last i try to fix but failed with error in linking
Error: Unresolved external '_CalculatedBlockCRC32' referenced from C:\BORLAND\BC
C55\BIN\CARMAN.OBJ
i so appricate if you want give me the fix source code...
best regard
@evilkyro:
It looks like you must be typing in the source code by hand, right? If so I can help you out with a copy of the source in a zip file. Let me know if you are interested.
yes, sure think mark i have type it. give me copy of the source in zip file very helpfull to me. i so appreciate. thank you. by the way where the link that i can download it mark. and again thanx
best regard
and i want to ask where can i find all the source code of your book Mr. Mark. cause, i have search but can find it. and this book i borrow from library and i live in indonesia. so i hard to find this book. can you give me the link. so i can download it. i'm so appreciate if you give me all source in your book.
best regard
Thank Your for send me the souce. It will be very usefull to learn compression.
best regard
I have the 1st edition I read years ago and have to review it again because I have some data compression work to do. The book is a great help to my understanding.
However, Appendix A (Compression Statistics) gives some numbers without describing their units. There are some 4 digit numbers of the form NN.NN. I assume they are compression ratio percentages, but perhaps they are execution time in seconds! If they are percentages, they might be compressed size / original size or (original size - compressed size) / original size. Also, there are numbers listed between 3 digits and 5 digits long in other tables, are these bytes/second, total seconds, or something else? Maybe someone that understands could respond or an entry could be made under Errata.
@Jacques:
Note that the head of the first table says "Ratios" - the numbers you are seeing are compression ratios. The ratio used is described in the front of the book in the section "Keeping Score". A value of 100 indicates perfect compression, and 0 indicates no compression.
dear mr.Mark nelson
now i'm doing my final exam about lossless compression using context tree modeling. i have download several references, but until now i'm still confuse about context modeling and context tree algorithm. can you give me suggestion, a book or an article about this algorithm? thank you
@sophie:
Actually, I don't have anything in book about Context Trees or the Context Tree Algorithm - because I don't know anything about them. Sorry.
Post to comp.compression and ask for references, I'm sure the good people there will be happy to give you a pointer or two.
- Mark
I just wanna see how iG:Syntax Hiliter works:
Mark-
Love the book. It's been on my bookshelf since '92.
I was referring to it this morning, and thought I'd pass along some errata, in case you plan to come out with a revised edition sometime soon.
These are pertaining to the first edition.
Page 125, in your table, all 18 of your "greater than" signs should be "less than signs". Example,
SPACE 1/10 0.00 >= r > 0.10
(no value of r can be both less than 0 and greater than 0.1)
Page 234, second last paragraph ends with:
So this token is encoded as: 14, 4, ' .' - which should be
So this token is encoded as: 14, 4, ' '.
(I'm sure some English-expert editor (or program) incorrectly put the period inside the quote for you. If you want to satisfy the editor, perhaps put it inside of double quotes, like you did on page 235:
So this token is encoded as: "14, 4, ' '."
I found a few more minor things... not sure if you want them. Very helpful book, though! Thanks!
@J:
Thanks for the corrections - the first edition was done in the dark ages of publishing - believe it or not I had to send in hardcopy, so it had more than it's share of problems, particularly in the code.
- Mark
Leave A Reply
You can insert source code in your comment without fear of too much mangling by tagging it to use the iG:Syntax Hiliter plugin. A typical usage of this plugin will look like this:[c]
Note that tags are enclosed in square brackets, not angle brackets. Tags currently supported by this plugin are: as (ActionScript), asp, c, cpp, csharp, css, delphi, html, java, js, mysql, perl, python, ruby, smarty, sql, vb, vbnet, xml, code (Generic). If you post your comment and you aren't happy with the way it looks, I will do everything I can to edit it to your satisfaction.int main()
{
printf( "Hello, world!\n" );
return 1;
}
[/c]