![]() |
Dr. Dobb's Journal January, 1997 |
Free software is like pizza - when it's good, it's very good. When it's bad, it can still be pretty good. This article takes a look at zlib, the free data compression library written by Mark Adler and Jean-loup Gailly. Like the very finest pizza, zlib is exceptional.
In this article I will first give a quick overview of what zlib is, and where it came from. I'll then discuss the existing interface, and look at the wrapper class I created for my own use. Finally, I'll talk about how I integrated zlib into a a command line test program and a general purpose file compression OCX. After reading this article you should feel comfortable using zlib for many, if not all, of your data compression needs.
Where there's a need...
I recently received an email message from Scott Steketee, a developer of The Geometer's Sketchpad, an educational/visualization tool:
Do you know of a product which can be used in place of the Windows files compress.exe and expand.exe in such an installation? It's obvious even to me that the compression achieved by the standard Windows programs is not very good, and I imagine I could keep the current number of disks just by improving the ratio somewhat.
I frequently see messages similar to this in Internet newsgroups like comp.compression and comp.lang.c, or on various CompuServe forums. It seems that the requirement to efficiently compress or expand files is a fairly common one. Often, programmers find that their needs aren't being filled by the Windows API functions in LZEXPAND.DLL (no compression functions), and don't want to spend the money needed for third party solutions, such as Greenleaf's ArchiveLib.
Fortunately, I'm able to point Scott towards an excellent solution to his problem: zlib. zlib is a library of C routines that can be used to compress or expand files using the same deflate algorithm popularized by PKZIP 2.0. zlib is efficient, portable, and free.
A Brief History of zlib
The origins of zlib can be found in the history of Info-ZIP. Info-ZIP is loosely organized group of programmers who give the following reason for their existence:
Info-ZIP's purpose is to provide free, portable, high-quality versions of the Zip and UnZip compressor-archiver utilities that are compatible with the DOS-based PKZIP by PKWARE, Inc.
These free versions of Zip and UnZip are world class programs, and are in wide use on platforms ranging from the orphaned Amiga through MS-DOS PCs up to high powered RISC workstations. But these programs are designed to be used as command line utilities, not as library routines. People have found that porting the Info-ZIP source into an application could be a grueling exercise.
Fortunately for all of us, two of the Info-ZIP gurus took it upon themselves to solve this problem. Mark Adler and Jean-loup Gailly single-handedly created zlib, a set of library routines that provide a safe, free, and non-patented implementation of the deflate compression algorithm.
One of the driving reasons behind zlib's creation was for use as the compressor for PNG format graphics. After Unisys belatedly began asserting their patent rights to LZW compression, programmers all over the world were thrown into a panic over the prospect of paying royalties on their GIF decoding programs. The PNG standard was created to provide an unencumbered format for graphics interchange. The zlib version of the deflate algorithm was embraced by PNG developers, not only because it was free, but it also compressed better than the original LZW compressor used in GIF files.
zlib turns out to be good for more than graphics developers, however. The deflate algorithm makes an excellent general purpose compressor, and as such can be incorporated into all sorts of different software. For example, I use zlib as the compression engine in Greenleaf's ArchiveLib, a data compression library that work with ZIP archives. It's performance and compatibility mean I didn't have to reinvent the wheel, saving precious months of development time.
zlib's interface
As a library developer, I know that interfaces make or break a library. Performance issues are important, but if an awkward API makes it impossible to integrate a library into your program, you've got a problem.
zlib's interface is confined to just a few simple function calls. The entire state of a given compression or decompression session is encapsulated in a C structure of type z_stream, whose definition is shown in Figure 1.
C:
|
Figure 1
The z_stream object definition
Using the library to compress or decompress a file or other data object consists of three main steps:
-
Creating a z_stream object.
Processing input and output, using the z_stream object to communicate with zlib.
Destroying the z_stream object.
An overview of the process is shown in Figure 2.

Figure 2
The mandatory folder layout
Steps 1 and 3 of the compression process are done using conventional function calls. The zlib API, documented in header file zlib.h, prototypes the following functions for initialization and termination of the compression or decompression process:
-
deflateInit()
inflateInit()
deflateEnd()
inflateEnd()
Step 2 is done via repeated calls to either inflate() or deflate(), passing the z_stream object as a parameter. The entire state of the process is contained in that object, so there are no global flags or variables, which allows the library to be completely reentrant. Storing the state of the process in a single object also cuts down on the number of parameters that must be passed to the API functions.
When performing compression or decompression, zlib doesn't perform any I/O on its own. Instead, it reads data from an input buffer pointer that you supply in the z_stream object. You simply set up a pointer to the next block of input data in member next_in, and place the number of available bytes in the avail_in member. Likewise, zlib writes its output data to a memory buffer you set up in the next_out member. As it writes output bytes, zlib decrements the avail_out member until it drops to 0.
Given this interface, Step 2 of the compression process for an input file and an output file might look something like this:
C:
|
Figure 3
The code to implement file compression
This method of handling I/O frees zlib from having to implement system dependent read and write code, and it insures that you can use the library to compress any sort of input stream, not just files. It's simply a matter of replacing the wrapper code shown above with a version customized for your data stream.
Wrapping it up
zlib's versatility is one of its strengths, but I don't always need all that flexibility. For example, to perform the simple file compression task Scott asked about at the start of this article, it would be nice to just be able to call a single function to compress a file, and another function to decompress. To make this possible, I created a wrapper class called zlibEngine.
ZlibEngine provides a simple API that automates the compression and decompression of files and uses virtual functions to let you customize your user interface to zlib. The class definition is shown in its entirety in Figure 4. There are two different groups of members that are important to you in ZlibEngine. The first is the set of functions providing the calling interface to the engine. The second is the set of functions and data members used to create a user interface that is active during the compression process.
C++:
|
Figure 4
The ZlibEngine wrapper class
The Calling API
There are three C++ functions that implement the API needed to perform simple compression and decompression. Before using the engine, you must call the constructor, the first function. Since ZlibEngine is derived from the z_stream object used as the interface to zlib, the constructor is in effect also creating a z_stream object that will be used to communicate with zlib. In addition, the constructor initializes some of the z_stream member variables that will be used in either compression or decompression.
The two remaining functions are nice and simple: compress() compresses a file using the deflate algorithm. An optional level parameter sets a compression factor between 9 (maximum compression) and 0 (no compression.) decompress() decompresses a file, as you would expect. The compression level parameter isn't necessary when decompressing, due to the nature of the deflate algorithm. Both of these functions return an integer status code, defined in the zlib header file zlib.h. Z_OK is returned when everything works as expected. Note that I added an additional code, Z_USER_ABORT, used for an end user abort of the compression or decompression process.
The wrapper class makes it much easier to compress or decompress files using zlib. You only need to remember three things:
-
Include the header file for the wrapper class, zlibengn.h.
Construct a ZlibEngine object.
Call the member functions compress() or decompress() to do the actual work.
This means you can now perform compression with code this simple:
-
#include <zlibengn.h>
-
-
int foo()
-
{
-
ZlibEngine engine;
-
return engine.compress( "INPUT.DAT", "INPUT.DA_");
-
}
That's about as simple as you could ask for, isn't it?
The User Interface API
The calling API doesn't really make much of a case for creating the ZlibEngine class. Based on what you've seen so far, the compress() and decompress() functions don't really need to be members of a class. In theory, a global compress() function could just instantiate a z_stream object when called, without the caller even being aware of it.
The reason for creating this engine class is found in a completely different area: the user interface. It's really nice to be able to track the progress of your compression job while it's running. Conventional C libraries have to make do with callback functions or inflexible standardized routines in order to provide feedback, but C++ offers a better alternative through the use of virtual functions.
The ZlibEngine class has two virtual functions that are used to create a useful user interface: progress() is called periodically during the compression or decompression process, with a single integer argument that tells what percentage of the input file has been processed. status() is called with status messages during processing.
Both of these virtual functions have access to the ZlibEngine protected data element, m_AbortFlag. Setting this flag to a non-zero value will cause the compression or decompression routine to abort immediately. This easily takes care of another sticky user interface problem found when using library code.
Writing your own user interface then becomes a simple exercise. You simply derive a new class from ZlibEngine, and define your own versions of one or both of these virtual functions. Instantiate an object of your class instead of ZlibEngine, and your user interface can be as spiffy and responsive as you like!
Command line compression
I wrote a simple command line test program to demonstrate the use of class ZlibEngine. zlibtest.cpp does a simple compress/decompress cycle of the input file specified on the command line. I implement a progress function that simply prints out the current percent towards completion as the file is processed:
-
class MyZlibEngine : public ZlibEngine {
-
public :
-
void progress( int percent )
-
{
-
printf( "%3d%%\b\b\b\b", percent );
-
if ( kbhit() ) {
-
getch();
-
m_AbortFlag = 1;
-
}
-
}
-
};
Since class ZlibEngine is so simple, the derived class doesn't even have to implement a constructor or destructor. The derived version of progress() is able to provide user feedback as well as an abort function with just a few lines of code. zlibtest.cpp is in the zip file linked to at the end of the article.
The OCX
To provide a slightly more complicated test of class ZlibEngine, I created a 32 bit OCX using Visual C++ 4.1. The interface to an OCX is defined in terms of methods, events, and properties. ZlibTool.ocx has the following interface:
Properties: | InputFile |
OutputFile | |
Levele | |
Status | |
Methods: | Compress() |
Decompress() | |
Abort() | |
Events: | Progress() |
(Note that I chose to pass status information from the OCX using a property, not an event.)
ZlibTool.ocx is a control derived from a standard Win32 progress bar. The progress bar gets updated automatically while compressing or decompressing, so you get some user interface functionality for free. Using it with Visual Basic 4.0 or Delphi 2.0 becomes a real breeze. After registering the OCX, you can drop a copy of it onto your form and use it with a minimal amount of coding.
Both the source code for the OCX and a sample Delphi 2.0 program are available from the links at the end of this article. A screen shot of the Delphi program in action is shown in Figure 5.

Figure 5
The Delphi 2.0 OCX test program
Reference material
The source code that accompanies this article can be downloaded from this Web page. It contains the following source code collections:
-
The complete source for zlib
The Visual C++ 4.1 project for the ZlibTool OCX
The Delphi 2.0 project that exercises the OCX
The Console test program that exercises the ZlibEngine class
Each of the subdirectories contains a README.TXT file with documentation describing how to build and use the programs.
The source is split into two archives: zlibtool.zip
All source code and the OCX file: zlibdll.zip
The supporting MFC and VC++ DLLs. Many people will already have these files on their systems: MFC40.DLL, MSVCRT40.DLL, and OLEPRO32.DLL.
I haven't discussed the zlib code itself in this article. The best place to start gathering information about how to use zlib and the Info-ZIP products can be found on their home pages. Both pages have links to the most current versions of their source code as well:
Info-ZIP: http://www.info-zip.org
zlib: http://www.gzip.org/zlib
Once you download the Info-ZIP code, the quick start documentation is found in source file zlib.h. If you cook up any useful code that uses zlib, you might want to forward copies to Greg Roelofs for inclusion on the zlib home page. Greg maintains the zlib pages, and you can reach him via links found there.
Feel-good plug
zlib can do a lot more than just compress files. Its versatile interface can be used for streaming I/O, in-memory compression, and more. Since Jean-loup Gailly and Mark Adler were good enough to make this capable tool available to the public, it only makes sense that we take advantage of it. I know I have, and I encourage you to do the same.
40 users commented in " zlib – Looking the Gift Code in the Mouth "
Follow-up comment rss or Leave a TrackbackThanks for the article. You explained what I wanted from the headers, but they're not that concise.
I'm going off to implement :-)
Thanks.
My only question is whether zlib is able to decompress a gzip archive or file ? Thank you.
@eh936:
The zlib library does not have functions to decompress gzip files.
The zlib distribution contains complete programs that will decompress gzip files.
Actually, the zlib library has functions to compress and decompress gzip files. Also the zlib in-memory compression and decompression functions (deflate and inflate) can process gzip streams on request. Please read zlib.h for more information.
@madler:
Sorry for the misrepresentation. I believe I was reliving 1.1.4 or earlier, in which gzip files had to be managed using the contributed program minigzip.
I will make sure I get caught up.
- Mark
Apparently, there is a problem with this code.
I extracts only a part of a compressed file.
I have 6 MB of single file, suppose after compression if it becomes 2MB of file. i want zlib should break this 2MB compress file into small chunks, and while decompression it should use all this small chunks to make this 6MB of single file.
Can it be possible to do with zlib's existing function.
@eden:
No, zlib doesn't do anything like this.
- Mark
I try to register zlibtol.ocx with Windows Vista and get Error 0x80040200. Any idea?
@eden:
You need to be running from an elevated command prompt. Run cmd.exe as admin and try again.
- Mark
Thank you! This solved my problem.
- Reinhold
The interface for inflating GZIP files is far from obvious.
Even following
http://www.zlib.net/zlib_how.html
there is not enough explanation
I changed
ret = inflateInit(&strm);
to
ret = inflateInit2(&strm, 15);
but there is no indication as to what value windowsBits should be for GZIP files.
A few lines later and inflate() returns Z_DATA_ERROR (-3)
and there is zero clue as to why this is.
It is frustrating that what examples there is, is geared to ZLIB's own format and no code examples for decoding GZIP format (or what you have to do to inflate GZIP files).
WINZIP 12.1 has no problem expanding the file so I must be doing something wrong.
I wish for complete example code to inflate GZIP files using ZLIB DLL if possible.
A further comment:
minigzip_d.exe that comes with ZLIB123-DLL.ZIP fails to unzip a GZIP file. Nothing occurs.
In contrast, GZIP.EXE succeeds. And on using -l -v to list details of the GZIP file, it reports that it is using the "deflat" method, the very method ZLIB claims to handle.
All very disappointing
@Stephen:
The only time I've used zlib with gzip format files, I have been successful with minigzip.c.
I would strongly suggest that you pose this as a question on comp.compression - you will almost certainly get a good answer, especially if you can point to a copy of the offending file.
- Mark
Hi Mark
I am just wondering about the use of your ZlibEngine wrapper. What is the use of a wrapper and does it mean that data that are compressed using different wrappers such as zlib wrappers, gzip wrappers, or no wrappers at all can only be decompressed using programs with the same wrapper? How can I decompress a file with gzip headers using zlib? I am still new to C programming and still confused with the idea of wrappers.
Thanks for answering.
~Benson~
@Benson:
The wrapper just provides a convenient way to access zlib, helping with some initialization functions and the like.
If you want to decompress a gzip file using zlib, I suggest you look at the program minigzip which accompanies the zlib distribution.
- Mark
How can I use your .dll from Visual Basic?
Can you provide me a VB sample.
Thanks.
@Ashok:
I recommend using the Info-ZIP DLLs. Here's an example:
http://www.vbaccelerator.com/home/Vb/Code/Libraries/Compression/Introduction_to_the_Info-ZIP_Libraries/article.asp
I used ZlibTool.ocx in Visual Basic - and when I ran the program under the VB IDE everything went fine.
But when I compile a *.exe of that program and run it I just get 'welcome' by an error messagebox that says:
"System Error &H80004005. Unknown Error."
Well to shorten the story here is a quick workaround you have to add to each form that uses the ZlibTool:
Since the progressbar is no standard Windows control (like a editbox, label or commandbutton) you need to call InitCommonControls() to setup the windows classes before you can create instances of them - like for ex. one or two CommonControl progressbars.
Of course it's better to fix that problem at its root. But after that you'll need to compile the ZlibTool Project. What may take some time to if you haven't setup/install M$ Visual C++. And maybe some extra time to mess around and fix linker problem such as missing *.lib's unlinked or duplicated exports + fixing changing 'exotic' dll imports like 'MFC42D.DLL' that are not included in the dll's that came with the Windows installation.
So here we go:
zlibtool\ZlibToolCtl.cpp
Okay at the end you might ask why the error don't come up when running the program under VB6-IDE?
Well when it's running in the VB6-IDE, it's ran in the same process space as the VB6 IDE and since that IDE also makes use of CommonControls it already has called InitCommonControls() - so the bug will not show any effect.
So that's it.
@cw2k:
Thanks for the excellent support. That OCX is pretty old, it's pretty cool that you are able to get it to still run.
- Mark
Hi All,
If it possible for me to use this ocx to perform a compression of byte array in vb?
Thanks and Best Regards,
Boon Hui
@Zuffi:
Not without some changes. It compresses files right now.
@Stephen Howe
From the zlib FAQ:
Well that's nice, but how do I make a gzip file in memory?
Well that's nice, but how do I make a gzip file in memory?
You can request that deflate write the gzip format instead of the zlib format using deflateInit2(). You can also request that inflate decode the gzip format using inflateInit2(). Read zlib.h for more details.
http://zlib.net/zlib_faq.html#faq20
According to zlib.h:
windowBits can also be greater than 15 for optional gzip decoding. Add 32 to windowBits to enable zlib and gzip decoding with automatic header detection, or add 16 to decode only the gzip format (the zlib format will return a Z_DATA_ERROR). If a gzip stream is being decoded, strm->adler is a crc32 instead of an adler32.
This is very usefull zip ocx like ever
Hi,
I have a file which is 5000000 bytes size and after gzip compression it is coming out to 5123 bytes. I tried to use zlib to uncompress the same file but it is not getting me the last 2880 bytes. Why is this ? I have followed the procedure as given in zlib manual. I am using the zlib version 1.2.3.
Need some guidance on using the inflate function.
@Srivatsan:
Suggest you first make sure you can unzip the file using minigzip, which comes with the zlib distribution. If that works, compare its code to yours and see where the problem lies.
- Mark
I have used your zlibocx2 dll (now only available at http://www.dogma.net/markn/ZlibOCX2.dll) for almost 10 years now in a VB6 project and have compressed and decompressed millions of files over that period without a problem. However now I have to port to C# 64bit and I have been trying some of the C# ports out there for zlib. While they seem to work in general but I am occasionally having problems with some of the files that were originally compressed with zlibocx2 that seem to be related to buffer problems on deflation. Could you make the source code for zlibocx2 avaibale so that I could try and figure out what the issue is or do you have any pointers as to what it could be?
@Skreener:
There are links to all the source code at the bottom of this article.
Note that zlib has been through quite a few revisions since the code was first posted.
- Mark
Thanks for the reply Mark. The links seem to be for the ZlibTool.ocx but I can't see the code for ZlibOCX2.ocx. I think it was done in VC++5 back in 2001 to support VB5, or the the code essentially the same for both using the same ZlibEngine class?
Just to say thanks for what you have contributed to the programming world! I know it has helped me.
@Skeeter:
Unfortunately, what you see is what I have.
I think the best bet for you is to use the Info-Zip DLLs with VB. This wasn't possible back in 2001, but it works just fine now:
http://www.info-zip.org/
The Info-Zip DLLs are pretty well supported, and you will have an easier time keeping up with bug fixes and any future enhancements.
- Mark
A quick Thanks. I hope to work Zlib to death.
Now to learn something new...
I feel as dumb as a sack of hammers.
I am attempting to decompress a file compressed by Zlib.
It says "incorrect header check" I am rather frustrated so I may be missing the simple.
Do I have to add a header to the compressed? I assumed Compress() via zlib would be a complete file.
I get tired after a couple weeks of coding so I may be making a simple mistake but I ask anyway.
I have fetched the manual and have some web-pages to read. I may figure out what I am doing wrong before I read any answer but hey..
Other then that I am loving Zlib. It reports all most as much compression as the Gzip on my Linux Platform.
Ernst
Got it going. It was Me.
So it's now my front end.
I have to say I find it comforting to have data compression in the form of an in memory function. It was rather awkward to be testing results manually however observations like what does Gzip do better than Bzip and why is it Gzip can compress a Bzip archive sometimes, are not possible if we are not hands on too!
I have to agree with Mark in that it is Gift Code we should accept!
Thanks Mark for this thread.. 'Moving on Up..."
Hi,
I am writing a program using visual basic and I am wonder how to import the zlib lib and use it in my code to decompress an array of bytes and the decompression ratio is 95%
Please need your reply as soon as possible.
Thank you,
Liliane
Everything I know is in the article!
- Mark
Thanks for your post. Of course, I nedeed the opposite, I had one I nedeed to decompress. So I backwarded your algorithm and here is the result:
Hello,
Would be any problems to use or adapt it with VisualC++.net 2003? Looking forward to your answer.
M.P.
@Misulachis:
Sorry for the slow response, this comment got spam filtered, not sure why gmail did that.
This code should work fine with Visual C++ 2003.
Hello,
I came to know recently that our application is encrypting images in zlib format and stored into SQL server. I am not much into coding, but do you have any idea on how to decrypt these images from SQL server table and stored on hard drive?
Regards,
So data compressed with zlib isn't exactly encrypted, it is just compressed.
Decompressing the data can probably be done pretty easily using a python script, for example, but I don't think most SQL servers have a built-in function to handle zlib compression.
Step 1: what are the first two bytes of the data you think is compressed with zlib?
-Mark
Leave A Reply
You can insert source code in your comment without fear of too much mangling by tagging it to use the iG:Syntax Hiliter plugin. A typical usage of this plugin will look like this:[c]
Note that tags are enclosed in square brackets, not angle brackets. Tags currently supported by this plugin are: as (ActionScript), asp, c, cpp, csharp, css, delphi, html, java, js, mysql, perl, python, ruby, smarty, sql, vb, vbnet, xml, code (Generic). If you post your comment and you aren't happy with the way it looks, I will do everything I can to edit it to your satisfaction.int main()
{
printf( "Hello, world!\n" );
return 1;
}
[/c]