Working with Zip Files in Java

Introduction

In this article I cover the basics of creating, interacting with, inspecting, and extracting zip archive files using Java (OpenJDK 11 to be specific). The code sample used in this article is in the form of a Gradle project and hosted in this GitHub repo for you to run and experiment with. Please exercise caution when changing code that deletes files.

As mentioned already, the code examples here are written in using Java 11 and utilizes the var keyword which was introduced in Java 10 and functional programming paradigms in Java 8, so a minimum version of Java 10 is required to run them as is.

Contents

Key Java Classes for Working with Zip Archives

I feel it's a good idea to start things off by identifying some of the prominent classes that are commonly used when dealing with zip archives in Java. These classes live in either the java.util.zip or java.nio.file packages.

Common File Paths for the Code Examples

For the example code I use two common directories to write and read data to/from which are both relative to the root of the Gradle project. Take a look at the linked Repo in the introduction, or better yet, run the samples. Just keep these two Path variables in mind as they are used often as the starting directory for inputs and outputs.

public class App {

    static final Path zippedDir = Path.of("ZippedData");
    static final Path inputDataDir = Path.of("InputData");
    
    // ... other stuff   
}

Inspecting the Contents of a Zip Archive

You can instantiate a ZipFile class and pass it the path to an existing zip archive, which essentially opens it like any other file, then inspect the contents by querying the ZipEntry enumeration contained inside it. Note that ZipFile implements the AutoCloseable interface, making it a great candidate for the try-with-resources Java programming construct shown below and throughout the examples here.

static void showZipContents() {
    try (var zf = new ZipFile("ZipToInspect.zip")) {
    
        System.out.println(String.format("Inspecting contents of: %s\n", zf.getName()));
        
        Enumeration<? extends ZipEntry> zipEntries = zf.entries();
        zipEntries.asIterator().forEachRemaining(entry -> {
            System.out.println(String.format(
                "Item: %s \nType: %s \nSize: %d\n",
                entry.getName(),
                entry.isDirectory() ? "directory" : "file",
                entry.getSize()
            ));
        });
    } catch (IOException e) {
      e.printStackTrace();
    }
}

Running the Gradle project using the following:

$ ./gradlew run

This yields output for the App.showZipContents method of:

> Task :run
Inspecting contents of: ZipToInspect.zip

Item: ZipToInspect/ 
Type: directory 
Size: 0

Item: ZipToInspect/greetings.txt 
Type: file 
Size: 160

Item: ZipToInspect/InnerFolder/ 
Type: directory 
Size: 0

Item: ZipToInspect/InnerFolder/About.txt 
Type: file 
Size: 39

Here you can see that this prints out all files and directories in the zip archive, even the files within directories.

Extracting a Zip Archive

Extracting the contents of a zip archive onto disk requires nothing more than replicating the same directory structure as what is inside the ZipFile, which can be determined via ZipEntry.isDirectory and then copying the files represented in the ZipEntry instances onto disk.

static void unzipAZip() {
    var outputPath = Path.of("UnzippedContents");

    try (var zf = new ZipFile("ZipToInspect.zip")) {
    
        // Delete if exists, then create a fresh empty directory to put the zip archive contents
        initialize(outputPath);

        Enumeration<? extends ZipEntry> zipEntries = zf.entries();
        zipEntries.asIterator().forEachRemaining(entry -> {
            try {
                if (entry.isDirectory()) {
                    var dirToCreate = outputPath.resolve(entry.getName());
                    Files.createDirectories(dirToCreate);
                } else {
                    var fileToCreate = outputPath.resolve(entry.getName());
                    Files.copy(zf.getInputStream(entry), fileToCreate);
                }
            } catch(IOException ei) {
                ei.printStackTrace();
            }
         });
    } catch(IOException e) {
        e.printStackTrace();
    }
}

Writing Files Directly into a New Zip Archive

Since writing a zip archive is really nothing more than writing a stream of data to some destination (a Zip file in this case) then writing data, like String data, to a zip archive is only different in that you need to match the data being written to ZipEntry instances added to the ZipOutputStream.

Again, ZipOutputStream implements the AutoCloseable interface, so it is best to use with a try-with-resources statement. The only real catch is to remember to close your ZipEntry when you are done with each one to make it clear when it should no longer receive data.

static void zipSomeStrings() {
    Map<String, String> stringsToZip = Map.ofEntries(
        entry("file1", "This is the first file"),
        entry("file2", "This is the second file"),
        entry("file3", "This is the third file")
    );
    var zipPath = zippedDir.resolve("ZipOfStringData.zip");
    try (var zos = new ZipOutputStream(
                            new BufferedOutputStream(Files.newOutputStream(zipPath)))) {
        for (var entry : stringsToZip.entrySet()) {
            zos.putNextEntry(new ZipEntry(entry.getKey()));
            zos.write(entry.getValue().getBytes());
            zos.closeEntry();
        }
    } catch (IOException e) {
        e.printStackTrace();
    }
}

Zipping an Existing File into a New Zip Archive

If you've copied a File in Java before then you are essentially already a PRO at creating a zip archive from an existing file (or directory for that matter). Again, the only real difference is that you need to take a little extra caution to be sure you are matching files up to the appropriate ZipEntry instances.

In this example I create an input file "FileToZip.txt" and write some data to it "Howdy There Java Friends!" and then use the Files.copy(Path, OutputStream) to associate the ZipEntry with the FileToZip.txt file inside the ZippedFile.zip zip archive I'm creating with a ZipOutoutStream instance.

static void zipAFile() {
    var inputPath = inputDataDir.resolve("FileToZip.txt");
    var zipPath = zippedDir.resolve("ZippedFile.zip");
    
    try (var zos = new ZipOutputStream(
                            new BufferedOutputStream(Files.newOutputStream(zipPath)))) {
                            
        Files.writeString(inputPath, "Howdy There Java Friends!\n");

        zos.putNextEntry(new ZipEntry(inputPath.toString()));
        Files.copy(inputPath, zos);
        zos.closeEntry();
    } catch (IOException e) {
        e.printStackTrace();
    }
}

Zipping a Folder into a New Zip Archive

Zipping a non-empty directory becomes a little more involved, especially if you want to maintain empty directories within the parent directory. To maintain the presence of an empty directory within a zip archive you need to be sure to create an entry that is suffixed with the file system directory separator when creating it's ZipEntry, and then immediately close it.

In this example I create a directory named "foldertozip" containing the structure shown below, then zip it into a zip archive.

tree .
.
└── foldertozip
    ├── emptydir
    ├── file1.txt
    └── file2.txt

In the following code notice that I use the Files.walk(Path) method to traverse the directory tree of "foldertozip" and look for empty directories ("emptydir" in this example) and if / when found I concatenate the directory separator to the name within the ZipEntry. After this I close it as soon as I add it to the ZipOutputStream instance.

I also use a slightly different approach to injecting the non-directory files into the ZipOutputStream compared to the last example, but I am just using this different approach for the sake of variety in the examples.

static void zipADirectoryWithFiles() {
    var foldertozip = inputDataDir.resolve("foldertozip"); 
    var dirFile1 = foldertozip.resolve("file1.txt");
    var dirFile2 = foldertozip.resolve("file2.txt"); 

    var zipPath = zippedDir.resolve("ZippedDirectory.zip");
    try (var zos = new ZipOutputStream(
                            new BufferedOutputStream(Files.newOutputStream(zipPath)))) {
                            
        Files.createDirectory(foldertozip);
        Files.createDirectory(foldertozip.resolve("emptydir"));
        Files.writeString(dirFile1, "Does this Java get you rev'd up or what?");
        Files.writeString(dirFile2, "Java Java Java ... Buz Buz Buz!");

        Files.walk(foldertozip).forEach(path -> {
            try {
                var reliativePath = inputDataDir.relativize(path);
                var file = path.toFile();
                if (file.isDirectory()) {
                    var files = file.listFiles();
                    if (files == null || files.length == 0) {
                        zos.putNextEntry(new ZipEntry(
                                reliativePath.toString() + File.separator));
                        zos.closeEntry();
                    }
                } else {
                    zos.putNextEntry(new ZipEntry(reliativePath.toString()));
                    zos.write(Files.readAllBytes(path));
                    zos.closeEntry();
                }
            } catch(IOException e) {
                e.printStackTrace();
            }
        });
    } catch(IOException e) {
        e.printStackTrace();
    }
}

Conclusion

In this article I have discussed and demonstrated a modern approach to working with zip archives in Java using pure Java and no third party libraries. You may also notice I use a few more modern Java language features, such as functional programming paradigms and the var keyword for type inferred variables, so please make sure you are using at least Java 10 when running these examples.

As always, thanks for reading and don't be shy about commenting or critiquing below.

Author image
Lincoln, Nebraska Twitter Website
I am both passionate and inquisitive about all things software. My background is mostly in Python, Java, and JavaScript in the areas of science but, have also worked on large ecommerce and ERP apps.