Compression on GPUs


Increasing needs in efficient storage management and better utilization of network bandwidth with less data transfer lead the computing community to consider data compression as a solution. However, compression introduces an extra overhead and performance can suffer. The key elements in making the decision to use compression are execution time and compression ratio. Due to negative performance impact, compression is often neglected.
General purpose computing on graphic processing units (GPUs) introduces new opportunities where parallelism is available. Our previous work targets the use of the opportunities in GPU based systems by exploiting the parallelism in compression algorithms. In this project we want to test the existing CUDA implementation of Lempel-Ziv-Storer-Szymanski(LZSS) lossless data compression algorithm and additional streaming feature.

Intellectual Merit

Our previous implementation of the LZSS algorithm on GPUs significantly improves the performance of the compression process compared to CPU based implementation without any loss in compression ratio which can support GPU based clusters to solve bandwidth problems. Our system outperforms the serial CPU LZSS implementation by up to 18x, the parallel threaded version up to 3x and the BZIP2 program by up to 6x in terms of compression time, showing the promise of CUDA systems in lossless data compression. To give the programmers an easy to use tool, our work also provides an API for in memory compression without the need for reading from and writing to files, in addition to the version involving I/O.

Broader Impact

Making the best use of the expensive resources is crucial in high performance computing. Resources like memory, network bandwidth, or processing units are the key elements in achieving good performance and consumption of those resources needs to be carefully planned. Data compression helps to utilize space limited resources more efficiently. As nothing comes free, there are also some tradeoffs on the decision of using compression. One of the main issues is increase in running time. Obviously, additional computation results in longer total execution times. Our redesigned implementation for the CUDA framework aims to reduce the effect of compression time compared to CPU based compression implementations.

Use of FutureGrid

I would like to test our CUDA LZSS compression implementation on the futuregrid GPU nodes. I also want to test the new functionality of streaming ability.

Scale Of Use

One or two VMs for the experiment. The timeline includes development which might be 2-3 weeks, then I will need to test the code.



The previous work has been published and the title of the paper is "CULZSS: LZSS Lossless Data Compression on CUDA". The project will be future improvement work.
Adnan Ozsoy
Indiana University Bloomington



2 years 27 weeks ago