Evaluation of Basic Data Compression Algorithms in a Distributed Environment

Evaluation of Basic Data Compression Algorithms in a Distributed Environment

Minhaj Ahmad Khan

http://dx.doi.org/10.6000/1927-5129.2012.08.02.18

Abstract: Data compression methods aim at reducing the data size in order to make data transfer more efficient. To accomplish data compression, the basic algorithms such as Huffman, Lempel-Ziv (LZ), Shannon-Fano (SF) and Run-Length Encoding (RLE) are widely used. Most of the applications incorporate different variants of these algorithms. This paper analyzes the execution times, compression ratio and efficiency of compression methods in a client-server distributed environment. The data from a client is distributed to multiple processors/servers, subsequently compressed by the servers at remote locations, and sent back to the client. Our experimentation has been carried out using Simgrid Framework. Our results show that the LZ algorithm attains better efficiency/scalability and compression ratio, however, it works slower than other algorithms.

Keywords: Distributed Computing, Compression, LZ, Shannon-Fano, RLE, Huffman, Scalability.