![]() ![]() Just about any choice other than bzip2 will improve decompression time. This other benchmark which also includes lzma (the algorithm of 7zip, so you might use 7z instead of tar -lzma) suggests that lzma at a low level can reach the bzip2 compression ratio faster. This thread and this thread discuss common compression algorithms in particular this blog post cited by derobert gives some benchmarks which suggest that gzip -9 or bzip2 with a low level might be a good compromise compared to bzip2 -9. You might also tune the compression level: bzip2 defaults to -9 (maximum block size, so maximum compression, but also longest compression time) set the environment variable BZIP2 to a value like -3 to try compression level 3. Bzip2 isn't necessarily a bad choice - its main weakness is decompression speed - but you could use gzip and sacrifice some size for compression speed, or try out other formats such as lzop or lzma. If the bottleneck is the CPU, then the first thing to consider is using a faster compression algorithm. If the bottleneck is the network I/O, run the compression process on the machine where the files are stored: running it on a machine with a beefier CPU only helps if the CPU is the bottleneck. Make sure that the disks don't serve many parallel requests as that can only decrease performance. If the bottleneck is the disk I/O, there isn't much you can do. The first step is to figure out what the bottleneck is: is it disk I/O, network I/O, or CPU? Is there a way to improve the speed of the compression and complete the job faster? Any ideas?ĭon't worry about other processes and all, the location where the compression happens is on a NAS, and I can run mount the NAS on a dedicated VM and run the compression script from there. ![]() The cron job runs at 2:30 AM daily and continues to run till 5:00-6:00 PM. The problem is that it takes forever to compress the files. I get good results as 200 GB logs are compressed to about 12-15 GB. I have a script that moves the files to a temporary location and does a tar-bz2 on the temporary directory. I have about 200 GB of log data generated daily, distributed among about 150 different log files. ![]()
0 Comments
Leave a Reply. |
AuthorWrite something about yourself. No need to be fancy, just an overview. ArchivesCategories |