loading...
Fast and Efficient Compression of Floating-Point Data
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/TVCG.2006.143September-October 2006 (vol. 12 no. 5) pp. 1245-1250
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Large scale scientific simulation codes typically run on a cluster of CPUs that write/read time steps to/from a single file system. As data sets are constantly growing in size, this increasingly leads to I/O bottlenecks. When the rate at which data is produced exceeds the available I/O bandwidth, the simulation stalls and the CPUs are idle. Data compression can alleviate this problem by using some CPU cycles to reduce the amount of data needed to be transfered. Most compression schemes, however, are designed to operate offline and seek to maximize compression, not throughput. Furthermore, they often require quantizing floating-point values onto a uniform integer grid, which disqualifies their use in applications where exact values must be retained. We propose a simple scheme for lossless, online compression of floating-point data that transparently integrates into the I/O of many applications. A plug-in scheme for data-dependent prediction makes our scheme applicable to a wide variety of data used in visualization, such as unstructured meshes, point sets, images, and voxel grids. We achieve state-of-the-art compression rates and speeds, the latter in part due to an improved entropy coder. We demonstrate that this significantly accelerates I/O throughput in real simulation runs. Unlike previous schemes, our method also adapts well to variable-precision floating-point and integer data.

[1] 1245 IEEE 754: Standard for binary floating-point arithmetic, 1985.
[2] U. Bischoff and J. Rossignac, TetStreamer: Compressed back-to-front transmission of Delaunay tetrahedra meshes. Data Compression Conference, 93–102. 2005.
[3] D. Chen, Y.-J. Chiang, and N. Memon, Lossless compression of point-based 3D models. Pacific Graphics, 124–126. 2005.
[4] D. Chen, Y.-J. Chiang, N. Memon, and X. Wu, Optimal alphabet partitioning for semi-adaptive coding of sources of unknown sparse distributions. Data Compression Conference. 2003.
[5] A. W. Cook, W. H. Cabot, P. L. Williams, B. J. Miller, B. R. de Supinski, R. K. Yates, and M. L. Welcome, Tera-scalable algorithms for variable-density elliptic hydrodynamics with spectral accuracy. ACM/IEEE Supercomputing, 60. 2005.
[6] O. Devillers and P.-M. Gandoin, Geometric compression for interactive transmission. IEEE Visualization, 319–326. 2000.
[7] V. Engelson, D. Fritzson, and P. Fritzson, Lossless compression of high-volume numerical data from simulations. Data Compression Conference, 574–586. 2000.
[8] J. Fowler and R. Yagel, Lossless compression of volume data. IEEE Symposium on Volume Visualization, 43–50. 1994.
[9] M. N. Gamito and M. S. Dias, Lossless coding of floating point data with JPEG 2000 Part 10. Applications of Digital Image Processing XXVII, 276–287. 2004.
[10] F. Ghido, An efficient algorithm for lossless compression of IEEE float audio. Data Compression Conference, 429–38. 2004.
[11] S. Gumhold, S. Guthe, and W. Strasser, Tetrahedral mesh compression with the cut-border machine. IEEE Visualization, 51–58. 1999.
[12] L. Ibarria, P. Lindstrom, J. Rossignac, and A. Szymczak, Out-of-core compression and decompression of large n-dimensional scalar fields. Eurographics, 343–348. 2003.
[13] M. Isenburg and P. Alliez, Compressing polygon mesh geometry with parallelogram prediction. IEEE Visualization, 141–146. 2002.
[14] M. Isenburg and P. Alliez, Compressing hexahedral volume meshes. Graphical Models, 65 (4): 239–257, 2003.
[15] M. Isenburg, P. Lindstrom, S. Gumhold, and J. Shewchuk, Streaming compression of tetrahedral volume meshes. Graphics Interface, 115–121. 2006.
[16] M. Isenburg, P. Lindstrom, and J. Snoeyink, Lossless compression of predicted floating-point geometry. Computer-Aided Design, 37 (8): 869–877, 2005.
[17] M. Isenburg, P. Lindstrom, and J. Snoeyink, Streaming compression of triangle meshes. Symposium on Geometry Processing, 111–118. 2005.
[18] F. Kälberer, K. Polthier, U. Reitebuch, and M. Wardetzky, FreeLence — Coding with free valences. Eurographics, 469–478. 2005.
[19] A. Khodakovsky, P. Alliez, M. Desbrun, and P. Schroeder, Near-optimal connectivity encoding of 2-manifold polygon meshes. Graphical Models, 64 (3–4): 147–168, 2002.
[20] T. Liebchen, T. Moriya, N. Harada, Y. Kamamoto, and Y. A. Reznik, The MPEG-4 audio lossless coding (ALS) standard — Technology and applications. 119th Audio Engineering Society Convention. 2005.
[21] G. N. N. Martin, Range encoding: an algorithm for removing redundancy from a digitized message. Video and Data Recording Conference. 1979.
[22] P. Ratanaworabhan, J. Ke, and M. Burtscher, Fast lossless compression of scientific floating-point data. Data Compression Conference, 133–142. 2006.
[23] M. Schindler Range Encoder version 1.3, 2000. URL http://www.compressconsult.com/rangecoder .
[24] P. Schwan, Lustre: Building a file system for 1,000-node clusters. Linux Symposium, 401–408. 2003.
[25] J. Senecal, M. Duchaineau, and K. I. Joy, Length-limited variable-to-variable length codes for high-performance entropy coding. Data Compression Conference, 389–398. 2004.
[26] D. Subbotin, Carryless Rangecoder, 1999. URL http://search.cpan.org/src/SALVA/Compress-PPMd-0.10 Coder.hpp.
[27] C. Touma and C. Gotsman, Triangle mesh compression. Graphics Interface, 26–34. 1998.
[28] A. Trott, R. Moorhead, and J. McGinley, Wavelets applied to lossless compression and progressive transmission of floating point data in 3-D curvilinear grids. IEEE Visualization, 385–388. 1996.
[29] B. E. Usevitch, JPEG2000 extensions for bit plane coding of floating point data. Data Compression Conference, 451. 2003.
[30] Visualization contest data set, 2004. URL http://vis.computer.org/vis2004contestdata.html.
[31] I. H. Witten, R. M. Neal, and J. G. Cleary, Arithmetic coding for data compression. Communications of the ACM, 30 (6): 520–540, 1987.

Index Terms:
High throughput, lossless compression, file compaction for I/O efficiency, fast entropy coding, range coder, predictive coding, large scale simulation and visualization.
Citation:
Peter Lindstrom, Martin Isenburg, "Fast and Efficient Compression of Floating-Point Data," IEEE Transactions on Visualization and Computer Graphics, vol. 12, no. 5, pp. 1245-1250, Sept. 2006, doi:10.1109/TVCG.2006.143
Usage of this product signifies your acceptance of the Terms of Use.