loading...
On the Complexity of Optimal Grammar-Based Compression
Snowbird, Utah March 28-March 30
DOI Bookmark: http://doi.ieeecomputersociety.org/10.1109/DCC.2006.59Data Compression Conference (DCC'06)
 This Article 
 
PDF
HTML
 
 Share 
   
 Bibliographic References 
   
 Add to: 
 
Digg
Furl
Spurl
Blink
Simpy
Google
Del.icio.us
Y!MyWeb
 
 Search 
   
Jan Arpe, Universitat zu Lubeck, Germany
Rudiger Reischuk, Universitat zu Lubeck, Germany
Given a string, the task of grammar-based compression is to find a small contextfree grammar that generates exactly that string. We investigate the relationship between grammar-based compression of strings over unbounded and bounded alphabets. Specifically, we show how to transform a grammar for a string over an unbounded alphabet into a grammar for a block coding of that string over a fixed bounded alphabet and vice versa. From these constructions, we obtain asymptotically tight relationships between the minimum grammar sizes for strings and their block codings. Furthermore, we exploit an improved bound of our construction for overlap-free block codings to show that a polynomial time algorithm for approximating the minimum grammar for binary strings within a factor of c yields a polynomial time algorithm for approximating the minimum grammar for strings over arbitrary alphabets within a factor of 24c + ∊ (for arbitrary ∊ > 0). Currently, the latter problem is known to be NP-hard to approximate within a factor of 8569/8568. Since there is some hope to prove a nonconstant lower bound, our results may provide a first step towards solving the long standing open question whether minimum grammar-based compression of binary strings is NP-complete.
Citation:
Jan Arpe, Rudiger Reischuk, "On the Complexity of Optimal Grammar-Based Compression," dcc, pp.173-182, Data Compression Conference (DCC'06), 2006
Usage of this product signifies your acceptance of the Terms of Use.