Content area
Full Text
ABSTRACT
The concept of Mixed Raster Content describes a compound document image as being composed of several layers, each containing a part of its visual information. Usually, three layers are sufficient for classifying the types of content present in such an image: a foreground layer, a background layer, and a mask layer. In this context, MRC-based compression schemes promise to be more efficient than classical ones (where a single algorithm is used to compress the entire image), due to their implicit content-adaptive nature, because each layer can be compressed separately with a suitable algorithm (JPEG, JBIG etc.).
KEYWORDS: MRC, Document Compression, Image Compression, Data Compression, Image Processing, OCR, Resampling Filters.
INTRODUCTION
From the image processing field of research, image compression is of high interest nowadays, as performance (of transmission of images over the Internet or by fax) and storage issues (for online libraries, online databases of images) have become more prominent, with the increased rates of information exchange across electronic media.
There is a variety of compression algorithms and image formats, each being designed for a particular purpose and image type in mind (De Queiroz et al., 1999). For example, JPEG and JPEG2000 are designed for natural image compression (Rabbani and Joshi, 2002), while JBIG2 favors images with recurring symbols (Haneda and Bouman, 2011), such as document images.
A compression algorithm good for all types of images does not exist (De Queiroz et al., 1999; De Queiroz, 2005), but the variety of compression algorithms can be exploited by using a generalized framework, one which could adapt the algorithm to the characteristics of the image, at least to some degree. De Queiroz et al. (1999) stipulates that this could be accomplished by a compression algorithm based on the concept of Mixed Raster Content. This concept describes an image (comprising information in various forms: text, pictures, line art) as a composite of layers, each with different semantics and different visual and signal characteristics, correspondingly.
MRC COMPRESSION BASICS
An MRC image can be decomposed into as many layers as one considers it necessary, but usually three layers are considered to be sufficient in categorizing image information and compressing it accordingly (ITU-T Recommendation T.44, 2005), as illustrated in Fig 1: the foreground layer (text color), the mask layer...