Artificial neural networks models for image data compression
Abstract (summary)
In this dissertation, three dimensionality reduction artificial neural networks are proposed. The theoretical analysis of these models is carried out. To demonstrate the capabilities of these models to remove linear and nonlinear redundancy of the processed data, the models are applied to perform digital image compression.
First, a neural network approach to perform adaptive calculation of the Karhunen Loeve transformation (KLT), or the principal components (eigenvectors) of the covariance matrix of an input sequence, is proposed. The optimal learning rate is calculated by minimizing an error function of the learning rate along the gradient descent direction.
The above approach is then applied to encode gray-level images adaptively, by calculating a limited number of the KLT coefficients. The effect of changing the size of the input sequence, the maximum number of coding coefficients on the bit-rate values, the compression ratio, the signal-to-noise ratio, and the generalization capability of the model to encode new images are investigated.
Second, an analysis of a three-layer auto-association network with linear output neurons and sigmoidal hidden neurons is carried out. It is shown analytically why the sigmoids operate only in their linear region. The statistical relations among the input, the hidden, and the output vectors are then employed to derive the conditions under which only the linear part of the sigmoids is used in the coding process. Treating the network as a totally linear one, a pruning algorithm is proposed to find out the minimum number of hidden neurons needed to reconstruct the input data within a certain error threshold.
Finally, a neural network model performing clustering or unsupervised segmentation is proposed and compared to Winner-take-all (WTA) techniques which are commonly used in clustering analysis.
A classified vector image quantizer is then proposed. The algorithm employs a simple classification procedure which uses the variance of the image block as the only feature required. This classifier has good properties as it sorts the classes by their entropy contents. Then every class is clustered using the mixture maximum likelihood criterion. This shows that the number of clusters required to represent any class can be determined by assuming a certain input distribution. (Abstract shortened by UMI.)