Content area
In this paper we carry out an in-depth study on the average decoding error probability of the random parity-check matrix ensemble over the erasure channel under three decoding principles, namely unambiguous decoding, maximum likelihood decoding and list decoding. We obtain explicit formulas for the average decoding error probabilities of the random parity-check matrix ensemble under these three decoding principles and compute the error exponents. Moreover, for unambiguous decoding, we compute the variance of the decoding error probability of the random parity-check matrix ensemble and the error exponent of the variance, which implies a strong concentration result, that is, roughly speaking, the ratio of the decoding error probability of a random linear code in the ensemble and the average decoding error probability of the ensemble converges to 1 with high probability when the code length goes to infinity.
Introduction
Background
In digital communication, it is common that messages transmitted through a public channel may be distorted by the channel noise. The theory of error-correcting codes is the study of mechanisms to cope with this problem. This is an important research area with many applications in modern life. For example, error-correcting codes are widely employed in cell phones to correct errors arising from fading noise during high frequency radio transmission. One of the major challenges in coding theory remains to construct new error-correcting codes with good properties and to study their decoding and encoding algorithms.
In a binary erasure channel (BEC), a binary symbol is either received correctly or totally erased with probability . The concept of BEC was first introduced by Elias in 1955 [3]. Together with the binary symmetric channel (BSC), they are frequently used in coding theory and information theory because they are among the simplest channel models, and many problems in communication theory can be reduced to problems in a BEC. Here we consider more generally a q-ary erasure channel in which a q-ary symbol is either received correctly, or totally erased with probability .
The problem of decoding linear codes over the erasure channel has received renewed attention in recent years due to their wide application in the internet and the distributed storage system in analyzing random packet losses [2, 11, 12]. Three important decoding principles, namely unambiguous decoding, maximum likelihood decoding and list decoding, were studied in recent years for linear codes over the erasure channel, the corresponding decoding error probabilities under these principles were also investigated (see [5, 9, 14, 17] and reference therein).
In particular in [14], upon improving previous results, the authors provided a detailed study on the decoding error probabilities of a general q-ary linear code over the erasure channel under the three decoding principles. Via the notion of -incorrigible sets for linear codes, they showed that all these decoding error probabilities can be expressed explicitly by the r-th support weight distribution of the linear codes. As applications they obtained explicit formulas of the decoding error probabilities for some of the most interesting linear codes such as MDS codes, the binary Golay code, the simplex codes and the first-order Reed-Muller codes etc. where the r-support weight distributions were known. They also computed the average decoding error probabilities of a random code over the erasure channel and obtained the error exponent of a random code () under unambiguous decoding. The error exponents of a random [n, nR] code under list and maximum likelihood decoding were obtained in [18].
Statement of the main results
In this paper we consider a different code ensemble, namely the random parity-check matrix ensemble, that is, the set of all matrices over endowed with uniform probability, each of which is associated with a parity-check code as follows: for each , the corresponding parity-check code is given by
1
Here boldface letters such as denote row vectors.The ensemble has been studied for a long time and many strong results have been obtained. For example, in the classical work of Gallager [7], an upper bound of the average number of codewords of a given weight in was obtained, from which information about the minimum-distance distribution of can be derived (see [7, Theorems 2.12.2]); in [1] (see also [4]) union bounds on the block erasure probabilities of the ensemble was obtained under the maximum likelihood decoding (see [1, 4]). More recently, the undetected error probability of the ensemble was studied in the binary symmetric channel by Wadayama [16] (i.e. ), and some bounds on the error probability under the maximum likelihood decoding principle were obtained in the q-ary erasure channel [6, 10]. It is easy to see that contains all linear codes in the random code ensemble considered in [14], but these two ensembles are different for two reasons: first, in the random code ensemble considered in [14], each code is counted exactly once, while in each code is counted with some multiplicity as different choices for the matrix H may give rise to the same code; second, some codes in may have rates strictly larger than as the rows of H may not be linearly independent.
It is conceivable that most of the codes in have rate , and the average behavior of codes in should be similar to that of the random code ensemble considered in [14]. We will show that this is indeed the case. Actually we will obtain stronger results, taking advantage of the fine algebraic structure of the ensemble , which may not be readily available in the random code ensemble. Such structure has been exploited in [16].
We first obtain explicit formulas for the average decoding error probability of the ensemble over the erasure channel under the three different decoding principles. This is comparable to [14, Theorem 2] for the random code ensemble. Such formulas are useful as they allow explicit evaluations of the average decoding error probabilities for any given m and n, hence giving us a meaningful guidance as to what to expect for a good code over the erasure channel.
Theorem 1
Let be the random matrix ensemble described above. Denote by the Gaussian q-binomial coefficient and denote
2
The average unsuccessful decoding probability of under list decoding with list size , where is a non-negative integer, is given by
3
The average unsuccessful decoding probability of under unambiguous decoding is given by
4
The average decoding error probability of under maximum likelihood decoding is given by
5
Remark 1
By using the elementary identitywe see that . We list the formula for here to emphasis the unambiguous decoding, which we will study further later.
Remark 2
In Table 1, we use Mathematica to compute the numerical values of and () for and with . It can be seen that when is much smaller than the rate of the code which is , the decoding error probabilities are all very small, in particular and are of the same magnitude, and decreases even much further as increases. The fact that these decoding error probabilities are small will be captured by the error exponents which we will study in Theorem 2.
Table 1. Decoding error probability for where with
0.05 | ||||
0.10 | ||||
0.15 | ||||
0.20 | ||||
0.25 | ||||
0.30 | ||||
0.35 | ||||
0.40 | ||||
0.45 | ||||
0.50 |
Remark 3
To make a comparison of Theorem 1 with corresponding results for the random code ensemble, we take and rewrite the explicit formulas obtained in [14, 18] as follows (see [14, Theorem 2] and [18, Section 6]): Denote by the random code ensemble.
The average unsuccessful decoding probability of under list decoding with list size , where is a non-negative integer, is given by
6
The average unsuccessful decoding probability of under unambiguous decoding is given by
7
The average decoding error probability of under maximum likelihood decoding is given by
8
Table 2. Difference of decoding error probability between and where with
0.05 | ||||
0.10 | ||||
0.15 | ||||
0.20 | ||||
0.25 | ||||
0.30 | ||||
0.35 | ||||
0.40 | ||||
0.45 | ||||
0.50 |
Next, letting for , we compute the error exponents of the average decoding error probability of the ensemble series as under these decoding principles.
Theorem 2
Let the rate be fixed and .
For any fixed integer , the error exponent for average unsuccessful decoding probability of under list decoding with list size is given by
9
The error exponents for average unsuccessful decoding probability of under unambiguous decoding and maximum likelihood decoding (respectively) are both given by
10
A plot of the function for in the range is given by Fig. 1.
[See PDF for image]
Fig. 1
The error exponent for where and
It turns out that the error exponents obtained here under these decoding principles are identical with those for the random code ensemble obtained in [14, Theorem 3] and [18, Theorems 1.3 and 1.4].
We establish a strong concentration result for the unsuccessful decoding probability of a random code in the ensemble towards the mean under unambiguous decoding.
Theorem 3
Let the rate be fixed and . Then as runs over the ensemble , we have
11
under either of the following conditions:if for any , or
if for .
Here the notion WHP in (11) refers to “with high probability”, that is, for any , there is and such thatNoting that in the range , it was known that (see Theorem 2), henceso (11) shows that also tends to zero exponentially fast with high probability for the ensemble under either Condition (1) or (2) of Theorem 3.
The paper is now organized as follows. In Sect. 2, we introduce the three decoding principles and the Gaussian q-binomial coefficient in more details. Then in Sect. 3, we provide three counting results regarding matrices of certain rank over . Afterwards in Sects. 4, 5 and 6, we give the proofs of Theorems 1, 2 and 3 respectively. The proofs of Theorem 3 involves some technical calculus computations on the error exponent of the variance. In order to streamline the proofs, we put some of the arguments in Sect. 7 in Appendix. Finally we conclude this paper in Sect. 8.
Preliminaries
Three decoding principles, the average decoding error probability, and the error exponent
The three decoding principles have been well studied in the literature. For the sake of readers, we explain these terms here, following the presentation outlined in [14].
In a q-ary erasure channel, the channel input alphabet is a finite field of order q, and during transmission, each symbol is either received correctly with probability or erased with probability .
Let C be a code of length n over . For a codeword that was transmitted through the channel, suppose the word is received. Denote by E the set of i’s such that , i.e., the i-th symbol was erased during the transmission. In this case, we say that an erasure set E occurs. The probability P(E) that this erasure set E occurs for is clearly . Here denote the cardinality of the set E.
Let be the set of all codewords of C that match the received word , that is,Here . The decoding problem is about how to choose one or more possible codewords in the set when a word is received. Now we consider three decoding principles:
In unambiguous decoding, the decoder outputs the only one codeword in if and declares “failure” otherwise. The unsuccessful decoding probability of C under unambiguous decoding is defined to be the probability that the decoder declares “failure”.
In list decoding, the decoder with list size outputs all the codewords in if and declares “failure” otherwise. The unsuccessful decoding probability of C under list decoding with list size is defined to be the probability that the decoder declares “failure”.
In maximum likelihood decoding, the decoder randomly choose a codeword in uniformly and outputs this codeword. The decoding error probability of C under maximum likelihood decoding is defined to be the probability that the codeword outputted by the decoder is not the sent codeword.
Now assume that C is an linear code, that is, C is a k-dimensional subspace of . For any , defineIt is easy to see that if the set is not empty, then the cardinality of is the same as that of C(E), which is a vector space over . Denote by the -incorrigible set distribution of C, and the incorrigible set distribution of C, which are defined respectively as follows:
12
Here denotes the dimension of C(E) as a vector space over . We see that , so if , then . We also define13
It is easy to see that , if and14
We also have the identity15
Recall from [14] that the values and can all be expressed in terms of , and as follows:For , we write and for , where is the parity-check code defined by (1). The average decoding error probabilities over the ensemble are given byandHere the expectation is taken over the ensemble .Let be fixed constants. The error exponent for average unsuccessful decoding probability of the family of ensembles under unambiguous decoding is defined asprovided that the limit exists [8, 14, 15]. The error exponents of for the other two decoding principles are defined similarly.
Gaussian binomial coefficients
For integers , the Gaussian binomial coefficients is defined aswhere . By convention , for any and if or . The function defined in (2) can be written asWe may define for and if or . Next, recall the well-known combinatorial interpretation of :
Lemma 1
([13]) The number of k-dimensional subspaces of an n-dimensional vector space over is .
The Gaussian binomial coefficient satisfies the propertyand the identity
16
Three counting results for the ensemble
In this section we provide three counting results about matrices of certain rank in the ensemble . Such results may not be new, but since it is not easy to locate them in the literature, we prove them here. These results will be used repeatedly in the proofs later on.
For , denote by the rank of the matrix H over .
Lemma 2
Let H be a random matrix in the ensemble . Then for any integer j, we have
17
Proof
We may assume that j satisfies , because if j is not in the range, then both sides of Equation (17) are obviously zero.
Denote by the set of -linear transformations from to . Writing vectors in and as row vectors, we see that the random matrix ensemble can be identified with the set via the relation
18
Since if and only if , and , we haveThe inner sum counts the number of surjective linear transformations from to V, a j-dimensional subspace of . Since , this is also the number of surjective linear transformations from to , or, equivalently, the number of matrices K over such that the columns of K are linearly independent. The number of such matrices K can be counted as follows: the first column of K can be any nonzero vector over , there are choices; given the first column, the second column can be any vector lying outside the space of scalar multiples of the first column, so there are choices; inductively, given the first k columns, the -th column lies outside a k-dimensional subspace, so the number of choices for the -th column is . Thus we have19
Together with Lemma 1, we obtainwhich is the desired result.Lemma 3
Let H be a random matrix in the ensemble . Let be a subset with cardinality s. Denote by the submatrix formed by columns of H indexed from A. Then for any integers j and r, we have
20
Proof
We may assume that and , because if j or r does not satisfy this condition, then both sides of Equation (20) are zero.
Using the relations (18) and (19), we can expand the term as
21
Here is the subspace of formed by restricting V to coordinates with indices from A. We may consider the projection given byThe kernel of has dimension and is of the form for some subspace . So we can further decompose the sum on the right hand side of (21) as22
Now we compute the inner sum on the right hand side of (22). Suppose we are given an ordered basis of the -dimensional subspace of . We extend it to an ordered basis of some j-dimensional subspace V as follows: first we need r other basis vectors to be linearly independent. At the same time, they have to be linearly independent with any nonzero vector in due to the kernel condition. This requires the set to be linearly independent in . On the other hand, if this condition is satisfied, then the r vectors are also linearly independent with one another as well as with any nonzero vector in . Therefore it reduces to counting the number of ordered linearly independent sets of r vectors in . This number is clearly given by , so the total number of different ordered bases is given by .On the other hand, given a fixed j-dimensional subspace V with , we count the number of ordered bases of V of the form stated in previous paragraph as follows: we choose to be any vector in V but not in , which gives many choices for ; similarly is any vector in V but not in the span of and , this gives us many choices for ; using this argument, we see that the number of such ordered bases is given by .
We conclude from the above arguments thatPutting this into the right hand side of (22) and using Lemma 1 again, we getThe desired result is obtained immediately by plugging this into the right hand side of (21). This completes the proof of Lemma 3.
Lemma 4
Let H be a random matrix in the ensemble . Let be subsets of [n] such thatThen
23
Proof
It is clear that if a matrix H has full rank, then so is the submatrix for any index subset A. Hence we haveIt is easy to see that the two events and are conditionally independent given , since columns of and are independent as random vectors over . Hence we getHere we have applied Lemmas 2 and 3 in the last equality with and and .
Proof of Theorem 1
Let C be an linear code. Recall from Sect. 2 that the values and can all be expressed explicitly as
24
25
26
where and are the incorrigible set distribution of C and the -incorrigible set distribution of C respectively, and is defined in (13) also in Sect. 2.Now we can start the proof of Theorem 1. For , we denoteTaking expectations on both sides of Equations (24)-(26) over the ensemble , we obtain
27
28
29
We now compute . Noting that for and , we have , thusBy the symmetry of the ensemble , the inner sum on the right hand side depends only on the cardinality of E, so we may assume to obtainThe right hand side is exactly where the probability is over the ensemble . So from Lemma 2 we have30
Using this and (15), we also obtainInserting the above values and into (28) and (29) respectively, we obtain explicit expressions of and , which agree with (3) and (5) of Theorem 1.As for , noting by (14) that and by (30) that , therefore
31
Inserting this value into (27), we also obtain the desired expression of . This completes the proof of Theorem 1.Proof of Theorem 2
First recall that the error exponents of the average decoding error probability of the ensemble over the erasure channel under the three decoding principles are defined by
32
andprovided that the three limits exist [8, 14, 15].Unambiguous decoding corresponds to list decoding with , and it is also easy to see thathence we haveif the limit exists say in (32) for . So we only need to prove Part 1) of Theorem 2 for the case of list decoding.
Write , and we defineIt is easy to see that for all integers i, j. In addition, if and only if and .
We can rewrite (3) aswhere i, j are integers satisfying the conditions
33
The number of such integer pairs (i, j) is at most mn. Noting that , we have34
Now we focus on the quantity . First, set so that . It is easy to verify that for ,35
Hence we haveThe infinite product converges absolutely to some positive real number M which only depends on q, and for any m. This implies thatand thusTherefore we haveWe want to maximize this quantity over i, j satisfying (33). Since the term is always non-positive, it is easy to see that for any fixed i, to maximize the term , we shall takeSo we can simplify (34) aswhere .Let with . Using the result (see [13] and [14, Lemma 3] for example)
36
where is the binary entropy function (in q-its), and taking , we obtainwhere37
Note that f(t) is continuous on the interval (0, 1).Differentiating (37) with respect to t, we getIt is easy to check from the right hand side that atand at these points the function f(t) has a local maximum. There are three cases to consider:
Case 1:
In this case we have , then f(t) is maximized at , soCase 2:
In this case we have , then f(t) is maximized at , soCase 3:
In this case we get , then f(t) is maximized at , soCombining Cases 1–3 above gives Equation (9). This completes the proof of Theorem 2.
Proof of Theorem 3
The proof of Theorem 3 depend on the computation of the variance of the unsuccessful decoding probability under unambiguous decoding and its error exponent.
The variance of unsuccessful decoding probability and its error exponent
Note from (24) that the variance of the unsuccessful decoding probability under unambiguous decoding can be expressed aswhere the term is given byWe first obtain:
Lemma 5
For , we have
38
Here the multinomial coefficient for any non-negative integers a, b, c, d, n such that is given byProof of Lemma 5
Obviously if i or does not satisfy the relation , then both sides of (38) are zero. So we may assume that .
For any , from (12) we see thatSo we can expand the term as
39
Applying Lemmas 2 and 4 to the terms in the inner sums on the right hand side of (39), we have40
Together with the combinatorial identity41
we obtain the desired result as described by Lemma 5.From Lemma 5, the variance can be obtained easily, which we summarize below:
Theorem 4
For , the error exponent of the variance is defined by
42
if the limit exists. We obtain:Theorem 5
where is given by
43
A plot of the function for in the range is given by Fig.2.
[See PDF for image]
Fig. 2
The error exponent for , where
We remark that the proof of Theorem 5 follows a similar argument as that of Theorem 2, though here the computation of is much more complex as it involves a lot more technical manipulations. In order to streamline the idea of the proof in this section, we first assume Theorem 5 and leave its proof to Sect. 7 in Appendix. Then Theorem 3 can be proved easily by using the standard Chebyshev’s inequality.
Proof of Theorem 3
We have, by definition of error exponents,andThese implyThe right hand side of the above will tend to 0 as if
44
Thus under (44), by Chebyshev’s inequality, we have, for any fixed ,that is, WHP as .To prove Theorem 3, it remains to verify that (44) holds true under the assumptions of either (1) or (2) of Theorem 3.
Case 1.
Since , by a simple calculation, we haveso (44) always holds true in this case.
Case 2.
We can actually obtain a slightly more general result than (2) of Theorem 3 as follows:
Lemma 6
If (44) holds for where satisfies , then (44) also holds true for any .
Proof of Lemma 6
First, from Case 1 and the continuity of and , we know that (44) holds for . Hence such always exists.
Now if , then we haveDifferentiating twice with respect to R, we havewhich is negative for , and thus is convex for R within that interval.
Next, for , we note thatis a linear function in R with positive slope. Hence the function is convex for R within the whole interval , and Lemma 6 follows.
Now we let . Consider . Under this special value,Differentiating with respect to , we haveIt is easy to check that when , then we haveTherefore exactly one of the two roots for lies in the desired range , and attains local maximum at that point. Since , we conclude that is positive for , and therefore (44) holds for .
Now by Lemma 6, we see that Eq. (44) holds for when . This completes the proof of Theorem 3.
Acknowledgements
The authors would like to acknowledge the editor and anonymous reviewers for their valuable comments and suggestions to improve our manuscript.
Author contributions
First draft: C. H. Chan and M. Xiong wrote the statements of main theorems, lemmas as well as their proofs and also prepared Figures 1 and 2. M. Xiong wrote the background part of the introduction section. F.-W. Fu provided the main initiation and motivation of doing this work. All authors reviewed the manuscript. Revision: C. H. Chan responded to and made most of the required revisions suggested by Reviewer 1, except for Point 7. M. Xiong responded to Point 7 of Reviewer 1 by inserting two tables of simulating data on the error probabilities under the three different decoding principles after the statement of the main theorem, and also responded to most of the comments from Reviewer 2. F.-W. Fu provided some extra ideas on the responses to the comments from Reviewer 2. All authors reviewed the manuscript again to make sure it is typo-free and contents (especially formulas) not far beyond margin after making the above revisions.
Funding
Open access funding provided by Hong Kong University of Science and Technology.
Data availability
No datasets were generated or analysed during the current study.
Declarations
Conflict of interest
The authors declared that they hav no conflict of interest.
L. Mérai
Publisher's Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
References
1. Berlekamp, ER. The technology of error-correcting codes. Proc. IEEE; 1980; 68,
2. Byers, J.W., Luby, M., Mitzenmacher, M., Rege, A.: A digital fountain approach to reliable distribution of bulk data. In: Proc. ACM SIGCOMM Conf. Appl., Technol., pp. 56–67. Architectures, Protocols Comput. Commun., Vancouver (1998).
3. Cover, TM; Thomas, JA. Elements of Information Theory; 1991; 1 New York, Wiley-Interscience: 0762.94001
4. Di, C; Proietti, D; Telatar, IE; Richardson, TJ; Urbanke, RL. Finite-length analysis of low-density parity-check codes on the binary erasure channel. Special issue on Shannon theory: perspective, trends, and applications. IEEE Trans. Inform. Theory; 2002; 48,
5. Didier, F. A new upper bound on the block error probability after decoding over the erasure channel. IEEE Trans. Inform. Theory; 2006; 52,
6. Fashandi, S., Gharan, S.O., Khandani, A.K.: Coding over an erasure channel with a large alphabet size. In: Proc. 2008 IEEE Int. Symp. Inform. Theory, pp. 1053–1057 (2008).
7. Gallager, R.G.: Low Density Parity Check Codes. MIT Press (1963); IRE Trans. IT-8, pp. 21–28. Cambridge (1962).
8. Gallager, RG. Information Theory and Reliable Communication; 1968; New York, Wiley: 0198.52201
9. Lemes, L.C., Firer, M.: Generalized weights and bounds for error probability over erasure channels. In: Proc. Inf. Theory Appl. Workshop (ITA), p. 108. San Diego (2014).
10. Liva, G; Paolini, E; Chiani, M. Bounds on the error probability of block codes over the $q$-ary erasure channel. IEEE Trans. Inform. Theory; 2013; 61,
11. Luby, M., Mitzenmacher, M., Shokrollahi, A., Sipelman, D.A., Stemann, V.: Practical loss-recilient codes. In: Proc. 29th Annual ACM Symp. Theory Comput., pp. 150–159 (1997).
12. Lun, DS; Médard, M; Koetter, R; Effros, M. On coding for reliable communication over packet networks. Phys. Commun.; 2008; 1,
13. MacWilliams, F.J., Sloane,N.J.A.: The Theory of Error-Correcting Codes. North-Holland Mathematical Library, vol. 16. Amsterdam (1981).
14. Shen, L; Fu, F-W. The decoding error probability of linear codes over the erasure channel. IEEE Trans. Inform. Theory; 2019; 65,
15. Viterbi, AJ; Omura, JK. Principles of Digital Communication and Coding; 1979; New York, McGraw-Hill: 0495.94002
16. Wadayama, T. On the undetected error probability of binary matrix ensemblesl. IEEE Trans. Inform. Theory; 2010; 56,
17. Weber, JH; Abdel-Ghaffar, KAS. Results on parity-check matrices with optimal stopping and/or dead-end set enumerators. IEEE Trans. Inform. Theory; 2008; 54,
18. Xiong, M. Decoding error probability in the erasure channel and the $r$-th support weight distribution. Sci. Sin. Math.; 2021; 51, pp. 1-14. [DOI: https://dx.doi.org/10.1360/SSM-2021-0019] 1456.76161 (in Chinese)
© The Author(s) 2024. This work is published under http://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.