Content area
We first study the error performances of the Vector Weak Rescaled Pure Greedy Algorithm for simultaneous approximation with respect to a dictionary
1. Introduction
Approximation using a sparse linear combination of elements from a fixed redundant family is actively used because of its concise representations and increased computational efficiency. It has been applied widely to signal processing, image compression, machine learning and PDE solvers (see [1,2,3,4,5,6,7,8,9,10]). Among others, simultaneous sparse approximation has been utilized in signal vector processing and multi-task learning (see [11,12,13,14]). It is well known that the greedy-type algorithms are powerful tools for generating such sparse approximations (see [15,16,17,18,19]). In particular, vector greedy algorithms are very efficient at approximating a given finite number of target elements simultaneously (see [20,21,22,23]). In this article, we propose a new vector greedy algorithm—the Vector Weak Rescaled Pure Greedy Algorithm (VWRPGA)—for simultaneous approximation. We estimate the error of the VWRPGA and show that its convergence rate on the convex hull of the dictionary is optimal.
Let X be a real Banach space with norm . We say a set of elements is a dictionary, if for each and . We assume that every dictionary is symmetric, i.e.,
If is the output of a greedy algorithm after m iterations, then the efficiency of the approximation can be measured by the decay of the error as . We are mainly concerned with the error . We want to know whether it tends to zero, as . If it indeed converges to zero, then what is the convergence rate? To solve these problems, we need the following classes of elements.
For a general dictionary , we define the class of elements
and as the closure of . Let be the union of the classes over all . Denote . For , we define its norm asWe recall some related results in a Hilbert space for the reason that this kind of space has priority in geometric features and practical applications. Let H be a real Hilbert space with an inner product and the norm .
The most natural greedy algorithm in a Hilbert space is the Pure Greedy Algorithm (PGA). This algorithm is also known as the Matching Pursuit in signal processing [24]. We recall its definition from [15].
-
PGA(H, ):
•Step 0: Define .
•Step m:
- If , stop the algorithm and define for .
- If , choose an element such that
Define the next approximant to be
and proceed to Step .The first upper bound on the rate of convergence of the PGA for was obtained in [15] as follows:
Later, the above estimate of the PGA was improved in [25,26] to and , where s is the root of the equation
on the closed interval It is known that .Note that when is an ortho-normal basis of H, it is not difficult to prove that for any , there holds
(1)
In addition, there exists an element (see [27]) such that
Thus, inequality (1) cannot be improved for ortho-normal bases. A natural question arises: does inequality (1) hold for any dictionary ? Unfortunately, the answer is negative.
In fact, Livshitz and Temlyakov [28] proved that there exists a dictionary , a positive constant C and an element such that
This lower bound on the convergence rate of the PGA indicates that this algorithm does not attain the rate for all .
In [15], the idea of the best approximation was introduced into the greedy algorithm, which formed the original idea of the Orthogonal Greedy Algorithm (OGA). In order to construct an approximation, the OGA takes the orthogonal projection of f on the subspace generated by all the chosen . We recall its definition from [15].
-
OGA(H, ):
•Step 0: Define .
•Step m:
- If , stop the algorithm and define for .
- If , choose an element such that
Define the next approximant to be
and proceed to Step , where is the orthogonal projection ontoIn [15], it is shown that for any , the output of the OGA(H, ) satisfies
Note that when is an ortho-normal basis of H, the OGA(H, ) coincides with the PGA(H, ). So, the rate is sharp.
The Relaxed Greedy Algorithm (RGA) is also a modification of PGA. We recall its definition from [15].
-
RGA(H, ):
•Step 0: Define .
•Step m:
- If , stop the algorithm and define for .
- If , choose an element such that
For , define
For , define the next approximant to be
and proceed to Step .It is shown in [15] that the RGA also achieves the rate on .
The Rescaled Pure Greedy Algorithm (RPGA) [17] is another kind of greedy algorithm which makes a modification to the rescaling process to replace the original output of the PGA with at each iteration. It is defined as follows.
-
RPGA(H, :
•Step 0: Define .
•Step m:
- If , stop the algorithm and define for .
- If , choose an element such that
With
define the next approximant to be and proceed to Step .In [17], the convergence rate of the RPGA was obtained as follows:
It is worth noting that the supremum of the inner product might not be attainable. To remedy this problem, the original condition on the selection of is replaced by
where . This is often referred to as the “weak” condition. The study on the weak version of the above algorithms can be found in [15,25,29,30].Meanwhile, building simultaneous approximations for a given vector of elements brings about the so-called vector-type greedy algorithms. Instead of running the algorithm for a finite collection of elements each time separately, the vector greedy algorithm manages to obtain a simultaneous approximation of all elements with a single run. Hence, the complexity of calculation and the storage of information can be reduced greatly. Now, it comes to the question of how well this type of algorithm can perform. Namely, we need to measure its efficiency via its error bound. The Vector Weak Pure Greedy Algorithm (VWPGA, which is also referred to as the Vector Weak Greedy Algorithm (VWGA)) and the Vector Weak Orthogonal Greedy Algorithm (VWOGA) have been introduced and studied in [21,22,23].
We recall the definitions of the VWPGA and VWOGA from [23] as follows. Let and be a given sequence.
-
VWPGA(H, ): is the target element.
•Step 0: Define .
•Step m:
- If , stop the algorithm and define for .
- If , choose an element such that
Define the next approximant to be
and proceed to Step .-
VWOGA(H, ): is the target element.
•Step 0: Define .
•Step m:
- If , stop the algorithm and define for .
- If . Let be such that
Choose an element such that
Define the next approximant to be
and proceed to Step , where is the orthogonal projection onto := span{}.We list the results on the convergence rate of the VWPGA and VWOGA in [23] as follows.
Let be a given real sequence. Then, for any , the output of the VWPGA satisfies
Let be a given real sequence. Then, for any , the output of the VWOGA satisfies
Improvements to the above estimates are made in [19,21,22]. The results indicate that the VWOGA achieves a better convergence rate on than that of the VWPGA.
In [23], the authors gave a sufficient condition of convergence for the VWPGA.
Assume that . Then, for any dictionary and any finite elements , the VWPGA satisfies
Motivated by these studies, we design the Vector Weak Rescaled Pure Greedy Algorithm (VWRPGA) and study its efficiency. The remainder of the paper is organized as follows. In Section 2, we deal with the case of Hilbert spaces. In Section 3, we deal with the case of Banach spaces. In Section 4, we draw the conclusions. Below, we provide more details.
In Section 2, we define the VWRPGA in Hilbert spaces and study its approximation properties. We first prove that
is the sufficient convergence condition of the VWRPGA for any and , . This convergence condition is weaker than that of the VWPGA. Then, we prove that the error bound of the VWRPGA on satisfiesWhen , we show that convergence rate of the VWRPGA on is , which is sharp. This convergence rate is better than that of the VWPGA. In particular, this advantage is more obvious when N is large. The VWRPGA is more efficient than VWOGA from the viewpoint of computational complexity. This is because, for N target elements, the VWRPGA only needs to solve N one-dimensional optimization problems, while the VWOGA involves N m-dimensional optimization problems.
In Section 3, we define the VWRPGA for some uniformly smooth Banach spaces. The sufficient condition of the convergence of the VWRPGA is obtained in this case. It seems that this is the first result on the convergence analysis of the vector greedy algorithms in the Banach space setting. Then, we derive the error bound of the VWRPGA. The results show that the convergence rate of the VWRPGA on is sharp. We compare the approximation properties of the VWRPGA with those of the Vector Weak Chebyshev Greedy Algorithm (VWCGA) and the Vector Weak Relaxed Greedy Algorithm (VWRGA). We show that the VWRPGA has better convergence properties than the VWRGA. Similarly, the computational complexity of the VWRPGA is essentially smaller than those of the VWCGA and VWRGA.
In Section 4, we draw the conclusions of our study. Our results show that the VWRPGA is the simplest vector greedy algorithm for simultaneous approximation with the best convergence property and the optimal convergence rate. We also discuss the possible applications of the VWRPGA in multi-task learning and signal vector processing.
2. The VWRPGA for Hilbert Spaces
In this section, we define the VWRPGA in Hilbert spaces and obtain its sufficient condition of convergence together with an estimate of its error bound. Based on these results, we compare the VWRPGA with the VWPGA and the VWOGA.
Firstly, we recall the definition of the Weak Rescaled Pure Greedy Algorithm (WRPGA) in Hilbert spaces from [17]. Let , be a given sequence. The WRPGA consists of the following stages:
-
WRPGA(H, ):
•Step 0: Define .
•Step m:
- If , stop the algorithm and define for .
- If , choose an element such that
With
define the next approximation to be and proceed to Step .The error bound of the WRPGA has been obtained as follows.
(see Theorem 4.1 in [17]). If , then the output of the WRPGA satisfies the error estimate
Based on the WRPGA, we can define the VWRPGA. Let , be a given sequence. Now, we define the VWRPGA using the following steps:
-
VWRPGA(H, ): Given
•Step 0: Define .
•Step m:
- If , stop the algorithm and define for .
- If , let be such that
Choose an element such that
With
define the next approximation to be and proceed to Step .We establish in this section two typical results on the approximation properties of the VWRPGA (H, ). We first give the sufficient condition for the convergence of the VWRPGA for any dictionary and any ,
Assume . Then, the VWRPGA converges for any dictionary and any
In the proof of Theorem 5, we will reduce the approximation of the general element to that of the element from . To this end, we recall from [31] the following lemmas on the approximation properties of .
Let X be a Banach space and be a dictionary. Then, for any and any , there exists such that
and
with some number .
For any and any dictionary , we have
Note that is the orthogonal projection of onto the one-dimensional space . Thus, it is the best approximation of from .
Let , be the residual of . By the definition of and the choice of , we have
(2)
The latter inequality implies that is a decreasing sequence. According to the Monotone Convergence Theorem, we know that exists, .
We prove that by contradiction. Assume , . Then, for any m, we have . By (2), we obtain that
Denote
By the inequality , we obtain that
(3)
Then, we come to obtain a lower estimate for .
Set . In view of Lemma 1, we can find such that
and with some number .Using Lemma 2, we have
(4)
Since is the orthogonal projection of onto , we have
Then,
(5)
Combining (4) and (5) with , we obtain
(6)
Combining (3) with (6), we can obtain that
The assumption implies that as .
It is obvious that for . Hence, we obtain a contradiction, which proves this theorem. □
It is known from Theorem 2.1 in [32] that is also the necessary condition for the convergence of the VWRPGA.
According to the Cauchy–Schwartz inequality, we know that
Hence,
On the other hand, taking , we notice that
Therefore, the convergence condition of the VWPRGA is weaker than that of the VWPGA.
The following theorem gives the error bound of the VWRPGA () for .
Let be a weakness sequence. If . Then, we have for the VWRPGA ()
We establish the approximation error of the VWRPGA based on the methods of [25]. The main idea of this proof is that the VWRPGA can be seen as a realization of the WRPGA with a particular weakness sequence.
Let . Under the assumption that and the fact that the sequence is decreasing, we have
Thus, we only need to prove the estimate below:
At step k, the VWRPGA chooses from the in terms of only one remainder from . Then, each has been used different times to choose when the VWRPGA carries on to step m. Now, we record the usage of .
For every , denote } ( is defined in the definition of VWRPGA). Then, we have
Hence,
Using we can find such that
Next, let . We have
(7)
Now, we only consider element . For , we can obtain as an application of the WRPGA with the weakness sequence given by
Therefore, by Theorem 4, we obtain
Together with (7), we complete the proof of Theorem 6. □
We recall the theorem in [21] about the error estimate of the VWPGA.
Let be a decreasing sequence. Then, for any , the output of the VWPGA satisfies
We observe from Theorem 7, for a fixed m, the error of the VWPGA increases as the number of the target elements increases. The exponent is close to zero as long as N is sufficiently large.
Taking in Theorem 7, we yield the following theorem, which gives the convergence rate of the VWPGA.
Let , . Then, for any , the output of the VWPGA satisfies
Again, by taking in Theorem 6, we obtain the following theorem.
Let . If . Then, for the VWRPGA (),
From Theorems 8 and 9, we see that the VWRPGA provides a significantly better convergence rate than the VWPGA. In particular, this advantage is more obvious when N is large.
It is known from Theorems 2 and 9 that the approximation property of the VWRPGA is almost the same as that of the VWOGA. While the VWRPGA is simpler than the VWOGA from the viewpoint of computational complexity. For N target elements, from the definitions of the algorithms, one can see that the VWOGA needs to solve N m-dimensional optimization problems. However, the VWRPGA only needs to solve N one-dimensional optimization problems. Then, this makes the VWRPGA easier to implement than the VWOGA in practical applications.
3. The VWRPGA for Banach Spaces
In this section, we consider the VWRPGA in the context of Banach spaces. We remark that there are two natural generations of the PGA in the case of Banach space X: the X-greedy algorithm and the dual greedy algorithm. However, there are no general results on convergence and error bound of these two algorithms, cf. [29]. On the other hand, the WOGA, WRGA, WRPGA and VWOGA have been successfully generalized to the case of Banach spaces. We first recall from [31] the definition of the Weak Chebyshev Greedy Algorithm (WCGA), which is a natural generalization of the WOGA.
For any non-zero element , we denote by a norming functional for f:
The existence of such a functional is guaranteed by the Hahn–Banach theorem. Let , be a given sequence. The WCGA is defined as follows.
-
WCGA(X, ):
•Step 0: Define .
•Step m:
- If , stop the algorithm and define for .
- If , choose an element such that
Set
Define to be the best approximant to f from and proceed to Step .
To estimate the error of WCGA, we shall utilize some geometric aspects of Banach spaces. For a Banach space X, we define , the modulus of smoothness of X, as
A uniformly smooth Banach space is one with the property
We shall only consider Banach spaces whose modulus of smoothness satisfies the inequality
where is a constant independent of u.A typical example of a uniformly smooth Banach space is the Lebesgue space , . It is known from [33] that
(8)
Moreover, we obtain from [34] that for any X with ,
and for any X with ,The following error bound of the WCGA on has been established in [31].
Let X be a Banach space with modulus of smoothness . If , then the output of the WCGA() satisfies the inequality
where the constant depends only on q and γ.
Taking , Theorem 10 implies the following corollary, which can be seen in [31].
Let X be a Banach space with modulus of smoothness . Then, for any , the output of the WCGA(X, ) satisfies the inequality
In order to show the convergence rate cannot be improved, we now take as an example.
Let be fixed. Combining Corollary 1 with inequality (8), we have for any and any
(9)
When is a wavelet basis of , it is known from [35] that there is a such that
Thus, inequality (9) could not be improved.
Similarly, let be fixed. Combining Corollary 1 with inequality (8), we have for any and any
(10)
When is the trigonometric system of , it is known from [36] that there is a such that
Thus, inequality (10) could not be improved.
Then, the convergence rate in Corollary 1 serves as a benchmark for the performance of greedy algorithms in uniformly smooth Banach spaces.
Next, we recall the definition of the WRGA in the Banach space setting from [31]. Let , , be a given sequence. The WRGA is defined as follows.
-
WRGA(X, ):
•Step 0: Define .
•Step m:
- If , stop the algorithm and define for .
- If , choose a element such that
Find such that
Define and proceed to Step .
The following error bound of the WRGA on has been established in [31].
Let X be a Banach space with a modulus of smoothness . If , then the output of the WRGA () satisfies the inequality
where the constant depends only on q and γ.
Now, we turn to the vector greedy algorithms. Let , be a given sequence. The Vector Weak Chebyshev Greedy Algorithm (VWCGA) [22] is defined as follows.
-
VWCGA(X, ):
•Step 0: Define .
•Step m:
- If , stop the algorithm and define for .
- If , let be such that
Choose an element such that
Set
Define to be the best approximant to from , and proceed to Step .
Let , be a given sequence. The Vector Weak Relaxed Greedy Algorithm (VWRGA) [22] is defined as follows.
-
VWRGA(X, ):
•Step 0: Define .
•Step m:
- If , stop the algorithm and define for .
- If , let be such that
Choose an element such that
Find such that
Define and proceed to Step .
The error bounds of the VWCGA and VWRGA on have been established in [22].
Let X be a Banach space with a modulus of smoothness . Then, for a sequence and any , we have
Now, we start to define the VWRPGA (X, ). To accomplish this, we recall the definition of the WRPGA from [17]. Let X be a Banach space with a modulus of smoothness . Let , be a given sequence.
-
WRPGA(X, ):
•Step 0: Define .
•Step m:
- If , stop the algorithm and define for .
- If , choose an element such that
With
choose such that
Define the next approximant to be and proceed to Step .
The sufficient conditions for the convergence of the WRPGA in terms of the weakness sequence and the modulus of smoothness can be found in [17]. Moreover, the following theorem gives the error bound of the WRPGA on .
(see Theorem 6.1 in [17]). Let X be a Banach space with modulus of smoothness . If , then the output of the WRPGA() satisfies the inequality
where the constant depends only on q and γ.Let X be a Banach space with modulus of smoothness . Let , be a given sequence. We define the VWRPGA (X, ) as follows.
-
VWRPGA(X, ): Given
•Step 0: Define .
•Step m:
- If , stop the algorithm and define for .
- If , let be such that
Choose an element such that
With
choose such that
Define the next approximant to be and proceed to Step .
In this section, we obtain the convergence properties and error bound of the VWRPGA.
Firstly, we establish the theorem on the convergence of the VWRPGA. It seems that this theorem is the first result on the convergence property of the vector greedy algorithms in the Banach space setting.
Let X be a Banach space with modulus of smoothness . Assume
Then, for any and any dictionary the VWRPGA converges.
The idea of the proof of Theorem 14 is similar to that of Theorem 5. However, because of the complexity of Banach spaces, a series of arguments in the subsequence analysis must be modified, replaced, and generalized. Some useful results in the case of Hilbert spaces have been generalized to the case of Banach spaces, as shown in the following lemmas.
Let X be a Banach space with modulus of smoothness . For any two nonzero elements and any , we have
The proof of this lemma follows from the proof of Lemma 6.1 in [29] and the fact that the modulus of smoothness of X satisfies □
Let X be a Banach space with modulus of smoothness . Let be the output of the VWRPGA at Step for . If , then we have
Denote . By the definition of the VWRPGA is the best approximant to from L for . Thus, the conclusion of the lemma follows from Lemma 2.1 in [31]. □
(see Lemma 2.2 in [31]). For any bounded linear functional F and any dictionary from a Banach space, we have
Now, we prove Theorem 14.
Let be the residual of . It is known from the definition of the VWRPGA that satisfies
We apply Lemma 3 to the latter inequality with and obtain
By the choice of , we have
(11)
Thus, it is easy to see that is a decreasing sequence. According to the Monotone Convergence Theorem, we know that for exists.
Next, we prove that by contradiction. Assume . Then, for any m, we have . By (11), we obtain that
Denote
(12)
By the inequality , we obtain
(13)
Then, we proceed with a lower estimate for .
By Lemma 1, we set and find such that
and with some number .We obtain from Lemma 5 that
(14)
By Lemma 4, we obtain
(15)
Inequalities (14), (15) and result in
(16)
Combining (12) with (16), we obtain
(17)
Combining (17) with (13), we have
The assumption implies that as .
Thus, for We obtain a contradiction which proves this theorem. □
According to Theorem 3.1 in [32], we know that is also a necessary condition for the convergence of the VWRPGA.
Since the WRGA only converges for the target elements from , see [31], then the VWRGA also only converges for the target elements from . Thus, the convergence property of the VWRPGA is better than that of the VWRGA.
For , the sufficient condition for the convergence of the VWCGA follows from Theorem 12. Theorem 15 gives the convergence condition for any .
By using the same method, it is not difficult to prove the following theorem on the convergence of the VWCGA.
Let X be a Banach space with modulus of smoothness . Assume
Then, for any and any dictionary the VWCGA converges.
Next, we give the theorem about the error bound of the VWRPGA() on .
Let X be a Banach space with modulus of smoothness . Then, for a sequence and any , we have
It is known from (11) that the sequences are decreasing. Fix i. The inequality
follows from the assumption and the fact that is decreasing.Thus, we only need to prove the following estimate:
We define the set just as we did in proof of Theorem 6. It is obvious that
Thus, there exists such that
As in proof of Theorem 6, for , the sequences are the outputs of the WRPGA with the weakness sequence . Therefore, using Theorem 13, we obtain
The proof of Theorem 16 is completed. □
We know from Theorems 12 and 16 that the error bound of the VWRPGA is almost the same as those of the VWCGA and VWRGA. Similarly, the computational complexity of the VWRPGA is essentially smaller than those of the VWCGA and VWRGA.
4. Conclusions
In this paper, we consider the use of vector greedy algorithms for simultaneous approximation. We first work in a Hilbert space H. We propose a new vector greedy algorithm—the Vector Weak Rescaled Pure Greedy Algorithm (VWRPGA)—for simultaneous approximation with respect to a dictionary in H. Then, we study the error performances of the VWRPGA. We show that the convergence rate of the VWRPGA on is optimal. The VWRPGA has a weaker convergence condition than the VWPGA. The convergence rate of the VWRPGA is better than that of the VWPGA. In particular, this advantage is more obvious when N is large. Moreover, the error performances of the VWRPGA are similar to those of the VWOGA. However, from the viewpoint of computational complexity, the VWRPGA is simpler than the VWOGA. For N target elements, from the definitions of the algorithms, one can see that the VWOGA needs to solve N m-dimensional optimization problems. However, the VWRPGA only needs to solve N one-dimensional optimization problems.
Then, we design the Vector Weak Rescaled Pure Greedy Algorithm (VWRPGA) in a uniformly smooth Banach space setting. We obtain the convergence properties and error bound of the VWRPGA in this case. We also show that the convergence condition of the VWCGA is the same as that of the VWRPGA. We show that when the Banach space is a Lebesgue space, the convergence rate of the VWRPGA on is sharp. As for the convergence properties, the VWRGA converges only for the target elements from , while the VWRPGA converges for any element. Therefore, the VWRPGA has better convergence properties than the VWRGA. The error bounds of the VWRPGA are similar to those of the VWCGA and VWRGA. From the viewpoint of computational complexity, the VWRPGA is simpler than the VWCGA and the VWRGA.
In conclusion, the VWRPGA is the simplest vector greedy algorithm for simultaneous approximation with the best convergence property and the optimal convergence rate.
The VWRGA is more efficient than the WRPGA, since the complexity of its calculation and the storage of information can be reduced greatly by the VWRPGA instead of the N-fold WRPGA. If and , then the VWRPGA degenerates into the RPGA. In [5], the authors applied the RPGA to a kernel-based regression. They defined the Rescaled Pure Greedy Learning Algorithm (RPGLA) and studied its efficiency. They showed that the computational complexity of the RPGLA is less than the Orthogonal Greedy Learning Algorithm (OGLA) [37] and Relaxed Greedy Learning Algorithm (RGLA) [38]. When the kernel is infinitely smooth, the learning rate can be arbitrarily close to the best rate under a mild assumption of the regression function. Since the VWRPGA is more efficient than the RPGA, the VWRPGA can be used to solve the problems of multi-task learning more efficiently. Moreover, it is natural to consider the applications of the VWRPGA to vector signal processing. We will study these applications of the VWRPGA in the future.
Conceptualization, X.X., P.Y. and W.Z.; methodology, X.X. and P.Y.; formal analysis, J.G.; investigation, all authors; resources, all authors; data curation, all authors; writing—original draft preparation, P.Y. and J.G.; writing—review and editing, X.X. and P.Y.; visualization, all authors; supervision, X.X. and P.Y.; project administration, all authors; funding acquisition, all authors. All authors have read and agreed to the published version of the manuscript.
Not applicable.
The authors declare no conflict of interest.
Footnotes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
References
1. Barron, A.; Cohen, A.; Dahmen, W.; DeVore, R. Approximation and learning by greedy algorithms. Ann. Stat.; 2008; 36, pp. 64-94. [DOI: https://dx.doi.org/10.1214/009053607000000631]
2. Cohen, A.; Dahmen, W.; DeVore, R. Compressed sensing and best k-term approximation. J. Am. Math. Soc.; 2009; 22, pp. 211-231. [DOI: https://dx.doi.org/10.1090/S0894-0347-08-00610-3]
3. Yang, B.; Yang, C.; Huang, G. Efficient image fusion with approximate sparse representation. Int. J. Wavelets Multiresolut. Inf. Process.; 2016; 14, 1650024.
4. Zhang, W.H.; Ye, P.X.; Xing, S.; Xu, X. Optimality of the approximation and learning by the rescaled pure super greedy algorithms. Axioms; 2022; 11, 437. [DOI: https://dx.doi.org/10.3390/axioms11090437]
5. Zhang, W.H.; Ye, P.X.; Xing, S. Optimality of the rescaled pure greedy learning algorithms. Int. J. Wavelets Multiresolut. Inf. Process.; 2023; 21, 2250048. [DOI: https://dx.doi.org/10.1142/S0219691322500485]
6. Nguyen, H.; Petrova, G. Greedy strategies for convex optimization. Calcolo; 2017; 54, pp. 207-224. [DOI: https://dx.doi.org/10.1007/s10092-016-0183-2]
7. Huang, A.T.; Feng, R.Z.; Wang, A.D. The sufficient conditions for orthogonal matching pursuit to exactly reconstruct sparse polynomials. Mathematics; 2022; 10, 3703. [DOI: https://dx.doi.org/10.3390/math10193703]
8. Liu, Z.Y.; Xu, Q.Y. A multiscale RBF collocation method for the numerical solution of partial differential equations. Mathematics; 2019; 7, 964. [DOI: https://dx.doi.org/10.3390/math7100964]
9. Jin, D.F.; Yang, G.; Li, Z.H.; Liu, H.D. Sparse recovery algorithm for compressed sensing using smoothed l0 norm and randomized coordinate descent. Mathematics; 2019; 7, 834. [DOI: https://dx.doi.org/10.3390/math7090834]
10. Natsiou, A.A.; Gravvanis, G.A.; Filelis-Papadopoulos, C.K.; Giannoutakis, K.M. An aggregation-based algebraic multigrid method with deflation techniques and modified generic factored approximate sparse inverses. Mathematics; 2023; 11, 640. [DOI: https://dx.doi.org/10.3390/math11030640]
11. Argyriou, A.; Evgeniou, T.; Pontil, M. Convex multitask feature learning. Mach. Learn.; 2008; 73, pp. 243-272. [DOI: https://dx.doi.org/10.1007/s10994-007-5040-8]
12. Schmidt, E. Zur Theorie der linearen und nichtlinearen Integralgleichungen. I Math. Annalen.; 1906–1907; 63, pp. 433-476. [DOI: https://dx.doi.org/10.1007/BF01449770]
13. Tropp, J.A.; Gilbert, A.C.; Strauss, M.J. Algorithms for simultaneous sparse approximation. Part I: Greedy pursuit. Signal. Process.; 2006; 86, pp. 572-588. [DOI: https://dx.doi.org/10.1016/j.sigpro.2005.05.030]
14. Wirtz, D.; Haasdonk, B. A vectorial kernel orthogonal greedy algorithm. Proc. DWCAA; 2013; 6, pp. 83-100.
15. DeVore, R.A.; Temlyakov, V.N. Some remarks on greedy algorithms. Adv. Comput. Math.; 1996; 5, pp. 173-187. [DOI: https://dx.doi.org/10.1007/BF02124742]
16. Gao, Z.; Petrova, G. Rescaled pure greedy algorithm for convex optimization. Calcolo; 2019; 56, 15. [DOI: https://dx.doi.org/10.1007/s10092-019-0311-x]
17. Petrova, G. Rescaled pure greedy algorithm for Hilbert and Banach spaces. Appl. Comput. Harmon. Anal.; 2016; 41, pp. 852-866. [DOI: https://dx.doi.org/10.1016/j.acha.2015.10.008]
18. Jiang, B.; Ye, P.; Zhang, W. Unified error estimate for weak biorthogonal greedy algorithms. Int. J. Wavelets Multiresolut. Inform. Process.; 2022; 5, 2150001. [DOI: https://dx.doi.org/10.1142/S0219691322500102]
19. Dereventsov, A.V.; Temlyakov, V.N. A unified way of analyzing some greedy algorithms. J. Funct. Anal.; 2019; 12, pp. 1-30. [DOI: https://dx.doi.org/10.1016/j.jfa.2019.108286]
20. Temlyakov, V.N. A remark on simultaneous greedy approximation. East J. Approx.; 2004; 10, pp. 17-25.
21. Leviatan, D.; Temlyakov, V.N. Simultaneous approximation by greedy algorithms. Adv. Comput. Math.; 2006; 25, pp. 73-90. [DOI: https://dx.doi.org/10.1007/s10444-004-7613-4]
22. Leviatan, D.; Temlyakov, V.N. Simultaneous greedy approximation in Banach spaces. J. Complex.; 2005; 21, pp. 275-293. [DOI: https://dx.doi.org/10.1016/j.jco.2004.09.004]
23. Lutoborski, A.; Temlyakov, V.N. Vector greedy algorithms. J. Complex.; 2003; 19, pp. 458-473. [DOI: https://dx.doi.org/10.1016/S0885-064X(03)00026-8]
24. Mallat, S.; Zhang, Z. Matching pursuit with time-frequency dictionaries. IEEE Trans. Signal Pross.; 1993; 41, pp. 3397-3415. [DOI: https://dx.doi.org/10.1109/78.258082]
25. Konyagin, S.V.; Temlyakov, V.N. Rate of convergence of pure greedy algorithm. East. J. Approx.; 1996; 5, pp. 493-499.
26. Sil<sup>′</sup>nichenko, A.V. Rates of convergence of greedy algorithms. Mat. Zametki.; 2004; 76, pp. 628-632.
27. Burusheva, L.; Temlyakov, V. Sparse approximation of individual functions. J. Approx. Theory; 2020; 259, 105471. [DOI: https://dx.doi.org/10.1016/j.jat.2020.105471]
28. Livshitz, D.; Temlyakov, V.N. Two lower estimates in greedy approximation. Constr. Approx.; 2003; 19, pp. 509-524. [DOI: https://dx.doi.org/10.1007/s00365-003-0533-6]
29. Temlyakov, V.N. Greedy Approximation; Cambridge University Press: Cambridge, UK, 2011.
30. Temlyakov, V.N. Weak greedy algorithms. Adv. Comput. Math.; 2000; 12, pp. 213-227. [DOI: https://dx.doi.org/10.1023/A:1018917218956]
31. Temlyakov, V.N. Greedy algorithms in Banach spaces. Adv. Comput. Math.; 2001; 14, pp. 277-292. [DOI: https://dx.doi.org/10.1023/A:1016657209416]
32. Jiang, B.; Ye, P.X. Efficiency of the weak rescaled pure greedy algorithm. Int. J. Wavelets Multiresolut. Inform. Process.; 2021; 4, 2150001. [DOI: https://dx.doi.org/10.1142/S0219691321500016]
33. Donahue, M.; Gurvits, L.; Darken, C.; Sontag, E. Rate of convex approximation in non-Hilbert spaces. Constr. Approx.; 1997; 13, pp. 187-220. [DOI: https://dx.doi.org/10.1007/BF02678464]
34. Lindenstrauss, J.; Tzafriri, L. Classical Banach Spaces I; Springer: Berlin/Heidelberg, Germany, 1977.
35. Temlyakov, V.N.; Yang, M.R.; Ye, P.X. Greedy approximation with regard to non-greedy bases. Adv. Comput. Math.; 2011; 34, pp. 319-337. [DOI: https://dx.doi.org/10.1007/s10444-010-9155-2]
36. Ye, P.X.; Wei, X.J. Efficiency of weak greedy algorithms for m-term approximations. Sci. China Math.; 2016; 59, pp. 697-714. [DOI: https://dx.doi.org/10.1007/s11425-015-5106-1]
37. Chen, H.; Zhou, Y.C.; Tang, Y.Y.; Li, L.Q.; Pan, Z.B. Convergence rate of the semi-supervised greedy algorithm. Neural Netw.; 2013; 44, pp. 44-50. [DOI: https://dx.doi.org/10.1016/j.neunet.2013.03.001]
38. Lin, S.B.; Rong, Y.H.; Sun, X.P.; Xu, Z.B. Learning capability of the relaxed greedy algorithms. IEEE Trans. Neural Netw. Learn. Syst.; 2013; 24, pp. 1598-1608.
© 2023 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.