Content area
In the field of video image processing, high definition is one of the main directions for future development. Faced with the curse of dimensionality caused by the increasingly large amount of ultra-high-definition video data, effective dimensionality reduction techniques have become increasingly important. Linear discriminant analysis (LDA) is a supervised learning dimensionality reduction technique that has been widely used in data preprocessing for dimensionality reduction and video image processing tasks. However, traditional LDA methods are not suitable for the dimensionality reduction and processing of small high-dimensional samples. In order to improve the accuracy and robustness of linear discriminant analysis, this paper proposes a new distributed sparse manifold constraint (DSC) optimization LDA method, called DSCLDA, which introduces
Full text
1. Introduction
An ultra-high-definition video image processing system relies on the ability to detect information from multiple targets and over long distances [1,2]. However, the current chip computing power cannot support complex computational imaging processing methods, resulting in the inability to meet the real-time requirements of the system, and the processing performance has not yet reached the demand for high-dimensional information detection and perception. Therefore, it is necessary to quickly and adaptively process high-dimensional video images. Nowadays, complex image data are inherently redundant and non-Gaussian, leading to the unstable performance of traditional methods such as principal component analysis (PCA), linear discriminant analysis (LDA) [3,4], Fisher discriminant analysis (FDA) [5], orthogonal linear discriminant analysis (OLDA) [6], and uncorrelated linear discriminant analysis (ULDA) [7], which has affected the actual video processing performance. Therefore, it is urgent to explore how to utilize high-dimensional spatial data, establish new sparse discrimination models, and design effective optimization schemes to improve existing detection and classification strategies.
From the perspective of data analysis, the key to processing and analyzing high-dimensional data lies in dimensionality reduction and feature extraction, with a focus on sparsity [8,9]. As an emerging optimization branch, sparse constraints have attracted much attention due to their ability to break through traditional Shannon sampling and achieve efficient transmission. Nowadays, sparsity constraints have been widely used in pattern recognition image processing, and its applicability in many other fields has been recognized [10,11]. Sparse constraints refer to the majority of elements being zero. For high-dimensional data, it is necessary to consider sparsity, such as through sparse linear discriminant analysis (SLDA) [12], sparse uncorrelated linear discriminant analysis (SULDA) [13], robust sparse linear discriminant analysis (RSLDA) [14], intra-class and inter-class kernel constraints (IIKCs) [15], hypergraph Laplacian-based semi-supervised discriminant analysis (HSDAFS) [16], adaptive and fuzzy locality discriminant analysis (AFLDA) [17], etc. Compared with traditional LDA methods, sparse discriminant analysis greatly improves the identification ability of the system. However, sparse discriminant analysis methods usually replace the -norm with the -norm to obtain convex optimization problems. Obviously, the -norm can select the most representative feature variables and optimize faster than -norm-constrained optimization. Examples include a sparse signal recovery framework based on segmented threshold gradient approximation [18], image non-negative matrix factorization with alternating smooth -norm constraints [19], and sparse feature selection based on fast embedding spectral analysis [20]. The above methods also prove that the -norm does indeed have better feature vector selection and faster optimization algorithm capabilities.
From the perspective of feature extraction, sparse analysis-based methods have significant data analysis capabilities but can not reveal potential causal relationships between variables during the analysis process [21,22,23]. To address this problem, manifold learning can be introduced to learn local features of potential information in high-dimensional space. To characterize such data, a practicable solution is to map the linear inseparable features in high-dimensional space to a low-dimensional nonlinear feature space, such as through robust sparse manifold discriminative analysis (RSMDA) [24], which captures both global and local geometric information through manifold learning. Zhang et al. [25] proposed a coupled discriminative manifold alignment (CDMA) method, which focuses on aligning the manifold structures of a low resolution (LR) and high resolution (HR) in a common feature subspace. In order to use manifold learning methods, many optimization schemes have been proposed, such as the projection algorithm [26], precise penalty algorithm [27], augmented Lagrangian algorithm [28], iterative hard threshold [29], Newton hard threshold pursuit [30], etc. In addition, in the field of video image processing, the problem of high-dimensional pixel videos essentially requires a minimization optimization with Stiefel manifolds, including the partial least squares [31], principal component analysis [32], and canonical correlation analysis [33]. Manifold-constrained optimization can even be seen frequently in reinforcement learning [34] and federated learning [35].
The optimization methods mentioned above mostly focus on a single constraint. Currently, there is limited research on problems that consider both manifold constraints and sparse constraints. On the one hand, this is because both constraints are non-convex, non-smooth, and even NP-hard, making joint algorithm design difficult. On the other hand, joint constraints require two constraints to share the same variable, making theoretical analysis more difficult. To solve these difficulties, this paper proposes a new distributed sparse manifold-constrained optimization algorithm and explores effective numerical solutions. The proposed joint constraints are introduced to LDA, and the novel method is called distributed sparse manifold-constrained linear discriminant analysis (DSCLDA). The proposed method first divides the process monitoring data into multiple data nodes and performs distributed parallel operations simultaneously. Afterward, a -norm sparse constraint is constructed to regulate local features and preserve the local structure of variables. In addition, by using manifold constraints on global variables, the proposed method can capture the causal correlation and reduce the data structure loss during the projection. By using the manifold proximal gradient (ManPG) to combine local and global variables, sparse constraints and manifold constraints are incorporated into the calculation process during optimization, and explicit solutions for each variable are obtained. The contributions of the proposed method can be presented as follows:
This paper proposes a novel distributed sparse manifold-constrained linear discriminant analysis (DSCLDA) method, which introduces sparse and manifold constraints to maintain the local and global structure.
We designed an effective solution scheme that combines local and global variables using the manifold proximal gradient (ManPG) to obtain explicit solutions for each subproblem.
We conducted a series of experiments on several public datasets to verify the effectiveness of the proposed method and discuss the convergence and feature distribution.
The rest of this paper is organized as follows. Section 2 introduces the notations and related works. Section 3 details the problem of the proposed method and the corresponding optimization algorithm. Section 4 evaluates and discusses the performance of the proposed method. Section 5 concludes this paper.
2. Notations and Preliminaries
2.1. Notations
For convenience, we will define some symbols required for this section. For the matrix , is represented as the ith row, and is represented as an element of the ith row and jth column. is written as an all-zero matrix in , represents the identity matrix with the dimensions , and represents the identity matrix with the dimensions . represents the transpose of X, and represents the vectorization of X. For set , is used to represent the complement of T. In addition, for matrices , the inner product is defined as , where represents the trace of the matrix.
2.2. Preliminaries
LDA, as a supervised learning method, can use prior experience of categories in the dimensionality reduction process, while unsupervised learning cannot use prior experience of categories. Compared with other methods, a feature of LDA is to learn discriminative projections by maximizing the inter-class distance while minimizing the intra-class distance, thereby achieving a more effective dimensionality reduction ability. Define the inter-class distance matrix and intra-class distance matrix for the training samples, and these two matrices can be defined as
(1)
(2)
LDA attempts to find a suitable projection direction that minimizes intra-class dispersion and maximizes inter-class dispersion after projection. This search process can be expressed as follows:
(3)
To avoid the distortion of , problem (3) can also be extended in the following form:
(4)
However, LDA still has some shortcomings. For example, LDA can only reduce the dimensionality of data with a category of k to at most. Therefore, LDA cannot be used when reducing the dimensionality below . In addition, if the original sample size is too small, the dimensionality reduction results of LDA are prone to overfitting. Therefore, a common modification is to add sparse constraints to LDA, commonly known as SLDA. In common SLDA methods, the -norm is applied in LDA to induce sparsity, which can remove redundant features from the data and improve the performance of video image processing. The formulation of SLDA, which introduces the -norm, is expressed as follows:
(5)
To effectively eliminate noise and outliers in SLDA and improve robustness in discriminant analysis, reference [14] proposed RSLDA, which is expressed in the form of
(6)
where is the -norm. By selecting different parameters of and , RSLDA can select important features and effectively eliminate noise and outliers, thereby achieving excellent performance in the field of image classification.Another method to improve the performance of SLDA is to incorporate manifold constraints into the optimization problem, such as in the RSMDA method from reference [24], which is represented as
(7)
Inspired by the above methods, this paper proposes an LDA variant that utilizes joint sparsity and manifold constraints. The specific optimization problem will be described in detail in Section 3.
3. Methodology
3.1. Optimization Problem
In this paper, for the random matrix X, the proposed distributed sparse manifold constraints can be expressed as the following problem:
(8)
where l represents the total number of distributed representations of X. Distributed sparse manifold constraints can fully utilize the spatial information of the current extended variables, further improving the interpretability of variables from the process monitoring data. Therefore, combined with regular LDA, distributed sparse manifold linear discriminant analysis (DSCLDA) is proposed, which can fully utilize the local and global information of process monitoring observations and take into account both causal and structural relationships between variables.In this model, is the given Lipschitz locally continuous function, and is the given global function. is introduced as the sparse constraint, and is used as the manifold constraint. Substituting problem (4) into the distributed sparse constraint yields
(9)
3.2. Optimization Algorithm
To obtain an effective algorithm, the distributed variable and the global variable Y are introduced to transform problem (8) into
(10)
where represents the variables of the ith distribution. In problem (10), the sparse constraint only includes the local variable , and the manifold constraint only includes the global variable Y. Therefore, further consideration can be given to the optimization problem of the following penalty function:(11)
in which is the penalty parameter corresponding to each branch.3.2.1. Updating
Problem (30) is an NP-hard problem, and there is no explicit solution. Inspired by the Newton hard threshold tracking method, the proposed optimization algorithm extends it to matrices. Assuming the objective function is , then the gradient of this function is represented as
(12)
The Hessian expression of problem (12) can be written as
(13)
If is satisfied (where is the step size parameter), then can be considered as the stable point for problem (29). Let represent the set of indicators for the first s rows of under -norm constraints; then, for any , this satisfies a nonlinear relationship, which is written as
(14)
in which represents the submatrix in , and is the corresponding indicator set. indicates the submatrix in with as the indicator set. The gradient of in can be expressed as(15)
where represents the Hessian submatrix with the indicator set . Define(16)
where D represents the descending direction. The minimum can be obtained using a sparse proximal gradient (SpaPG). The descending direction D is obtained from(17)
Then, the th , , should be represented as
(18)
in which , while is the smallest positive integer that satisfies the following equation, written as(19)
3.2.2. Updating Y
Set ; then, the tangent space of manifold at Y is expressed as . Assuming the objective function is , it has the following approximation function:
(20)
where is a parameter. To obtain the descending direction D, define(21)
Based on the definition of , set , and Equation (21) can be represented as
(22)
Based on Equation (22), the Lagrange function can be obtained, which is written as
(23)
in which is the Lagrange multiplier. Then, the corresponding Karush–Kuhn–Tucker (KKT) system for the Lagrangian function above is represented as(24)
By synthesizing Equation (24), the optimization problem for can be obtained, which is written as
(25)
Equation (25) can be solved using the manifold proximal gradient (ManPG) algorithm, and the th Y can be represented as
(26)
in which the mapping represents Retraction. maps the vectors in the tangent space to the manifold, allowing the problem to maintain orthogonality during the optimization. In Equation (26), , and q is the smallest positive integer that satisfies the following equation, expressed as(27)
3.3. Convergence Analysis
According to the updates of and Y, the optimization algorithm of Equation (11) can be expressed as Algorithm 1. In addition, according to the literature [36], if satisfies
(28)
then can be considered as the stable point of Equation (11). The experimental verification of the convergence analysis can be found in Section 4.5.3.4. Complexity Analysis
To verify how distributed sparse constraints can enhance the performance of existing methods, this section compares the complexity and computational cost of the proposed method with that of the baseline LDA method. For Equation (9), given the original data with a dimensionality of d and n samples, the computational complexity of the objective function is . The sparse constraint , which checks the number of non-zero elements, has a complexity of . The manifold constraint implies that X is an orthogonal matrix, which, enforcing orthogonality through methods such as QR decomposition, has a complexity of . Therefore, the overall complexity of the proposed method is . In contrast, the computational complexity of traditional LDA-based methods is primarily determined by calculating the within-class scatter matrix , the between-class scatter matrix , and solving the generalized eigenvalue problem. The complexity of computing is , while that of computing is , due to the calculation based on class means and the global mean. The complexity of solving the eigenvalues and eigenvectors of is , which is the most time-consuming part of LDA. The overall complexity of LDA is . The proposed distributed sparse constraint method demonstrates superior computational efficiency over traditional LDA methods by reducing the overall complexity from to through the enforcement of sparsity and orthogonality constraints, thereby eliminating the most time-consuming generalized eigenvalue problem in LDA.
| Algorithm 1 Optimization algorithm for (11) |
Input: Data X, parameters s,l,,. Initialize: Data , parameter . Output: Data Y. While not converged do
End while |
| Algorithm 2 Optimization algorithm for (12) |
| Input: Data X, parameters , , . Initialize: , , when . Output: While not converged do
End while |
| Algorithm 3 Optimization algorithm for (20) |
| Input: Parameters . Initialize: , . Output: . While not converged do
End while |
4. Simulation Studies
In the experiments, DSCLDA was compared with traditional LDA and six LDA variants, including AFLDA [17], ERSLDA [37], RSLDA+IIKC [15], RSMDA [24], RSLDA [14], SULDA [13], and SLDA [12]. The optimization problem and constraint of each method are shown in Table 1. The datasets used in the experiments in this paper are shown in Table 2, and examples of each dataset are shown in Figure 1. In this experiment, a self-built vehicle dataset, called the CAR_image dataset, was introduced.
4.1. Experiment Settings
Due to the fact that the datasets were divided into D parts as data nodes for distributed computing during the experiment, D was added before all method names to indicate distributed performance, such as DERSLDA and DRSLDA. In the simulation verification, each method was executed 10 times, with different random samples selected from the same dataset for each run; then, the average classification accuracy was calculated. To improve computational efficiency, all datasets were preconverted into grayscale images. In addition, to improve computational efficiency and achieve better classification accuracy, this experiment used PCA to perform dimensionality reduction on all image datasets, retaining 95% of the original data information. Furthermore, due to the large and inconsistent image sizes in the Car_image dataset, the unified resolution of the images in this dataset was .
In selecting experimental parameters, the selection of the parameters and was carried out through a ten-fold cross-validation method based on the content and size of each dataset. The range of the parameters and is denoted as , and . Prior to numerical validation, a strategy of fixing the value of while varying was employed to ascertain the corresponding accuracy for each configuration, serving as a basis for evaluation. The experimental results on the COIL20 image dataset are shown in Figure 2. Based on the experimental results, it can be determined that the selection of the parameters and should be within the range of , , , and to achieve better image processing performance. Specifically, for the COIL20 image dataset, the most suitable parameter combination was identified as and , and a similar method for parameter selection was applied to other datasets under investigation. In addition, the shutdown criterion in this experiment was set so that 100 iterations would be reached or the overall objective function value would be less than .
4.2. Experiment Based on Sample Size
This experiment used the k nearest neighbors (KNN) classifier to analyze the classification accuracy of the dimensionality reduction results of various methods. The knn classifier is a supervised machine learning algorithm that assigns a new data point to the class most common among the k nearest neighbors in the feature space, based on a distance metric such as the Euclidean distance. In this experiment, four different sample sizes were randomly selected for each dataset as the training set, and the remaining samples were used as the testing set. The classification experiment results under different sample sizes are shown in Table 3, where the highest-performing results are highlighted in bold. The simple image datasets used in the experiment, including the Mnist dataset, Hand Gesture Recognition dataset, and COIL20 dataset, have simple content, a monotonous background, and obvious features. Therefore, each method could achieve better classification performance on the above three datasets. The image features of the NEU surface defect dataset, Car_image dataset, and Caltech-101 dataset are relatively complex or have a high proportion of the background, resulting in relatively low classification accuracy for each model on these datasets. However, the experimental results show that the DSCLDA model still had improvements in the classification performance compared to other methods on these datasets.
Compared with other methods, DSCLDA improved by at least on the Mnist dataset; improved by at least on the Hand Gesture Recognition dataset; improved by at least on the COIL20 image dataset; improved by at least on the NEU surface defect dataset; improved by at least on the Car_image dataset; and improved by at least on the Caltech-101 image dataset. The classification performance of DSCLDA was further improved on two difficult datasets, namely the NEU surface defect dataset and the Car_image dataset. The results can be explained by the fact that the DSCLDA model, which simultaneously extracts features from both global and local structures, can obtain more representative feature data when processing complex images or images with unclear features, thereby achieving better classification performance. The experimental results also demonstrate that DSCLDA divides process monitoring data into multiple data nodes and performs distributed parallel operations, which not only improves computational efficiency but also provides better adaptation to the processing needs of large-scale data.
Compared to other methods, the average classification performance of the proposed DSCLDA method improved by at least , which proves that the proposed method achieves satisfactory classification performance by introducing joint sparse and manifold constraints. In addition, compared with DRSLDA, DRSMDA, DRSLDA+IIKC, and DERSLDA, the proposed DSCLDA still had a significant improvement, indicating that the proposed method can demonstrate advantages when compared with some of the latest SLDA variants.
4.3. Experiment Based on the Number of Dimensions
In this experiment, , , , , , and samples were selected as training sets for each type on the six public image datasets, and the remaining samples were used as testing sets with dimensions ranging from 5 to 200. The classification experiment results are shown in Figure 3, Figure 4, Figure 5, Figure 6, Figure 7 and Figure 8. The experimental results indicate that the proposed DSCLDA method achieved relatively better classification performance on the six publicly available datasets mentioned above. From the experimental results, it can be seen that the classification performance of DLDA and DSLDA is very sensitive to the choice of dimensionality. As the dimensionality increases, the classification performance of these two methods may even even decrease. However, the proposed DSCLDA method can still maintain classification accuracy in the presence of dimensional changes. The experimental results demonstrate the flexibility of the proposed method in dimension selection. For the NEU surface defect and Caltech-101 image datasets, the classification performance of DSCLDA did not show significant improvement compared to that of other methods because the features of these two datasets are relatively complex and not clear enough, resulting in similar classification results for the above methods. However, on several other publicly available datasets, the proposed DSCLDA method still performed relatively better in terms of its classification accuracy. DSCLDA normalizes local features by constructing -norm sparse constraints, preserves the local structure of variables, and utilizes manifold constraints to capture causal correlations between global variables, reducing data structure loss during the projection process.
4.4. Experiments with Deep Learning Methods
Deep learning methods, such as Transformer-based feature extraction models, provide new perspectives and powerful tools for feature extraction and dimensionality reduction, which can provide valuable benchmarks. These deep learning methods typically have better feature-learning capabilities and stronger robustness and can achieve excellent performance on large-scale datasets. Therefore, this paper also compares DSCLDA with deep learning-based dimensionality reduction techniques, such as R3D-CNN [43], I3D [44], and Transformer [45], to demonstrate its broader applicability in different scenarios. Through experiments on the Hand Gesture Recognition (HGR) and CIFAR-100 [46] datasets, we validated the advantages of DSCLDA in feature extraction and dimensionality reduction, as well as its competitiveness with deep learning methods. Table 4 demonstrates that the gesture recognition dataset may have certain limitations, such as its number of samples, diversity, and representativeness. On smaller datasets, simpler or more traditional models, such as DSCLDA, may perform better because of their lower complexity. On the other hand, models supported by deep learning methods may be more suitable for handling large and complex datasets, capable of capturing more subtle patterns and relationships.
4.5. Convergence Analysis
This section describes how we conducted experimental verification of the convergence analysis proposed in Section 3.3. In the proposed DSCLDA method, the most computationally expensive step is the calculation of the projection matrix X, while the most computationally intensive task is the process of solving the inverse matrix, which significantly affects the computational efficiency of DSCLDA. In this experiment, the computational efficiency of DSCLDA was reflected in the speed of function value reduction and the convergence speed of the classification accuracy. In order to visually demonstrate the convergence of the proposed DSCLDA method, Figure 9 shows the curves of the objective function value and classification accuracy of the functions. As the number of iterations increased, the objective function value of the proposed DSCLDA method rapidly decreased and reached its minimum value, and the classification accuracy also reached its maximum value and converged within 30 iterations. The experimental results validate the fast convergence of DSCLDA.
4.6. t-SNE Comparison
In addition, to further validate the principle and effectiveness of the proposed method, the t-SNE method was utilized to visualize the data distribution before and after projection. The experiment used the top five types of data from the Mnist dataset and randomly selected 100 samples for each type as the training set and the remaining samples as the testing set. The corresponding classification accuracies of each method were 85.85% (DRSLDA), 90.10% (DERSLDA), and 90.20% (DSCLDA), respectively. The experimental results are shown in Figure 10. It can be seen that when not projected, the inter-class and intra-class distributions of the Mnist dataset were not significant. When using the DRSLDA method for projection, it can reduce the distance between types and increase the distance between different types, but DRSLDA cannot fully classify all data, and this classification method is not satisfactory in terms of distribution. With the introduction of sparse constraints, the inter-class distance between different types of data becomes larger, while the intra-class spacing becomes smaller. In the t-SNE distribution of the proposed DERSLDA method, the intra-class spacing is relatively small, but the inter-class distance is not large enough, so there is still a possibility of data confusion during the classification process. In the t-SNE distribution of the proposed DSCLDA method, the distance between different types is the largest, and the distance between types is the smallest, such as between type 1 and type 2, which is more conducive to determining the data category during the classification process. The experimental results show that the proposed method has relatively better classification performance.
5. Conclusions
In this paper, we constructed a novel distributed sparse manifold constraint and a novel LDA variant, called DSCLDA. The proposed method trains discriminative projections by introducing manifold constraints and -norm sparse constraints, which can obtain the most discriminative features for process monitoring. In addition, in this paper, we designed and developed a novel manifold proximal gradient algorithm to handle the proposed optimization model, while distributed parallel computing could significantly improve computational efficiency. The advantages of DSCs and DSCLDA have been demonstrated through numerical experiments on several public datasets. Compared with other existing LDA methods, the proposed DSCLDA method improves the image classification accuracy by at least and also has significant advantages in convergence and feature distribution.
However, the proposed method currently exhibits limitations in terms of its image processing efficiency and feature classification accuracy, necessitating integration with deep learning techniques for improvement. In the future, we will attempt to combine the proposed method with deep learning methods to improve the efficiency of image processing and the accuracy of feature classification. Furthermore, deployment on hardware platforms may be constrained by computational complexity and insufficient flexibility, highlighting the need for further optimization to enhance the processing efficiency and applicability. In addition, this method will also be considered for deployment on hardware to improve the efficiency of the method’s processing and the flexibility of the method’s usage.
Methodology, M.F. and J.L.; software, M.F.; writing—original draft, M.F. and J.L.; writing—review and editing, Y.Z. and X.C. All authors have read and agreed to the published version of the manuscript.
Not applicable.
Not applicable.
All datasets used are available online with open access.
The authors declare no conflicts of interest.
Footnotes
Disclaimer/Publisher’s Note: The statements, opinions and data contained in all publications are solely those of the individual author(s) and contributor(s) and not of MDPI and/or the editor(s). MDPI and/or the editor(s) disclaim responsibility for any injury to people or property resulting from any ideas, methods, instructions or products referred to in the content.
Figure 1. Some image examples from the datasets used in the experiment. (a) Mnist, (b) Hand Gesture Recognition, (c) COIL20, (d) NEU surface defects, (e) Car_image, (f) Caltech-101.
Figure 2. Parameter cross-validation on the COIL20 image dataset. The parameters [Forumla omitted. See PDF.] and [Forumla omitted. See PDF.] are derived from Equation (11) In this figure, green indicates high value and blue indicates low value.
Figure 3. Classification accuracy on the Mnist dataset. (a) Number of samples: 50; (b) number of samples: 100.
Figure 4. Classification accuracy on the Hand Gesture Recognition dataset. (a) Number of samples: 4; (b) number of samples: 6.
Figure 5. Classification accuracy on COIL20 image dataset. (a) Number of samples: 4; (b) number of samples: 6.
Figure 6. Classification accuracy on the NEU surface defect dataset. (a) Number of samples: 50; (b) number of samples: 100.
Figure 7. Classification accuracy on the Car_image dataset. (a) Number of samples: 10; (b) number of samples: 20.
Figure 8. Classification accuracy on the Caltech-101 image dataset. (a) Number of samples: 10; (b) number of samples: 20.
Figure 9. The relationship between the objective function value, classification accuracy, and the number of iterations. (a) Mnist, (b) Hand Gesture Recognition, (c) COIL20, (d) NEU surface defects, (e) Car_image, (f) Caltech-101.
Figure 10. The data distribution displayed using the t-SNE method. The images correspond to (a) the local data of the original Mnist dataset; (b) the distribution of the corresponding data after DRSLDA projection; (c) the distribution of the corresponding data after DERSLDA projection; (d) and the corresponding data distribution projected through DSCLDA.
Information on all comparison methods used in this experiment. The bold method is the proposed method.
| Method | Optimization Problem | Constraint |
|---|---|---|
| LDA | | |
| SLDA | | |
| SULDA | | |
| RSLDA | | |
| | ||
| RSMDA | | |
| | ||
| | ||
| RSLDA+IIKC | | |
| | ||
| | ||
| ERSLDA | | |
| | ||
| DSCLDA | | |
Information related to the dataset used in this experiment.
| Dataset | Image Types | Images | Color Type | Original Resolution |
|---|---|---|---|---|
| Mnist [ | 10 | 60,000 | Gray | |
| Hand Gesture Recognition [ | 10 | 20,000 | Gray | |
| Coil20 [ | 20 | 1440 | Gray | |
| NEU surface defects [ | 6 | 1200 | Gray | |
| Car_image | 10 | 200 | RGB | |
| Caltech-101 [ | 101 | 9146 | RGB and gray | About |
The classification accuracy obtained on six datasets. The bold value represents the highest value of the column.
| Methods | Mnist | Hand Gesture Recognition | COIL20 | |||||||||
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 10 | 50 | 100 | 200 | 4 | 5 | 6 | 7 | 3 | 6 | 9 | 12 | |
| DLDA | 75.20 | 85.59 | 84.02 | 83.78 | 75.95 | 80.38 | 87.81 | 89.35 | 65.29 | 74.97 | 81.58 | 80.40 |
| DSLDA | 80.42 | 85.64 | 84.85 | 84.46 | 81.50 | 83.69 | 90.69 | 92.23 | 70.84 | 78.28 | 84.46 | 84.02 |
| DSULDA | 87.78 | 87.34 | 86.37 | 93.38 | 84.76 | 88.62 | 90.11 | 91.65 | 74.10 | 83.21 | 83.88 | 86.37 |
| DRSLDA | 84.03 | 85.85 | 88.40 | 96.58 | 87.40 | 89.73 | 88.98 | 90.52 | 76.74 | 84.32 | 82.75 | 86.75 |
| DRSMDA | 83.62 | 86.56 | 90.77 | 97.51 | 87.62 | 85.49 | 90.85 | 92.39 | 76.96 | 80.08 | 84.62 | 87.30 |
| DRSLDA+IIKC | 73.27 | 83.85 | 85.92 | 96.94 | 90.30 | 90.86 | 92.89 | 94.43 | 79.64 | 85.45 | 86.66 | 86.94 |
| DERSLDA | 86.77 | 90.10 | 92.06 | 97.62 | 88.12 | 90.26 | 89.81 | 91.35 | 77.46 | 84.85 | 83.58 | 88.73 |
| DAFLDA | 85.91 | 89.11 | 90.84 | 96.34 | 87.93 | 88.28 | 88.76 | 90.67 | 76.24 | 82.17 | 83.36 | 85.23 |
| DSCLDA | 86.92 | 90.20 | 92.94 | 97.82 | 90.37 | 91.39 | 93.48 | 95.02 | 79.71 | 85.98 | 87.25 | 90.95 |
| Methods | NEU Surface Defects | Car_IMAGE | Caltech-101 | |||||||||
| 25 | 50 | 75 | 100 | 10 | 15 | 20 | 25 | 10 | 15 | 20 | 25 | |
| DLDA | 41.27 | 38.87 | 43.26 | 48.08 | 20.64 | 25.00 | 34.50 | 44.77 | 51.54 | 58.16 | 62.82 | 65.21 |
| DSLDA | 43.03 | 44.93 | 48.15 | 50.53 | 37.03 | 39.93 | 42.15 | 42.08 | 55.56 | 67.60 | 70.02 | 74.32 |
| DSULDA | 42.18 | 48.73 | 52.45 | 54.82 | 37.18 | 42.73 | 47.45 | 48.82 | 67.89 | 77.09 | 80.69 | 86.11 |
| DRSLDA | 42.73 | 46.20 | 50.89 | 56.50 | 36.73 | 40.20 | 44.89 | 50.50 | 69.22 | 83.60 | 83.70 | 87.04 |
| DRSMDA | 52.18 | 53.33 | 57.85 | 60.83 | 47.18 | 47.33 | 51.85 | 54.83 | 71.86 | 83.33 | 84.51 | 86.25 |
| DRSLDA+IIKC | 52.30 | 57.80 | 61.70 | 64.92 | 46.30 | 51.80 | 55.70 | 59.92 | 74.45 | 87.52 | 90.32 | 91.02 |
| DERSLDA | 47.52 | 54.53 | 55.85 | 62.92 | 42.52 | 49.53 | 50.85 | 56.92 | 73.20 | 85.10 | 85.20 | 88.25 |
| DAFLDA | 45.64 | 50.91 | 52.68 | 56.12 | 40.58 | 46.37 | 48.25 | 53.76 | 70.45 | 83.68 | 84.71 | 85.99 |
| DSCLDA | 54.55 | 57.80 | 62.22 | 65.58 | 50.32 | 53.57 | 57.99 | 61.35 | 74.79 | 88.28 | 90.98 | 91.47 |
The accuracy of DSCLDA, R3D-CNN, I3D, and Transformer on the HGR and CIFAR-100 datasets. The bold value represents the highest value of the column.
| HGR | CIFAR-100 | ||
|---|---|---|---|
| Method | Acc. (%) | Method | Acc. (%) |
| DSCLDA | 90.37 | DSCLDA | 63.45 |
| R3D-CNN | 83.80 | R3D-CNN | 90.62 |
| I3D | 85.70 | I3D | 94.82 |
| Transformer | 87.60 | Transformer | 95.03 |
References
1. Yu, W.; Zhu, Q.; Zheng, N.; Huang, J.; Zhou, M.; Zhao, F. Learning non-uniform-sampling for ultra-high-definition image enhancement. Proceedings of the 31st ACM International Conference on Multimedia; Ottawa, ON, Canada, 29 October–3 November 2023; pp. 1412-1421.
2. Yu, X.; Dai, P.; Li, W.; Ma, L.; Shen, J.; Li, J.; Qi, X. Towards efficient and scale-robust ultra-high-definition image demoiréing. Proceedings of the European Conference on Computer Vision; Tel Aviv, Israel, 23–27 October 2022; pp. 646-662.
3. McLachlan, G.J. Discriminant Analysis and Statistical Pattern Recognition; John Wiley & Sons: Hoboken, NJ, USA, 2005.
4. Ullah, S.; Ahmad, Z.; Kim, J.M. Fault Diagnosis of a Multistage Centrifugal Pump Using Explanatory Ratio Linear Discriminant Analysis. Sensors; 2024; 24, 1830. [DOI: https://dx.doi.org/10.3390/s24061830] [PubMed: https://www.ncbi.nlm.nih.gov/pubmed/38544093]
5. Mai, Q.; Zou, H. A note on the connection and equivalence of three sparse linear discriminant analysis methods. Technometrics; 2013; 55, pp. 243-246. [DOI: https://dx.doi.org/10.1080/00401706.2012.746208]
6. Ye, J.; Xiong, T. Null space versus orthogonal linear discriminant analysis. Proceedings of the 23rd International Conference on Machine Learning; Pittsburgh, PA, USA, 25–29 June 2006; pp. 1073-1080.
7. Ye, J.; Janardan, R.; Li, Q.; Park, H. Feature reduction via generalized uncorrelated linear discriminant analysis. IEEE Trans. Knowl. Data Eng.; 2006; 18, pp. 1312-1322.
8. Shi, Y.; Huang, W.; Ye, H.; Ruan, C.; Xing, N.; Geng, Y.; Dong, Y.; Peng, D. Partial least square discriminant analysis based on normalized two-stage vegetation indices for mapping damage from rice diseases using PlanetScope datasets. Sensors; 2018; 18, 1901. [DOI: https://dx.doi.org/10.3390/s18061901]
9. Bach, F. High-dimensional analysis of double descent for linear regression with random projections. SIAM J. Math. Data Sci.; 2024; 6, pp. 26-50. [DOI: https://dx.doi.org/10.1137/23M1558781]
10. Xu, H.L.; Chen, G.Y.; Cheng, S.Q.; Gan, M.; Chen, J. Variable projection algorithms with sparse constraint for separable nonlinear models. Control Theory Technol.; 2024; 22, pp. 135-146. [DOI: https://dx.doi.org/10.1007/s11768-023-00194-3]
11. Zhang, L.; Wei, Y.; Liu, J.; Wu, J.; An, D. A hyperspectral band selection method based on sparse band attention network for maize seed variety identification. Expert Syst. Appl.; 2024; 238, 122273. [DOI: https://dx.doi.org/10.1016/j.eswa.2023.122273]
12. Clemmensen, L.; Hastie, T.; Witten, D.; Ersbøll, B. Sparse discriminant analysis. Technometrics; 2011; 53, pp. 406-413. [DOI: https://dx.doi.org/10.1198/TECH.2011.08118]
13. Zhang, X.; Chu, D.; Tan, R.C. Sparse uncorrelated linear discriminant analysis for undersampled problems. IEEE Trans. Neural Netw. Learn. Syst.; 2015; 27, pp. 1469-1485. [DOI: https://dx.doi.org/10.1109/TNNLS.2015.2448637]
14. Wen, J.; Fang, X.; Cui, J.; Fei, L.; Yan, K.; Chen, Y.; Xu, Y. Robust sparse linear discriminant analysis. IEEE Trans. Circuits Syst. Video Technol.; 2018; 29, pp. 390-403. [DOI: https://dx.doi.org/10.1109/TCSVT.2018.2799214]
15. Li, S.; Zhang, H.; Ma, R.; Zhou, J.; Wen, J.; Zhang, B. Linear discriminant analysis with generalized kernel constraint for robust image classification. Pattern Recognit.; 2023; 136, 109196. [DOI: https://dx.doi.org/10.1016/j.patcog.2022.109196]
16. Sheikhpour, R.; Berahmand, K.; Mohammadi, M.; Khosravi, H. Sparse feature selection using hypergraph Laplacian-based semi-supervised discriminant analysis. Pattern Recognit.; 2025; 157, 110882. [DOI: https://dx.doi.org/10.1016/j.patcog.2024.110882]
17. Wang, J.; Yin, H.; Nie, F.; Li, X. Adaptive and fuzzy locality discriminant analysis for dimensionality reduction. Pattern Recognit.; 2024; 151, 110382. [DOI: https://dx.doi.org/10.1016/j.patcog.2024.110382]
18. Vivekanand, V.; Mishra, D. Framework for Segmented threshold L0 gradient approximation based network for sparse signal recovery. Neural Netw.; 2023; 162, pp. 425-442.
19. Chen, K.; Che, H.; Li, X.; Leung, M.F. Graph non-negative matrix factorization with alternative smoothed L0 regularizations. Neural Comput. Appl.; 2023; 35, pp. 9995-10009. [DOI: https://dx.doi.org/10.1007/s00521-022-07200-w]
20. Wang, J.; Wang, H.; Nie, F.; Li, X. Sparse feature selection via fast embedding spectral analysis. Pattern Recognit.; 2023; 139, 109472. [DOI: https://dx.doi.org/10.1016/j.patcog.2023.109472]
21. Chen, D.W.; Miao, R.; Yang, W.Q.; Liang, Y.; Chen, H.H.; Huang, L.; Deng, C.J.; Han, N. A feature extraction method based on differential entropy and linear discriminant analysis for emotion recognition. Sensors; 2019; 19, 1631. [DOI: https://dx.doi.org/10.3390/s19071631]
22. Zheng, W.; Lu, S.; Yang, Y.; Yin, Z.; Yin, L. Lightweight transformer image feature extraction network. PeerJ Comput. Sci.; 2024; 10, e1755. [DOI: https://dx.doi.org/10.7717/peerj-cs.1755]
23. Zhou, J.; Zhang, Q.; Zeng, S.; Zhang, B.; Fang, L. Latent linear discriminant analysis for feature extraction via isometric structural learning. Pattern Recognit.; 2024; 149, 110218. [DOI: https://dx.doi.org/10.1016/j.patcog.2023.110218]
24. Wang, J.; Liu, Z.; Zhang, K.; Wu, Q.; Zhang, M. Robust sparse manifold discriminant analysis. Multimed. Tools Appl.; 2022; 81, pp. 20781-20796. [DOI: https://dx.doi.org/10.1007/s11042-022-12708-3]
25. Zhang, K.; Zheng, D.; Li, J.; Gao, X.; Lu, J. Coupled discriminative manifold alignment for low-resolution face recognition. Pattern Recognit.; 2024; 147, 110049. [DOI: https://dx.doi.org/10.1016/j.patcog.2023.110049]
26. Chen, S.; Ma, S.; Man-Cho So, A.; Zhang, T. Proximal gradient method for nonsmooth optimization over the Stiefel manifold. SIAM J. Optim.; 2020; 30, pp. 210-239. [DOI: https://dx.doi.org/10.1137/18M122457X]
27. Xiao, N.; Liu, X.; Yuan, Y.x. Exact Penalty Function for L2,1 Norm Minimization over the Stiefel Manifold. SIAM J. Optim.; 2021; 31, pp. 3097-3126. [DOI: https://dx.doi.org/10.1137/20M1354313]
28. Wang, L.; Liu, X. Decentralized optimization over the Stiefel manifold by an approximate augmented Lagrangian function. IEEE Trans. Signal Process.; 2022; 70, pp. 3029-3041. [DOI: https://dx.doi.org/10.1109/TSP.2022.3182883]
29. Beck, A.; Eldar, Y.C. Sparsity constrained nonlinear optimization: Optimality conditions and algorithms. SIAM J. Optim.; 2013; 23, pp. 1480-1509. [DOI: https://dx.doi.org/10.1137/120869778]
30. Zhou, S.; Xiu, N.; Qi, H.D. Global and quadratic convergence of Newton hard-thresholding pursuit. J. Mach. Learn. Res.; 2021; 22, pp. 1-45.
31. Li, G.; Qin, S.J.; Zhou, D. Geometric properties of partial least squares for process monitoring. Automatica; 2010; 46, pp. 204-210. [DOI: https://dx.doi.org/10.1016/j.automatica.2009.10.030]
32. Liu, Y.; Zeng, J.; Xie, L.; Luo, S.; Su, H. Structured joint sparse principal component analysis for fault detection and isolation. IEEE Trans. Ind. Inform.; 2018; 15, pp. 2721-2731. [DOI: https://dx.doi.org/10.1109/TII.2018.2868364]
33. Chen, Z.; Ding, S.X.; Peng, T.; Yang, C.; Gui, W. Fault detection for non-Gaussian processes using generalized canonical correlation analysis and randomized algorithms. IEEE Trans. Ind. Electron.; 2017; 65, pp. 1559-1567. [DOI: https://dx.doi.org/10.1109/TIE.2017.2733501]
34. Li, H.; Liu, D.; Wang, D. Manifold regularized reinforcement learning. IEEE Trans. Neural Netw. Learn. Syst.; 2017; 29, pp. 932-943. [DOI: https://dx.doi.org/10.1109/TNNLS.2017.2650943]
35. Li, J.; Ma, S. Federated learning on Riemannian manifolds. arXiv; 2022; arXiv: 2206.05668
36. Rockafellar, R.T.; Wets, R.J.B. Variational Analysis; Springer Science & Business Media: Berlin, Germany, 2009; Volume 317.
37. Liu, J.; Feng, M.; Xiu, X.; Liu, W.; Zeng, X. Efficient and Robust Sparse Linear Discriminant Analysis for Data Classification. IEEE Trans. Emerg. Top. Comput. Intell.; 2024; 9, pp. 617-629. [DOI: https://dx.doi.org/10.1109/TETCI.2024.3403912]
38. Deng, L. The mnist database of handwritten digit images for machine learning research [best of the web]. IEEE Signal Process. Mag.; 2012; 29, pp. 141-142. [DOI: https://dx.doi.org/10.1109/MSP.2012.2211477]
39. Mantecón, T.; del Blanco, C.R.; Jaureguizar, F.; García, N. Hand gesture recognition using infrared imagery provided by leap motion controller. Proceedings of the Advanced Concepts for Intelligent Vision Systems: 17th International Conference, ACIVS 2016; Lecce, Italy, 24–27 October 2016; Proceedings 17 Springer: Berlin/Heidelberg, Germany, 2016; pp. 47-57.
40. Nene, S.A.; Nayar, S.K.; Murase, H. Columbia Object Image Library (Coil-20); Department of Computer Science, Columbia University: New York, NY, USA, 1996.
41. Bao, Y.; Song, K.; Liu, J.; Wang, Y.; Yan, Y.; Yu, H.; Li, X. Triplet-graph reasoning network for few-shot metal generic surface defect segmentation. IEEE Trans. Instrum. Meas.; 2021; 70, pp. 1-11. [DOI: https://dx.doi.org/10.1109/TIM.2021.3083561]
42. Kinnunen, T.; Kamarainen, J.K.; Lensu, L.; Lankinen, J.; Käviäinen, H. Making visual object categorization more challenging: Randomized caltech-101 data set. Proceedings of the 2010 20th International Conference on Pattern Recognition; Istanbul, Turkey, 23–26 August 2010; pp. 476-479.
43. Molchanov, P.; Yang, X.; Gupta, S.; Kim, K.; Tyree, S.; Kautz, J. Online detection and classification of dynamic hand gestures with recurrent 3d convolutional neural network. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Las Vegas, NV, USA, 27–30 June 2016; pp. 4207-4215.
44. Carreira, J.; Zisserman, A. Quo vadis, action recognition? A new model and the kinetics dataset. Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition; Honolulu, HI, USA, 21–26 July 2017; pp. 6299-6308.
45. D’Eusanio, A.; Simoni, A.; Pini, S.; Borghi, G.; Vezzani, R.; Cucchiara, R. A transformer-based network for dynamic hand gesture recognition. Proceedings of the 2020 International Conference on 3D Vision (3DV); Fukuoka, Japan, 25–28 November 2020; pp. 623-632.
46. Zhang, Z.; Sabuncu, M. Generalized cross entropy loss for training deep neural networks with noisy labels. Adv. Neural Inf. Process. Syst.; 2018; 31, pp. 1-11.
© 2025 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.