A Distinguishable Pseudo-Feature Synthesis Method

Full text

Turn on search term navigation

This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

1. Introduction

Target classification and recognition have been dramatically improved with the development of deep learning technologies. Traditional deep learning methods rely heavily on large-scale labelled training datasets such as ImageNet [1]. However, some are infeasible in extreme cases without labelled samples of some classes [2]. To address it, zero-shot learning (ZSL), which imitates the process of human recognition, has been proposed to link seen classes (available in training datasets) and unseen ones (not available in training datasets) using auxiliary information (e.g., attributes [3] and word vectors [4]). Conventional ZSL methods only consider the recognition of unseen classes but neglect that of seen classes. It leads to the failure of simultaneous recognition of them [5]. Subsequently, generalized zero-shot learning (GZSL) [6] has been found to address it.

Most previous GZSL works are mainly divided into mapping-based approaches [7, 8] and hybrid approaches. The former learns a visual-semantic projection model trained with labelled samples. However, they are prone to overfitting due to limitation of labelled sample numbers and domain shift between disjointed seen classes and unseen classes [9], failing in unseen class classification. The latter, including generating-based approaches [10] and synthesis-based ones, has been proposed to alleviate overfitting. Generating-based approaches (e.g., generative adversarial networks (GANs) [11] and variational auto-encoders (VAEs) [12]) generate pseudo-features for unseen classes with prior semantic knowledge. However, they suffer from mode collapse [13] because it is challenging to train hybrid models. Unlike them, synthesis-based approaches [14–16] synthesize pseudo-features for unseen classes by using semantic information and seen class features. However, they suffer from negative transfer [17] and low-quality class discriminability [18].

In this paper, we propose a novel two-stage method of distinguishable pseudo-feature synthesis (DPFS) for GZSL tasks, as shown in Figure 1. Here, the embedding network and the preclassifier are jointly pretrained to extract distinguishable features for seen classes and simultaneously predict prototypes for unseen ones in stage 1. It ensures that the features of seen classes are well-kept and avoids overfitting effectively. Next, distinguishable pseudo-features of unseen classes are synthesized through the attribute projection module (APM) and the pseudo-feature synthesis module (PFSM) in stage 2. Here, for each unseen class, APM builds a sparse representation based on attributes to output a base vector. It only uses attributes of the base classes (i.e., the similar seen classes), thereby overcoming negative transfer. Furthermore, PFSM creates feature representations and synthesizes the pseudo-features by using the base class features, the base vectors and the unseen class attributes. The outliers of pseudo-features are disposed of to get distinguishable pseudo-features and improve the class discriminability. The distinguishable features are fed to the classifier to boost GZSL classification performance.

[figure(s) omitted; refer to PDF]

Our major contributions are summarized as follows:

(1) We proposed a novel generalized zero-shot learning (GZSL) method of distinguishable pseudo-feature synthesis (DPFS). The proposed method can further improve GZSL classification performance compared with other state-of-the-art methods.

(2) We pretrained our model by a well-designed distance prediction loss while predicting prototypes for unseen classes, thereby avoiding overfitting.

(3) We only selected attributes of similar seen classes when making sparse representations based on attributes for unseen classes, thereby overcoming negative transfer effectively.

(4) We screened the outliers of synthesized pseudo-features and disposed of them to further improve class discriminability.

2. Related Works

Mapping-based approaches can be traced back to early ZSL tasks [2–4, 9]. They learn a mapping function between visual features and semantic features by supervised learning. So, it is important to construct a feature-semantic loss function that can be used to train mapping model [19]. But early methods are prone to overfitting in GZSL tasks [7]. CPL [8] learned visual prototype representations for unseen classes to solve the problem. To obtain discriminative prototype, DVBE [20] used second-order graphical statistics, DCC [21] learned the relationship between embedded features and visual features, and HSVA [22] used hierarchical two-step adaptive alignment of visual and semantic feature manifolds. However, the prototype representation is constrained and does not correspond to actual features [10] due to domain shift. Different from these works, we propose a distance prediction loss, which constructs not only feature-attribute distance constraint of seen classes but also predicts unseen class prototypes under the guidance of a preclassifier. It keeps seen class features from disturbing the classification of unseen classes to avoid overfitting.

Generating-based approaches [23, 24], which utilize GANs and VAEs, have been widely applied to produce information about unseen classes and improve the prototype representation for GZSL tasks. They generate pseudo-features for unseen classes under the prior condition of semantic knowledge and random noise. LDMS [25], Inf-FG [26], and FREE [27] improved the generating strategy from aspects of discrimination loss, consistency descriptors, and feature refining. Besides, GCF [28] presented counterfactual-faithful generation to solve recognition rate imbalance between both seen classes and unseen ones. Although the strategies of generating-based methods are added to our proposed method, the use of simplex semantic information and the training difficulty [16] of GANs cause mode collapse.

Synthesis-based approaches [24, 29] integrate features and semantics of seen classes to enhance the feature diversity. SPF [15] designed a synthesis rule to guide feature embedding. TCN [14] exploited class similarities to build knowledge transfer from seen to unseen classes. To deal with the domain shift, LIUF [16] synthesized domain invariant features by minimizing the maximum mean discrepancy distance of seen class features. However, it would lead to negative transfer by mixing irrelevant class information. Different from the above mentioned, we only select the similar seen classes, instead of all seen classes, to finish knowledge transfer, thereby avoiding negative transfer caused by the mixing of irrelevant information. Then, we utilize distinguishable features extracted from the pretrained embedding network to apply to the pseudo-feature synthesis. Besides, we use a preclassifier to dispose of the outliers of synthesized components, thereby improving class discriminability. Unlike the method [24] of using synthesized elements from other domains, we only utilize the similar seen classes from this domain to overcome the unavailability of data from other domains.

3. Proposed Method

GZSL is more challenging than ZSL, which recognizes samples only from unseen classes, because GZSL needs to recognize samples from seen classes and unseen classes. Therefore, we propose the DPFS method to improve the theoretical basis of GZSL further and boost the classification performance. DPFS can synthesize distinguishable pseudo-features for unseen classes, and then use the pseudo-features to finish GZSL classification together with features of seen classes. In this chapter, we first define notations and definitions of GZSL, then outline the proposed method, including base class selection, distinguishable feature extraction, attribute projection, and distinguishable pseudo-feature synthesis. Finally, we provide the process of our training algorithm.

3.1. Mathematical Formulation

In GZSL tasks, suppose we have $S$ seen classes $y^{S}$ and $U$ unseen classes $y^{U}$ , $y^{S} \cap y^{U} = \emptyset$ . We give training dataset $∆^{S} = {y^{s} x_{i}, y_{i} \in Ξ \times y^{S}}_{i = 1}^{n_{s}}$ where $n_{s}$ is the sample number, $Ξ$ is visual space, $x_{i}$ is a visual feature, and $y_{i}$ is the class index of $x_{i}$ . The mapping function of the embedding network is denoted as $φ : Ξ ⟶ ς$ where $ς$ is latent space. The weight parameters of the embedding network, the preclassifier and the classifier are $θ_{e n}$ , $θ_{p c l s}$ , and $θ_{c l s}$ , respectively. $A^{S} = a_{1}^{S}, \dots, a_{S}^{S}$ and $A^{U} = a_{1}^{U}, \dots, a_{U}^{U}$ are class-attribute matrices of seen classes and unseen classes, respectively. $s$ and $u$ are indexes of seen classes and unseen classes, $s \in y^{S}$ and $u \in y^{U}$ , respectively.

GZSL methods learn a function $f_{G Z S L} : Ξ ⟶ y^{S} \cup y^{U}$ with training dataset $∆^{S}$ , and class-attribute matrices $A^{S}$ and $A^{U}$ to classify disjoint seen classes and unseen ones at the same time. After the training, both seen and unseen classes from testing datasets will be predicted by $f_{G Z S L}$ .

3.2. Base Class Selection

For each unseen class, we only select the top $K$ seen classes similar to the unseen classes to overcome negative transfer. Attributes of all base classes of unseen class $u$ are with the closest distance to the attribute of the unseen class, which are as follows: $\begin{matrix} (1) & Β_{u} = 1 - \frac{a_{u}^{U} {∙ a}_{s}^{S}}{a_{s}^{S} a_{u}^{U}} | s \in y^{S}, \\ (2) & y_{u}^{B} = k | 1 - \frac{a_{u}^{U} ∙ a_{k}^{S}}{a_{u}^{U} a_{k}^{S}} \in t o p k B_{u}, \end{matrix}$ where $t o p k ∙$ is an operator that sorts elements from small to large and selects indices of the top $K$ elements. $y_{u}^{B}$ stones indices of the top $K$ base classes, which are the first to the $K$ th seen classes most similar to unseen class $u$ .

3.3. Distinguishable Feature Extraction

In stage 1, we pretrain the embedding network and the preclassifier. It makes the embedding network extract distinguishable features for seen classes to build a relationship between classes and semantics, as shown in Figure 1. The attributes obtained by cognitive scientists [30] are the most commonly used semantic knowledge, and they are based on the high-level description of target objects specified by human beings [2]. We introduce the constraint of feature-attribute distance by imitating meta-learning [31], and build prototype representations, as shown in Figure 2. The customary way to construct the meta-learning task is called as $K$ -way- $N$ -shot [32], where $N$ labelled samples in each of the $K$ classes are provided in each iteration of the model training.

[figure(s) omitted; refer to PDF]

We randomly sample one unseen class and $K$ seen classes per iteration. And, we set support set $Σ = {x_{i}, y_{i} | y_{i} \in Ψ^{S}}_{i = 1}^{N \times K}$ and query set $Θ = {x_{i}, y_{i} | y_{i} \in Ψ^{S}}_{i = N \times K + 1}^{N \times K + Q \times K}$ . The visual features from $Σ$ produce prototypes for seen classes through the embedding network are as follows: $\begin{matrix} (3) & c_{y_{i}} = \frac{\sum_{i = 1}^{N} E_{θ_{e n}} x_{i}}{N}, \end{matrix}$ where $x_{i}$ is a visual feature from seen class $s$ and $N$ is the class number. Then, a feature-attribute distance (FAD) loss is constructed as follows: $\begin{matrix} (4) & Λ_{F A D} = \sum_{x_{i}, y_{i} \in Θ} {E_{θ_{e n}} x_{i} - c_{y_{i}}}_{2}^{2} + {c_{y_{i}} - a_{y_{i}}^{S}}_{2}^{2} . \end{matrix}$

Different from the meta-representation [33] restrained by the distance minimization of intraclass features, we act on the feature-attribute distance constraint to structure the meta-representation associating common characteristics between different attributes. After the constraint, features in latent space are pulled near their prototypes to ensure that the similar attracts and the dissimilarity repels each other. The prototype and the attribute from the same class are close to each other. Therefore, the features of seen classes in latent space can be regarded as the distinguishable features extracted from the embedding network.

To keep the embedding network from overfitting, the prototypes are predicted by features of their base classes. A component from the base class is denoted as follows: $\begin{matrix} (5) & v_{k} = E_{θ_{e n}} c h o i c e b_{k}, \end{matrix}$ where $c h o i c e ∙$ is a choice operator, specifically $c h o i c e b_{k}$ means randomly choosing a visual feature of the $k$ th similar base class from $Ψ_{u}^{B}$ . A predicted prototype is denoted as follows: $\begin{matrix} (6) & {\tilde{c}}_{u} = \frac{\sum_{b_{k} \in Ψ_{u}^{B}} v_{k} + a_{u}^{U}}{K + 1} . \end{matrix}$

For each iteration, we build a prototype query set $Θ^{U} = {\tilde{c}}_{u}, u | y_{i} \in Ψ^{U}_{i = 1}^{U}$ . Then, a preclassification loss $Λ_{P C}$ operating to pretrain the preclassifier is donated as follows: $\begin{matrix} (7) & Λ_{P C} = \sum_{x_{i}, y_{i} \in Θ} \log P_{θ_{p c l s}} y_{i} | E_{θ_{e n}} x_{i} - \sum_{{\tilde{c}}_{u}, u \in Θ^{U}} \log P_{θ_{p c l s}} u | {\tilde{c}}_{u}, \end{matrix}$ where $p ∙ | ∙$ is a SoftMax function for the preclassification. Then, $Λ_{F A D}$ and $Λ_{P C}$ are summed to form distance prediction loss $Λ_{D P}$ as follows: $\begin{matrix} (8) & Λ_{D P} = Λ_{F A D} + Λ_{P C} . \end{matrix}$

We use the distance prediction loss to jointly pretrain the embedding network and the preclassifier. After that, seen classes will be classified, and unseen classes will be predicted preliminarily. It prevents trade-off failure between seen and unseen classes. Besides, features of seen classes will be extracted, and then used for unseen pseudo-feature synthesis.

3.4. Attribute Projection

Inspired by sparse coding, we make a sparse representation for each unseen class. We select attributes only from the base classes unlike the methods [14, 16] using all seen classes, to build attribute projections from seen to unseen classes. For unseen class $u$ , the matrix of its attribute projection is denoted as follows: $\begin{matrix} (9) & M_{u} = \begin{matrix} a_{b_{1}}^{S} & \dots & a_{b_{K}}^{S} \end{matrix}, \end{matrix}$ where $a_{b_{1}}^{S}, a_{b_{K}}^{S} \in Ψ_{u}^{B}$ . The attribute projection can represent the unseen class information by using sparse representation vector set ${m_{u}}_{u = 1}^{U}$ . The objective function of the attribute projection is as follows: $\begin{matrix} (10) & m_{u} = \arg \min_{m_{u}} {a_{u}^{U} - M_{u} m_{u}}_{2}^{2} + β_{1} {m_{u}}_{1} + β_{2} {m_{u}}_{2}^{2}, \end{matrix}$ where $β_{1}$ and $β_{2}$ are two regulation coefficients, $β_{1}, β_{2} > 0$ . The mixed regularizations of L1-norm and L2-norm have the advantages of sparsity and trade-off between deviation and variance [34]. Both $β_{1}$ and $β_{2}$ are set as 0.4 with appropriate generality. The objective function is optimized by the optimal local condition of Karush-Kuhn-Tucker [35] where $m_{u}$ are non-negative. We normalize $m_{u}$ by using the following equation: $\begin{matrix} (11) & m_{u} = \frac{m_{u}}{m_{u}} . \end{matrix}$

Then, we treat $m_{u}$ as the base vector. The attribute projection provides a vital item for the pseudo-feature synthesis, as shown in Figure 3.

[figure(s) omitted; refer to PDF]

3.5. Distinguishable Pseudo-Feature Synthesis

For unseen class $u$ , we randomly choose a feature from each of its base classes to construct an embedding matrix $\begin{matrix} v_{1} & \dots & v_{K} \end{matrix}$ . The base vectors are utilized for weighting the chosen features that are embedded into the attribute projection, as shown in Figure 3(a). Then, a feature representation is formulated as follows: $\begin{matrix} (12) & \tilde{v} = 1 - γ \begin{matrix} v_{1} & \dots & v_{K} \end{matrix} m_{u} + γ a_{u}^{U}, \end{matrix}$ where $γ$ is a weighting coefficient ( $γ \in 0,1$ ). However, the feature representation only integrated with features of the base classes may be scattered and produce outliers of candidate pseudo-features, as shown in Figure 3(b). Therefore, attribute information is integrated into the feature representation to synthesize candidate pseudo-features, as shown in Figure 3(c).

To dispose of the outliers, we screen them by the following equation: $\begin{matrix} (13) & f \tilde{v} = \begin{matrix} 0, \max_{s \in Ψ^{S}} P_{θ_{p c l s}} s | \tilde{v} \geq τ, \\ 1, o t h e r w i s e, \end{matrix} \end{matrix}$ where $τ$ is creditability threshold ( $τ \in 0,1$ ). The preclassifier acts as an operator of the outlier disposing. It screens and reserves the credible pseudo-features satisfying $f \tilde{v} = 1$ to get distinguishable pseudo-features of unseen classes, as shown in Figure 3(d). After the operations of the attribute projection and the pseudo-feature synthesis, the synthesized features integrated with the information of the similar base classes and unseen classes have separability characteristics.

3.6. Train and Inference

We conduct the DPFS model training. Algorithm 1 shows the pseudo-code of the DPFS training algorithm. The algorithm mainly includes two-cycle structures because DPFS is a two-stage method. Firstly, the sequence structure from lines 1 to 2 performs the attribute projection to get the base vector for each unseen class. Next, the first cycle from lines 3 to 9 performs the embedding module pretraining to extract distinguishable features of seen classes. Then, the second cycle from lines 10 to 15 performs the classifier training for GZSL tasks. In each iteration of the classifier training, we randomly select a certain number of the whole samples from training samples and synthesized pseudo-feature samples, where the number of the selected whole samples is $N_{w}$ . Here, the proportion of the pseudo-feature samples in the whole samples is set as $η$ . After each iteration, the classifier is adopted for evaluation.

Algorithm 1: DPFS training algorithm

Input: training dataset $Δ^{S}$ , class-attribute matrices of seen classes and unseen ones $A^{S}$ and $A^{U}$ , learning rate $λ$ , and max-epochs of the embedding module pretraining and the classifier training $n_{p r e}$ and $n_{t}$

Initialize: set of the weight parameters of the embedding module and the preclassifier $W = θ_{e n}, θ_{p c l s}$ , classifier weight parameter $θ_{c l s}$

(1) Build attribute projection matrices with $A^{S}$ and $A^{U}$ by, and equations (1), (2), and (9) for unseen classes

(2) Compute the base vectors with the matrices and $A^{U}$ by equations (10) and (11)

(3) for step = 0, …, $n_{p r e}$ do

(4) Set $Σ = {x_{i}, y_{i}}_{i = 1}^{N \times K}$ and $Θ = {x_{i}, y_{i}}_{i = N \times K + 1}^{N \times K + Q \times K}$

(5) Compute base class prototype $c_{k}$ with $E_{θ_{e n}}$ by equation (3)

(6) Build prototype query set $Θ^{U} = {\tilde{c}}_{i}_{i = 1}^{Q}$ with $E_{θ_{e n}}$ and $A^{U}$ by equations (5) and (6)

(7) Compute $Λ_{D P}$ by equation (8)

(8) Update $W \leftarrow W + λ_{1} \nabla_{W} Λ_{D P}$

(9) end for

(10) for step = 0, …, $n_{t}$ do

(11) Synthesize candidate pseudo-features for unseen classes by equation (12)

(12) Dispose of the outliers of candidate pseudo-features by equation (13)

(13) Select a certain number of samples

(14) Train the classifier with the selected samples to update $θ_{c l s}$ while fine-tune $θ_{e n}$ .

(15) end for

Output: Embedding network $E_{θ_{e n}}$ and classifier $P_{θ_{c l s}}$

4. Experimental Results

4.1. Datasets

The DPFS model is evaluated on four widely datasets as evaluating benchmarks, i.e., Animals with Attributes 2 (AWA2 [6]), aPascal & Yahoo (aPY [36]), Caltech UCSD Birds 200 (CUB [37]), and SUN Attribute (SUN [38]). AWA2 and aPY are coarse-grained datasets and aPY includes a higher proportion of unseen classes than AWA2. CUB and SUN are fine-grained datasets, especially SUN, with more whole classes and fewer training samples per class than CUB. Table 1 summarizes the statistics of the four evaluating benchmarks.

Table 1

Statistics of the four benchmark datasets.

Dataset	Number of classes			Attribute	Number of samples
Dataset	Seen	Unseen	Total	Attribute	Training	Seen testing	Unseen testing	Total
AWA2	40	10	50	85	23527	5882	7913	37322
APY	20	12	32	64	5932	1483	7924	15339
CUB	150	50	200	312	7057	1764	2967	11788
SUN	645	72	717	102	10320	2580	1440	14340

4.2. Implementation Details

We conduct ResNet-101 [39] as a backbone based on a convolutional neural network. Visual features are extracted from the output of the final avg-pooling layer after the backbone is pretrained on ImageNet [1]. Figure 4 shows the network structures of the DPFS model including the embedding network, the preclassifier and the classifier. The embedding network is composed of three fully connected (FC) layers, and the back of each layer is connected to a ReLU activation function for nonlinear activation. Both the preclassifier and the classifier have the same modules. Their modules are composed of two FC layers and the output dimensions equal the total number of all classes. For the four benchmarks, the middle layer dimension of the classifier is 512 for AWA2 and aPY, and 1024 for CUB and SUN, respectively.

[figure(s) omitted; refer to PDF]

Our model is coded in PyTorch and runs on GeForce RTX 2080 Ti. It is trained by an adaptive moment estimation (Adam) [40] optimizer. During the embedding module pretraining, sample numbers of each class in both the support set and the query set, $N$ and $Q$ , are set as 4 for AWA2, aPY, and CUB, and 2 for SUN, respectively. The learning rate of our model is 10⁻⁴. During the classifier training, the number of the whole selected samples, $N_{w}$ is set as 1000. The classifier is trained with a learning rate of 10⁻⁴ and the embedding module is fine-tuned with a learning rate of 10⁻⁶. Besides, four additional hyper-parameters, the proportion of pseudo-feature samples $η$ , creditability threshold $τ$ , number of base classes $K$ , and weighting coefficient $γ$ will be discussed later in the hyper-parameter sensitivity chapter. Samples from training datasets are used to train our model by supervised learning. And samples from the testing datasets are used to evaluate GZSL classification performance of our model.

The accuracies of average seen classes ( $A s$ ) and average unseen classes ( $A u$ ) are computed based on the universal evaluation protocols [6]. $\begin{matrix} (14) & A s = \frac{1}{Ψ^{S}} \sum_{y \in Ψ^{S}} \frac{# c o r r e c t p r e d i c t i o n s i n y}{# s a m p l e s i n y}, \\ (15) & A u = \frac{1}{Ψ^{U}} \sum_{y \in Ψ^{U}} \frac{# c o r r e c t p r e d i c t i o n s i n y}{# s a m p l e s i n y} . \end{matrix}$

We evaluate the simultaneous classification accuracy of both seen and unseen classes by computing harmonic mean $H$ as follows: $\begin{matrix} (16) & H = \frac{2 \times A s \times A u}{A s + A u}, \end{matrix}$ $H$ is regarded as the most crucial criterion to measure the GZSL classification performance.

4.3. Hyper-Parameter Sensitivity

There are four hyper-parameters including the proportion of pseudo-feature samples $η$ , creditability threshold $τ$ , number of base classes $K$ , and weighting coefficient $γ$ . We discuss the sensitivity of the hyper-parameters because proper hyper-parameters give our model extra reliability and robustness.

Proportion $η$ controls the frequencies of obtaining information from seen classes and unseen ones. Higher $η$ provides the classifier with more opportunities to learn the characteristics of unseen classes. Figure 5 shows GZSL classification performance under different $η$ on the four benchmarks. We set $η$ within the range from 0.7 to 0.97 and select the proper $η$ value according to the optimal GZSL performance.

[figure(s) omitted; refer to PDF]

$A s$ will decrease slowly while $A u$ and $H$ will increase until reaching a peak along with the increase of $η$ in most cases. This result reveals that DPFS can provide more balanced GZSL performance by adjusting $η$ . The decreasing ratio of $A s$ will increase after $A u$ and $H$ reach the peak. It indicates a proper selection of $η$ is necessary to solidify seen class classification. When $H$ reaches the peak, $η$ is different on the four benchmarks. The value depends on the granularity of training samples. In general, the value on the benchmarks with a few training samples (such as SUN) should be lower than that on the benchmarks with multitraining samples (such as AWA2), and the value on the benchmarks with a higher proportion of unseen classes (such as aPY and CUB) should be higher. Therefore, we set $η$ = 0.85 for AWA2, $η$ = 0.94 for aPY, $η$ = 0.91 for CUB, and $η$ = 0.76 for SUN.

Creditability threshold $τ$ controls the effect of the outlier disposing. Figure 6 shows the performance under different $τ$ on the four benchmarks. We set $τ$ within the range of 0.7 to 0.95. This result reveals that $A s$ will decrease and $A u$ will increase along with the increase of $τ$ in most cases. Meanwhile, $H$ will increase until reaching a peak. When the range of $τ$ is 0.8 to 0.9, $H$ will reach the peak, and the classification accuracy will be the best. It indicates proper $τ$ can prevent the outliers from interfering with seen class classification while maintaining unseen class classification. Therefore, we set $τ$ = 0.85 on all the four benchmarks.

[figure(s) omitted; refer to PDF]

Numbers of base classes and weighting coefficient, $K$ and $γ$ , concurrently control the pseudo-feature synthesis simultaneously. Figure 7 shows the $H$ heatmap results of the performance under different $K$ and $γ$ values on the four benchmarks. The range of $K$ is set from 3 to 9 for AWA2 and CUB, from 6 to 12 for aPY, and from 2 to 8 for SUN, respectively. The range of $γ$ is set from 0 to 0.4.

[figure(s) omitted; refer to PDF]

The result reveals that $N$ has a more significant impact than $γ$ on $H$ . $H$ will increase first and then reduce along with the increase of $N$ . It indicates that an appropriate integration with the similar seen classes will achieve outstanding classification accuracy, but an over-integration will degrade the classification accuracy because it mixes information of irrelevant classes. According to the performance on the four benchmarks, $N$ making $H$ reach peak depends on the granularity of training samples. In general, $N$ on the benchmarks with a few training samples (such as CUB and SUN) should be lower than that on the benchmarks with multitraining samples (such as AWA2), and $N$ on the benchmarks with the higher proportion of unseen classes (such as aPY) should be higher. So, we set $N$ = 5 for AWA2, $N$ = 9 for aPY, $N$ = 6 for CUB, and $N$ = 3 for SUN.

The result also reveals that when $N$ is fixed, $H$ will also increase first and then reduce along with the increase of $γ$ in most cases. It indicates that weighting a certain proportion of attributes will improve the classification accuracy and the proper introduction of attribute information can raise the performance of our model. Therefore, we set $γ$ = 0.2 for AWA2, $γ$ = 0.3 for aPY, $γ$ = 0.1 for CUB, and $γ$ = 0.35 for SUN.

4.4. Performance Results

Table 2 shows GZSL classification performance results compared with existing state-of-the-art approaches and the proposed DPFS. The existing approaches contain the mapping-based, the generating-based, and the synthesis-based approaches, which are marked with †, ⸶, and ⸷, respectively. Among these, the results show that DPFS gains the best performance on AWA2, CUB, and SUN, and achieves the second performance on CUB. Compared with the mapping-based approaches, DPFS is superior to DCC by 5.5% on aPY, and DVBE by 4.8%, 2%, and 5.4% on AWA2, CUB, and SUN, respectively. Compared with the generating-based approaches, DPFS is superior to FREE by 4.7% on AWA2, LDMS by 4.9% on aPY, and GCF by 3.9% on SUN, respectively. And compared with the synthesis-based approaches, DPFS is superior to LIUF by 1.6%, 1.1%, 6%, and 3.2% on AWA2, aPY, CUB, and SUN, respectively. DPFS significantly improves $A u$ and avoids overfitting.

Table 2

Quantitative comparisons of average per-class GZSL classification accuracy (%).

Method		AWA2			aPY			CUB			SUN
Method		$A s$	$A u$	$H$	$A s$	$A u$	$H$	$A s$	$A u$	$H$	$A s$	$A u$	$H$
†	LATEM [9]	77.3	11.5	20.0	73.0	0.1	0.2	57.3	15.2	24.0	28.8	14.7	19.5
	DEM [19]	86.4	30.5	45.1	75.1	11.1	19.4	54.0	19.6	13.6	34.3	20.5	25.6
	CPL [8]	83.1	51.0	63.2	73.2	19.6	30.9	58.6	28.0	37.9	32.4	21.9	26.1
	DVBE [20]	70.8	63.6	67.0	58.3	32.6	41.8	60.2	53.2	56.5	37.2	45.0	40.7
	DCC [21]	82.9	55.1	66.2	74.8	34.4	47.2	57.7	46.5	51.5	41.0	33.1	36.6
	HSVA [22]	79.3	57.8	66.9	—	—	—	59.5	51.9	55.5	39.0	48.6	43.3

†	SEZEL [23]	68.1	58.3	62.8	—	—	—	53.3	41.5	46.7	30.5	40.9	34.9
	DUET [10]	90.2	48.2	63.4	55.6	21.8	31.3	80.1	39.7	53.1	—	—	—
	Inf-FG [26]	63.4	58.3	60.7	—	—	—	57.0	45.8	50.8	37.1	44.7	40.5
	LDMS [25]	71.8	60.9	65.9	66.3	37.4	47.8	61.6	48.0	53.9	36.2	45.6	40.3
	FREE [27]	75.4	60.4	67.1				59.9	55.7	57.7	37.7	44.8	40.9
	GCF [28]	75.1	60.4	67.0	56.8	37.1	44.9	59.7	61.0	60.3	37.8	47.9	42.2

†	SPF [15]	60.9	52.4	56.3	—	—	—	63.4	30.2	40.9	59.0	32.2	41.6
	TCN [14]	65.8	61.2	63.4	64.0	24.1	35.1	52.0	52.6	52.3	37.3	31.2	34.0
	LIUF [16]	83.5	60.6	70.2	79.1	38.2	51.6	54.0	51.2	52.5	40.4	45.7	42.9
	DPFS	87.3	61	71.8	83.0	38.6	52.7	63.8	54.0	58.5	43.0	49.6	46.1

DPFS is superior to most mapping-based approaches in the aspects of $A u$ and $H$ , especially on SUN. It indicates that DPFS has a more vital learning ability on the benchmarks with a few training samples. And DPFS shows significant improvement of $A s$ , $A u$ , and $H$ , especially compared to generating-based approaches on aPY. It explains that DPFS makes full use of the feature information of seen classes and the attribute information, thereby solving the difficulty of classifying the higher proportion of unseen classes and avoiding mode collapse.

DPFS also experiments on the four benchmarks for conventional ZSL tasks, where only the synthesized pseudo-feature samples are fed into the classifier. Table 3 shows ZSL classification performance results. We observe that DPFS overperforms existing methods on AWA2, aPY, and SUN, which can also verify that the synthesized pseudo-features have distinguishable characteristics.

Table 3

Quantitative comparisons for the ZSL tasks.

Method	AWA2	aPY	CUB	SUN
LATEM [9]	55.8	35.2	49.3	55.3
SJE [4]	61.9	32.9	53.9	53.7
TVN [5]	68.8	41.3	58.1	60.7
CPL [8]	72.7	45.3	56.4	62.2
SEZSL [23]	69.2	—	59.6	63.4
ZVG [12]	69.3	37.4	54.8	59.4
HSVA [22]	—	—	62.8	63.8
DUET [10]	72.6	41.9	72.4	—
Inf-FG [26]	68.3	—	58.0	61.1
LDMS [25]	72.9	43.7	58.4	59.4
TCN [14]	71.2	38.9	59.5	61.8
LIUF [16]	72.4	59.3	43.7	63.3
DPFS	73.9	61.4	68.0	66.8

We further demonstrate the advantage of DPFS over SPF and LIUF. We imitate SPF and LIUF, replacing the strategy of our pseudo-feature synthesis with the synthesis strategies of SPF and LIUF to form the reference methods, D-SPF and D-LIUF, respectively. Meanwhile, the stages of the embedding module pretraining and classifier training of D-SPF and D-LIUF are the same as those of DPFS. Table 4 shows the comparison results among D-SPF, D-LIUF, and DPFS. DPFS gains prominent advantages over D-SPF because the optimized attribute projection can embed and project features of seen class into features of unseen class more accurately, to improve class discriminability. DPFS also has apparent advantages over D-LIUF especially on CUB and SUN. DPFS eliminates the irrelevant classes, so it suppresses negative transfer. In addition, DPFS introduces the attribute weighting in equation (12) and the outlier disposing in equation (13), to decrease the confusion between classes. So, DPFS is superior to D-SPF and D-LIUF in classification.

Table 4

Quantitative comparisons among D-SPF, D-LIUF, and DPFS.

Method	AWA2			aPY			CUB			SUN
Method	$A s$	$A u$	$H$	$A s$	$A u$	$H$	$A s$	$A u$	$H$	$A s$	$A u$	$H$
D-SPF	81.3	55.4	65.9	64.5	32.9	43.5	70.8	44.0	54.3	45.3	42.4	43.8
D-LIUF	86.1	60.4	71.0	82.5	36.2	50.3	58.7	51.8	55.0	42.4	46.4	44.3
DPFS	87.3	61.0	71.8	83.0	38.6	52.7	63.8	54.0	58.5	43.0	49.6	46.1

4.5. Ablation Results

We conducted ablative experiments to illustrate the influence of different tactics in DPFS. The tactics contain the embedding module pretraining (mpt), the outlier disposing (odi) in equation (13), and the preclassification loss (pc) in equation (7). Table 5 shows the results of ablation experiments. Four ablated methods, PFS, DPFS-1, DPFS-2, and DPFS-3 are all validated. PFS is to remove all the tactics. DPFS-1, which pretrains the model only by the feature-attribute distance loss in equation (4), is to add the mpt tactic. DPFS-2 is to add both the mpt and odi tactics. And DPFS-3, which pretrains the model by the distance prediction loss in equation (8), is to add both the tactics of mpt and pc.

Table 5

Ablation results on DPFS.

Method	Mpt	Odi	Pc	AWA2	aPY	CUB	SUN
Method	Mpt	Odi	Pc	H	H	H	H
PFS				58.6	35.8	46.0	31.8
DPFS-1	√			67.2(+8.6)	44.1(+8.3)	55.3(+9.3)	41.1(+9.3)
DPFS-2	√	√		68.1(+9.5)	44.5(+8.7)	55.2(+9.2)	40.8(+9.0)
DPFS-3	√		√	70.1(+11.5)	50.7(+14.9)	56.8(+10.8)	43.5(+11.7)
DPFS	√	√	√	71.8(+13.2)	52.7(+16.9)	58.5(+12.5)	46.1(+14.3)

It is important to add the mpt tactic for extracting some common characteristics between seen classes and unseen ones because it improves prototype representations and eliminates the domain shift. Therefore, DPFS-1 performs obvious progress compared with PFS. PFS-1 is superior to PFS by 8.6% on AWA2, 8.3% on aPY, 9.3% on CUB, and 9.3% on SUN. On this foundation, DPFS-2 adopts the odi tactic to eliminate the outliers of candidate pseudo-features. It boosts the performance on parts of benchmarks. PFS-2 is superior to PFS-1 by 0.9% on AWA2, and 0.4% on aPY, respectively. DPFS-3 adopts the pc tactic to predict prototypes for unseen classes before the classifier training, thus improving the classification performance. PFS-3 is superior to PFS-1 by 2.9% on AWA2, 6.6% on aPY, 1.5% on CUB, and 2.4% on SUN, respectively. DPFS can cohere all the features in the same class and therefore avoid outlier interference. Thus, DPFS adopting the three auxiliary tactics at the same time makes the best progress in $H$ on the four benchmarks. And, DPFS is superior to PFS-3 by 1.7% on AWA2, 2% on aPY, 1.7% on CUB, and 2.6% on SUN.

We visualize features from the embedding module by t-SNE [41] to further show the tactic effect on the AWA2 benchmark for GZSL tasks. Figure 8 shows the visualization results. We find that DPFS can improve the distinguishability of unseen classes. Meanwhile, it can also maintain the distinguishability of seen classes according to the comparison results between Figures 8(a), 8(c) and 8(b), and 8(d). Considering that existing methods [18, 26] do not visualize all features of both seen and unseen classes, we visualize all the output features of testing samples from PFS and DPFS in Figures 8(e) and 8(f), respectively. It is obvious that the classes characterized by the output features from DPFS is more separable than those characterized by the output features from the PFS. DPFS eliminates the confusion between classes and improves feature distinguishability, thus achieving a better multiclass classification accuracy. Both seen and unseen classes satisfy the characteristics of intraclass gather and interclass separability. Therefore, DPFS can effectively eliminate the domain shift.

[figure(s) omitted; refer to PDF]

5. Discussion

Based on the results above, our model was trained and evaluated on four benchmark datasets. Our method selected the optimal hyper-parameters for different benchmarks to achieve the best GZSL classification performance compared with most existing methods. Especially on the benchmarks with a few training samples or with a higher proportion of unseen classes, DPFS gained the superior performance because it can use the information of features and attributes appropriately and avoid mode collapse. Compared with existing synthesis-based models similar to DPFS, DPFS can eliminate the introduction of irrelevant classes and suppress negative transfer. It can also synthesize candidate pseudo-features and dispose of the outliers to improve class discriminability.

Furthermore, our model was also trained and evaluated for ZSL tasks and outperformed competing ZSL methods on most benchmarks. Besides, we conducted the ablation experiments of DPFS and further explained the performance gain of each tactic. Distinguishable features can be extracted and the GZSL performance can be improved with the embedding module pretraining tactic. On this basis, adding the preclassification tactic can predict prototypes for unseen classes before the classifier training, thereby improving the performance and avoiding overfitting. The tactic of the outlier disposing can further enhance the performance. These are the foundation that outperforms the competing GZSL methods. The visualization results have demonstrated that DPFS has the distinguishability characteristics of both seen and unseen classes.

6. Conclusion

This paper proposed a novel distinguishable pseudo-feature synthesis (DPFS) method for GZSL tasks. It included the procedures of base class selection, distinguishable feature extraction, attribute projection, feature representations, and outlier disposing. These procedures can realize the initialization, the connection, and the weight updating of the DPFS model. Therefore, the model can synthesize distinguishable pseudo-features with attributes of unseen classes and features of similar seen classes. Experimental results showed that DPFS achieved the GZSL classification performance better than existing methods. It indicated DPFS significantly improved class discriminability and restrained negative transfer, and DPFS also effectively eliminated the domain shift and the confusion between classes. In the future, we will synthesize more distinguishable features of unseen classes by integrating more auxiliary information, such as statistical features and knowledge graphs, to extend our method into other applications.

Acknowledgments

This work is supported by the National Natural Science Foundation of China (Grant nos. 42276187 and 41876100) and the Fundamental Research Funds for the Central Universities (Grant no. 3072022FSC0401).

References

[1] O. Russakovsky, J. Deng, H. Su, J. Krause, S. Satheesh, S. Ma, Z. Huang, A. Karpathy, A. Khosla, M. Bernstein, A. C. Berg, L. Fei-Fei, "Imagenet large scale visual recognition challenge," International Journal of Computer Vision, vol. 115 no. 3, pp. 211-252, DOI: 10.1007/s11263-015-0816-y, 2015.

[2] C. H. Lampert, H. Nickisch, S. Harmeling, "Learning to detect unseen object classes by between-class attribute transfer," pp. 951-958, DOI: 10.1109/CVPR.2009.5206594, .

[3] C. H. Lampert, H. Nickisch, S. Harmeling, "Attribute-based classification for zero-shot visual object categorization," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 36 no. 3, pp. 453-465, DOI: 10.1109/tpami.2013.140, 2014.

[4] Z. Akata, S. Reed, D. Walter, H. Lee, B. Schiele, "Evaluation of output embeddings for fine-grained image classification," pp. 2927-2936, DOI: 10.1109/CVPR31182.2015, .

[5] H. Zhang, Y. Long, Y. Guan, L. Shao, "Triple verification network for generalized zero-shot learning," IEEE Transactions on Image Processing, vol. 28 no. 1, pp. 506-517, DOI: 10.1109/tip.2018.2869696, 2019.

[6] Y. Xian, B. Schiele, Z. Akata, "Zero-shot learning-the good, the bad and the ugly," pp. 4582-4591, DOI: 10.1109/CVPR35066.2017, .

[7] S. Liu, M. Long, J. Wang, M. I. Jordan, "Generalized zero-shot learning with deep calibration network," Proceedings of the Advances in Neural Information Processing Systems2018, pp. 2005-2015, .

[8] Z. Liu, X. Zhang, Z. Zhu, S. Zheng, Y. Zhao, J. Cheng, "Convolutional prototype learning for zero-shot recognition," Image and Vision Computing, vol. 98,DOI: 10.1016/j.imavis.2020.103924, 2020.

[9] Y. Xian, Z. Akata, G. Sharma, Q. Nguyen, M. Hein, B. Schiele, "Latent embeddings for zero-shot classification," pp. 69-77, DOI: 10.1109/CVPR33180.2016, .

[10] Z. Jia, Z. Zhang, L. Wang, C. Shan, T. Tan, "Deep unbiased embedding transfer for zero-shot learning," IEEE Transactions on Image Processing, vol. 29, pp. 1958-1971, DOI: 10.1109/tip.2019.2947780, 2020.

[11] K. Li, M. R. Min, Y. Fu, "Rethinking zero-shot learning: a conditional visual classification perspective," pp. 3583-3592, DOI: 10.1109/ICCV43118.2019, .

[12] R. Gao, X. Hou, J. Qin, J. Chen, L. Liu, F. Zhu, Z. Zhang, L. Shao, "Zero-VAE-GAN: generating unseen features for generalized and transductive zero-shot learning," IEEE Transactions on Image Processing, vol. 29, pp. 3665-3680, DOI: 10.1109/tip.2020.2964429, 2020.

[13] Z. Lin, A. Khetan, G. Fanti, S. Oh, "Pacgan: the power of two samples in generative adversarial networks," Advances in Neural Information Processing Systems, 2018.

[14] H. Jiang, R. Wang, S. Shan, X. Chen, "Transferable contrastive network for generalized zero-shot learning," pp. 9765-9774, DOI: 10.1109/ICCV43118.2019, .

[15] C. Li, X. Ye, H. Yang, Y. Han, X. Li, Y. Jia, "Generalized zero shot learning via synthesis pseudo features," IEEE Access, vol. 7, pp. 87827-87836, DOI: 10.1109/access.2019.2925093, 2019.

[16] X. Li, M. Fang, H. Li, J. Wu, "Learning domain invariant unseen features for generalized zero-shot classification," Knowledge-Based Systems, vol. 206,DOI: 10.1016/j.knosys.2020.106378, 2020.

[17] Z. Wang, Z. Dai, B. Póczos, J. Carbonell, "Characterizing and avoiding negative transfer," pp. 11293-11302, DOI: 10.1109/ICCV43118.2019, .

[18] Z. Ji, J. Wang, Y. Yu, Y. Pang, J. Han, "Class-specific synthesized dictionary model for zero-shot learning," Neurocomputing, vol. 329, pp. 339-347, DOI: 10.1016/j.neucom.2018.10.069, 2019.

[19] L. Zhang, T. Xiang, S. Gong, "Learning a deep embedding model for zero-shot learning," pp. 2021-2030, DOI: 10.1109/CVPR35066.2017, .

[20] S. Min, H. Yao, H. Xie, C. Wang, Z.-J. Zha, Y. Zhang, "Domain-aware visual bias eliminating for generalized zero-shot learning," pp. 12664-12673, DOI: 10.1109/CVPR42600.2020, .

[21] M. Hou, W. Xia, X. Zhang, Q. Gao, "Discriminative comparison classifier for generalized zero-shot learning," Neurocomputing, vol. 414, pp. 10-17, DOI: 10.1016/j.neucom.2020.07.030, 2020.

[22] S. Chen, G. Xie, Y. Liu, Q. Peng, B. Sun, H. Li, X. You, L. Shao, "Hsva: hierarchical semantic-visual adaptation for zero-shot learning," Advances in Neural Information Processing Systems, vol. 34, 2021.

[23] V. K. Verma, G. Arora, A. Mishra, P. Rai, "Generalized zero-shot learning via synthesized examples," pp. 4281-4289, DOI: 10.1109/CVPR40276.2018, .

[24] D. Mahapatra, B. Bozorgtabar, S. Kuanar, Z. Ge, Self-supervised Multimodal Generalized Zero Shot Learning for gleason Grading, Domain Adaptation and Representation Transfer, and Affordable Healthcare and AI for Resource Diverse Global Health, pp. 46-56, 2021.

[25] X. Li, M. Fang, H. Li, J. Wu, "Learning discriminative and meaningful samples for generalized zero shot classification," Signal Processing: Image Communication, vol. 87,DOI: 10.1016/j.image.2020.115920, 2020.

[26] Z. Han, Z. Fu, G. Li, J. Yang, "Inference guided feature generation for generalized zero-shot learning," Neurocomputing, vol. 430, pp. 150-158, DOI: 10.1016/j.neucom.2020.10.080, 2021.

[27] S. Chen, W. Wang, B. Xia, Q. Peng, X. You, F. Zheng, L. Shao, "Free: feature refinement for generalized zero-shot learning," pp. 122-131, DOI: 10.1109/ICCVW54120.2021, .

[28] Z. Yue, T. Wang, Q. Sun, X.-S. Hua, H. Zhang, "Counterfactual zero-shot and open-set visual recognition," pp. 15404-15414, DOI: 10.1109/ICCVW54120.2021, .

[29] D. Mahapatra, S. Kuanar, B. Bozorgtabar, Z. Ge, Self-supervised Learning of Inter-label Geometric Relationships for gleason Grade Segmentation, Domain Adaptation and Representation Transfer, and Affordable Healthcare and AI for Resource Diverse Global Health, pp. 57-67, 2021.

[30] C. Kemp, J. B. Tenenbaum, T. L. Griffiths, T. Yamada, N. Ueda, "Learning systems of concepts with an infinite relational model," Learning systems of concepts with an infinite relational model, vol. 1, pp. 381-388, 2006.

[31] J. Vanschoren, Meta-learning, Automated Machine Learning, pp. 35-61, 2019.

[32] Q. Cai, Y. Pan, T. Yao, C. Yan, T. Mei, "Memory matching networks for one-shot image recognition," pp. 4080-4088, DOI: 10.1109/CVPR40276.2018, .

[33] J. Li, M. Jing, K. Lu, Z. Ding, L. Zhu, Z. Huang, "Leveraging the invariant side of generative zero-shot learning," pp. 7402-7411, DOI: 10.1109/CVPR41558.2019, .

[34] H. Zou, T. Hastie, "Regularization and variable selection via the elastic net," Journal of the Royal Statistical Society - Series B: Statistical Methodology, vol. 67 no. 2, pp. 301-320, DOI: 10.1111/j.1467-9868.2005.00503.x, 2005.

[35] S. Boyd, S. P. Boyd, L. Vandenberghe, Convex Optimization, 2004.

[36] A. Sheshadri, I. Endres, D. Hoiem, D. Forsyth, Describing Objects by Their Attributes, pp. 1778-1785, 2012.

[37] C. Wah, S. Branson, P. Welinder, P. Perona, S. Belongie, The Caltech-Ucsd Birds-200-2011 Dataset, 2011.

[38] G. Patterson, J. Hays, "Sun attribute database: discovering, annotating, and recognizing scene attributes," pp. 2751-2758, DOI: 10.1109/CVPR17489.2012, .

[39] K. He, X. Zhang, S. Ren, J. Sun, "Deep residual learning for image recognition," pp. 770-778, DOI: 10.1109/CVPR33180.2016, .

[40] I. Loshchilov, F. Hutter, "Fixing weight decay regularization in Adam," Proceedings of the ICLR 2018 Conference Blind Submission, .

[41] L. Van der Maaten, G. Hinton, "Visualizing non-metric similarities in multiple maps," Machine Learning, vol. 87 no. 1, pp. 33-55, DOI: 10.1007/s10994-011-5273-4, 2012.

Word count: 6028

Show less

Copyright © 2022 Yunpeng Jia et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0/

Abstract

Translate

Generalized zero-shot learning (GZSL) aims to classify seen classes and unseen classes that are disjoint simultaneously. Hybrid approaches based on pseudo-feature synthesis are currently the most popular among GZSL methods. However, they suffer from problems of negative transfer and low-quality class discriminability, causing poor classification accuracy. To address them, we propose a novel GZSL method of distinguishable pseudo-feature synthesis (DPFS). The DPFS model can provide high-quality distinguishable characteristics for both seen and unseen classes. Firstly, the model is pretrained by a distance prediction loss to avoid overfitting. Then, the model only selects attributes of similar seen classes and makes sparse representations based on attributes for unseen classes, thereby overcoming negative transfer. After the model synthesizes pseudo-features for unseen classes, it disposes of the pseudo-feature outliers to improve the class discriminability. The pseudo-features are fed into a classifier of the model together with features of seen classes for GZSL classification. Experimental results on four benchmark datasets verify that the proposed DPFS has GZSL classification performance better than that in existing methods.

Details

Title

A Distinguishable Pseudo-Feature Synthesis Method for Generalized Zero-Shot Learning

Author

Jia, Yunpeng¹

; Ye, Xiufen¹

; Liu, Yusong¹

; Xing, Huiming¹

; Guo, Shuxiang²

¹ College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin, Heilongjiang 150001, China
² College of Intelligent Systems Science and Engineering, Harbin Engineering University, Harbin, Heilongjiang 150001, China; Faculty of Engineering, Kagawa University, Takamatsu, Kagawa 7608521, Japan

Editor

Vinh Truong Hoang

Publication year

2022

Publication date

2022

Publisher

John Wiley & Sons, Inc.

ISSN

16875265

e-ISSN

16875273

Source type

Scholarly Journal

Language of publication

English

DOI

https://doi.org/10.1155/2022/6220501

ProQuest document ID

2749277386

A Distinguishable Pseudo-Feature Synthesis Method for Generalized Zero-Shot Learning

Jump to:

Full text

Abstract

Details

Suggested sources