Content area

Abstract

Transfer learning involves leveraging the knowledge gained while solving one problem and applying it to a different but related problem, thus facilitating the adaptation of learned patterns and representations. This approach is particularly beneficial when labeled data is scarce or training resources are limited. Over the past decade, transfer learning has emerged as a critical technique in the field of machine learning, revolutionizing how models are trained and deployed across various domains. Approaches such as fine-tuning pretrained models, representation transfer, and domain adaptation have enabled models to leverage knowledge learned from large-scale datasets and transfer it to new, related tasks with limited labeled data. However, the interpretability of the transferred knowledge remains a challenge in transfer learning. While pretrained models often achieve impressive performance gains, understanding how and why these models make specific predictions is often non-trivial.

This thesis seeks to further our understanding of transfer learning by investigating the knowledge transferred between source and target domains. Previous research on interpretable transfer learning has focused on empirical evaluations of network architectures that lead to better transfer, as opposed to understanding what knowledge enables positive versus negative transfer of knowledge. Furthermore, transfer learning has predominantly functioned as a tool for enhancing performance in target domains, overlooking the potential harm of propagating undesirable knowledge encoded in source models to downstream tasks. To this end, we address three research questions surrounding interpretable transfer learning: Can we interpret what, where, and how the knowledge is transferred from a source domain to a target domain? Can we mitigate the transfer of undesirable knowledge to downstream tasks? Can we automatically identify and transfer common concepts or attributes that are helpful to the target task?

For the first research question, we designed and implemented Auto-Transfer (AT), a framework that automatically learns to route source representations to appropriate target representations, following which they are combined in meaningful ways to produce accurate target models. We demonstrated upwards of 5% accuracy improvements compared with the state-of-the-art knowledge transfer methods on several benchmark datasets. We qualitatively analyze the goodness of our transfer scheme by showing individual examples of the essential features using visual explanation methods. We also observed that our improvement over other methods is higher for smaller target datasets, making it an effective tool for small data applications that may benefit from transfer learning.

For the second research question, we proposed a novel approach for suppressing the transfer of user-determined semantic concepts (viz. color, glasses, etc.) in intermediate source representations to target tasks without retraining the source model, which can otherwise be expensive or even infeasible. Notably, we tackled a bigger challenge in the input data as a given intermediate source representation is biased towards the source task, thus further entangling the desired concepts. We evaluated our approach both qualitatively and quantitatively in the visual domain and demonstrated that our approach successfully suppresses user-determined concepts without altering other concepts.

Lastly, we explored the automatic identification of beneficial concepts for the target task, using examples from the biomedical domain. We introduced Conceptual Counterfactual Explanation (CoCoX), a method that integrates conceptual and counterfactual explanations to pinpoint the most relevant medical concepts for a black-box chest X-ray classifier. Furthermore, we enhanced the joint embedding space of biomedical foundation models with textual concepts, achieving performance improvements of over 5\% across various downstream tasks from diverse biomedical domains.

Overall, through this thesis, we developed methods to support the interpretability of knowledge transferred between source and target domains, mitigate the transfer of undesirable knowledge, and improve performance on resource-constrained tasks. As the field of transfer learning continues to evolve, achieving a balance between performance and interpretability remains a crucial area of focus for advancing the robustness and reliability of machine learning models across diverse real-world application domains.

Details

1010268
Business indexing term
Title
Interpretable Transfer Learning: Understanding and Controlling Knowledge Transfer
Author
Number of pages
137
Publication year
2025
Degree date
2025
School code
0185
Source
DAI-B 86/12(E), Dissertation Abstracts International
ISBN
9798280713826
Committee member
McGuinness, Deborah L.; Stewart, Charles V.; Sims, Christopher R.
University/institution
Rensselaer Polytechnic Institute
Department
Computer Science
University location
United States -- New York
Degree
Ph.D.
Source type
Dissertation or Thesis
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
31842284
ProQuest document ID
3216342224
Document URL
https://www.proquest.com/dissertations-theses/interpretable-transfer-learning-understanding/docview/3216342224/se-2?accountid=208611
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Database
ProQuest One Academic