Content area

Abstract

This dissertation presents a novel information-theoretic framework, ROADMAP (Representation, Organization, and Analysis for Data Modeling and Annotative Predictions), for advancing information representation, knowledge organization, and predictive analytics in machine learning. Motivated by emerging limitations in classical entropy-based approaches, this research explores how new formulations in information theory can lead to improvements in accuracy, robustness, interpretability, and annotation efficiency in AI systems. Central to this investigation is DLITE (Discounted Least Information-Theoretic Entropy), a new entropy metric grounded in Least Information Theory (LIT), a mathematically rigorous formulation designed to quantify bounded entropy change, normalize scale effects, and satisfy essential information-metric properties - providing both theoretical clarity and practical utility serving as an alternative entropy-based quantification model.

DLITE was formulated by Ke (2020, 2022a) to address key theoretical and practical gaps in traditional information theory models such as Shannon Entropy and KL Divergence. Unlike popular information-theoretic measures, which lack metric distance properties, DLITE introduces a bounded and symmetric measure of the entropy difference, discounting redundant or scale-sensitive information using a formal entropy discount. By satisfying properties such as nonnegativity, symmetry, and identity of indiscernible, DLITE provides a scale-invariant, semantically meaningful framework that is theoretically rigorous and computationally robust.

Furthermore, central to this study is the conviction that a structured framework is essential to effectively address complex problems in AI. The ROADMAP framework, a three-tier model, ensures that each stage of information processing is logically aligned, systematically implemented, and analytically evaluated. This framework provides a scaffolded strategy for integrating theory with methodology, enabling deeper insights and reproducible experimentation. Moreover, this study operationalizes DLITE within a tiered ROADMAP framework. In Tier 1, DLITE informs feature weighting via TF-iDL, an entropy-discounted alternative to TF-IDF that reflects more meaningful semantic salience. In Tier 2, DLITE Impurity Measure (DIM) is integrated into decision tree models, resulting in more stable and informative node splits for hierarchical knowledge classification. In Tier 3, DLITE Loss is deployed as a loss function in transformer-based deep learning models, offering strong recall, fast convergence, and resilience to class imbalance, as evidenced in comparative experiments against Cross-Entropy and KL Divergence on CoNLL-2003, Basic NER, and the Broad Twitter Corpus.

Beyond empirical performance, DLITE supports selective annotation strategies and enhances explainability through interpretable attention and entropy landscapes. It offers a new way of thinking about how learning systems evaluate and optimize information, especially under conditions of noise, ambiguity, and limited data. This dissertation demonstrates that DLITE is not just a derivative of existing information theory but a substantive contribution to it: one that redefines how entropy is modeled for learning, ranking, and reasoning in machine intelligence.

But this research is also personal. It is not only a pursuit of technical advancement but a tribute to intellectual lineage. Like the storied advisor-advisee relationships that shaped entire disciplines—Frege and Wittgenstein, Thomson and Rutherford, Hilbert and Von Neumann—this dissertation emerges from a legacy of mentorship. The core theories at its heart, LIT and DLITE, were conceptualized by my advisor, Dr. Weimao Ke. My work, then, is not only a scholarly inquiry, but a continuation—a living expression of the ideas entrusted to me.

This dissertation exemplifies how new knowledge can emerge from lineage, shaped by the values of academic stewardship and inspired by the responsibility to carry forward a vision. In the tradition of advisor-advisee collaborations that have moved science and scholarship forward, this work aims not just to validate a theory, but to extend its reach. DLITE is more than a mathematical model; it is an invitation to rethink how we quantify, interpret, and apply information. In that spirit, this dissertation contributes to the evolving landscape of explainable, human-aligned AI, and to the scholarly lineage from which it was born.

Details

1010268
Business indexing term
Title
A Novel Information-Theoretic Framework for Information Representation, Knowledge Organization, and Predictive Analytics
Author
Number of pages
251
Publication year
2025
Degree date
2025
School code
0065
Source
DAI-A 87/1(E), Dissertation Abstracts International
ISBN
9798286449569
Advisor
Committee member
Xia, Lin; Kelly, Mat; Park, Jung-Ran; Kazuhiro, Seki
University/institution
Drexel University
Department
Information Science [Ph.D.] (College of Computing and Informatics)
University location
United States -- Pennsylvania
Degree
Ph.D.
Source type
Dissertation or Thesis
Language
English
Document type
Dissertation/Thesis
Dissertation/thesis number
32117548
ProQuest document ID
3225729729
Document URL
https://www.proquest.com/dissertations-theses/novel-information-theoretic-framework/docview/3225729729/se-2?accountid=208611
Copyright
Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.
Database
ProQuest One Academic