A Novel Information-Theoretic Framework for Information Representation, Knowledge Organization, and Predictive Analytics

Abstract

This dissertation presents a novel information-theoretic framework, ROADMAP (Representation, Organization, and Analysis for Data Modeling and Annotative Predictions), for advancing information representation, knowledge organization, and predictive analytics in machine learning. Motivated by emerging limitations in classical entropy-based approaches, this research explores how new formulations in information theory can lead to improvements in accuracy, robustness, interpretability, and annotation efficiency in AI systems. Central to this investigation is DLITE (Discounted Least Information-Theoretic Entropy), a new entropy metric grounded in Least Information Theory (LIT), a mathematically rigorous formulation designed to quantify bounded entropy change, normalize scale effects, and satisfy essential information-metric properties - providing both theoretical clarity and practical utility serving as an alternative entropy-based quantification model.

DLITE was formulated by Ke (2020, 2022a) to address key theoretical and practical gaps in traditional information theory models such as Shannon Entropy and KL Divergence. Unlike popular information-theoretic measures, which lack metric distance properties, DLITE introduces a bounded and symmetric measure of the entropy difference, discounting redundant or scale-sensitive information using a formal entropy discount. By satisfying properties such as nonnegativity, symmetry, and identity of indiscernible, DLITE provides a scale-invariant, semantically meaningful framework that is theoretically rigorous and computationally robust.

Furthermore, central to this study is the conviction that a structured framework is essential to effectively address complex problems in AI. The ROADMAP framework, a three-tier model, ensures that each stage of information processing is logically aligned, systematically implemented, and analytically evaluated. This framework provides a scaffolded strategy for integrating theory with methodology, enabling deeper insights and reproducible experimentation. Moreover, this study operationalizes DLITE within a tiered ROADMAP framework. In Tier 1, DLITE informs feature weighting via TF-iDL, an entropy-discounted alternative to TF-IDF that reflects more meaningful semantic salience. In Tier 2, DLITE Impurity Measure (DIM) is integrated into decision tree models, resulting in more stable and informative node splits for hierarchical knowledge classification. In Tier 3, DLITE Loss is deployed as a loss function in transformer-based deep learning models, offering strong recall, fast convergence, and resilience to class imbalance, as evidenced in comparative experiments against Cross-Entropy and KL Divergence on CoNLL-2003, Basic NER, and the Broad Twitter Corpus.

Beyond empirical performance, DLITE supports selective annotation strategies and enhances explainability through interpretable attention and entropy landscapes. It offers a new way of thinking about how learning systems evaluate and optimize information, especially under conditions of noise, ambiguity, and limited data. This dissertation demonstrates that DLITE is not just a derivative of existing information theory but a substantive contribution to it: one that redefines how entropy is modeled for learning, ranking, and reasoning in machine intelligence.

But this research is also personal. It is not only a pursuit of technical advancement but a tribute to intellectual lineage. Like the storied advisor-advisee relationships that shaped entire disciplines—Frege and Wittgenstein, Thomson and Rutherford, Hilbert and Von Neumann—this dissertation emerges from a legacy of mentorship. The core theories at its heart, LIT and DLITE, were conceptualized by my advisor, Dr. Weimao Ke. My work, then, is not only a scholarly inquiry, but a continuation—a living expression of the ideas entrusted to me.

This dissertation exemplifies how new knowledge can emerge from lineage, shaped by the values of academic stewardship and inspired by the responsibility to carry forward a vision. In the tradition of advisor-advisee collaborations that have moved science and scholarship forward, this work aims not just to validate a theory, but to extend its reach. DLITE is more than a mathematical model; it is an invitation to rethink how we quantify, interpret, and apply information. In that spirit, this dissertation contributes to the evolving landscape of explainable, human-aligned AI, and to the scholarly lineage from which it was born.

Details

Business indexing term

Subject:

Artificial intelligence

Subject

Information science;
Artificial intelligence

Classification

0723: Information science
0800: Artificial intelligence

Identifier / keyword

Entropy; Information organization; Information representation; Information theory; Predictive analytics

Title

A Novel Information-Theoretic Framework for Information Representation, Knowledge Organization, and Predictive Analytics

Author

Pascua, Sonia M.

Number of pages

251

Publication year

2025

Degree date

2025

School code

0065

Source

DAI-A 87/1(E), Dissertation Abstracts International

ISBN

9798286449569

Advisor

Ke, Weimao

Committee member

Xia, Lin; Kelly, Mat; Park, Jung-Ran; Kazuhiro, Seki

University/institution

Drexel University

Department

Information Science [Ph.D.] (College of Computing and Informatics)

University location

United States -- Pennsylvania

Degree

Ph.D.

Source type

Dissertation or Thesis

Language

English

Document type

Dissertation/Thesis

Dissertation/thesis number

32117548

ProQuest document ID

3225729729

Document URL

https://www.proquest.com/dissertations-theses/novel-information-theoretic-framework/docview/3225729729/se-2?accountid=208611

Database copyright ProQuest LLC; ProQuest does not claim copyright in the individual underlying works.

Database

ProQuest One Academic

A Novel Information-Theoretic Framework for Information Representation, Knowledge Organization, and Predictive Analytics

Content area

Abstract

Details