Content area
Allogeneic bone marrow and umbilical cord stem cell transplants often offer the best hope for curing patients with blood cancers and other blood diseases. We establish the historical context by starting with World War II and the initial use of nuclear weapons when the world recognized the dangers of radiologic events and the following public health threats. These threats included blood cancers on an unimaginable scale. At that time, little was known about survivability or available treatments. Researchers at the University of Chicago, associated with the Manhattan Project, were among the first to identify the preservation of hematopoiesis as a primary therapy for otherwise deadly exposures. We introduce hematopoietic stem cell (HSC) transplantation, which has two distinct types: autologous and allogeneic.
Next, we discuss how matching patients to unrelated donors requires flexible and timely searches as the matching criteria evolve. Matching systems should scale to accommodate the diversity in patient and donor typing resolution as well as the increasing number of donors. We begin by describing the context of this work and the challenges associated with timely, scalable, and adaptable platform solutions. We introduce a current patient/donor matching method called HapLogic at the National Marrow Donor Program. HapLogic is a leading platform for this work, offering nearly interactive access to transplant patients worldwide.
In the second chapter of the thesis, we propose a novel approach to matching patients and donors for stem cell transplants. GraphMatch (GM) is a scalable graph database solution for storing and searching variable-resolution HLA genotype markers. For our test set, we expanded the World Marrow Donor Association (WMDA) validation set based on version 2.16 of the IPD-IMGT/HLA Database to create a synthetic production dataset comprising 1 million patients and 10 million donors. Single-patient identity search times range from 218.5 milliseconds per patient with 2 million donors to 1201.4 milliseconds per patient with 10 million donors. Search performance timing remained linear relative to the number of edges, even at a production scale.
In the third chapter, we anticipate practical extensions to the GraphMatch platform, allowing horizontally scalable performance and a flexible schema to accommodate additional search criteria. GraphMatch can also simulate additional matching algorithms, such as GRIMM and Hap-E. Ultimately, GM demonstrates the usefulness of graph databases as a flexible platform for scalable matching solutions.