Application of Bioinformatics and Data Mining in

Full text

Headnote

Abstract- Computer-aided cancer prediction and risk assessment has become a very useful tool and is starting to be taken seriously by the medical community. Advanced bioinformatics and data mining techniques are used extensively to assist in predicting the chances of an individual patient's cancer occurrence as well as the population cancer rates in general. These techniques rely heavily on analyzing and comparing genetic and medical datasets, as well as environment-based and other factors. We have developed an expert system called the cancer predictor calculator (CPC) which predicts two specific cancer risks for women. Specifically, the CPC estimates the risk for the breast and ovarian cancers by examining a number of user-provided genetic and non-genetic factors. The expert system was validated by comparing its predicted results with the patients' prior medical information (and the subsequent outcomes) contained in actual health history databases and other well-known sources (e.g., see [14-17]).

Keywords: Cancer prediction calculator, breast cancer, ovarian cancer, bioinformatics, data mining, health IT

(ProQuest: ... denotes formula omitted.)

1 Introduction

Many existing information and/or knowledgebased technologies have aided humans in predicting abnormal genes and other factors that lead to diseases. Cancer is considered to be the number one fatal genetic disease.

Contrary to popular opinion, the excessive retention and/or compilation of the immense amounts of biological data have turned its analysis into a very difficult and complex undertaking. Even with the emergence of bioinformatics and data mining, and combining biology, computer science, information technology, statistics, and mathematics, the problem of efficient knowledge extraction is increasingly becoming more difficult. One of the primary purposes of bioinformatics is to clarify the biological processes that depend on hereditary resources. Data mining has the capability to detect hidden useful patterns between datasets objects and to use them as predictors. Consequently, any interaction between bioinformatics and data mining can only serve to improve their usefulness and the overall outcomes.

We have integrated both to construct the CPC, which using huge biological datasets incorporates the patient-provided information to derive its prediction results. This software predicts the patient's cancer risk/percentage and classifies this risk into four categories (no risk, low, medium, or high). The importance and applicability of this software comes from a potential early warning, which could lead to a discovery...

Show less

Application of Bioinformatics and Data Mining in Cancer Prediction

Full text

Suggested sources

Application of Bioinformatics and Data Mining in Cancer Prediction

Content area

Full text

Suggested sources