Content area
Full Text
Abstract - Over the past decades, data mining has proved to be a successful approach for extracting hidden knowledge from huge collections of structured digital data stored in databases. From the inception, data mining was done primarily on numerical set of data. Nowadays, large multimedia data sets such as audio, speech, text, web, image, video and combination of several types are becoming increasingly available. In the present work, an attempt has been made for software development in which slide images of patient's DNA and disease affected DNA can be processed and compared. Pixel values of these images stored in the database can be also compared for result analysis.
Keywords - Multimedia dataminig, DNA data
I. INTRODUCTION
With the recent advances in electronic imaging, video devices, storage, networking and computer power, the amount of multimedia has grown anormously,and data mining has become a popular way of discovering new knowledge from such a large data sets. Multimedia data refers to data such as text, numeric, images, video, audio, graphical, temporal, relational and categorical data. Multimedia data mining refers to pattern discovery, rule extraction and knowledge acquisition from multimedia database[1] Datamining techniques are the result of a long process of research and product development. This evolution began when business data was first stored on computers, continued with improvements in data access, and more recently, generated technologies that allow users to navigate through their data in real time. In recent years, there has been an explosion in the rate of acquisition of biomedical data and advances in molecular genetics technologies such as DNA microarrays [2-3].The main types of data analysis needed for biomedical applications include:
* Gene Selection -This is a process of attribute selection, which finds the genes most strongly related to a particular class [4-7].
* Classification - classifying diseases or predicting outcomes based on gene expression patterns and perhaps even identifying the best treatment for given genetic signature [8-10]
* Clustering - finding new biological classes or refining existing ones [11-12].
It is widely believed that thousands of genes and their products in a given living organism function in a complicated and orchestrated way that creates the mystery of life. However, traditional methods in molecular biology generally work on a "one gene in one experiment"...