Content area

Abstract

Clustering of protein association networks is crucial for understanding protein relationships and cellular functions. This research employs a Mixed Integer Linear Programming (MILP) approach to cluster proteins in the FprAl flavodiiron protein network, containing 61 proteins and 230 connections. The first stage applies MILP to minimize the maximum diameter within the clusters, focusing only on the topological characteristics of the network. A refined model is then followed, designed to maximize the functional similarity within each cluster. This is achieved using a Jaccard similarity matrix based on the molecular function aspect of the Gene Ontology (GO) terms, which emphasizes biological relevance in the clustering process. The integration of topological and functional criteria into the second MILP model enables effective clustering that captures both connectivity and biological context. Validation through gene sequence alignment supports the functional relevance of the formed clusters, revealing biologically significant groupings. The findings suggest that incorporating functional similarities into the clustering improves the biological interpretability of gene groups, demonstrating the potential for refined prediction of gene function. Future directions include incorporating additional GO aspects such as biological processes and cellular components, as well as advanced metrics for sequence similarity, to further improve the precision of clustering.

Full text

Turn on search term navigation

Copyright Institute of Industrial and Systems Engineers (IISE) 2025