Introduction
Moraxella (Branhamella) catarrhalis, previously recognized as Neisseria catarrhalis or Micrococcus catarrhalis is a gram-negative and an aerobic diplococcus that is predominantly reported to be found in an upper respiratory tract commensal. M. catarrhalis has emerged as notorious bacterial pathogen in past 20 to 30 years [1]. It causes acute otitis media in infants and exacerbation of chronic bronchitis in adults. It is typically associated with various infections associated with other deadly pathogens like Streptococcus pneumoniae or Haemophilus influenzae being encountered in up to 50% of cultures [2]. M. catarrhalis is a prime cause of various active infections in hosts with weakened immune systems, encompassing pneumonia, endocarditis, septicemia, and meningitis [1]. Furthermore, hospital outbreaks of M. catarrhalis-related respiratory disease have been characterized and classified it as a nosocomial pathogen. For decades, M. catarrhalis was thought to be a harmless commensal since little is known about its pathogenic features and virulence factors despite the fact that research in this field has expanded in recent years [3].
The classical antibiotic treatment alleviates the clinical burden. However, unrestricted antibiotic use is a major factor in the rapid progress of antibiotic-resistant bacteria, which has reduced the number of viable antimicrobial options [4]. In recent past the antimicrobial resistance has been rising dramatically. Acute Chronic Obstructive Pulmonary Disease (COPD) and other respiratory diseases caused by M. catarrhalis are notoriously difficult to treat because of the rising MICs and antimicrobial drug resistance [5].
The genome and proteome analysis aids in the identification of several potential drug targets for the treatment of highly pathogenic diseases. Arguably comparative and subtractive genomic analysis is an exigent job because of its high dimensional data analysis. Interestingly, the arrival of the post-genomic era and pathogen whole-genome sequences opened multiple avenues for the methodologies such as comparative subtractive genomics to design new drugs and vaccine targets. This cost-effectiveness has unlocked the new pathways for finding potential drug candidates and it has accelerated the process of drug discovery, expanded the number of treatment options, and reduced the failure rate in clinical trial process in later stages [6]. Computational approaches enabled the identification of potent therapeutic targets against such pathogens [7]. The method has already been used to successfully prioritize and predict therapeutic targets for Clostridium botulinum [8], Mycoplasma pneumoniae [9], Rickettsia [10], Neisseria gonorrhoeae [11], Salmonella typhi [6], and Shigella dysenteriae [12].
In the present study, genomics data in BBH18 of Moraxella catarrhalis was investigated to find unique therapeutic targets and therapeutic candidates. The study includes comparative and subtractive genomics analysis approach, Protein-Protein Interaction (PPI) network analysis, essentiality, drug ability of target proteins and ADMET properties. Eventually, certain limitations from previous studies against M. catarrhalis such as consideration of hub nodes, and conserved drug targets are covered in this study. Future study may involve the development of antibacterial lead compounds against these shortlisted potential drug targets.
Material and methods
Subtractive genomics approach was employed for the drug target prioritization against M. catarrhalis which holds the clinical and biological importance. The BBH18 strain was chosen to identify the potent drug target and candidate. There was not much reported work against this specific strain in the ground of in silico drug target identification and also it was the only reference strain available for M. catarrhalis, that’s why it was selected for further study. Several databases and tools as illustrated in the flow chart in Fig 1 were used for the determination of therapeutic targets.
[Figure omitted. See PDF.]
Retrieval of proteomes of pathogen and host
The whole proteome of Moraxella catarrhalis BBH18 and Human host both were obtained from the Universal Protein Resource (UniProt) database [13]. Additionally, the Database of Essential Genes (DEG database) [14, 15] was used to screen the drug targets essentiality, and the Drug Bank database Version 5.1.8 [16] was used to investigate the drug ability of proposed targets. Moreover, the Virulence Factor Database (VFDB) [17] was used to curate information about virulence factors of M. catarrhalis whereas, the ARG-ANNOT (Antibiotic Resistance Gene-ANNOTation) AA V6 July 2019 [18] was used for the detection of already existing and putative new Antibiotic Resistance (AR) genes in pathogen genomes [18].
Subtractive genomics approach
Subtractive Genomics is an extensively employed approach that is used to subtract the sequences between the host and pathogen proteomes and metabolic pathways to provide details for a set of proteins that are required by the microorganism but do not exist in the corresponding host. Subtractive Genomics has an active role in identifying unique and essential potent drug targets for pathogen to survive without changing the systematic metabolic pathways of the hosts [19].
Removal of paralogous protein sequences
The complete proteome of Moraxella catarrhalis BBH18 was eradicated at 60% threshold using Cluster Database at High Identity with Tolerance i.e. CD-HIT [20, 21].The proteins possessing the sequence identity greater than 60% are paralogous to each other. The complete sequences of paralogous i.e. duplicates were removed keeping the non-paralogous sequences only for the downstream analysis.
Identification of non-homologous protein
Further, the set of proteins retrieved after removal of paralogous were subjected to BLASTp [22] with the expectation value (i.e. E-value) cut-off of 10−5 against Homo sapiens proteome [21]. The BLASTp generated results i.e. ’Hits Found’ (homologous sequences amid pathogen and the host) and ’No-Hits Found’ (Non-homologous sequences). The non-homologous sequences with no resemblance to the human host were opted for further analysis.
Identification of essential non-homologous genes
Essential proteins of any organism are those proteins that possess a significant role in cellular metabolism [23]. Hence, BLASTp of non-homologue M. catarrhalis proteins was performed against DEG. Strictly, essential proteins were sorted out in Moraxella catarrhalis by keeping a threshold E-value of 10−100. To screen out essential genes, a minimum cut off score of 100 was set [21]. It resulted in the protein data set that were non-homologous as well as essential to M. catarrhalis.
Druggability of essential proteins
Later, nonhomologous essential proteins were evaluated using BLASTp against Food and Drug Administration (FDA) approved proteins that served as therapeutic targets and obtained from the DrugBank. The evaluation was performed with E-value cut-off of 10−5 for the discovery of drug–target-like ability of identified essential proteins for the prioritization of novel and unique therapeutic targets [21].
Identification of essential virulent proteins
Furthermore, the selected genes were subjected to BLASTp against the Virulence Factor Database by setting E-value cut-off to 10−5 for determining the proteins possessing the highest virulence factor [24].
Resistance proteins analysis
For further structure-based studies, only those proteins that possess high antibiotic resistance were required. The BLASTp analysis was performed for essential virulent proteins against ARG-ANNOT by setting E-value cut-off to 10−5 [24]. The resultant data set was of homologous essential proteins possessing high antibiotic resistance and was opted for further structure-based studies.
Identification of subcellular localizations
Every protein has a distinct function at a specified locality. These regions are crucial because proteins are distributed to specific regions in the cell once they are released. Failure of the proteins to move to their adequate location may cause a variety of disorders. Therefore, PSORTb version 3.0.2 [25] and CELLO2GO [26] were used to determine the subcellular localization of all essential, drug like, and nonhomologous proteins. The underlying principle behind Subcellular Localization (SCL) is to run a search of BLAST over all nonhomologous proteins required against the proteins with a specified subcellular location. These tools classified proteins into distinct types based on their cellular location: cytoplasm, cytoplasmic membrane, inner membrane, outer membrane and periplasmic membrane, extracellular region, and undetermined [6, 24].
Structure prediction and homology modelling
The shortlisted proteins from the subtractive genomic approach were evaluated and searched for their structures in the Protein Data Bank (PDB). The BLASTp was used to find a suitable template for protein structure modelling. If there is a lack of 3-dimensional structure, the protein structure can be modelled using the Swiss Model–Homology Modeller [27]. In the lack of an experimentally determined crystal structure of the protein, the homology modelling is the most precise and efficient method for constructing protein structures i.e. 3D. It works by comparing the sequences of proteins in the Protein Data Bank [24].
Validation of protein structure
The modelled structures were validated using various tools based on their respective principles to perform the docking experiment against shortlisted proteins. i.e. PROCHECK [28] to evaluate the stereochemistry composition of a protein structure by analyzing its residue-by-residue geometry as well as the entire structural geometries of the protein. The ERRAT i.e. empirical atom based analyzing tool [29], and PSIPRED were used to estimate the β-sheets, α-helices, and random coils (secondary structure) of shortlisted proteins i.e. sequence based validation [30]. On the other hand, PROSA web server [31] which is employed to validate the modelled protein structure against the available structures supplied from PDB on the basis of Z-score [32].
Ligand and active site prediction
As the structure was modelled, it was required to find the active site over which the ligand could attach to perform its tasks and key roles. In the absence of any ligand in the active site, the ligand was predicted using the template protein of modelled structure obtained as a result of BLAST search. This active site can be chosen for the docking of ligand against respective proteins.
Molecular docking studies
Molecular docking is a computational technique used to predict non covalent binding of a macromolecule i.e. protein (receptor) and a ligand, initiating with their unbounded structures acquired from homology modelling [33]. The most effective ligand in molecular docking has the lowest docking score for its target protein. Using standard docking parameters of 10 times Lamarckian GA settings resulting in 27,000 generations through AutoDock v4.2 [34]. In the docking experiment, the modelled protein function as the target and the identified compound acted as the ligand [35].
Redocking and virtual screening for the identification of novel drug candidates
The docking parameters were validated first by re-docking an ADP co-crystal ligand discovered within the binding site of histidine kinase. For the molecular docking, the conventional docking protocol was used with AutoDock. The ligand was docked and implemented using 250 times Lamarckian GA settings, resulting in a maximum of 27,000 generations and 2,500,000 evaluations. [36]. The re-docking was performed to assess the performance of docking program for its capability of reproducing the same crystal conformation of the bound ligand [36].
Further on, virtual screening of 2000 compounds from the ZINC database [37] was performed against the histidine kinase i.e. subjected drug target protein to identify novel drug candidates using AutoDock Vina [38]. These compounds were selected based upon the range of their molecular weight from 150 to 350 Da (Dalton) as according to the Lipinski’s rule of 5, molecular weight should be >500 Da and as well as due to their easy availability from inhouse library (institute’s library). For the grid points, 72 on X-axis, 112 on Y-axis and 104 on Z axis were selected whereas the parameters for grid center were selected at 55.3, 30.378, and 26.716, respectively [38]. The AutoDock Vina PDBQT Split 1.1.1 [39] was used to split the prepared PDBQT library into the required file. Virtual screening was carried out using the default parameters applied for docking study.
Post docking analysis
The Molecular Operating Environment (MOE 2019.01) [40] was used to assess the docked ligand–protein interaction and depict the ligand’s H–bonds and hydrophobic interactions with the docked protein inside a range of 5 Å. Whereas mmff94s force field was used for energy minimization.
Physiochemical property profiling and toxicity predictions
The physio-chemical properties (i.e. ADME properties) of a ZINC products library was examined in order to determine the important characteristics and parameters that may have a role in influencing the bioactivities. Estimation of compound drug-likeness is component of the physio-chemical analysis (e.g., Lipinski’s rule, lead resemblance), molecular weight, compound interaction with biological environment (e.g., cell permeability, skin permeability, intestinal permeable), biopharmaceutical properties (i.e., pKa value, solubility, etc.), interaction with plasma proteins, and drug bioavailability. Moreover, the pkCSM [41] and SwissADME [42] tools were used to analyze the Absorption, Distribution of Drug, Metabolism, and Excretion (ADME) qualities as well as a number of factors related to the pharmacological action of the drug [43].
Prediction of protein-protein interaction of identified drug target
Histidine kinase was the identified drug target protein. It was found to be essential and with cytoplasmic properties predicted through Database of Essential Genes (DEG) was evaluated for interactions with other proteins. The STRING Version 11.5 (Search Tool for the Retrieval of Interacting Genes/Proteins) [44] is a database containing protein interactions that include both verified and anticipated interactions. Interactions can be both direct (physical) and indirect (functional). The STRING integrates interaction data from these sources statistically for many species and transmits information among these organisms as required. The database currently comprised of 5,214,234 proteins from 1133 species [45]. It was subjected to determine whether the identified drug target can act as hub protein and to validate their functional interactions [46]. These PPIs are classified as hub proteins using node degrees and clustering coefficients. Medium confidence value i.e. 0.40 (by default setting) was set as the minimum required interaction score for the PPIs.
Results
Subtractive genomics approach
The current study is an application of an efficient subtractive genomics approach as exhibited in Fig 1. The Fig 1 depicts the complete series of steps as well as the tools and databases used for the identification of potent drug targets against Moraxella catarrhalis BBH18. Furthermore, the in-silico evaluation exhibited that the complete proteome of Moraxella catarrhalis BBH18 was comprised of a total of 1881 proteins. The step-wise filtering of the proteins during the current study was shown in Table 1.
[Figure omitted. See PDF.]
Removal of paralogous protein sequences
The CD-HIT tool resulted in a total of two paralogous proteins among 1881 proteins. Subsequently, the remaining 1879 proteins were found as non-paralogous.
Non-homologous proteins identification
Furthermore, these proteins were then subjected to BLASTp analysis against the human proteome to opt non-homologous proteins to Human proteome. By sorting out the BLASTp results, total of 519 proteins showed similarity with human proteins and these proteins were refrained for the downstream analysis as they may cause cross-reactivity and undesired toxicity in humans. Therefore, for further analysis, a total of 1360 non-homologous proteins were opted.
Identification of essential non-homologous genes
Moreover, the BLASTp search was performed against the DEG that comprises of a collection of essential genes found in a wide variety of pathogenic and non-pathogenic organisms (both pro- and eukaryotes). A total of 91 proteins were identified as essential proteins required for the viability of M. catarrhalis and could be proposed as the potent drug targets.
Drug ability of essential protein
Additionally, above 91 proteins were further subjected to the BLASTp analysis against Drug Bank Database. Only proteins with considerable similarities in sequence to FDA-approved therapeutic targets were chosen and the rest were omitted through dataset. The BLASTp alignment search was resulted in 38 druggable proteins of M. catarrhalis.
Identification of essential virulent and antibiotic resistance protein
Further on, these 38 proteins were evaluated using BLASTp analysis against the Database of Virulence factor. Only 14 of which were classified as essential virulent proteins i.e. proteins with high virulence factor of M. catarrhalis. However, only four proteins were identified as antibiotic resistance out of 14 shortlisted proteins against ARG-ANNOT database.
Subcellular localization prediction
In subtractive genomic approach, the PSORTb was employed to find the subcellular location of the nonhomologous essential proteins. In this research, among 38 essential drug-like proteins, 79% proteins were predicted to be found as cytoplasmic, 18% of them were anticipated to be in cytoplasmic membrane proteins, and only 3% were identified as outer membrane protein whereas according to CELLO2GO results, 65.3% proteins appeared to be cytoplasmic, 24.5% were shortlisted as the inner membrane proteins, 4.1% periplasmic proteins, 4.1% were classified as outer membrane proteins and 2% were found to be extra cellular proteins as shown in Table 2. The distribution of all essential proteins in M. catarrhalis was depicted in Fig 2A and 2B.
[Figure omitted. See PDF.]
Subcellular localization: (A) Psortb results showing the subcellular distribution of 38 essential proteins identified in M. catarrhalis (B) CELLO2GO results showing the subcellular distribution of 38 essential proteins identified in M. catarrhalis.
[Figure omitted. See PDF.]
Novel drug targets prediction
In this study, 38 potential drug targets were shortlisted as shown in Table 3. Because they are nonhomologous and non-paralogous, therefore these 38 proteins may be considered as promising therapeutic targets. Furthermore, among 38 potential drug targets, four of which were classified as antibiotic resistance proteins. Among them, one protein was shortlisted as essential, non-homolog, with high virulence factor and antibiotic resistance, drug able target against M. catarrhalis i.e. sensor histidine kinase (D5VAF6), and therefore, proceeded to structure-based studies. Fig 3 showed the comprehensive outcome of the current study.
[Figure omitted. See PDF.]
[Figure omitted. See PDF.]
Significance of selected protein
Sensor histidine kinase is an ATP-binding signal transduction protein found in M. catarrhalis’s two-component system [47]. These sensor histidine kinases detect changes in the environment (such as stress or the presence of a drug) surrounding the pathogen and transmit the signals inside that dynamically adjusts the internal mechanism of bacterial cells, preparing them to take advantage of these changes. Changes in these sensor kinases have been linked to resistance to many antibacterial drugs such as cefotaxime [48].
Homology modeling of shortlisted drug target
Histidine kinase’s 3-dimensional structure (one amongst four nominated and shortlisted) was not available in Protein Data Bank (PDB) PDB. As a result, the protein’s FASTA sequence from the NCBI database possessing the accession number D5VAF6 as specified in the database was used for the homology modelling. 4CTI, 4BIU, and 4BIZ were the respective structures of PDB that could be possible templates with percent identities of 34.67%, 27%, and 26.69%, respectively. Ultimately, the structure 4BIZ with a 26.69% sequence similarity and 54% query coverage was selected as a template due to its similarity and availability of ligand, and the structure was effectively modelled as shown in Fig 4.
[Figure omitted. See PDF.]
(A) Modeled Structure of Proteins (drug targets): Structure modeled through Homology SWISS modeler for sensor histidine kinases using 4biz as respective template. (B) Template Protein and Modelled Protein: Superimpose protein of histidine kinase with template protein in slenna color and modelled protein in medium purple. (C) Protein sequence alignment of modelled protein i.e. histidine kinase and template protein i.e. 4biz generated through clustal omega showing sequence similarity.
Modelled structure’s validation
Various tools were used to verify the modelled protein structure, i.e.
1. I) Confirmation of Proteins through PSIPRED
The PSIPRED was used for the secondary structure validation of the protein. It validated the structure on the prediction of helices and beta sheets formation as shown in S1 Fig
2. II) PROCHECK Validation of Proteins
The PROCHECK was used to generate a Ramachandran plot for the modelled protein structure. The Ramachandran plot showed about 91.0% residues found in the favorable region, having one residue in the disallowed region, 16 residues in the additionally allowed regions whereas 4 in generously allowed regions responsible for about 6.8% and 1.7%, correspondingly as shown in S2 Fig.
3. III) ERRAT Validation of Proteins
The ERRAT tool was used to validate the unbounded statistics between two atoms conformation in the structure. It resulted in the quality factor of about 89.147% as shown in S3 Fig.
4. IV) PROSA web Validation of Proteins
The ProSa web server tool was used to calculate the quality of the 3-D structures of proteins in terms of Z-score in the structure of modelled protein. The resulted Z-score is -7.33. The Z-score was calculated using NMR Spectroscopy (dark blue) and X-ray crystallography (light blue) in relation to length of protein chains in the provided structure as shown in S4A Fig. The energy plot illustrates the local model’s quality by depicting the knowledge-based energies as a function of amino acid sequence location as shown in S4B Fig.
Prediction of active site
The active site of a protein can be predicted by a variety of bioinformatics tools including molecular docking assays and structure-based drug discovery. The anticipation of active site was performed on the basis of following amino acids and their respective binding energies as amino acid Arginine (R) is present at 423 position attached with binding energy of 1.330 Å, Leucine (L) is present at 461 position attached with binding energy of 1.550 Å, GlutamicIid (E) is present at 407 position attached with binding energy of 1.546 Å, Arginine (R) is present at 428 position attached with binding energy of 1.339 Å, and Isoleucine (I) is present at 417 position attached with binding energy of 1.548 Å as shown in Fig 5A.
[Figure omitted. See PDF.]
A. Predicted Active Site for histidine kinase: i.e. depicted by R423 (1.330 Å), L461 (1.550 Å), E407 (1.546 Å), R428 (1.339 Å) and I417 (1.548 Å) residues. B. Pre-Docked and Post Docked Protein: the superimposed docked complex of pre docked ADP (in dark khaki) over post docked (in aqua) highlighting the accuracy of docking study in terms of RMSD indicated as 1.660 Å.
Protein-ligand interactions study through docking
The protein-ligand interactions were analyzed through AutoDock Vina.
1. i) Ligand Identification
The ligand co-crystalized within the binding cavity of the template protein (i.e., ADP) was retrieved and shown in S5 Fig. This compound corresponds to the purine ribonucleoside 3’, 5’-bisphosphates class of organic compounds. These are purine ribonucleotides possessing one phosphate group connected to the ribose moiety such as 3’ and 5’ hydroxyl groups [16]. The IUPAC name of ADP is: {[(2R,3S,4R,5R)-5-(6-amino-9H-purin-9-yl)-4-hydroxy-3-(phosphonooxy)oxolan-2-yl]methoxy}phosphonic acid ligand was identified from a template protein PDB ID: 4BIZ (from Escherichia coli) for sensor histidine kinase.
2. ii) Molecular Docking with AutoDock
The AutoDock 4.2 was used for the docking study of histidine kinase. By selecting 10 algorithms run along with setting the Lamarckian GA to 10 times, the ADP ligand was docked and a maximum number of evaluation steps of 2,500,000 proceeding the generation of 27,000. Binding of ligand in the active site of protein in various orientations and conformations was revealed because of AutoDock. Each conformation has a distinct binding energy, ranging from negative to positive. The lowest binding energy of -6.3 kcal/mol was aided in ranking the top conformation of ADP, since the lowest binding energy relates the ligand’s spontaneous binding to the active site, and also forms a lower energy complex which is more stable.
Redocking and virtual screening for identification of novel drug candidates
The redocking validation of co-crystallized ligand yielded a binding energy prediction of -5.32 kcal/mol. Fig 5B depicts the top-ranked docked conformation of ADP with histidine kinase protein. The RMSD of the top-ranked docked ADP conformation against the co-crystal structure was 1.660 Å indicating that the docking parameters could be implemented for virtual screening.
The modelled structure was docked with the ZINC library. The compounds were screened using the same parameters that have previously been used for AutoDock validation (redocking). As seen in Fig 6A, the screening led to the identification of 789 compounds known as highly ranked with binding energies ranging from -5.5 to -6.5 kcal/mol. Whereas only 1424 molecules exhibited favorable interactions with histidine kinase with energetics spanning between -5.5 to -9.3 kcal/mol, as illustrated in Fig 6B. These compounds may serve as leads in the future. Furthermore, Fig 6C showed three potent drug candidates that are identified for histidine kinase against 2000 compounds from ZINC. These identified potent drug candidates are ZINC09185674, ZINC03839141, and ZINC00631248 possessing the binding energies as -6.4, -6.2, and -6.2 kcal/mol, respectively.
[Figure omitted. See PDF.]
(A) Virtual screening of 2000 compounds, (B) identified leads like compounds, (C) and proposed leads compounds in current study.
Post docking analysis (i.e. interaction analysis of selected compounds with histidine kinase)
The post docking interaction analysis of shortlisted compounds was conducted to comprehend the identified mechanism of binding and pharmacological activity against histidine kinase. The rank order of docking depending on score and presented as following:
ZINC09185674 >ZINC03839141 and ZINC00631248 possessing the binding energies as -6.4, -6.2, and -6.2 Kcal/mol, respectively.
The docking analysis for ZINC09185674 revealed considerable binding energy of −6.4 kcal/mol. The ZINC09185674 was found to facilitate hydrophobic interactions only within binding cavity of histidine kinase. It mediates H–bonds as a hydrogen acceptor to Arg356 and Lys355 via the hydroxyl group and one H– bond as a hydrogen acceptor to Gln331 via the pyrrole ring’s oxygen.
The binding score of ZINC03839141 was -6.2 Kcal/mol. The hydrophobic interactions within the binding cavity are depicted. As a hydrogen acceptor, it bridges two hydrogen bonds with Tyr339. The ZINC03839141 was also interacted ionically with Asp332.
With docking scores of 6.2 kcal/mol, ZINC00631248 has demonstrated strong hydrophobic interactions. Importantly, the reference compound (ADP) showed an ionic interaction with Arg428. It serves as a hydrogen donor Tyr368 and a hydrogen acceptor Glu407 in two H-bonds. through hydroxyl group that shows the pi-pi interaction. The redocked compound incorporating the modelled protein (2D and 3D interaction) is depicted in Fig 7. Table 4 showed the docked scores and reported types of bonds anticipated by the MOE tool for the identified compounds.
[Figure omitted. See PDF.]
Redocked compound incorporating the modelled protein: (A) For ZINC09185674, (B) ZINC03839141, (C) ZINC00631248 and (D) Reference Protein.
[Figure omitted. See PDF.]
Physiochemical property profiling and toxicity predictions
The pharmacokinetic parameters of three chosen drugs were calculated using the online pkCSM tool based on Blood-Brain Barrier crossing capabilities, drug-likeness, toxicological analyses and ADME characteristics. The Lipinski rule of five was employed in the drug-likeness characterization.
To anticipate the compound’s drug likeness, the SwissADME tool was employed. The two of three selected candidates have indicated zero violations to Lipinski’s Rule of Five whereas one compound has indicated only one violation and showed acceptable drug-like properties. The results of ADME properties analysis including Water Solubility, Blood-Brain Barrier (BBB) Permeability, Human Intestinal Absorption (HIA), Skin Permeability, CaCo2 permeability, and Lipinski Violation of shortlisted three compounds are shown in Table 5.
[Figure omitted. See PDF.]
The results of toxicity analysis i.e. Max Tolerated Dose (Human), Minnow toxicity, Skin Sensitization, Hepatotoxic, Ames test, Oral Rat Acute Toxicity (LD50), T. Pyriformis (Toxicity) are shown in Table 6. The table also includes Radar of the respective compound.
[Figure omitted. See PDF.]
Prediction of protein-protein interaction of identified drug target protein
For the filtration and analysis of functional genomic data to annotate structural, functional, as well as evolutionary information on proteins, the proposed interaction could be utilized.
Histidine kinase’s NCBI ID: D5VAF6 was submitted to the STRING database and found the interaction with other proteins in the Moraxella catarrhalis. The MCR _0156 represented the histidine kinase, and its minimal interactions with other proteins in their surroundings MCR_0386 (Two-component system sensor histidine kinase) with score of 0.721, MCR_0387 (Two-component system sensor histidine kinase) with score of 0.811, MCR_0405 (Tetratricopeptide repeat family protein) with score of 0.746, MCR_1062 (LuxR family transcriptional regulator) with score of 0.998, bioF (8-amino-7-oxononanoate synthase) with score of 0.844, csrA (Translational regulator CsrA) with score of 0.770, ompR (Two-component system response regulator) with score of 0.946, phoB (Two-component system phosphate regulon response regulator PhoB) with score of 0.896, phoR (Two-component system phosphate regulon sensor histidine kinase PhoR) with score of 0.894, anD RUmA (23S rRNA (uracil(1939)-C(5))-methyltransferase RlmD) with score of 0.760. The results showed that the histidine kinase (MCR_0156) protein has 482 edges, total 309 edges were shown to be expected, whereas number of nodes present are 78, and 12.4 is suggested as its average nodes degree. The PPI enrichment p-value is < 1.0e-16 with a local clustering score of 0.657 on average (Fig 8). These proteins are engaged in a variety of critical functions. Because of targeting Histidine kinase protein, the function of the other interacting proteins may be jeopardized. As a result, this protein might be used as a therapeutic target.
[Figure omitted. See PDF.]
Discussion
In present day and age, the computational methods and approaches have gained considerable attention for the identification and development of potent drug targets [49]. Yet, high-throughput sequencing experimental data for the majority of infectious bacteria is currently unavailable, and efforts to define and identify essential drug targets have now being relied solely on bioinformatics predictions [50]. Because of the obvious rise in drug resistance among pathogens, in-silico subtractive genomic analysis has been widely used for strain-specific targeting for drug target identification [51].
In the current research, subtractive genomics approach was employed for the identification and prediction of potent drug candidates. The focus of the current research was one of the most clinically relevant species i.e. Moraxella catarrhalis BBH-18 strain. It is one of the widely used approach for the novel drug targets identification against various deadly pathogens. The respective approach led to the identification of a total of 91 non-homologous, 38 essential druggable proteins, and 14 virulent protein of M. catarrhalis, out of which four were reported as potent drug targets. As a result, i.e., Efflux pump membrane transporter (D5VAP4), Histidine kinases (D5VAF5), (D5VAF6), and (D5V9S5) were identified as potent drug targets. These proteins may result in the removal and destruction of pathogen from the host through effective drug candidates and vaccines. Finally, one enzymatic protein was opted as potential drug target against M. catarrhalis i.e., sensor histidine kinases (D5VAF6) involved in two-component system. It plays the key role for the bacteria’s development and survival [52].
Histidine kinase has been reported in different studies as a potential drug target against various pathogens such as Mycobacterium tuberculosis [53], Salmonella enterica [54], Streptococcus Species [55], Bacillus subtilis and Staphylococcus aureus [53] and etc. But this study has uniquely reported histidine kinase for M. catarrhalis BBH18 as it was not documented as a drug target yet. Histidine kinase plays a major role in the two-component system of M. catarrhalis. It usually uses two-component signal transduction systems to translate extraneous and cellular signals into cell signaling. Because of the relevance of this protein’s function, it might be used as a potential therapeutic target in future [56].
Furthermore, ZINC library (>2000 compounds) was screened against the selected drug target to identify potential inhibitor. Following the screening process, 789 compounds having binding energies between -5.5 and -6.5 kcal/mol were identified as promising candidates. Only 1424 molecules showed preferential interactions with histidine kinase (energetics -5.5 to -9.3 kcal/mol). ADMET profiling was performed to substantially docked compounds to predict highly potent drug-like molecules. Subsequently, only three compounds were shortlisted as novel drug candidates that are ZINC09185674, ZINC03839141, and ZINC00631248. The binding energies of ZINC09185674, ZINC03839141, and ZINC00631248 are in descending order from lowest to highest, -6.4, -6.2, and -6.2 kcal/mol. To verify our docking analysis, molecular re-docking was performed for reference compound (ADP) using the same applied parameters. The results revealed the RMSD of 1.660 Å for redocking analysis validating the applied protocol of screening.
Bioinformatics subtractive genomics analysis is used to identify prospective therapeutic targets and candidates in this research. Genome and proteome pipelines are examined to prioritize effective antimicrobial agents that may be useful in halting the progression of the severe disease ‘Campylobacteriosis” followed by experimental verification. This might aid in the treatment of periodontal or other C. concisus-related disorders, as well as the reversal of C. concisus-induced intestinal microbial imbalance infections. The method employed has the potential to be used as a general method for target identification, and hence may be used in drug development.
The preceding research identified different essential proteins that could be used as potential drug targets and candidates. Concurrently, in this work, cytoplasmic protein can be typically utilized to identify drug targets, whereas membrane-associated proteins can be employed for formulation of peptide vaccines [57]. As a result, different other computational methodologies and approaches in addition to this approach and experimental validations can be used in the future to develop potential therapeutic strategies not only against M. catarrhalis but also against other pathogens.
Conclusion
Notably, the analysis of genomes and proteomes of many pathogens has revolutionized the identification of therapeutic targets against pathogens. In this research, a subtractive genomic approach was employed to reveal beneficial findings in determining non-homologous essential druggable proteins against M. catarrhalis. These potential drug targets may aid in developing the novel antibiotics as well as potential drug targets that may be directed against M. catarrhalis, ensuring that the revealed targets are not the same as the host genome i.e. Homo sapiens in this case, to avoid any allergic responses or harmful consequences. By targeting these proteins functioning with novel drugs, candidates may be capable of damaging and eliminating infections from their respective hosts. The findings encompass all essential and potent drug targets in M. catarrhalis which could help future researchers to develop effective drug agents and vaccines against strain-specific M. catarrhalis BBH18.
Supporting information
S1 Fig. Secondary structure validation through PSIPRED predicts the positions for helices and beta sheets for histidine kinase.
https://doi.org/10.1371/journal.pone.0273252.s001
(TIF)
S2 Fig. Ramachandran Plot generated through PROCHECK shows 91% residues in the favored region for histidine kinase.
https://doi.org/10.1371/journal.pone.0273252.s002
(TIF)
S3 Fig. ERRAT tool for unbounded statistics between two atoms confirmation in the structures shows the quality factor of about 89.147%.
https://doi.org/10.1371/journal.pone.0273252.s003
(TIF)
S4 Fig. ProSa web server tool calculates the quality of the 3-D structures of proteins in terms of Z-score i.e., -7.33 in the structure of modelled protein.
https://doi.org/10.1371/journal.pone.0273252.s004
(TIF)
S5 Fig. Identified Ligands: ligand identified for drug targets, Adenosine Diphosphate(ADP), IUPAC names as {[(2R,3S,4R,5R)-5-(6-amino-9H-purin-9-yl)-4-hydroxy-3-(phosphonooxy)oxolan-2-yl]methoxy}phosphonic acid as ligand identified against histidine kinase protein.
https://doi.org/10.1371/journal.pone.0273252.s005
(TIF)
S1 Graphical abstract.
https://doi.org/10.1371/journal.pone.0273252.s006
(TIF)
Citation: Ashraf B, Atiq N, Khan K, Wadood A, Uddin R (2022) Subtractive genomics profiling for potential drug targets identification against Moraxella catarrhalis. PLoS ONE 17(8): e0273252. https://doi.org/10.1371/journal.pone.0273252
About the Authors:
Bilal Ashraf
Roles: Conceptualization, Data curation, Writing – original draft, Writing – review & editing
¶‡ BA and NA are contributed equally and shared first authorship.
Affiliations Baqai Institute of Information Technology, Baqai Medical University Karachi, Karachi, Pakistan, Dr. Panjwani Center for Molecular Medicine and Drug Research, International Center for Chemical and Biological Sciences, University of Karachi, Karachi, Pakistan
Nimrah Atiq
Roles: Conceptualization, Validation, Writing – original draft, Writing – review & editing
¶‡ BA and NA are contributed equally and shared first authorship.
Affiliations Dr. Panjwani Center for Molecular Medicine and Drug Research, International Center for Chemical and Biological Sciences, University of Karachi, Karachi, Pakistan, Hamdard Institute of Engineering Sciences and Technology, Hamdard University, Karachi, Pakistan
Kanwal Khan
Roles: Conceptualization, Data curation, Formal analysis
Affiliation: Dr. Panjwani Center for Molecular Medicine and Drug Research, International Center for Chemical and Biological Sciences, University of Karachi, Karachi, Pakistan
Abdul Wadood
Roles: Conceptualization, Data curation, Project administration, Supervision
Affiliation: Department of Biochemistry, Abdul Wali Khan University, Mardan, Pakistan
Reaz Uddin
Roles: Conceptualization, Data curation, Supervision
E-mail: [email protected]
Affiliation: Dr. Panjwani Center for Molecular Medicine and Drug Research, International Center for Chemical and Biological Sciences, University of Karachi, Karachi, Pakistan
https://orcid.org/0000-0002-9928-4482
1. Verduin C.M., et al., Moraxella catarrhalis: from emerging to established pathogen. Clin. Microbiol. Rev., 2002. 15(1): p. 125–144.
2. Tan T.T., et al., Haemophilus influenzae survival during complement-mediated attacks is promoted by Moraxella catarrhalis outer membrane vesicles. J. Infect. Dis., 2007. 195(11): p. 1661–70.
3. Su Y.-C., Singh B., and Riesbeck K.J.F.m., Moraxella catarrhalis: from interactions with the host immune system to vaccine development. Future Microbiol., 2012. 7(9): p. 1073–1100.
4. Soltan M.A., et al., In Silico Prediction of a Multitope Vaccine against Moraxella catarrhalis: Reverse Vaccinology and Immunoinformatics. 2021. 9(6): p. 669.
5. Hakenbeck R., et al., Molecular mechanisms of β-lactam resistance in Streptococcus pneumoniae. Future Microbiol., 2012. 7(3): p. 395–410.
6. Jalal K., et al., Pan-Genome Reverse Vaccinology Approach for the Design of Multi-Epitope Vaccine Construct against Escherichia albertii. International Journal of Molecular Sciences, 2021. 22(23): p. 12814.
7. Fair R. and Tor Y., Antibiotics and bacterial resistance in the 21st century. Perspect Med. Chem., 2014. 6: p. PMC-S14459.
8. Sudha R., et al., Identification of potential drug targets and vaccine candidates in Clostridium botulinum using subtractive genomics approach. Bioinformation, 2019. 15(1): p. 18–25.
9. Vilela Rodrigues T.C., et al., Reverse vaccinology and subtractive genomics reveal new therapeutic targets against Mycoplasma pneumoniae: a causative agent of pneumonia. R. Soc. Open. Sci., 2019. 6(7): p. 190907.
10. Tanwer P., et al., Identification of potential therapeutic targets in Neisseria gonorrhoeae by an in-silico approach. J. Theor. Biol., 2020. 490: p. 110172.
11. Maurya P.K., Singh S., and Mani A., Comparative genomic analysis of Rickettsia rickettsii for identification of drug and vaccine targets: tolC as a proposed candidate for case study. Acta. Trop., 2018. 182: p. 100–110.
12. Jalal K., et al., Identification of vaccine and drug targets in Shigella dysenteriae sd197 using reverse vaccinology approach. Sci. Rep., 2022. 12(1): p. 251.
13. Consortium T.U., UniProt: the universal protein knowledgebase in 2021. Nucleic Acids Res., 2020. 49(D1): p. D480–D489.
14. Luo H., et al., DEG 10, an update of the database of essential genes that includes both protein-coding genes and noncoding genomic elements. Nucleic Acids Res., 2014. 42(D1): p. D574–D580.
15. Zhang R., Ou H.Y., and Zhang C.T., DEG: a database of essential genes. Nucleic Acids Res., 2004. 32(suppl_1): p. D271–D272.
16. Wishart D.S., et al., DrugBank 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res., 2018. 46(D1): p. D1074–D1082.
17. Chen L., et al., VFDB 2016: hierarchical and refined dataset for big data analysis—10 years on. Nucleic Acids Res., 2015. 44(D1): p. D694–D697.
18. Gupta S.K., et al., ARG-ANNOT, a new bioinformatic tool to discover antibiotic resistance genes in bacterial genomes. Antimicrob. Agents Chemother., 2014. 58(1): p. 212–220.
19. Fenoll A., et al., Temporal trends of invasive Streptococcus pneumoniae serotypes and antimicrobial resistance patterns in Spain from 1979 to 2007. J. Clin. Microbiol., 2009. 47(4): p. 1012–1020.
20. Fu L., et al., CD-HIT: accelerated for clustering the next-generation sequencing data. Bioinformatics, 2012. 28(23): p. 3150–3152.
21. Fatoba A.J., Okpeku M., and Adeleke M.A., Subtractive Genomics Approach for Identification of Novel Therapeutic Drug Targets in Mycoplasma genitalium. Pathogens, 2021. 10(8): p. 921.
22. Johnson M., et al., NCBI BLAST: a better web interface. Nucleic Acids Res., 2008. 36(suppl_2): p. W5–W9.
23. Deng J., et al., Investigating the predictability of essential genes across distantly related organisms using an integrative approach. Nucleic Acids Res., 2011. 39(3): p. 795–807.
24. Kaur H., Kalia M., and Taneja N., Identification of novel non-homologous drug targets against Acinetobacter baumannii using subtractive genomics and comparative metabolic pathway analysis. Microb. Pathog., 2021. 152: p. 104608.
25. Yu N.Y., et al., PSORTb 3.0: improved protein subcellular localization prediction with refined localization subcategories and predictive capabilities for all prokaryotes. J. Bioinform., 2010. 26(13): p. 1608–1615.
26. Yu C.-S., et al., CELLO2GO: a web server for protein subCELlular LOcalization prediction with functional gene ontology annotation. PloS one, 2014. 9(6): p. e99368.
27. Waterhouse A., et al., SWISS-MODEL: homology modelling of protein structures and complexes. Nucleic Acids Res., 2018. 46(W1): p. W296–W303.
28. Laskowski R., et al., PROCHECK—a program to check the stereochemical quality of protein structures. J. App. Cryst., 1993. 26.
29. Dym O., Eisenberg D., and Yeates T., ERRAT. Int. J. Biol., 2012. 21: p. 678–679.
30. Buchan D.W. and Jones D.T., The PSIPRED protein analysis workbench: 20 years on. Nucleic Acids Res., 2019. 47(W1): p. W402–W407.
31. Wiederstein M. and Sippl M., ProSA-web: interactive web service for the recognition of errors in three-dimensional structures of proteins. Nucleic Acids Res., 2007. 35(suppl_2): p. W407–W410.
32. Satyanarayana S.D., et al., In silico structural homology modeling of nif A protein of rhizobial strains in selective legume plants. J. Genet. Eng. Biotechnol., 2018. 16(2): p. 731–737.
33. Trott O. and Olson A.J., AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading. J. Comput. Chem., 2010. 31(2): p. 455–461.
34. Tanchuk V.Y., et al., A new, improved hybrid scoring function for molecular docking and scoring based on AutoDock and AutoDock Vina. Chem. Biol. Drug Des., 2016. 87(4): p. 618–625.
35. Khan K., et al., Comparative Metabolic Pathways Analysis and Subtractive Genomics Profiling to Prioritize Potential Drug Targets Against Streptococcus pneumoniae. Front. Microbiol., 2021. 12.
36. Uddin R. and Azam S.S., Identification of glucosyl-3-phosphoglycerate phosphatase as a novel drug target against resistant strain of Mycobacterium tuberculosis (XDR1219) by using comparative metabolic pathway approach. Comput. Biol. Chem., 2019. 79: p. 91–102.
37. Irwin J.J. and Shoichet B.K., ZINC− a free database of commercially available compounds for virtual screening. J. Chem. Inf. Model., 2005. 45(1): p. 177–182.
38. Gaillard T., Evaluation of AutoDock and AutoDock Vina on the CASF-2013 benchmark. J. Chem. Inf. Model., 2018. 58(8): p. 1697–1706.
39. Huey R., Morris G.M., and Forli S., Using AutoDock 4 and AutoDock vina with AutoDockTools: a tutorial. The Scripps Research Institute Molecular Graphics Laboratory, 2012. 10550: p. 92037.
40. Vilar S., Cozza G., and Moro S., Medicinal chemistry and the molecular operating environment (MOE): application of QSAR and molecular docking to drug discovery. Curr. Top. Med. Chem., 2008. 8(18): p. 1555–1572. pmid:19075767
41. Pires D.E., Blundell T.L., and Ascher D.B., pkCSM: predicting small-molecule pharmacokinetic and toxicity properties using graph-based signatures. J. Med. Chem., 2015. 58(9): p. 4066–4072.
42. Daina A., Michielin O., and Zoete V., SwissADME: a free web tool to evaluate pharmacokinetics, drug-likeness and medicinal chemistry friendliness of small molecules. Sci. Rep., 2017. 7(1): p. 1–13.
43. Jalal K., et al., In Silico Study to Identify New Monoamine Oxidase Type A (MAO-A) Selective Inhibitors from Natural Source by Virtual Screening and Molecular Dynamics Simulation. J. Mol. Struct., 2021: p. 132244.
44. Szklarczyk D., et al., STRING v10: protein–protein interaction networks, integrated over the tree of life. Nucleic Acids Res., 2015. 43(D1): p. D447–D452.
45. Hosen M.I., et al., Application of a subtractive genomics approach for in silico identification and characterization of novel drug targets in Mycobacterium tuberculosis F11. Interdiscip. Sci., 2014. 6(1): p. 48–56.
46. Szklarczyk D., et al., The STRING database in 2021: customizable protein–protein networks, and functional characterization of user-uploaded gene/measurement sets. Nucleic Acids Res., 2021. 49(D1): p. D605–D612.
47. Singh A.P., et al., Graphene oxide/ferrofluid/cement composites for electromagnetic interference shielding application. Nanotechnology, 2011. 22(46): p. 465701.
48. Möglich A., Signal transduction in photoreceptor histidine kinases. Protein Sci., 2019. 28(11): p. 1923–1946.
49. Boeckmann B., et al., The SWISS-PROT protein knowledgebase and its supplement TrEMBL in 2003. Nucleic Acids Res., 2003. 31(1): p. 365–370.
50. Uddin R. and Rafi S., Structural and functional characterization of a unique hypothetical protein (WP_003901628. 1) of Mycobacterium tuberculosis: a computational approach. Med. Chem. Res., 2017. 26(5): p. 1029–1041.
51. Uddin R., et al., Proteome-wide subtractive approach to prioritize a hypothetical protein of XDR-Mycobacterium tuberculosis as potential drug target. Chem. Biol. Drug Des., 2019. 41(11): p. 1281–1292.
52. Gupta S.R., et al., Comparative Proteome Analysis of Mycobacterium Tuberculosis Strains-H37Ra, H37Rv, CCDC5180, and CAS/NITR204: A Step Forward to Identify Novel Drug Targets. Lett. Drug Des. Discov., 2020. 17(11): p. 1422–1431.
53. Watanabe T., et al., Isolation and characterization of inhibitors of the essential histidine kinase, YycG in Bacillus subtilis and Staphylococcus aureus. J. Antibiot., 2003. 56(12): p. 1045–1052.
54. Hossain T., et al., Application of the subtractive genomics and molecular docking analysis for the identification of novel putative drug targets against Salmonella enterica subsp. enterica serovar Poona. Biomed Res. Int., 2017. 2017.
55. Georrge J.J. and Umrania V.V., Subtractive Genomics Approach to Identify Putative Drug Targets and Identification of Drug-like Molecules for Beta Subunit of DNA Polymerase III in Streptococcus Species. Applied Biochemistry and Biotechnology, 2012. 167(5): p. 1377–1395.
56. Rosales‐Hurtado M., et al., Synthesis of histidine kinase inhibitors and their biological properties. Med. Res. Rev., 2020. 40(4): p. 1440–1495.
57. Masomian M., et al., Development of next generation Streptococcus pneumoniae vaccines conferring broad protection. Vaccines, 2020. 8(1): p. 132.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2022 Ashraf et al. This is an open access article distributed under the terms of the Creative Commons Attribution License: http://creativecommons.org/licenses/by/4.0/ (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original author and source are credited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Moraxella catarrhalis (M. catarrhalis) is a gram-negative bacterium, responsible for major respiratory tract and middle ear infection in infants and adults. The recent emergence of the antibiotic resistance M. catarrhalis demands the prioritization of an effective drug target as a top priority. Fortunately, the failure of new drugs and host toxicity associated with traditional drug development approaches can be avoided by using an in silico subtractive genomics approach. In the current study, the advanced in silico genome subtraction approach was applied to identify potential and pathogen-specific drug targets against M. catarrhalis. We applied a series of subtraction methods from the whole genome of pathogen based on certain steps i.e. paralogous protein that have extensive homology with humans, essential, drug like, non-virulent, and resistant proteins. Only 38 potent drug targets were identified in this study. Eventually, one protein was identified as a potential new drug target and forwarded to the structure-based studies i.e. histidine kinase (UniProt ID: D5VAF6). Furthermore, virtual screening of 2000 compounds from the ZINC database was performed against the histidine kinase that resulted in the shortlisting of three compounds as the potential therapeutic candidates based on their binding energies and the properties exhibited using ADMET analysis. The identified protein gives a platform for the discovery of a lead drug candidate that may inhibit it and may help to eradicate the otitis media caused by drug-resistant M. catarrhalis. Nevertheless, the current study helped in creating a pipeline for drug target identification that may assist wet-lab research in the future.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer