This is an open access article distributed under the Creative Commons Attribution License, which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
1. Introduction
Transmembrane protease serine 2, also called TMPRSS2, is an androgen-regulated gene that is located at human chromosome 21q 22.3, approximately extends 43.59 Kb in length, and contains 14 exons [1]. TMPRSS2 is locally expressed in many tissues, comprising the prostate, bile duct, breast, kidney, colon, pancreas, ovary, stomach, salivary gland, and lung [1]. The full-length TMPRSS2 cDNA encodes a protein of 492 amino acids, with a type II transmembrane domain, a receptor class A domain (LDLRA, aa 113-148), a scavenger receptor cysteine-rich domain (SRCR, aa 149-242), and a serine protease domain (aa, 255-492) [2].
To date, physiological roles of the transmembrane protease serine 2 are unknown, but it participates in many biological processes such as digestion, fertility, blood coagulation, tissue remodeling, inflammatory responses, tumor cell invasion, and apoptosis [2]. TMPRSS2 in turn plays an essential role in prostate tumorigenesis via the proteolytic activation of the protease-activated receptor 2 (PAR-2) [3, 4]. A study by Magi-Galluzzi et al. about prostate cancer (Pca) revealed that TMPRSS2-ERG fusion was significantly correlated with ethnicity and geography (50% of Caucasians, 31.3% African-Americans, and 15.9% of Japanese patients) [5]. Another study by Kong et al. explored the association between the TMPRSS2-ERG gene fusion and clinicopathological characteristics and reported that no significant correlation was observed between the TMPRSS2-ERG gene fusion and clinical parameters [6].
Recently, it has been shown that SARS-CoV-2 engages angiotensin-converting enzyme 2 (ACE2) as the entry receptor and uses TMPRSS2 for S protein priming [7]. Overall, SARS-CoV-2 has been determined by four types of structural, i.e., spike (S), envelope (E), membrane (M), and nucleocapsid (N) proteins, and accessory proteins like ORF3a, ORF7a, ORF8, ORF9, and ORF10 [8, 9].
The S protein is composed of an extracellular N-terminal associated with S1 essential for binding the receptor and a C-terminal labelled S2 that is used for membrane fusion. The envelope E protein is composed of a hydrophilic amino acid terminus (7-12 AA), the transmembrane hydrophobic domain, and a long C-terminal domain that are essential for viral assembly and maturation. The M protein is composed of a hydrophilic C-terminal and amphipathic N-terminal which are needed for viral assembly. The N protein consists of an N-terminal RNA domain (NTD) and a C-terminal dimerization (CTD) domain separated by a serine-rich linker region that are essential for viral entry and assembly [8, 9]. As TMPRSS2 is expressed in bronchial and lung cells, it can therefore facilitate entry of SARS-CoV-2 into host cells by cleaving the ACE2 receptor at arginine 697-716 positions [2]. The TMPRSS2 protein is responsible for the proteolytic cleavage of the viral spike protein (S) [10]. Several studies have demonstrated the existence of three residues of catalytic triad of TMPRSS2, namely, His 296, Asp 345, and Ser 441, that play a crucial role in the involvement of molecular complex between TMPRSS2 and viral spike protein S and consequently SARS-CoV-2 [10].
Recent studies show the existence of unique variants in TMPRSS2 (p. Val160Met, p. Gly181Arg, p. Arg240Cys, p. Pro335Leu, p. Gly432Ala, and p. Arg435Tyr) that can alter the efficiency of TMPRSS2 and might influence susceptibility to SARS-CoV-2 [11].
Taking into account all these considerations, this article is aimed at elucidating the plausible effect of TMPRSS2 genetic missense variants in structure, stability, and functions of TMPRSS2 using different publicly available bioinformatics algorithms. The use of a wide array of pathogenicity tools like SIFT, PolyPhen2.0, PROVEAN, SNAP2, and PMut provides consistent results. Also, stability, conservation, and flexibility approaches using bioinformatics tools, namely, I-Mutant Suite, MUpro, iStable, STRUM, CUPSAT, ConSurf, ModPred, and FlexPred, will aid comprehending the mutation effect on TMPRSS2 protein [12–15].
2. Materials and Methods
2.1. Datasets
The amino acid sequence of the TMPRSS2 gene was obtained in FASTA format from UniProt databases (UniProt ID: O15393) (https://www.uniprot.org). All the variants of the TMPRSS2 gene were collected from Ensembl Genome Browser (https://www.ensembl.org/Homo_sapiens/Gene/Variation_Gene/Table?db=core;g=ENSG00000184012;r=21:41464305-41531116). A total of 392 missense variants were mapped in the human TMPRSS2 gene, but we limited our study to those SNPs who provide explanations for genetic susceptibility to COVID-19; therefore, six variants remained.
2.2. Functional Analysis of Human TMPRSS2 Missense Variants
SIFT (Sort Intolerant from Tolerant) is a sequence homology-based algorithm that predicts tolerable and intolerable change in protein function caused by the substitution in amino acid sequence, which is available at https://sift.bii.a-star.edu.sg/ [16]. A substitution is predicted to be “deleterious” if the prediction score ranges from 0 to 0.05 and “tolerable” if the prediction score is greater than or equal to 0.05 [17]. PolyPhen2.0 (Polymorphism Phenotyping v2) is a web server that uses physical and comparative considerations to estimate the effect of substitution of an amino acid on protein function and structure, which is available at https://genetics.bwh.harvard.edu/pph2/ [18]. PROVEAN (Protein Variation Effect analyzer) is an algorithm that predicts the possible impact of the substitution of amino acid, based on the alignment score approach, which is available at http://provean.jcvi.org/[50]. SNAP2 (Screening of nonacceptable Polymorphism 2) is a bioinformatics tool that uses the annotations from the protein mutant database (PMD) to predict the changes due to the nsSNPs on protein function, which is available at https://rostlab.org/services/snap/ [19]. PMut (http://mmb.irbbarcelona.org/PMut) is a tool, developed based on a neural network classification method, which uses both sequence conservation and physicochemical properties to predict disease-associated mutations [20]. MutPred2 (http://mutpred.mutdb.org/) is a machine learning approach that predicts the molecular cause of disease-related amino acid change. MutPred2 comprises functional, structural, and evolutionary properties including secondary structures, posttranslational modification (PTM), and metal binding [21].
2.3. Structural Analysis of Human TMPRSS2 Missense Variants
2.3.1. Protein Stability
I-Mutant Suite is a web server based on a support vector machine developed to predict the stability change of the mutated protein sequence or structure when available. I. Mutant predicts if a given mutation increases (
2.3.2. Identification of Conserved Residues and Sequence Motifs
Clustal Omega, a bioinformatics program, was used to align multiple homologous proteins or DNA/RNA sequences. It uses both the older clustalX and clustalW for multiple sequence alignment. Clustal Omega is available at https://www.ebi.ac.uk/Tools/msa/clustalo/ or can be used from the command line [28]. Jalview is a freely available system (https://www.jalview.org), which was used for visualization, editing, figure generation, and analysis of molecular sequences, alignment, and structures, provided by the European Bioinformatics Institute (EBI) and the University of Dundee [29].
2.3.3. Evolutionary Phylogenetic Analysis of TMPRSS2
ConSurf (https://consurf.tau.ac.il/) is an in silico tool that uses an empirical Bayesian method for estimating the degree of evolutionary conservation of an amino acid in macromolecules (protein or nucleic acid). The conservation grades are ranged from 0 to 9, where 1–4 score is variable, 5–6 score is intermediate, and 7–9 score is conserved [30].
2.3.4. Prediction of Posttranslational Modification
ModPred (http://www.modpred.org/), a web server, was developed to predict posttranslational modification sites (PTMs) such as acetylation, methylation, N-linked glycosylation, N-terminal acetylation, phosphorylation, SUMOylation, and ubiquitination. As a PTM predictor, ModPred estimates the overall propensity of a particular amino acid to be changed [31].
2.3.5. Protein Flexibility
FlexPred (http://flexpred.rit.albany.edu/) a bioinformatic program uses two sequence-derived information and solvent accessibility to evaluate residue positions involved in conformational switches. FlexPred classifies amino acid residues into rigid or flexible [32]. PredyFlexy (https://www.dsimb.inserm.fr/dsimb_tools/predyflexy/) is an online tool, which was used to predict protein flexibility. PredyFlexy adopts the X-ray
2.3.6. Secondary Structure
PredictProtein (https://predictprotein.org/) is an automatic server that uses FASTA amino acid sequence as input and predicts protein structure such as secondary structure, solvent accessibility, disulfide bonds, transmembrane helices, strands, coiled-coil regions, and disordered regions, and function [35].
2.4. Modeling
Swiss-Model(https://swissmodel.expasy.org/), an automated server, was used for predicting the three-dimensional structure of proteins. Using FASTA amino acid sequence as input, the Swiss-Model server searches for templates and/or for model building. It gives the best models with sequence identity higher than 30% [36]. ModRefiner (https://zhanglab.ccmb.med.umich.edu/ModRefiner/), an online server, was used for high-resolution protein structure refinement. ModRefiner adopts two separate phases: firstly, it starts from C-alpha trace and main chain hydrogen-bonding networks. Secondly, the side chain is added onto the backbone conformation with the guide of a composite of physics and knowledge-based force fields [37]. PROCHECK (https://servicesn.mbi.ucla.edu/PROCHECK/) is a web-based tool for assessing the quality of protein structure. Its outputs contain a large number of plots including the Ramachandran plot [38]. Verify3D, a freely available online server (https://servicesn.mbi.ucla.edu/Verify3D/), was used to verify the quality assessment of protein models with three-dimensional profiles. A PDB file format was provided as input to generate a profile window plot [39]. TM-align (https://zhanglab.ccmb.med.umich.edu/TM-align/), an online tool, was employed to predict the best alignment between two structures using both TM-score rotation matrix and dynamic programming. A
2.5. Ligand Binding Site Prediction
COACH is a metaserver approach for prediction of protein-ligand binding sites. The server employs other comparative methods, like TM-Site and S-Site, FINDSITE, COFACTOR, and ConCavity, which are available at https://zhanglab.ccmb.med.umich.edu/COACH/ [41]. RaptorX binding site (http://raptorx.uchicago.edu/BindingSite/), a tool, was used for the prediction of ligand binding regions by submitting the FASTA format as input [42].
2.6. Protein Display
Protter (https://wlab.ethz.ch/protter/start/), a graphics open-source program, was developed to predict sequence feature annotations with experimental proteomic [43].
2.7. Dynamic Cross-Correlation Matrix Analysis Using Bio3d Package by RStudio Software and DynOmics Server
We determined the Dynamic Cross-Correlation Maps (DCCM) of TMPRSS2 native and mutants using the Bio3d package by RStudio program [9]. Then, we used DynOmics ENM server to determinate the correlation between observed and predicted fluctuations of TMPRSS2 native and mutants. DynOmics ENM, an online server, was used for computing biomolecular system dynamics of any PDB file. DynOmics ENM uses both elastic network models (ENMs)—the Gaussian Network Model (GNM) and the Anisotropic Network Model (ANM) [44]. Bio3d is an automated R package for the comparative analysis of biomolecular structure, sequence, analysis, and dynamic. Bio3d integrates multiple comparative methods such as principal component analysis (PCA), new ensemble difference distance matrix (eDDM) analysis, network analysis, and normal mode analysis (NMA) [45].
3. Results
All the reported missense variants of the TMPRSS2 gene were retrieved from Ensembl Genome Browser (https://www.ensembl.org/Homo_sapiens/Gene/Variation_Gene/Table?db=core;g=ENSG00000184012;r=21:41464305-41531116). In this paper, we selected only six missense variants (rs12329760, rs781089181, rs762108701, rs1185182900, rs570454392, and rs867186402) to investigate the potential genetic susceptibility to COVID-19. For that, we used a multitier approach using different algorithms such as functional analysis of human TMPRSS2 missense variants using SIFT, PolyPhen2.0, PROVEAN, SNAP2, PMut, and MutPred; stability analysis of mutant proteins using I-Mutant Suite, MUpro, CUPSAT, iStable, and STRUM; the implication of missense variants with conserved and exposed residues in TMPRSS2 protein by using Clustal Omega and ConSurf tools; analysis of the effect of missense variants on protein flexibility and secondary structure using FlexPred, PredyFlexy, RaptorX property, and PredictProtein, respectively; structure analysis and comparison between tertiary structures of mutant and native proteins using Swiss Model, ModRefiner, PROCHECK, Verify3D, TM-align; and finally ligand binding site prediction using COACH and RaptorX binding site servers.
3.1. Functional Analysis of Human TMPRSS2 Missense Variants
Among the six missense variants tested, five were predicted damaging (prediction score was ranged from 0 to 0.02) (Table 1). According to PolyPhen2.0, all the variants were identified as probably damaging (prediction score close to 1), while PROVEAN predicted five of the SNPs to be deleterious (G181R, R240C, P335L, G432A, and D435Y), SNAP2 predicted all of the submitted SNPs to affect protein function. When using the PMut, five of the subjected mutations were found to be disease-related (V160M, G181R, P335L, G432A, and D435Y). As presented in Table 2, MutPred analysis revealed that G181R was significantly associated with gain of a helix, loss of disulfide at C185, and the gain of ADP-ribosylation at G181 with
Table 1
Missense variants identified to be deleterious or damaging using different algorithms.
SNP ID | Amino acid change | SIFT | PolyPhen2.0 | PROVEAN | SNAP2 | PMut | ||||
Score | Prediction | Score | Prediction | Score | Prediction | Score | Prediction | |||
rs12329760 | V160M | 0.01 | D | 0.997 | P. D | -1.891 | N | 95 | E | Dis |
rs781089181 | G181R | 0.06 | T | 1.000 | P. D | -6.057 | Del | 45 | E | Dis |
rs762108701 | R240C | 0.01 | D | 1.000 | P. D | -5.224 | Del | 63 | E | N |
rs1185182900 | P335L | 0.02 | D | 0.985 | P. D | -7.515 | Del | 39 | E | Dis |
rs570454392 | G432A | 0.00 | D | 1.000 | P. D | -5.631 | Del | 63 | E | Dis |
rs867186402 | D435Y | 0.00 | D | 1.000 | P. D | -7.975 | Del | 74 | E | Dis |
Legend: D: damaging; T: tolerated, P. D: probably damaging; Del: deleterious; N: neutral; E: effect; Dis: disease.
Table 2
Prediction of effect of missense variants on phylogenetic conservation, phenotypic analysis, and posttranslational modification sites in human TMPRSS2 protein.
SNP ID | Variant | Posttranslational modifications (PTMs) by ModPred | Phylogenetic conservation | Predicted effect by MutPred |
rs12329760 | V160M | — | 6, B | — |
rs781089181 | G181R | — | 9, B | Loss of loop |
Altered transmembrane protein | ||||
Gain of helix | ||||
Loss of disulfide linkage at C185 | ||||
Gain of ADP-ribosylation at G181 | ||||
rs762108701 | R240C | Proteolytic cleavage | 5, E | — |
ADP-ribosylation | ||||
rs1185182900 | P335L | Proteolytic cleavage | 3, E | — |
rs570454392 | G432A | Proteolytic cleavage | 9, E, F | Loss of relative solvent accessibility |
Loss of loop | ||||
Altered transmembrane protein | ||||
Altered metal binding | ||||
Gain of disulfide linkage at C437 | ||||
Gain of catalytic site at D435 | ||||
Gain of pyrrolidone carboxylic acid at Q431 | ||||
rs867186402 | D435Y | Proteolytic cleavage | 9, E, F | Altered transmembrane protein |
Altered ordered interface | ||||
Altered metal binding | ||||
Loss of relative solvent accessibility | ||||
Loss of catalytic site at G439 | ||||
Gain of disulfide linkage at C437 | ||||
Gain of pyrrolidone carboxylic acid at Q43 | ||||
Gain of sulfation at D435 |
3.2. Structural Analysis of Human TMPRSS2 Missense Variants
3.2.1. Protein Stability
I-Mutant Suite, MUpro, CUPSAT, iStable, and STRUM were used to predict the change in protein stability of TMPRSS2. Out of six nsSNPs submitted for stability testing, four variants (V160M, G181R, R240C, and G432A) were found as decreasing the stability of TMPRSS2 protein according to I-Mutant Suite, MUpro, and iStable, while five out of six missense variants were predicted as destabilizing the TMPRSS2 protein using the STRUM server. CUPSAT identified five variants (V160M, G181R, P335L, G432A, and D435Y) out of six that affect the protein stability of TMPRSS2. Only one variant P335L exhibited unfavorable charges in torsion angle with influence on TMPRSS2 protein stability (Tables 3 and 4).
Table 3
Effects of mutation on protein stability by I-Mutant, MUpro, iStable, and STRUM.
SNP ID | Amino acid variant | I-Mutant | MUpro | iStable | STRUM |
rs12329760 | V160M | Decrease | Decrease | Decrease | Destabilizing |
rs781089181 | G181R | Decrease | Decrease | Decrease | Destabilizing |
rs762108701 | R240C | Decrease | Decrease | Decrease | Destabilizing |
rs1185182900 | P335L | Decrease | Increase | Increase | Destabilizing |
rs570454392 | G432A | Decrease | Decrease | Decrease | Stabilizing |
rs867186402 | D435Y | Increase | Decrease | Increase | Destabilizing |
Table 4
Missense variant analysis by CUPSAT tool.
SNP ID | Amino acid variant | Stability | Torsion | Predicted |
rs12329760 | V160M | Destabilizing | Favorable | -3.39 |
rs781089181 | G181R | Destabilizing | Favorable | -0.57 |
rs762108701 | R240C | Stabilizing | Favorable | 0.75 |
rs1185182900 | P335L | Destabilizing | Unfavorable | -2.39 |
rs570454392 | G432A | Destabilizing | Favorable | -6.86 |
rs867186402 | D435Y | Destabilizing | Favorable | -1.68 |
3.2.2. Conservation Analysis of TMPRSS2 Gene
The amino acid sequence of TMPS2_Human transmembrane protease serine 2 protein was blasted against the UniprotKB/SwissProt in NCBI databases, and 100 sequences producing significant alignments were downloaded as Hit Table (CSV) files. Therefore, all sequences share more than 70% identity and an
3.2.3. Evolutionary Phylogenetic Analysis of TMPRSS2
The amino acid evolutionary conservation in TMPRSS2 protein was checked using the ConSurf server. As presented in Figures 1 and 2 and Table 2, ConSurf analysis showed that residues G181 (buried), G432, and D435 (exposed and functional) are highly conserved with an index conservation of 9 and identified less conserved amino acid residues V160 (buried) and R240 (exposed) with an index conservation of 5-6. P335 was observed to have a conservation score of 3 (variable and exposed).
[figure omitted; refer to PDF][figure omitted; refer to PDF]3.2.4. Protein Flexibility
FlexPred program was used to predict fluctuations and evaluate which amino acid residues are located in flexible or rigid regions of the TMPRSS2 protein. It was identified that five residues valine, glycine, arginine, proline, and aspartic acid at positions 160, 181, 240, 335, and 435, respectively, were rigid, while the glycine at position 432 was predicted flexible (Table 5).
Table 5
Prediction of TMPRSS2 flexibility using FlexPred server.
Position | Residues | S_LBL ((R) rigid or flexible (F) label) | S_PRB (probability of flexible (F) label) |
160 | VAL | R | 0.4874 |
181 | GLY | R | 0.6059 |
240 | ARG | R | 0.5174 |
335 | PRO | R | 0.5862 |
432 | GLY | F | 0.7747 |
435 | ASP | R | 0.6696 |
For identifying the levels of residue dynamics, we used the PredyFlexy program based on
Table 6
Flexibility analysis by PredyFlexy.
RMSF | Confidence index (CI) | ||
V160M | 0.687 | 0.574 | 8 |
G181R | 0.725 | 0.824 | 7 |
R240C | -0.359 | -0.375 | 10 |
P335L | 0.854 | 0.601 | 11 |
G432A | 0.788 | 0.690 | 2 |
D435Y | -0.105 | 0.195 | 6 |
To determine protein secondary structure, disorder regions, and solvent accessibility of TMPRSS2 protein, the RaptorX property was used. As exposed in Figure 3(c), 88 (17%) positions were predicted as disordered by RaptorX property; then, eight secondary structure types were identified in the TMPRSS2 protein, such as α helix, 3-helix, 5-helix (ℼ helix), extended strand in β ladder, isolated β bridge, hydrogen-bonded turn, bend, and coil. Results of solvent accessibility of TMPRSS2 protein were 27% intermedia, 46% exposed residues, and 25% buried residues (Figure 3(c)).
[figures omitted; refer to PDF]
3.2.5. Secondary Structure
To validate the solvent accessibility and protein secondary structure, we applied the PredictProtein tool. The most types of secondary structure of the TMPRSS2 protein are the helix, buried, exposed, and disordered regions. Then, three types of protein secondary structure were identified in the TMPRSS2 protein, which was helix 2.64% (H; includes α, Pi-, and 3_10-helix), β-strand 23.37% (E; extended strand in the β-sheet conformation of at least two residues length), and loop (L) 73.98%. Figures 3(a) and 3(b), display the PredictProtein analysis of the TMPRSS2 protein (46.14% buried residues and 53.86% exposed residues).
3.2.6. Modeling
The full three-dimensional structure of human TMPRSS2 protein was not available in the Research Collaboratory for Structural Bioinformatics Protein Data Bank (RCSB-PDB, http://rcsb.org). For that, the SwissModeller group has modeled the TMPRSS2 structure with a resolution of 1.95 Å and sequence identity equal to 98.69% was used for further analysis. The selected structures wild type and mutants were refined using ModRefiner and were validated using PROCHECK and Verify3D (Table 7). Ramachandran plot of the native protein identified 258 residues (86.9%) in favored regions, 38 residues (12.8%) in allowed regions (additional and generously allowed regions), and one residue (0.3%) in disallowed regions (Figure 3(d)). Furthermore, Verify3D analysis of the native and mutant proteins revealed that 95.38% (native) of the residues had an average 3D-1D
Table 7
TMPRSS2 structure validation using Verify3D and PROCHECK servers.
Verify3D | PROCHECK | |||
% of amino acid | Favored region | Allowed region | Disallowed region | |
Native | 93.02 | 86.9% (258) | 12.8% (38) | 0.3% (1) |
V160M | 97.09 | 88.9% (264) | 10.1% (30) | 1.0% (3) |
G181R | 93.02 | 89.3% (266) | 10% (30) | 0.7% (2) |
R240C | 96.80 | 89.9% (267) | 9.4% (28) | 0.7% (2) |
P335L | 95.64 | 90.6% (270) | 8.7% (26) | 0.7% (2) |
G432A | 99.42 | 88.6% (264) | 10.7% (31) | 0.7% (2) |
D435Y | 94.19 | 87.2% (259) | 11.8% (35) | 1.0% (3) |
Besides, structural similarities between the wild-type and mutant structures were performed using TM-align tool based on TM-score to assess the topological similarity of two proteins and the RMSD (Root Mean Square Deviation) to measure the distance between the backbones of the superimposed protein structures. The RMSD values for all missense variants were significant (
Table 8
Structure alignment comparing mutant models and native TMPRSS2 proteins.
Position | Variant | TM-align score server | ||
Align | RMSD | TM-score | ||
160 | V160M | 7 meq1.A | 0.51 | 0.99435 |
181 | G181R | 7 meq1.A | 0.51 | 0.99431 |
240 | R240C | 7 meq1.A | 0.58 | 0.99304 |
335 | P335L | 7 meq1.A | 0.53 | 0.99403 |
432 | G432A | 7 meq1.A | 0.44 | 0.99590 |
435 | D435Y | 7 meq1.A | 0.48 | 0.99510 |
3.2.7. Ligand Binding Site Prediction
To identify ligand binding sites in the TMPRSS2 protein, we used RaptorX binding and COACH servers. According to the RaptorX binding tool, the largest pocket multiplicity was 55 (pocket
Table 9
Ligand binding site prediction of the TMPRSS2 protein by RaptorX binding.
Multiplicity | Ligand | Binding residues | |
1 | 55 | QGG, SO4, BEN, TFA, CH2 | H296, D435, S436, C437, Q438, G439, S441, T459, S460, W461, G462, S463, G464 |
2 | 19 | SO4 | Y416, S463, G464 |
3 | 19 | SO4 | D338, N343, N344 |
4 | 14 | SO4 | P335, W483, Q487 |
5 | 11 | SO4, PG4 | Q276, N277, L302 |
The bold values show the residues included in the current study.
According to the COACH server, D435 was predicted as a binding residue. The detailed results of COACH are shown in Table 10.
Table 10
Ligand binding site prediction of the TMPRSS2 gene.
(a)
COACH
Cluster size | Name of ligand | Residue number | |
0.82 | 1440 | T87 | 296, 340, 341, 342, 435, 436, 437, 438, 441, 459, 460, 462? 463, 464, 465, 472. |
0.05 | 123 | PEPTIDE | 275, 280, 281, 296, 297, 300, 301, 308, 435, 436, 437, 438, 439, 441, 459, 460, 461, 462, 463, 464, 472. |
0.03 | 93 | PEPTIDE | 260, 261, 263, 264, 265, 266, 268, 269, 358, 359, 362, 363, 364, 365, 377, 378, 380, 399, 401, 429, 447, 448, 451, 452, 453. |
0.02 | 77 | PEPTIDE | 274, 278, 311, 317, 318, 319, 320, 322, 325. |
0.02 | 55 | PEPTIDE | 265, 266, 267, 268, 269, 357, 359, 362, 363, 364, 365, 380, 399, 452, 453. |
0.02 | 52 | PEPTIDE | 274, 277, 279, 280, 296, 309, 317, 318, 319, 320, 325, 327, 340, 393, 435, 436, 438, 439, 440, 441, 460, 461, 462, 464, 472. |
0.01 | 27 | PEPTIDE | 274, 278, 279, 317, 318, 319. |
0.01 | 35 | CA | 314, 316, 317, 318, 319, 320, 323. |
0.01 | 21 | PEPTIDE | 265, 266, 267, 268, 269, 288, 355, 356, 357, 359, 361, 362, 363, 364, 365, 380, 453. |
0.00 | 4 | SO4 | 367, 368, 369, 454. |
(b)
TM-Site
Cluster size | Name of ligand | Residue number | |
0.50 | 113 | III, 0G6, 0GJ | 296, 342, 435, 436, 437, 438, 439, 441, 459, 460, 461, 462, 463, 464, 465, 472. |
0.24 | 23 | III, C3A, SO4 | 275, 280, 281, 296, 297, 300, 301, 308, 435, 436, 437, 438, 439, 441, 459, 460, 461, 462, 463, 464, 472. |
0.19 | 29 | III | 263, 264, 265, 266, 268, 269, 358, 359, 360, 362, 363, 364, 365, 376, 377, 378, 380, 401, 429, 447, 448, 450, 451, 452, 453. |
0.19 | 22 | III, GSH, BR | 265, 266, 268, 269, 357, 359, 362, 363, 364, 365, 380, 399, 452, 453. |
0.16 | 7 | III, ZN, IOD | 274, 277, 278, 279, 280, 296, 301, 309, 311, 317, 318, 319, 320, 325, 340, 341, 435, 436, 438, 439, 441, 460, 461, 462, 464, 472. |
(c)
S-Site
Cluster size | Name of ligand | Residue number | |
0.38 | 752 | III, BEN, UUU | 280, 281, 296, 297, 300, 341, 342, 418, 435, 436, 437, 438, 439, 440, 441, 459, 460, 461, 462, 463, 464, 472, 473, 474. |
0.14 | 80 | III, UUU, GSH | 260, 263, 264, 265, 266, 267, 268, 269, 288, 355, 356, 357, 358, 359, 360, 361, 362, 363, 364, 365, 372, 376, 377, 378, 379, 380, 401, 429, 447, 448, 451, 452, 453. |
0.13 | 98 | III, CA, EDO | 274, 276, 277, 278, 279, 309, 311, 314, 316, 317, 318, 319, 320, 323, 324, 325, 327. |
0.11 | 27 | NA, CA, ZN | 413, 416, 429, 430, 431, 433, 463, 466, 467, 468, 469, 470, 471, 473. |
0.10 | 13 | BGC, SO4, CA | 372, 373, 375, 404, 405, 406, 407, 408, 409, 410, 421, 422, 423, 424, 425, 426, 456, 476. |
(d)
COFACTOR
Name of ligand | Residue number | |
0.51 | PEPTIDE | 296, 337, 340, 342, 389, 419, 435, 436, 437, 438, 439, 441, 460, 461, 462, 463, 464, 465, 472. |
0.45 | T76 | 296, 436, 441, 459, 460, 461, 462, 464, 465, 472, 473. |
0.42 | BM2 | 296, 341, 342, 435, 438, 441, 460, 461, 462, 463, 464. |
0.26 | PEPTIDE | 296, 435, 436, 437, 461, 462, 464, 472. |
0.25 | PEPTIDE | 296, 389, 390, 441, 460, 462, 464. |
(e)
FINDSITE
Cluster size | Name of ligand | Residue number | |
0.70 | 320 | Site 1 | 296, 342, 418 435, 436, 437, 438, 441, 459, 461, 462, 463, 464, 465, 472, 474. |
0.10 | 46 | Site 2 | 272, 274, 276, 277, 279, 309, 311, 317, 318, 319, 320, 324, 325, 327, 393. |
0.04 | 16 | Site 3 | 265, 266, 267, 268, 269, 285, 288, 355, 375, 359, 362, 363, 365, 380, 452, 453. |
0.03 | 14 | Site 4 | 299, 302 |
0.01 | 4 | Site 5 | 338, 339 |
(f)
ConCavity
Name of ligand | Residue number | |
0.45 | Cavity 1 | 280, 296, 297, 341, 342, 345, 381, 402, 416, 419, 420, 427, 428, 429, 434, 435, 436, 437, 438, 439, 440, 441, 445, 458, 459, 460, 461, 462, 464, 465, 467, 470, 471, 472, 473, 474. |
0.30 | Cavity 2 | 267, 270, 271, 272, 279, 282, 317, 383, 384, 397, 439, 440. |
0.21 | Cavity 3 | 268, 269, 270, 271, 285, 288, 289, 291, 310, 312, 313, 327, 328, 349, 351, 355, 360, 361, 362. |
The bold values show the residues included in the current study.
3.2.8. Protein Display
The topology prediction was shown by the Protter server; the figure illustrates a long cytoplasmic N-terminus and suggests that the TMPRSS2 protein was located mostly at the extracellular part of the cell membrane. Then, the five amino acids (orange color) represent the predicted variants such as V160, S254, E331, K451, and D491 (Figure 5).
[figure omitted; refer to PDF]3.2.9. Dynamic Cross-Correlation Matrix Analysis Using Bio3d Package by RStudio Software and DynOmics Server
DCCM was done to comprehend the correlated communications between residues. The result showed that as compared with the wild type, the V160M, G181R, R240C, P335L, G432A, and D435Y variants decreased the degree of positive (red color) and negative (blue color) correlations observed in the TMPRSS2 native, despite the fact that no significant correlation in the movement of residues has been remarked in the Dynamic Cross-Correlation Matrix analysis (Figure 6, Table 11).
[figures omitted; refer to PDF]
Table 11
Correlation between observed and predicted fluctuations of TMPRSS2 native and mutants.
TMPRSS2 structures | Native | V160M | G181R | R240C | P335L | G432A | D435Y |
Correlation between observed and predicted fluctuations | 0.8 | 0.69 | 0.70 | 0.69 | 0.72 | 0.69 | 0.69 |
4. Discussion
The transmembrane serine protease 2 (TMPRSS2) plays a crucial role in human cell entry of a diverse range of viruses including SARS-CoV-2 [2]. Strikingly, a recent investigation by Hou et al. found six deleterious variants such as p. Val 160Met, p. Gly181Arg, p. Arg240Cys, p. Pro335Leu, p. Gly432Ala, and p. Arg435Tyr in the TMPRSS2 gene, which are demonstrated as somatic mutations in different cancer databases and also suggest explanations for genetic susceptibility to COVID-19 [11]. This analysis reported that TMPRSS2 variants were probably associated with susceptibility to SARS-CoV-2 [11]. So, in this report, we look for these six missense variants (V160M, G181R, R240C, P335L, G432A, and D435Y) which previously might be important risk factors associated with COVID-19 susceptibility. The current study might also be helpful to understand the effect of those variants on TMPRSS2 structure, function, and stability. A series of in silico prediction analyses were used for the functional and structural annotations of human TMPRSS2 missense variants like SIFT, PolyPhen2.0, PROVEAN, SNAP2, PMut, MutPred2, I-Mutant Suite, MUpro, iStable, CUPSAT, and STRUM, respectively, which were utilized to find out the most deleterious variants of TMPRSS2 and to evaluate their effects on TMPRSS2 function, structure, and stability.
From our functional analysis of human TMPRSS2 missense variants, SIFT predicted five of total variants are deleterious; these five variations were predicted deleterious by PROVEAN (except for V160M), SNAP2, and PMut (except for R240C). Protein stability is essential for understanding the relationship between protein structure and function [46]. A total of six variants tested were identified decreasing the stability of TMPRSS2 by all algorithms for V160M and G181R and by at least three tools for the rest (R240C, P335L, G432A, and D435Y), by analyzing all missense variants through different servers. The six nsSNPs are potentially damaging. ConSurf analysis results showed that variants at positions G181R, G432A, and D435Y were in the highly conserved region and confirmed by MutPred2 to have crucial alterations on the TMPRSS2 protein. The prediction of posttranslational modification sites (PTMs) is one of the important characteristics for understanding different biological processes such as the cell signalling state, localization, and interactions. It can also be essential for the study of diseases or for development of drugs [47]. Therefore, the R240, P335, G432, and D435 residues identified PTMs for proteolytic cleavage and ADP-ribosylation. Flexibility is one of the most essential criteria related to protein functions. Herein, we used FlexPred and PredyFlexy to determine conformational changes and to comprehend dynamic system of TMPRSS2. Variants R240C and D435Y were predicted to be in a relatively rigid region, while G432A was defined as a flexible area. We have also investigated the secondary structure of native and mutants by identifying disordered regions in TMPRSS2 using PredictProtein and RaptorX property. Compared to the native structure of TMPRSS2 protein, 5 disordered regions were formed due to V160M and P335L variants, since this can change the function of TMPRSS2 because disordered regions are dynamically flexible. Prediction of three-dimensional structures of TMPRSS2 models is necessary for the validation of structural changes. Therefore, the three-dimensional structure of the TMPRSS2 native and mutants was generated using 7 meq as a template from the SwissModeller group and refined by using ModRefiner. Quality checking of SwissModel constructed models was done by using PROCHECK and Verify3D. Ramachandran plot analysis showed that all models of TMPRSS2 (wild-type and mutants) were of good quality and can be used for further study; then, quantitative assessment was done by using the TM-align tool for comparing native and mutant proteins by calculating RMSD values and TM-score. All RMSD values were significant (
To date, various in silico analyses have been made using different bioinformatics tools to identify and predict TMPRSS2 gene host polymorphism against SARS-CoV-2. As our results show, a study by [48] has shown that the TMPRSS2 p. Val160Met polymorphism was associated with SARS-CoV-2 infectivity. A recent investigation by Asselta et al. reported the existence of some TMPRSS2 polymorphisms, namely, rs2070788, rs9974589, and rs7364083. These variants showed a significant association between these SNPs and the SARS-CoV-2 infectivity [40]. Another study by Irham et al. (2020) demonstrated that some variants of TMPRSS2, namely, rs2070788, rs383510, rs464397, and rs469390, might affect the expression of TMPRSS2 in some many tissues and consequently were probably associated with SARS-CoV-2 infectivity [51].
Overall, this in silico analysis gives an interesting insight into the role of the TMPRSS2 variants in susceptibility to SARS-CoV-2 infection. The analysis consortium would also involve researchers and scientists in the future to confirm the selected mutations (V160M, G181R, R240C, P335L, G432A, and D435Y) as candidate variants. In the future, it should be noted that further in silico analysis and laboratory experiments must be combined for more justifying such important results.
5. Conclusion
Overall, we conclude that rs12329760 (V160M), rs781089181 (G181R), rs762108701 (R240C), rs1185182900 (P335L), rs570454392 (G432A), and rs867186402 (D435Y) are the most significant variants. All six nsSNPs were predicted to alter protein function and stability. Most of them are highly conserved (V160M, G181R, G432A, and D435Y) and comprise posttranslational modification sites (PTMs) (R240C, P335L, G432A, and D435Y). D435 was identified as a ligand-binding site that may interfere in the binding interactions of the TMPRSS2 protein. In this in silico analysis, for the first time, we tested the effect of those missense variants on TMPRSS2 structure, stability, and function by using various bioinformatics algorithms that may serve an important role in SARS-CoV-2 infection.
Authors’ Contributions
Lahcen Wakrim and Anass Kettani contributed equally to this work.
Acknowledgments
The authors are thankful to the Pasteur Institute of Morocco for providing encouragement and facilities.
[1] T. M. Antalis, T. H. Bugge, Q. Wu, "Membrane-anchored serine proteases in health and disease," Progress in Molecular Biology and Translational Science, vol. 99,DOI: 10.1016/B978-0-12-385504-6.00001-4, 2011.
[2] M. Thunders, B. Delahunt, "Gene of the month: TMPRSS2 (transmembrane serine protease 2)," Journal of Clinical Pathology, vol. 73 no. 12, pp. 773-776, DOI: 10.1136/jclinpath-2020-206987, 2020.
[3] G. Ploussard, G. Plennevaux, Y. Allory, L. Salomon, S. Azoulay, D. Vordos, A. Hoznek, C. C. Abbou, A. de la Taille, "High-grade prostatic intraepithelial neoplasia and atypical small acinar proliferation on initial 21-core extended biopsy scheme: incidence and implications for patient care and surveillance," World Journal of Urology, vol. 27 no. 5, pp. 587-592, DOI: 10.1007/s00345-009-0413-1, 2009.
[4] J. A. Squire, P. C. Park, M. Yoshimoto, J. Alami, J. L. Williams, A. Evans, A. M. Joshua, "Prostate cancer as a model system for genetic diversity in tumors," Advances in Cancer Research, vol. 112, pp. 183-216, DOI: 10.1016/B978-0-12-387688-1.00007-7, 2011.
[5] C. Magi-Galluzzi, T. Tsusuki, P. Elson, K. Simmerman, C. LaFargue, R. Esgueva, E. Klein, M. A. Rubin, M. Zhou, "TMPRSS2-ERG gene fusion prevalence and class are significantly different in prostate cancer of Caucasian, African-American and Japanese patients," The Prostate, vol. 71 no. 5, pp. 489-497, DOI: 10.1002/pros.21265, 2011.
[6] D. P. Kong, R. Chen, C. L. Zhang, W. Zhang, G. A. Xiao, F. B. Wang, N. Ta, X. Gao, Y. H. Sun, "Prevalence and clinical application of TMPRSS2-ERG fusion in Asian prostate cancer patients: a large-sample study in Chinese people and a systematic review," Asian Journal of Andrology, vol. 22 no. 2, pp. 200-207, DOI: 10.4103/aja.aja_45_19, 2020.
[7] M. Hoffmann, H. Kleine-Weber, S. Schroeder, N. Krüger, T. Herrler, S. Erichsen, T. S. Schiergens, G. Herrler, N. H. Wu, A. Nitsche, M. A. Müller, C. Drosten, S. Pöhlmann, "SARS-CoV-2 cell entry depends on ACE2 and TMPRSS2 and is blocked by a clinically proven protease inhibitor," Cell, vol. 181 no. 2, pp. 271-280.e8, DOI: 10.1016/j.cell.2020.02.052, 2020.
[8] U. Kumar, N. M. Priya, S. R. Nithya, P. Kannan, N. Jain, D. T. Kumar, R. Magesh, S. Younes, H. Zayed, C. G. P. Doss, "A review of novel coronavirus disease (COVID-19): based on genomic structure, phylogeny, current shreds of evidence, candidate vaccines, and drug repurposing," 3 Biotech, vol. 11 no. 4,DOI: 10.1007/s13205-021-02749-0, 2021.
[9] T. Kumar D, N. Shaikh, U. Kumar S, G. P. Doss C, H. Zayed, "Structure-based virtual screening to identify novel potential compound as an alternative to remdesivir to overcome the RdRp protein mutations in SARS-CoV-2," Frontiers in Molecular Biosciences, vol. 8, article 645216,DOI: 10.3389/fmolb.2021.645216, 2021.
[10] M. Hussain, N. Jabeen, A. Amanullah, A. Ashraf Baig, B. Aziz, S. Shabbir, F. Raza, N. Uddin, "Molecular docking between human TMPRSS2 and SARS-CoV-2 spike protein: conformation and intermolecular interactions," AIMS Microbiology, vol. 6 no. 3, pp. 350-360, DOI: 10.3934/microbiol.2020021, 2020.
[11] Y. Hou, J. Zhao, W. Martin, A. Kallianpur, M. K. Chung, L. Jehi, N. Sharifi, S. Erzurum, C. Eng, F. Cheng, "New insights into genetic susceptibility of COVID-19: an ACE2 and TMPRSS2 polymorphism analysis," BMC Medicine, vol. 18 no. 1,DOI: 10.1186/s12916-020-01673-z, 2020.
[12] U. Kumar, S. Sankar, D. T. Kumar, S. Younes, N. Younes, R. Siva, C. G. P. Doss, H. Zayed, "Molecular dynamics, residue network analysis, and cross-correlation matrix to characterize the deleterious missense mutations in GALE causing galactosemia III," Cell Biochemistry and Biophysics, vol. 79 no. 2, pp. 201-219, DOI: 10.1007/s12013-020-00960-z, 2021.
[13] S. Sankar, S. Younes, M. N. Ahmad, S. S. Okashah, B. Kamaraj, A. M. Al-Subaie, H. Zayed, "Deciphering the role of filamin B calponin-homology domain in causing the Larsen syndrome, boomerang dysplasia, and atelosteogenesis type I spectrum disorders via a computational approach," Molecules, vol. 25 no. 23,DOI: 10.3390/molecules25235543, 2020.
[14] S. Udhaya Kumar, S. Sankar, S. Younes, D. Thirumal Kumar, M. N. Ahmad, S. S. Okashah, B. Kamaraj, A. M. Al-Subaie, C. George Priya Doss, H. Zayed, "Mutational landscape of K-Ras substitutions at 12th position-a systematic molecular dynamics approach," Journal of Biomolecular Structure & Dynamics, vol. 9,DOI: 10.1080/07391102.2020.1830177, 2020.
[15] S. Udhaya Kumar, D. Thirumal Kumar, P. D. Mandal, S. Sankar, R. Haldar, B. Kamaraj, C. E. J. Walter, R. Siva, C. George Priya Doss, H. Zayed, "Comprehensive in silico screening and molecular dynamics studies of missense mutations in Sjogren-Larsson syndrome associated with the ALDH3A2 gene," Advances in Protein Chemistry and Structural Biology, vol. 120, pp. 349-377, DOI: 10.1016/bs.apcsb.2019.11.004, 2020.
[16] P. C. Ng, S. Henikoff, "SIFT: predicting amino acid changes that affect protein function," Nucleic Acids Research, vol. 31 no. 13, pp. 3812-3814, DOI: 10.1093/nar/gkg509, 2003.
[17] N. L. Sim, P. Kumar, J. Hu, S. Henikoff, G. Schneider, P. C. Ng, "SIFT web server: predicting effects of amino acid substitutions on proteins," Nucleic Acids Research, vol. 40 no. W1, pp. W452-W457, DOI: 10.1093/nar/gks539, 2012.
[18] I. Adzhubei, D. M. Jordan, S. R. Sunyaev, S. R. Sunyaev, "Predicting functional effect of human missense mutations using PolyPhen-2," Current Protocols in Human Genetics, vol. 76 no. 1,DOI: 10.1002/0471142905.hg0720s76, 2013.
[19] Y. Bromberg, G. Yachdav, B. Rost, "SNAP predicts effect of mutations on protein function," Bioinformatics, vol. 24 no. 20, pp. 2397-2398, DOI: 10.1093/bioinformatics/btn435, 2008.
[20] V. López-Ferrando, A. Gazzo, X. de la Cruz, M. Orozco, J. L. Gelpí, "PMut: a web-based tool for the annotation of pathological variants on proteins, 2017 update," Nucleic Acids Research, vol. 45 no. W1, pp. W222-W228, DOI: 10.1093/nar/gkx313, 2017.
[21] V. Pejaver, J. Urresti, J. Lugo-Martinez, K. A. Pagel, G. N. Lin, H.-J. Nam, M. Mort, D. N. Cooper, J. Sebat, L. M. Iakoucheva, S. D. Mooney, P. Radivojac, "MutPred2: inferring the molecular and phenotypic impact of amino acid variants," BioRvix,DOI: 10.1101/134981, 2020.
[22] E. Capriotti, P. Fariselli, R. Casadio, "I-Mutant2.0: predicting stability changes upon mutation from the protein sequence or structure," Nucleic Acids Research, vol. 33 no. Web Server, pp. W306-W310, DOI: 10.1093/nar/gki375, 2005.
[23] J. Cheng, A. Randall, P. Baldi, "Prediction of protein stability changes for single-site mutations using support vector machines," Proteins, vol. 62 no. 4, pp. 1125-1132, DOI: 10.1002/prot.20810, 2006.
[24] M. A. Beg, S. Shivangi, C. Thakur, L. S. Meena, "Structural Prediction and Mutational Analysis of Rv3906c Gene of Mycobacterium tuberculosis H37Rv to Determine Its Essentiality in Survival," Advances in Bioinformatics, vol. 2018,DOI: 10.1155/2018/6152014, 2018.
[25] C. W. Chen, M. H. Lin, C. C. Liao, H. P. Chang, Y. W. Chu, "IStable 2.0: predicting protein thermal stability changes by integrating various characteristic modules," Computational and Structural Biotechnology Journal, vol. 18, pp. 622-630, DOI: 10.1016/j.csbj.2020.02.021, 2020.
[26] L. Quan, Q. Lv, Y. Zhang, "STRUM: structure-based prediction of protein stability changes upon single-point mutation," Bioinformatics, vol. 32 no. 19, pp. 2936-2946, DOI: 10.1093/bioinformatics/btw361, 2016.
[27] V. Parthiban, M. M. Gromiha, D. Schomburg, "CUPSAT: prediction of protein stability upon point mutations," Nucleic Acids Research, vol. 34 no. Web Server, pp. W239-W242, DOI: 10.1093/nar/gkl190, 2006.
[28] F. Sievers, D. G. Higgins, "Clustal Omega, accurate alignment of very large numbers of sequences," Methods in Molecular Biology, vol. 1079, pp. 105-116, DOI: 10.1007/978-1-62703-646-7_6, 2014.
[29] A. M. Waterhouse, J. B. Procter, D. M. A. Martin, M. Clamp, G. J. Barton, "Jalview Version 2—a multiple sequence alignment editor and analysis workbench," Bioinformatics, vol. 25 no. 9, pp. 1189-1191, DOI: 10.1093/bioinformatics/btp033, 2009.
[30] H. Ashkenazy, S. Abadi, E. Martz, O. Chay, I. Mayrose, T. Pupko, N. Ben-Tal, "ConSurf 2016: an improved methodology to estimate and visualize evolutionary conservation in macromolecules," Nucleic Acids Research, vol. 44 no. W1, pp. W344-W350, DOI: 10.1093/nar/gkw408, 2016.
[31] V. Pejaver, W. L. Hsu, F. Xin, A. K. Dunker, V. N. Uversky, P. Radivojac, "The structural and functional signatures of proteins that undergo multiple events of post-translational modification," Protein Science, vol. 23 no. 8, pp. 1077-1093, DOI: 10.1002/pro.2494, 2014.
[32] I. B. Kuznetsov, M. McDuffie, "FlexPred: a web-server for predicting residue positions involved in conformational switches in proteins," Bioinformation, vol. 3 no. 3, pp. 134-136, DOI: 10.6026/97320630003134, 2008.
[33] A. G. de Brevern, A. Bornot, P. Craveur, C. Etchebest, J. C. Gelly, "PredyFlexy: flexibility and local structure prediction from sequence," Nucleic Acids Research, vol. 40 no. W1, pp. W317-W322, DOI: 10.1093/nar/gks482, 2012.
[34] S. Wang, W. Li, S. Liu, J. Xu, "RaptorX-property: a web server for protein structure property prediction," Advances in Cancer Research, vol. 44 no. 1, pp. W430-W435, 2016.
[35] G. Yachdav, E. Kloppmann, L. Kajan, M. Hecht, T. Goldberg, T. Hamp, P. Hönigschmid, A. Schafferhans, M. Roos, M. Bernhofer, L. Richter, H. Ashkenazy, M. Punta, A. Schlessinger, Y. Bromberg, R. Schneider, G. Vriend, C. Sander, N. Ben-Tal, B. Rost, "PredictProtein—an open resource for online prediction of protein structural and functional features," Nucleic Acids Research, vol. 42 no. W1, pp. W337-W343, DOI: 10.1093/nar/gku366, 2014.
[36] M. Biasini, S. Bienert, A. Waterhouse, K. Arnold, G. Studer, T. Schmidt, F. Kiefer, T. G. Cassarino, M. Bertoni, L. Bordoli, T. Schwede, "SWISS-MODEL: modelling protein tertiary and quaternary structure using evolutionary information," Nucleic Acids Research, vol. 42 no. W1, pp. W252-W258, DOI: 10.1093/nar/gku340, 2014.
[37] D. Xu, Y. Zhang, "Improving the physical realism and structural accuracy of protein models by a two-step atomic-level energy minimization," Biophysical Journal, vol. 101 no. 10, pp. 2525-2534, DOI: 10.1016/j.bpj.2011.10.024, 2011.
[38] R. A. Laskowski, M. W. MacArthur, D. S. Moss, J. M. Thornton, "PROCHECK: a program to check the stereochemical quality of protein structures," Journal of Applied Crystallography, vol. 26 no. 2, pp. 283-291, DOI: 10.1107/S0021889892009944, 1993.
[39] D. Eisenberg, R. Lüthy, J. U. Bowie, "[20] VERIFY3D: Assessment of protein models with three-dimensional profiles," Methods in Enzymology, vol. 277, pp. 396-404, DOI: 10.1016/S0076-6879(97)77022-8, 1997.
[40] Y. Zhang, J. Skolnick, "TM-align: a protein structure alignment algorithm based on the TM-score," Nucleic Acids Research, vol. 33 no. 7, pp. 2302-2309, DOI: 10.1093/nar/gki524, 2005.
[41] J. Yang, A. Roy, Y. Zhang, "Protein-ligand binding site recognition using complementary binding-specific substructure comparison and sequence profile alignment," Bioinformatics, vol. 29 no. 20, pp. 2588-2595, DOI: 10.1093/bioinformatics/btt447, 2013.
[42] M. Källberg, H. Wang, S. Wang, J. Peng, Z. Wang, H. Lu, J. Xu, "Template-based protein structure modeling using the RaptorX web server," Nature Protocols, vol. 7 no. 8, pp. 1511-1522, DOI: 10.1038/nprot.2012.085, 2012.
[43] U. Omasits, C. H. Ahrens, S. Müller, B. Wollscheid, "Protter: interactive protein feature visualization and integration with experimental proteomic data," Bioinformatics, vol. 30 no. 6, pp. 884-886, DOI: 10.1093/bioinformatics/btt607, 2014.
[44] H. Li, Y. Y. Chang, J. Y. Lee, I. Bahar, L. W. Yang, "DynOmics: dynamics of structural proteome and beyond," Nucleic Acids Research, vol. 45 no. W1, pp. W374-W380, DOI: 10.1093/nar/gkx385, 2017.
[45] B. J. Grant, A. P. C. Rodrigues, K. M. ElSawy, J. A. McCammon, L. S. D. Caves, "Bio3d: an R package for the comparative analysis of protein structures," Bioinformatics, vol. 22 no. 21, pp. 2695-2696, DOI: 10.1093/bioinformatics/btl461, 2006.
[46] L. Montanucci, E. Capriotti, Y. Frank, N. Ben-Tal, P. Fariselli, "DDGun: an untrained method for the prediction of protein stability changes upon single and multiple point variations," BMC Bioinformatics, vol. 20 no. S14,DOI: 10.1186/s12859-019-2923-1, 2019.
[47] M. M. Hasan, M. S. Khatun, "Opinion Prediction of protein post-translational modification sites: an overview," Annals of Proteomics and Bioinformatics, vol. 2 no. 1, pp. 049-057, DOI: 10.29328/journal.apb.1001005, 2017.
[48] L. Wulandari, B. Hamidah, C. Pakpahan, N. S. Damayanti, N. D. Kurniati, C. O. Adiatmaja, M. R. Wigianita, Soedarsono, D. Husada, D. Tinduh, C. R. S. Prakoeswa, A. Endaryanto, N. N. T. Puspaningsih, Y. Mori, M. I. Lusida, K. Shimizu, D. Oceandy, "Initial study on TMPRSS2 p.Val160Met genetic variant in COVID-19 patients," Hum Genomics, vol. 15 no. 1,DOI: 10.1186/s40246-021-00330-7, 2021.
[49] R. Asselta, E. M. Paraboschi, A. Mantovani, S. Duga, "ACE2 and TMPRSS2 variants and expression as candidates to sex and country differences in COVID-19 severity in Italy," Aging, vol. 12 no. 11, pp. 10087-10098, DOI: 10.18632/aging.103415, 2020.
[50] Y. Choi, A. P. Chan, "PROVEAN web server: a tool to predict the functional effect of amino acid substitutions and indels," Bioinformatics, vol. 31 no. 16, pp. 2745-2747, DOI: 10.1093/bioinformatics/btv195, 2015.
[51] L. M. Irham, W.-H. Chou, M. J. Calkins, W. Adikusuma, S. L. Hsieh, W. C. Chang, "Genetic variants that influence SARS-CoV-2 receptor TMPRSS2 expression among population cohorts from multiple continents," Biochemical and Biophysical Research Communications, vol. 529 no. 2, pp. 263-269, DOI: 10.1016/j.bbrc.2020.05.179, 2020.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Copyright © 2021 Asmae Saih et al. This is an open access article distributed under the Creative Commons Attribution License (the “License”), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License. https://creativecommons.org/licenses/by/4.0/
Abstract
The human transmembrane protease serine 2 (TMPRSS2) protein plays an important role in prostate cancer progression. It also facilitates viral entry into target cells by proteolytically cleaving and activating the S protein of the severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2). In the current study, we used different available tools like SIFT, PolyPhen2.0, PROVEAN, SNAP2, PMut, MutPred2, I-Mutant Suite, MUpro, iStable, ConSurf, ModPred, SwissModel, PROCHECK, Verify3D, and TM-align to identify the most deleterious variants and to explore possible effects on the TMPRSS2 stability, structure, and function. The six missense variants tested were evaluated to have deleterious effects on the protein by SIFT, PolyPhen2.0, PROVEAN, SNAP2, and PMut. Additionally, V160M, G181R, R240C, P335L, G432A, and D435Y variants showed a decrease in stability by at least 2 servers; G181R, G432A, and D435Y are highly conserved and identified posttranslational modifications sites (PTMs) for proteolytic cleavage and ADP-ribosylation using ConSurf and ModPred servers. The 3D structure of TMPRSS2 native and mutants was generated using 7 meq as a template from the SwissModeller group, refined by ModRefiner, and validated using the Ramachandran plot. Hence, this paper can be advantageous to understand the association between these missense variants rs12329760, rs781089181, rs762108701, rs1185182900, rs570454392, and rs867186402 and susceptibility to SARS-CoV-2.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details


1 Virology Unit, Immunovirology Laboratory, Institut Pasteur du Maroc, 20360 Casablanca, Morocco; Laboratory of Biology and Health, URAC 34, Faculty of Sciences Ben M’Sik Hassan II University of Casablanca, Morocco
2 Environmental Health Laboratory, Institut Pasteur du Maroc, 20360 Casablanca, Morocco
3 Immunology and Biodiversity Laboratory, Faculty of Sciences Ain Chock, Hassan II University of Casablanca, Morocco
4 Laboratory of Biology and Health, URAC 34, Faculty of Sciences Ben M’Sik Hassan II University of Casablanca, Morocco
5 Virology Unit, Immunovirology Laboratory, Institut Pasteur du Maroc, 20360 Casablanca, Morocco