COVID-19 Docking Server: An interactive server for docking small molecules, peptides and antibodies against potential targets of COVID-19
Ren Kong1, ?, Guangbo Yang1, Rui Xue1, Ming Liu2, Feng Wang3, Jianping Hu4, Xiaoqiang Guo4, Shan Chang1,*
https://arxiv.xilesou.top/abs/2003.00163
1 Institute of Bioinformatics and Medical Engineering, School of Electrical and Information Engineering, Jiangsu University of Technology, Changzhou 213001, China
2 Beijing New BioConcepts Biotech Co., Ltd., Beijing 101111, China
3 School of Information Science & Engineering, Changzhou University, Changzhou 213164, China
4 Key Laboratory of Medicinal and Edible Plants Resources Development of Sichuan Education Department, College of Pharmacy and Biological Engineering, Sichuan Industrial Institute of Antibiotics, Chengdu University, Chengdu 610106, China
ABSTRACT
Summary: The coronavirus disease 2019 (COVID-19) caused by a new type of coronavirus has been emerging from China and led to thousands of death globally since December 2019. Despite many groups have engaged in studying the newly emerged virus and searching for the treatment of COVID-19, the understanding of the COVID-19 target -ligand interactions represents a key challenge. Herein, we introduce COVID-19 Docking Server, a web server that predicts the binding modes between COVID-19 targets and the ligands including small molecules, peptide and antibody. Structures of proteins involved in the virus life cycle were collected or constructed based on the homologs of coronavirus, and prepared ready for docking. The platform provides a free and interactive tool for the prediction of COVID-19 target-ligand interactions and following drug discovery for COVID-19.
Availability and implementation: The COVID-19 Docking Server and tutorials are freely available at http://ncov.schanglab.org.cn
1. INTRODUCTION
According the situation report from World Health Organization (WHO), more than 75,000 cases of coronavirus disease 2019 (COVID-19) have been reported among 25 countries till February 21, 2020. As the first COVID-19 case was reported in Wuhan city, Hubei province of China in December 2019, concentrated pneumonia occurred in China, especially in Wuhan. The pathogen caused the disease was soon identified as a novel coronavirus, which belongs to the genus Betacoronavirus and is closely related to severe acute respiratory syndrome coronavirus (SARS-CoV) with 89.1% nucleotide similarity in the viral genome1, 2. Later on, it was named as severe acute respiratory syndrome coronavirus 2 (SARS-CoV-2) by the International Committee on the Taxonomy of Viruses. Tremendous effects have been done to study the newly emerged virus and find potent drugs for clinical usage. Wang et. al from Wuhan Institute of Virology screened some of the FDA approved anti-virus or anti-infection drugs and found that remdesivir and chloroquine could effectively inhibit the virus in cell based assay with EC50 of 0.77 and 1.13 μM, respectively3. Several clinical trials are ongoing for the treatment of COVID-19. However, no drug or vaccine has yet been approved.
In a very short time, the structures of functional proteins essential for SARS-CoV-2 were achieved by groups in China, including the main protease and spike protein in binding with angiotensin-converting enzyme 2 (ACE2). Although the structures of other proteins remain unknown, the high amino acid identity between SARS-CoV-2 and SARS-CoV (77.2%) enables that it is possible to establish the homology modeled structures of SARS-CoV-2 based on the protein structures of SARS-CoV with high confidence. Thus, we built and collected the protein structures as potential targets of
SARS-CoV-2 according to those of SARS-CoV. A web-server, COVID-19 Docking Server, was constructed to facilitate people to evaluate the binding modes and binding affinities between these targets and small molecules, peptides as well as antibodies. By launching the server, we’d like to provide a free and easy to use tool for people who is interested in drug discovery against COVID-19.
2. METHODS AND MATERIALS
2.1 Protocol of COVID-19 Docking Server
The flow chart of the COVID-19 Docking Server is shown in Figure 1. For small molecule docking, Autodock Vina4 is used as docking engine. Two computational type are provided with the default computational type as “Docking” (D) to dock single small molecule to the target, and the other option “Screening” (S) to dock a number of small molecules to the target. The protein structures of COVID-19 targets were extracted and prepared by MGLTools 1.5.6. User only needs to upload the small molecules in smi, mol2 or sdf formats. Open Babel was used for format transformation or 3D coordinate generation for the uploaded files5. Then ligand_prepare.py from MGLTools 1.5.6 is used to convert the ligand files into pdbqt format with Gasteiger charge added. The box center for docking was defined according to the information of active sites or binding sites of its homologs of SARS-CoV. All the parameters are set as default except the exhaustiveness value is set to 12 to achieve higher accuracy.
For peptide and antibody docking, CoDockPP6 is used as docking engine. Two options are provided with the default (D) option as global docking and the other one as site-specific (S) docking with constraints defined by user. CoDockPP program takes full
advantage of the sampling efficiency of FFT-based method to perform the global searching. An angle interval of 15o is used for rotational sampling, and a spacing of 1.2 ? is adopted for fast Fourier transform (FFT) translational search. CoDockPP uses a precise knowledge-based scoring function to evaluate the candidate poses after FFT searching. Finally, the top binding modes are clustered with ligand root mean square deviations (L_RMSD) cutoff of 3.0 ? in global docking and 2.0 ? in site-specific docking.
The COVID-19 Docking Server web interface is written in PHP and HTML. JSMol (http://jmol.sourceforge.net/) is used for molecular visualization on the results pages.
Figure 1. The flow chart of the COVID-19 Docking Server. For small molecule docking, default (D) computational type is “Docking”, to dock single molecule by AutoDock Vina. The other type is “Screening” (S), to dock a number of molecules by AutoDock Vina. For peptide and antibody docking, default (D) option is global docking by CoDockPP. The
other type is site-specific (S) docking with the user-defined constraint information by CoDockPP. Finally, the Top 10 binding modes (docking) or molecules (screening) are viewed by JSmol.
2.2 COVID-19 Targets in the server
The structures of functional or structural proteins of SARS-CoV-2 were collected or built based on its homologs of coronavirus by using homology modeling module of Maestro 10 (www.schrodinger.com). All the targets were prepared and available for peptides or antibodies docking on the website. For small molecule docking, the docking box was carefully defined for every specific target. Only the targets with enzyme active site, substrate binding site or inhibitor binding site were prepared and set for small molecule docking. The detailed information about the targets is described as below.
2.2.1 Nonstructural proteins
Main protease (Mpro): It is also named as chymotrypsin-like protease (3CLpro). Mpro cleaves most of the sites in the polyproteins and the products are nonstructural proteins (nsps) which assemble into the replicase-transcriptase complex (RTC). The structure was downloaded from the Protein Data Bank (PDB) with code of 6LU7. The substrate binding site was defined as the docking box for small molecule docking.
Papain-like protease (PLpro): PLpro cleaves the nsp1/2, nsp2/3 and nsp3/4 boundaries. It works with Mpro to cleave the polyproteins into nsps. The structure of PLpro was built based on 4OW0, the PLpro structure of SARS-CoV 7. The substrate binding site was defined as the docking box for small molecule docking.
Nonstructural protein 13 (nsp13, helicase): The helicase catalyzes the unwinding of duplex oligonucleotides into single strands in an NTP-dependent manner. It is also an ideal target to develop anti-viral drugs due to its sequence conservation in all CoV species. The structure of helicase was built based on 6JYT, the helicase structure of SARS-CoV 8. Two sites were defined for small molecule docking: the ADP binding site (ADP site), and the nucleic acids binding site (NCB site).
Nonstructural protein 12 (nsp12, RNA-dependent RNA polymerase, RdRp): Nsp12 is the polymerase which bounds to its essential cofactors, nsp7 and nsp8. It is important in replication and transcription of the viral genome. The structure of RdRp was built based on 6NUR, the RdRp structure of SARS-CoV 9. Two structures were prepared for small molecule docking: One structure is constructed with RNA from its homolog protein (3H5Y), while the other one with no RNA in it.
Nonstructural protein 14 (nsp14, N-terminal exoribonuclease and C-terminal guanine-N7 methyl transferase): Nsp14 of coronaviruses (CoV) is important for viral replication and transcription. The N-terminal exoribonuclease (ExoN) domain plays a proofreading role for prevention of lethal mutagenesis, and the C-terminal domain functions as a guanine-N7 methyl transferase (N7-MTase) for mRNA capping. The structure of nsp14 was built based on 5C8S, the nsp14 structure of SARS-CoV10. Two sites were prepared for small molecule docking. One is defined as the active site of the ExoN, and the other one is defined as the active site of N7-MTase.
Nonstructural protein 15 (nsp15, Uridylate-specific endoribonuclease): Nsp15 forms a hexameric endoribonuclease that preferentially cleaves 3' of uridines, also named
as Uridylate-specific endoribonuclease. It is one of the RNA-processing enzymes encoded by the coronavirus. The structure of nsp15 was built based on 2RHB, the nsp15 structure of SARS-CoV
11. The docking box was defined to include the enzyme active site of one of the chain from the hexamer.
Nonstructural protein 16 (nsp16, 2'-O-methyltransferase): Nsp16 is a S-adenosylmethionine (SAM) dependent nucleoside-2’-O methyltransferase. It is only active with the binding of nsp10. The structure of nsp16 was built based on 2XYR, the nsp16 structure of SARS-CoV12. The SAM binding site was defined as docking box for small molecule docking.
Nonstructural protein 10 (nsp10): It is essential cofactor and forms complex with nsp14 and 16. The structure of nsp10 was built based on 2XYR, the nsp10 structure of SARS-CoV12. It was prepared only for protein docking on the web-server.
Figure 2. The cartoon forms of nonstructural proteins. (A) Main protease. (B) Papain-like protease. (C) Helicase. (D) RNA-dependent RNA polymerase. (E) Nsp14. (F) Nsp15. (G) Nsp16. (H) Nsp10.
2.2.2 Structure proteins
Spike protein (S protein): The surface spike glycoprotein is consisting of three S1-S2 heterodimers. The receptor binding domain (RBD) located on the head of S1 and bind with the cellular receptor angiotensin-converting enzyme 2 (ACE2), initiating the membrane fusion of the virus and host cell. Recently, the structure of spike RBD of SARS-CoV-2 in complex with human ACE2 was released by Wang and Zhang’s group in Tsinghua University. The structure was downloaded and prepared for protein docking. We also built the monomer and trimer structure of full length spike protein based on 6ACG and prepared for protein docking 13.
S2 of S protein: It is the post-fusion state of S2 segment of spike protein, acting as viral fusion protein to mediate the membrane fusion of virus and cells. Typical HR1/HR2 6-helices complex were formed as post-fusion state of SARS-CoV-2, similar to the fusion step of HIV-1 virus. It is a potential target for entry inhibitor development. The 6-helices post fusion conformation of S2 was built based on 1WYY14. A 5-helices structure was prepared by deleting one of the helix from the 6-helices structure and used as receptor for protein or peptide docking.
Envelop small membrane protein (E protein): It forms pentamer and functions as ion channel, also named as E channel. The structure of E channel was built based on 5X29 15. Both the monomer and the pentamer were prepared for protein docking. And the E channel structure was also prepared for small molecule docking with the center of ion channel defined as box center in docking.
Membrane protein (M protein): The M protein involves in most of protein-protein interactions required for assembly of coronaviruses and it is also determined as a protective antigen in humoral responses16. The structure of M protein was built by using I-TASSER server and prepared for protein docking 17-19.
Nucleocapsid protein (N protein): N protein plays multiple roles in the virus replication cycle and forms a ribonucleoprotein complex with the viral RNA through the N protein's N-terminal domain (N-NTD). It buds the viral genomes into the membrane of the endoplasmic reticulum-Golgi intermediate compartment (ERGIC) containing the viral structure proteins to form the mature virions finally 16. The full length structure of N protein was built by using I-TASSER server and prepared for protein docking 17-19. The ribonucleotide-binding site (NCB site) of N protein was built based on 4KYJ and also prepared ready for small molecule docking 20.
Angiotensin-converting enzyme 2 (ACE2): The structure of human ACE2 was extracted from the complex structure of SARS-CoV-2 spike RBD and human ACE2 released by Wang and Zhang’s group in Tsinghua University.
Figure 3. The cartoon forms of structural proteins. (A) Trimer of S protein. (B) S2 of S protein. (C) E protein. (D) M protein. (E) N protein. (F) ACE2.
3. RESULTS AND DISCUSSION
3.1 Usage of small molecule docking
For each job of small molecule docking, the user can choose one of the COVID-19 targets. The small molecules should be uploaded in strict smi, mol2 or sdf formats. The user needs to choose the computational modes before uploading the small molecule files: if “Docking” is selected, only one small molecule should be uploaded, and top 10 binding modes will be displayed on the result page. If “screening” is selected, 10-20 molecules should be uploaded, and the top 10 molecules ranked by the scoring function will be displayed on the result page. If 1-10 molecules need to be evaluated, then the user has to upload the molecules one by one and choose the “Docking” mode. After the submission, an email will be sent to user’s email address, containing a directed link to access the docking results. After the job completes, the user could enter the job ID to access the
result page. By default, the server only displays the top 10 models. The selected COVID-19 protein target is renamed as Input_R.pdb. The top 10 models are named as TopL1.pdb-TopL10.pdb, respectively. The scored file is named as Score.dat. The models are visualized in 3D by JSmol and the user can view and download the docking results from the result page.
Here, we take Papain-like protease and its inhibitor as an example for small molecule docking 7. The “Docking” mode was selected and the mol2 format of its inhibitor in PDB 4OW0 was uploaded as ligand. For the top rank binding modes, the predicted binding energies of Top 1 and 2 were -11.5 kcal/mol and -10.7 kcal/mol, and the L_RMSDs were 0.23? and 3.60 ?, respectively. As shown in Figure 4A, the best binding mode of the inhibitor is very similar to the complex structure.
3.2 Usage of peptide and antibody docking
Similar to small molecule docking, the user needs to choose the COVID-19 protein target and uploads ligand protein (peptide or antibody) in strict pdb format for peptide or antibody docking. The protein docking server can perform global docking and site-specific docking to predict the binding mode between COVID-19 protein target and peptide (or antibody). The user could enter one residue into either the receptor box or the ligand box, or he could enter one residue into receptor box and one into the ligand box simultaneously. If only one residue on the ligand or receptor is defined, then the conformations with the specific residue on the interface of the complex are retained. When the user defines one residue on the receptor and the other one on the ligand simultaneously, then he needs to choose the constraint type: ambiguous constraints
or multiple constraints. When the “ambiguous constraints” is selected, the conformations are retained with at least one selected residue on the interface. When the “multiple constraints” is selected, the conformations are retained with both of the residues on the interface.
We use the post-fusion state of S2 segment of spike protein as an example for peptide docking. The receptor is 5-helices structure extracted from the post-fusion state of 6-helices S2. One helix was extracted from the S2 structure and used as ligand peptide. The pdb file of ligand peptide was uploaded and the default global docking mode was chosen. In our example, the predicted binding energies of Top 1 and 2 binding modes were -532.3 kcal/mol and -374.3 kcal/mol, and the L_RMSDs were 0.77 ? and 2.62 ?, respectively. As shown in Figure 4B, the best binding mode is very similar to the 6-helices structure of spike protein.
Figure 4. COVID-19 Docking Server outputs. (A) An example of small molecule docking. The Papain-like protease docked with its inhibitor. The complex structures of Papain-like protease with its inhibitor are colored gray. The predicted structures of Top 1 and 2 are colored pink and orange. (B) An example of peptide docking. The post-fusion state of S2 segment of spike protein docked with a helical peptide. The 6-helices structure of S2 is colored gray. The predicted structures of Top 1 and 2 are colored pink and orange.
4. CONCLUSION
15
The structures of potential targets of COVID-19 were collected or built based on the homology structures of coronavirus. An online interactive server, COVID-19 Docking Server, was constructed to predict the binding modes between the targets and small molecules, peptides or antibodies by implement of Autodock Vina and CoDockPP as docking engines. The server provides a user-friendly interface and binding mode visualization for the results, which makes it a useful tool for drug discovery of COVID-19.
1. Wu, F., Zhao, S., Yu, B., Chen, Y. M., Wang, W., Song, Z. G., Hu, Y., Tao, Z. W., Tian, J. H., Pei, Y. Y., Yuan, M. L., Zhang, Y. L., Dai, F. H., Liu, Y., Wang, Q. M., Zheng, J. J., Xu, L., Holmes, E. C., and Zhang, Y. Z. (2020) A new coronavirus associated with human respiratory disease in China, Nature.
2. Jiang, S., Xia, S., Ying, T., and Lu, L. (2020) A novel coronavirus (2019-nCoV) causing pneumonia-associated respiratory syndrome, Cellular & molecular immunology.
3. Wang, M., Cao, R., Zhang, L., Yang, X., Liu, J., Xu, M., Shi, Z., Hu, Z., Zhong, W., and Xiao, G. (2020) Remdesivir and chloroquine effectively inhibit the recently emerged novel coronavirus (2019-nCoV) in vitro, Cell research.
4. Trott, O., and Olson, A. J. (2010) AutoDock Vina: improving the speed and accuracy of docking with a new scoring function, efficient optimization, and multithreading, Journal of computational chemistry 31, 455-461.
5. O'Boyle, N. M., Banck, M., James, C. A., Morley, C., Vandermeersch, T., and Hutchison, G. R. (2011) Open Babel: An open chemical toolbox, Journal of cheminformatics 3, 33.
6. Kong, R., Wang, F., Zhang, J., Wang, F., and Chang, S. (2019) CoDockPP: A Multistage Approach for Global and Site-Specific Protein–Protein Docking, Journal of Chemical Information and Modeling 59, 3556-3564.
7. Baez-Santos, Y. M., Barraza, S. J., Wilson, M. W., Agius, M. P., Mielech, A. M., Davis, N. M., Baker, S. C., Larsen, S. D., and Mesecar, A. D. (2014) X-ray structural and biological evaluation of a series of potent and highly selective inhibitors of human coronavirus papain-like proteases, Journal of medicinal chemistry 57, 2393-2412.
8. Jia, Z., Yan, L., Ren, Z., Wu, L., Wang, J., Guo, J., Zheng, L., Ming, Z., Zhang, L., Lou, Z., and Rao, Z. (2019) Delicate structural coordination of the Severe Acute Respiratory Syndrome coronavirus Nsp13 upon ATP hydrolysis, Nucleic acids research 47, 6538-6550.
9. Kirchdoerfer, R. N., and Ward, A. B. (2019) Structure of the SARS-CoV nsp12 polymerase bound to nsp7 and nsp8 co-factors, Nature communications 10, 2342.
16
10. Ma, Y., Wu, L., Shaw, N., Gao, Y., Wang, J., Sun, Y., Lou, Z., Yan, L., Zhang, R., and Rao, Z. (2015) Structural basis and functional analysis of the SARS coronavirus nsp14-nsp10 complex, Proceedings of the National Academy of Sciences of the United States of America 112, 9436-9441.
11. Bhardwaj, K., Palaninathan, S., Alcantara, J. M., Yi, L. L., Guarino, L., Sacchettini, J. C., and Kao, C. C. (2008) Structural and functional analyses of the severe acute respiratory syndrome coronavirus endoribonuclease Nsp15, The Journal of biological chemistry 283, 3655-3664.
12. Decroly, E., Debarnot, C., Ferron, F., Bouvet, M., Coutard, B., Imbert, I., Gluais, L., Papageorgiou, N., Sharff, A., Bricogne, G., Ortiz-Lombardia, M., Lescar, J., and Canard, B. (2011) Crystal structure and functional analysis of the SARS-coronavirus RNA cap 2'-O-methyltransferase nsp10/nsp16 complex, PLoS pathogens 7, e1002059.
13. Song, W., Gui, M., Wang, X., and Xiang, Y. (2018) Cryo-EM structure of the SARS coronavirus spike glycoprotein in complex with its host cell receptor ACE2, PLoS pathogens 14, e1007236.
14. Duquerroy, S., Vigouroux, A., Rottier, P. J., Rey, F. A., and Bosch, B. J. (2005) Central ions and lateral asparagine/glutamine zippers stabilize the post-fusion hairpin conformation of the SARS coronavirus spike glycoprotein, Virology 335, 276-285.
15. Surya, W., Li, Y., and Torres, J. (2018) Structural model of the SARS coronavirus E channel in LMPG micelles, Biochimica et biophysica acta. Biomembranes 1860, 1309-1317.
16. Fehr, A. R., and Perlman, S. (2015) Coronaviruses: an overview of their replication and pathogenesis, Methods in molecular biology 1282, 1-23.
17. Roy, A., Kucukural, A., and Zhang, Y. (2010) I-TASSER: a unified platform for automated protein structure and function prediction, Nature protocols 5, 725-738.
18. Yang, J., Yan, R., Roy, A., Xu, D., Poisson, J., and Zhang, Y. (2015) The I-TASSER Suite: protein structure and function prediction, Nature methods 12, 7-8.
19. Yang, J., and Zhang, Y. (2015) I-TASSER server: new development for protein structure and function predictions, Nucleic acids research 43, W174-181.
20. Lin, S. Y., Liu, C. L., Chang, Y. M., Zhao, J., Perlman, S., and Hou, M. H. (2014) Structural basis for the identification of the N-terminal domain of coronavirus nucleocapsid protein as an antiviral target, Journal of medicinal chemistry 57, 2247-2257.