Frequently asked questions

 

What is WHISCY?

WHISCY is a program to predict protein-protein interfaces. WHISCY was primarily designed for protein-protein docking using HADDOCK. It can also be used to guide mutagenesis or to support experimental studies.

 

What do I need for a WHISCY prediction?

You need to supply the structure of your protein and a multiple sequence alignment of your protein sequence with homologuous sequences.

 

How does WHISCY make predictions?

WHISCY is primarily based on conservation, but it also takes into account structural information. The alignment is used to calculate a prediction score for each surface residue of your protein. Then, the interface propensities of the amino acids are used to adjust the scores. WHISCY uses the structure to define the surface and to smooth the prediction scores over the surface.
Check here for a short overview of how WHISCY works. Check the paper for a more detailed description. When using the server, you can manually disable interface propensities and/or surface smoothing

 

How can I supply my structure?

The structure needs to be in the PDB format. You can supply a PDB code, causing a PDB structure to be fetched from the PDB before prediction. Alternatively, you can upload your own PDB file.



What is meant by a protein chain?

Many PDB files consists of multiple protein molecules (chains). Normally, you want to predict on only one chain. WHISCY extracts the chain that you specify from the PDB file before prediction. In a PDB file, the chain identifier for each atom is defined after the amino acid code:

ATOM    125  N   ASN A  17      -1.652  28.627  57.495  1.00 24.69           N  
ATOM 126 CA ASN A 17 -1.282 28.264 56.130 1.00 27.68 C
ATOM 127 C ASN A 17 0.046 28.805 55.634 1.00 27.56 C
ATOM 128 O ASN A 17 0.507 28.378 54.582 1.00 28.39 O

where A is the chain indentifier


If your protein structure contains no chain identifiers, or if you want to predict on all chains, specify <None> as chain. In that case, all chain indentifiers will be set to A before prediction.

 

How can I supply an alignment?

If your structure (or a close homolog) is deposited in the PDB, you can use the alignment from the HSSP database. The HSSP database contains an alignment for nearly each protein in the PDB. To use the HSSP alignment, specify the corresponding PDB code in the HSSP_id field.


Alternatively, you can supply your own alignment. WHISCY recognizes a number of alignment formats used by major alignment programs: .aln (default output of CLUSTAL), fasta (default output of MUSCLE), PHYLIP and MSF.

If you want to generate an alignment but you are new to bioinformatics, a good place to start would be to run BLAST and then align your results using CLUSTAL.


Always specify the format of the alignment you supply. If WHISCY somehow fails to read your alignment, please contact me by email

 

I automatically generated my alignment. Should I correct it by hand before I can use WHISCY?

WHISCY is robust in terms of bad alignment quality, so manual correction is not essential. However, you MUST always make sure that the top sequence in the alignment is a real biological sequence. This can be the sequence of your protein structure, but only if it is not engineered.

 

What are interface propensities and surface smoothing?

These are features that WHISCY uses to improve the predictions. You can manually disable these features if desired.
Check here for a short overview of how WHISCY works. Check the paper for a more detailed description.

 

What do I need for a WHISCY prediction?

You need to supply the structure of your protein and a multiple sequence alignment of your protein sequence with homologuous sequences.

 

How can I view the predictions?

See the Results section for details.

 

How reliable are the predictions?

You should not trust WHISCY predictions blindly, but always try to relate them to information that you already have. WHISCY works very well for proteins with a single conserved protein binding site. It should not be used for non-conserved interactions, such as antibody-antigens. If you know that your protein has multiple binding sites, you should verify if a predicted site could be the one you are interested in. In addition, conserved binding sites for small molecules (such as calcium or ATP) are sometimes erroneously predicted as protein-protein interface.

 

How should I use WHISCY predictions in HADDOCK?

See the Results section for details.

 

What is WHISCYMATE?

WHISCYMATE combines WHISCY and ProMate into an even more accurate predictor. If the WHISCYMATE option is enabled, WHISCY remotely calls the ProMate server for your protein and the combined results are presented. For details of how WHISCYMATE works, see the paper.

 

When can I use WHISCY?
WHISCY is free to use for non-commercial purposes. If you use WHISCY predictions in a scientific context, please cite the following work:

De Vries SJ, van Dijk ADJ, and Bonvin AMJJ
WHISCY: What information does surface conservation yield? Application to data-driven docking.
Proteins: Struc. Funct. & Bioinformatics; 2006, 63(3): 479-489 .


If you want to submit many proteins simultaneously (more than 100), please contact me by email