PRISM's rationale depends on that if the two complementary sides of a template interface are similar to the surfaces of two target proteins, then these two proteins can interact with each other using this template interface architecture.
PRISM uses two types of datasets for prediction;
The Template Dataset represents the entirety of structurally available protein-protein interactions and serves as a template to predict other potentially interacting protein pairs. It is a subset of a nonredundant dataset of known protein-protein interfaces.
The two-chain protein interface in our definition was composed of interacting residues and nearby residues, respectively. The interfaced residues was picked up first. If the distance of any two atoms between residues is less than their sum of van der Waals radii plus 0.5 Angstrom, both residues were registered as interfaced residues. When assigned interface residues was less than 10 residues, an arbitrary but reasonable number to reflect the minimum requirement of contact, the interface was considered as a result of crystal packing force. Therefore, it would not be considered further. To enable illuminating the types of architectures at the interface, residues whose alpha carbon atom is within a distance of 6.0 Angstrom from an alpha carbon atom of previous assigned interface residues, are included and named nearby residue. The 6.0 Angstrom selected from trial and error is very close to the lowest distance to include residues not involving in interface but essential for demonstrating the scaffold of the interface. Below a graphical representation of an interface is given where the interacting residues are color-coded as magenta and nearby residues are color coded as cyan.
The contribution of residues in the binding region of proteins is not uniform; rather, they contain critical residues, called hot spot. Hot spots are primary targets of therapeutic agents, because designing a molecule that will bind to hot spots may lead to the disruption of a protein-protein interaction. Experimentally, hot spots are determined by alanine scanning mutagenesis where if the contribution of the mutated residue to the binding is more than 2.0 kcal/mol these residues are labeled as hot spots. Keeping in mind their unquestionable role in binding, we use hot spots in prediction of protein-protein interactions which provide evolutionary insights into the predictions.
The target dataset contains proteins for which potential interactions among them are to be predicted. To find if they interact directly, their surfaces will be structurally compared to the template interfaces.
PRISM is composed of four consecutive steps
To find every possible binary interaction between pairs of structures in the target dataset, we need a method to measure the similarity between partners of the template interfaces and surfaces of target proteins. To do this, we extract surfaces of target proteins at the initial phase and perform successive structural alignments between these surfaces and the partner chains of interfaces in template interface dataset at the second phase, in an all-against-all manner. For structural alignment, we are using MultiProt. This enables us to measure the structural similarity of a target structure to a template binding site. If surfaces of two target proteins (A and B) contain regions similar to complementary partner chains of a template interface, we say A and B may interact through these similar regions. Further, we check for the presence of hot spots on the target structure. Hot spot match number implies evolutionary similarity whereas structural match ratio implies structural similarity. At the third phase, the two chains whose surface regions are similar to the two parts of the template interface are transformed onto this template forming a complex structure, and the solution is assessed. This assessment is rigorious; a more detailed refinement will be made in the next phase. The last phase involves flexible refinement of the rigid docking solutions of MultiProt to resolve steric clashes, especially of side chains, and ranking of the putative complexes by the global energy. This is done by employing FiberDock , which calculates energies, and ranks the predicted protein complexes according to these. In this way, the geometric complementarity is combined with docking procedures which makes the method more physical.
PRISM web server runs the PRISM's prediction algorithm. Users can access PRISM functionalities through three pages:
Users can perform two tasks: the first task (Two Proteins) for predicting interactions between two proteins, and the second task (Network) for predicting interactions in a network of proteins. For predicting interactions between two proteins, users need to supply two PDB (Protein Data Bank) ids. Optionally, users can specify the chains used in the predictions. For example, if users want to predict interactions of only the 'A' chain of 2ghu, 2ghuA should be entered as target1. Similarly, users need to provide a PDB id for target2 (i.e. 1cew). Then, users can submit the prediction request. The PRISM will use the default template set to execute the PRISM algorithm. The request will be put on a job queue to be executed in a computer dedicated to the PRISM server. Users will be given a link to follow the progress of their job status. Additionally, users can provide their e-mail addresses in the optional email field to be notified when their jobs are submitted and their jobs are completed.
The network prediction is used to predict interactions in a network of proteins. In the pair list box, edges of the network is listed as a sequence of target1, target2 pairs where each target is a PDB id with or without chains as explained above. The number of edges is limited by 10 pairs due to heavy computational load.
In the examples above, targets are protein structures from the PDB. If users want to their own structures, they can upload them using the Upload Target buttons. However, the predictions results will be available to the user for downloading, but the result will not be stored in PRISM database. Similarly, the PRISM uses the default template set. If users want their own templates, they need to provide PDB id with two chains (i.e. 1stfEI). The PRISM will extract the interface and will use it to do predictions. Results will not be stored in PRISM database.
A typical two proteins prediction requires approximately 1 hour computation time. However, if the results or some intermediate results are already in the database from previous requests, the response can be given in a much shorter time.
Predictions page is used to browse the predictions found and stored in the database so far. For each prediction, PDB and chain ids of the targets, the interface used in the prediction and the energy score is listed. Additionally, the visualization of the complex structure can be accessed by view. The visualization window has a download button as well. The temperature factors of the downloaded prediction result PDB file are replaced with interface and non-interface residue information. If the temperature factor column has a value “1”, it means that it is a non-interface residue of target1, “3” represents the interface residue for target1, “2” represents the interface residue for target2, and “4” represents the non-interface residue for target2. This result PDB file can be easily visualized with other visualization tools using temperature factor coloring methods. In addition to PDB file, the server presents the interaction contacts of the interface residues “Contacts of Interface Residues” link in the view window. The prediction results can be searched by target or interface ids.
The default template set can be browsed and searched on this page. The details of each interface can be accessed by clicking on the interface id. There are 22604 template interfaces in the default set. The default set is explained in detail at Piface.
Designed & Developed by Alper Başpınar.