Our science
Development of new drug is a long, complicated and expensive process. According to Boston Consulting Group (A Revolution in R&D, November 2001 p12) it takes 12 years and costs about 1 000 000 000 $ to develop a new drug. One of the most important stages in the modern approach to drug development is searching for chemical compound called "ligand" that binds strongly with special target-protein. There are many numerical approaches (Computer Aided Drug Design) to this problem; for example, methods based on the protein structure (Structure Based Drug Design). Several drugs developed with active use of CADD have already appeared in the market, i.e. relenza, ritonivir, indinavir.
Now there are many SBDD programs. The most known of them are AutoDock, Dock, eHits, FlexX, Fred, Glide, GOLD, ICM, Slide, Surflex, Q-pharm. Specialists and independent users have tested these programs successfully. These programs operate approximately with the same quality for different tasks. For example, they predict the most probable position of a ligand in a protein binding site with accuracy of ~50% during independent testing and with accuracy of ~75% during testing by software developers. During virtual screening 50% of known active ligands can be found among 1-10% top of random ligands, ranged while virtual screening. In real experiments only one ligand among 1000 - 10000 random chemical substances is active. That means that comparing to real selectivity (10-3-10-4) SBDD approximations selectivity is sill - 10-1-10-2, which is few degrees less than in real selectivity.
Our principles:
We base on three general principles while developing of new approaches to SBDD problems:
- simplicity;
- using of maximum experimental data and
- correct using of experimental data.
Simplicity:
"Everything should be as simple as possible, but not simpler". Unlike general tendency we do NOT complicate anything without necessity. What for, if same result could be received with simple and clear methods? That was the reason for us not to use neural networks, genetic algorithm and many other "fancy", complex and sophisticated approaches. All our approaches are based on simple and clear physical models. Due to that our methods are simple and clear it is easy to control them step-by-step and clear understand origin of any results.
Using of maximum experimental data:
Quantum mechanics provides a complete description of molecules interactions. For small molecules in gas phase, "ad initio" methods based on quantum mechanics provide more accurate calculations of physical values comparing to experimental measuring. But for present moment it is impossible to describe interaction between protein and ligand in water solution using "ad initio" methods of quantum mechanics. Other methods are more or less based on experimental data. All our approaches to describe ligand and protein interaction process use maximum of available experimental data. The more experimental data you use the more accurate prediction could be received. The more experiments, the more experience.
Correct using of experimental data:
A computer can be used for spiking. But a hummer is more convenient. It is possible to develop scoring functions using information on one protein as training data and use this scoring function for docking of another protein. But using of scoring function developed specially for docking of certain proteins is a better way. All our empiric methods were trained on tasks and objects to which they were proceeded. If scoring function is used for docking, then it shall be trained for docking. If scoring function is used for separation of active ligands from inactive, then it shall be trained on active and inactive ligands at the same time.
A key problem of SBDD is prediction of ligand and protein interaction. We use scoring functions to solve this problem. But we do NOT use same scoring function to solve all SBDD problems. Each task shall have individual approach and for each problem its own scoring function shall be developed. We suggest methods to develop focused scoring functions. In our methods we tried do use correctly maximum of experimental data basing on principle: do it simple but not simpler as it is necessary.
|