Our
Internal Focus:
The GHKL Superfamily of Proteins
Current
early drug discovery is driven by high-throughput screening of compound
libraries, aided by computational analysis of a target and/or lead
compounds.
Biopredict, Inc. have developed a suite of computational
methods to address these problems (see
Technology) that:
-
analyzes and
internalizes available information in protein families or
superfamilies;
-
uses this
information to design a focused virtual screen of purchasable
compounds for testing;
-
docks and performs
energy minimization of small molecules in protein active sites and
builds molecular fragments into larger molecules;
-
analyzes hits
resulting from these focused screens, and
-
generates follow-on
hit-to-lead focused screens and combinatorial chemistry
lead-optimization libraries.
Our approach makes use
not only of available information on a specific target of interest, but
also of information on related targets within the target protein family
We have honed our current approach through collaborations with biotech
and pharmaceutical companies to identify lead compounds, to increase
their potency, to increase specificity in target families, and to design
family-focused combinatorial libraries. Targets have included nuclear
receptors, protein kinases and phosphatases, nucleotide synthases, and
serine and other proteases as as well as other more novel targets.
An advantage of our
family based multi-protein-target approach is that, at the first stage
of hits elucidation and structural hypothesis verification, we screen
for ligands that bind to a conserved structural motif common to this
family of proteins. We use a complementary approach that detects
activities against multiple members of the family, helping to prove our
structural binding mode hypotheses.
Advantages of our approach over other structure-based virtual screening
methods
Many
scoring functions used to dock and evaluate docked compounds are still
imperfect and will probably remain so for quite some time. As a result
the highest scoring compound is not always active, and the highest
scoring conformation of an active compound is not always predicted
correctly. Our information-driven approach circumvents this problem by
augmenting energy scores using hypothesis-driven methods to
select compounds that exhibit similar binding modes to those already
established either against the specific target or against homologous
targets. When used to select compounds these methods postulate that if
compounds possess similar modes of binding to known actives they will
themselves be active. It has been our experience that the application
of these methods when directly compared to selection of compounds by
docking scores alone results in a roughly ten-fold enrichment in
identified actives for a screen. Despite this enrichment, we generally
include up to ~1000 compounds in our first screen selected using
multiple hypotheses to increase the number of hits obtained. A further
improvement is the inclusion of a multiply iterated screen. Once hits
are found, testing purchasable compounds related to identified hits is an
attractive, efficient, and inexpensive way to expand the number of hits
considered within a given lead series before committing to chemical
synthesis. A final and perhaps defining advantage of our methods is
that they work with multiple related targets at the same time. Careful
analysis of active site similarities for multiple targets enables the
identification of common focused libraries for testing. If active site
similarities are great enough, the same methods can lead to
multiple-target inhibitors. A compound that hits more than one target
in a sensitive pathway has a higher likelihood of avoiding
drug-resistant mutations when targeting infectious disease organisms (we
are pursuing this approach in kinases with a client company).
This superfamily includes such diverse protein families
as DNA topoisomerase II, molecular chaperone Hsp90, DNA-mismatch-repair
enzymes MutL, and histidine kinases. The superfamily is rich in
established and proposed drug targets, for both cancer and bacterial
infectious disease. We are currently applying a structure-based,
computationally-driven approach to identify novel compounds and to
design combinatorial chemistry libraries that are active against
multiple members of the superfamily. Our goal is to generate high
affinity, specific inhibitors for several GHKL targets, with supporting
broad-spectrum libraries active against multiple GHKL targets to
establish a superfamily-based SAR. |