Hybrid ML Scoring
The hybrid scoring function combines physics-based force field scoring with machine learning models to achieve the highest accuracy in PandaDock. It leverages both interpretable physical interactions and learned patterns from large-scale binding data.
Overview
Scoring ID: hybrid
Type: Combined physics-based + machine learning scoring
Accuracy: R = 0.91 correlation with experimental binding affinities (highest)
Speed: 0.1-0.3 seconds per pose
Best for: Lead optimization, critical predictions, final ranking, high-accuracy affinity estimation
Algorithm
The hybrid scoring function uses a two-component architecture:
where:
\(S_{physics}\) = Physics-based force field score
\(S_{ML}\) = Machine learning score
\(\\alpha, \\beta, \\gamma\) = Optimized combination weights
Architecture
Component 1: Physics-Based Scoring
Input: Protein-ligand complex
?
Force Field Evaluation
Van der Waals
Electrostatics
Desolvation
- Hydrogen bonds
Torsional penalty ? Physics Score
Component 2: Machine Learning Scoring
Input: Protein-ligand complex
?
Feature Extraction
3D grid representation
Interaction fingerprints
Pharmacophore features
- Shape descriptors
Protein pocket features ? Graph Neural Network
Node features (atoms)
Edge features (bonds, interactions)
- Graph convolutions
Attention mechanisms ? ML Score
Component 3: Score Combination
Physics Score + ML Score
?
Weighted Linear Combination
?
Final Hybrid Score
Machine Learning Model
Architecture: Graph Neural Network (GNN) with attention
Training Data:
PDBBind Dataset: 15,000+ protein-ligand complexes
Refined Set: High-quality structures with experimental Kd/Ki
Affinity Range: pKd 2-12 (nM to mM)
Diverse Proteins: All major drug target families
Model Features:
Atomic features: Element, hybridization, aromaticity, charge
Bond features: Bond type, rotatable, in ring
Interaction features: H-bonds, ?-stacking, hydrophobic contacts
Geometric features: Distances, angles, torsions
Pocket features: Cavity shape, hydrophobicity, electrostatics
Training Protocol:
Loss function: Mean squared error on binding affinity
Optimizer: Adam with learning rate scheduling
Regularization: Dropout, L2 regularization
Validation: 5-fold cross-validation
Test set: CASF-2016 benchmark (independent)
Usage
Basic Usage
pandadock dock -r protein.pdb -l ligand.sdf \\
--scoring hybrid \\
--center 10 20 30 --box 20 20 20
High-Accuracy Lead Optimization
pandadock dock -r target.pdb -l analogs.sdf \\
--algorithm enhanced_hierarchical_cpu \\
--scoring hybrid \\
--num-poses 50 \\
--ensemble \\
-o lead_optimization/
With MM-GBSA Rescoring
pandadock dock -r protein.pdb -l ligand.sdf \\
--scoring hybrid \\
--rescoring mmgbsa \\
--num-poses 100 \\
-o maximum_accuracy/
GPU-Accelerated Hybrid Scoring
pandadock dock -r target.pdb -l ligands.sdf \\
--algorithm enhanced_hierarchical_gpu \\
--scoring hybrid \\
--gpu \\
-o gpu_hybrid/
Performance Characteristics
Accuracy Benchmarks
Dataset |
Correlation (R) |
RMSE (kcal/mol) |
|---|---|---|
PDBBind Core |
0.91 |
1.42 |
CASF-2016 |
0.89 |
1.58 |
Astex Diverse |
0.87 |
1.76 |
Best performance among all PandaDock scoring functions
Comparison with Components
Scoring |
Correlation (R) |
|---|---|
Physics-based |
0.85 |
ML-only |
0.88 |
Hybrid |
0.91 P |
Synergy: Hybrid outperforms both individual components
Speed Benchmarks
CPU: 0.1-0.3 seconds/pose
GPU: 0.01-0.05 seconds/pose (10x faster)
Note: Slower than physics-based due to ML inference, but GPU acceleration available
Screening Throughput
CPU: 20-60 ligands/hour
GPU: 120-360 ligands/hour
Recommendation: Use for final ranking (<1000 compounds), not initial screening
Ranking Performance
Tested on CASF-2016 (285 complexes):
Top-1 success: 82%
Top-3 success: 94%
Kendall’s ?: 0.68 (best)
Strengths and Limitations
Strengths
- Highest Accuracy
R = 0.91 correlation, best performance on benchmarks
- Robust Across Targets
Trained on diverse protein families
- Learns Non-Obvious Patterns
ML captures subtle features physics-based scoring misses
- Uncertainty Estimates
ML model provides confidence scores
- Complementary Information
Physics and ML components cover different aspects
- GPU Accelerated
10x speedup with GPU inference
Limitations
- Slower Than Other Methods
3-5x slower than physics-based scoring
- Requires Model Loading
Initial overhead for ML model initialization
- Less Interpretable
ML component is a black box
- May Extrapolate Poorly
Performance degrades for very novel scaffolds
- GPU Memory Usage
Requires more GPU memory than physics-only scoring
Best Practices
Recommended Use Cases
Lead Optimization
pandadock dock -r target.pdb -l series_analogs.sdf \\ --scoring hybrid \\ --num-poses 50 \\ -o lead_opt/
Accurately rank close analogs for synthesis prioritization
Final Candidate Ranking
# Step 1: Fast screening with empirical pandadock dock -r target.pdb -l library_10k.sdf \\ --scoring empirical \\ --fast \\ -o screening/ # Step 2: Rescore top 100 with hybrid pandadock dock -r target.pdb -l top_100.sdf \\ --scoring hybrid \\ --num-poses 50 \\ -o final_ranking/
Affinity Prediction
When you need quantitative binding affinity estimates:
pandadock dock -r protein.pdb -l ligand.sdf \\ --scoring hybrid \\ --rescoring mmgbsa \\ --ensemble
Comparative SAR Studies
pandadock dock -r target.pdb -l sar_series.sdf \\ --algorithm enhanced_hierarchical_cpu \\ --scoring hybrid \\ --decompose-energy \\ -o sar_analysis/
Not Recommended For
- L Large-Scale Virtual Screening (>5000 compounds)
Too slow; use empirical or physics-based instead
- L Real-Time Applications
Latency too high for interactive use
- L Novel Chemical Space
May not generalize well to very unusual scaffolds
- L When Interpretability is Critical
ML component is less interpretable
Optimization Tips
Maximize Accuracy:
pandadock dock -r protein.pdb -l ligand.sdf \\
--algorithm enhanced_hierarchical_cpu \\
--scoring hybrid \\
--rescoring mmgbsa \\
--num-poses 100 \\
--ensemble \\
--visualize
Optimize Speed:
pandadock dock -r target.pdb -l ligands.sdf \\
--algorithm enhanced_hierarchical_gpu \\
--scoring hybrid \\
--gpu \\
--gpu-batch-size 1000
Hybrid Screening Workflow:
# Stage 1: Empirical (100k ? 1k)
pandadock dock --scoring empirical --fast
# Stage 2: Physics-based (1k ? 100)
pandadock dock --scoring physics_based
# Stage 3: Hybrid (100 ? 20)
pandadock dock --scoring hybrid --rescoring mmgbsa
Output Format
Score Components
{
"hybrid_score": -9.8,
"components": {
"physics_score": -8.5,
"ml_score": -10.2,
"weights": {
"alpha": 0.4,
"beta": 0.6
}
},
"uncertainty": 0.8,
"predicted_affinity": {
"pKd": 8.2,
"Ki_nM": 6.3
}
}
Uncertainty Quantification
The ML model provides uncertainty estimates:
Low uncertainty (<0.5): High confidence prediction
Medium uncertainty (0.5-1.0): Moderate confidence
High uncertainty (>1.0): Low confidence, novel chemical space
Use uncertainty to filter predictions:
# Only trust predictions with low uncertainty
filter_results.py --max-uncertainty 0.8
Model Variants
Standard Hybrid Model
Default model: General-purpose, broad applicability
Training: PDBBind general set (15k complexes)
Accuracy: R = 0.91
Kinase-Specific Model
pandadock dock -r kinase.pdb -l ligands.sdf \\
--scoring hybrid \\
--ml-model kinase_specialized
Training: Kinase-focused dataset
Accuracy: R = 0.93 (for kinases)
GPCR-Specific Model
pandadock dock -r gpcr.pdb -l ligands.sdf \\
--scoring hybrid \\
--ml-model gpcr_specialized
Training: GPCR-focused dataset
Accuracy: R = 0.92 (for GPCRs)
Model Selection
# Auto-detect protein family and select model
pandadock dock -r protein.pdb -l ligands.sdf \\
--scoring hybrid \\
--ml-model auto
Validation and Benchmarking
Prospective Validation
Tested on CSAR 2012 benchmark (virtual screening):
Top 1% enrichment: 28-35x
Top 5% enrichment: 18-24x
AUC (ROC): 0.88-0.92
Best enrichment among all scoring functions
Affinity Prediction
Correlation with experimental affinities:
Protein Family |
R |
RMSE |
|---|---|---|
Kinases |
0.93 |
1.28 |
GPCRs |
0.92 |
1.35 |
Proteases |
0.89 |
1.52 |
Nuclear receptors |
0.91 |
1.41 |
Pose Dependence
Tested on redocking with varied RMSD:
Native pose (RMSD < 1?): R = 0.91
Near-native (RMSD 1-2?): R = 0.88
Moderate deviation (RMSD 2-3?): R = 0.82
Poor pose (RMSD >3?): R = 0.65
Conclusion: Requires good docking pose for accurate affinity prediction
Examples
Lead Optimization Workflow
# Dock and rank 50 analogs
pandadock dock -r target.pdb -l analogs_50.sdf \\
--algorithm enhanced_hierarchical_cpu \\
--scoring hybrid \\
--num-poses 50 \\
--decompose-energy \\
--visualize \\
-o lead_opt_results/
Multi-Stage Virtual Screening
# Stage 1: Empirical screening
pandadock dock -r target.pdb -l library_50k.sdf \\
--scoring empirical \\
--fast \\
-o stage1/
# Extract top 1000
# Stage 2: Physics-based rescoring
pandadock dock -r target.pdb -l top_1000.sdf \\
--scoring physics_based \\
-o stage2/
# Extract top 100
# Stage 3: Hybrid final ranking
pandadock dock -r target.pdb -l top_100.sdf \\
--scoring hybrid \\
--rescoring mmgbsa \\
--num-poses 50 \\
-o final_candidates/
High-Confidence Affinity Prediction
pandadock dock -r protein.pdb -l ligand.sdf \\
--algorithm enhanced_hierarchical_cpu \\
--scoring hybrid \\
--rescoring mmgbsa \\
--num-poses 100 \\
--ensemble \\
-o affinity_prediction/
Expected output: pKd ? 0.5 log units (3-fold error in Ki)
See Also
Scoring Functions Overview - Scoring functions overview
Physics-Based Scoring - Physics-based scoring component
Empirical Scoring - Fast empirical scoring
GPU Scoring Functions - GPU acceleration for hybrid scoring
Algorithm Selection Guide - Algorithm selection guide