Empirical Scoring

The empirical scoring function is optimized for ultra-fast virtual screening. It uses statistical potentials derived from protein-ligand databases to rapidly evaluate binding poses with acceptable accuracy.

Overview

Scoring ID: empirical

Type: Knowledge-based statistical scoring

Accuracy: R = 0.72 correlation with experimental binding affinities

Speed: 0.001-0.005 seconds per pose (10-50x faster than physics-based)

Best for: Virtual screening, large library docking, initial filtering, rapid pose evaluation

Algorithm

The empirical scoring function uses statistical potentials derived from known protein-ligand complexes:

\[S_{total} = S_{contact} + S_{lipophilic} + S_{hbond} + S_{metal} + S_{flexibility}\]

Scoring Components

  1. Contact Score

    \[\begin{split}S_{contact} = \\sum_{i,j} w_{ij} \\cdot f(d_{ij})\end{split}\]
    • Atom-type pair potentials

    • Distance-dependent statistical preferences

    • Derived from observed contact frequencies in PDB

  2. Lipophilic Score

    • Hydrophobic-hydrophobic contact rewards

    • Surface complementarity bonus

    • Burial of hydrophobic surface area

  3. Hydrogen Bond Score

    • Geometry-independent H-bond detection

    • Fixed weight per hydrogen bond

    • Faster than physics-based H-bond evaluation

  4. Metal Coordination Score

    • Bonus for coordinating metal ions

    • Simple distance-based detection

    • Fixed weights per metal type

  5. Flexibility Penalty

    • Penalty for rotatable bonds

    • Accounts for conformational entropy loss

    • Simpler than torsional energy calculation

Training Data

Empirical parameters optimized on:

  • PDBBind General Set: 10,000+ protein-ligand complexes

  • Refined Set: High-quality structures with experimental affinities

  • Diverse Set: Covering all protein families

  • Validation: CASF-2016, Astex Diverse Set

Usage

Basic Usage

pandadock dock -r protein.pdb -l ligand.sdf \\
               --scoring empirical \\
               --center 10 20 30 --box 20 20 20

Virtual Screening

pandadock dock -r target.pdb -l library_10k.sdf \\
               --algorithm monte_carlo_cpu \\
               --scoring empirical \\
               --fast \\
               --num-poses 3 \\
               -o screening_results/

With GPU Acceleration

pandadock dock -r target.pdb -l library.sdf \\
               --algorithm cuda_monte_carlo \\
               --scoring empirical \\
               --gpu \\
               --gpu-batch-size 2000 \\
               -o ultra_fast_screening/

Expected throughput: 5000-7200 ligands/hour

Performance Characteristics

Accuracy Benchmarks

Dataset

Correlation (R)

RMSE (kcal/mol)

PDBBind Core

0.72

2.35

CASF-2016

0.68

2.58

Astex Diverse

0.70

2.42

Note: Lower accuracy than physics-based, but 10-50x faster

Speed Benchmarks

  • Small ligand (<20 atoms): 0.001-0.002 seconds/pose

  • Medium ligand (20-40 atoms): 0.002-0.003 seconds/pose

  • Large ligand (>40 atoms): 0.003-0.005 seconds/pose

Screening throughput:

  • CPU (monte_carlo_cpu): 200-400 ligands/hour

  • GPU (cuda_monte_carlo): 3600-7200 ligands/hour

Pose Prediction Accuracy

  • RMSD < 2?: 80-85% (with monte_carlo algorithm)

  • RMSD < 2?: 88-92% (with enhanced_hierarchical algorithm)

  • Top pose RMSD < 2?: 65-75%

Lower pose prediction accuracy than physics-based, but sufficient for filtering.

Strengths and Limitations

Strengths

 Ultra-Fast Evaluation

10-50x faster than physics-based scoring

 Good Pose Recognition

Can distinguish near-native from incorrect poses

 Robust

Works across diverse protein families

 Simple

Few parameters, easy to use

 Parallelizes Well

Excellent GPU acceleration

Limitations

 Lower Accuracy

R = 0.72 vs 0.85 for physics-based

 Coarse Granularity

Less sensitive to subtle differences

 No Energy Decomposition

Can’t analyze individual interaction contributions

 Training Set Bias

May perform poorly on novel binding modes

 No Solvation Model

Doesn’t explicitly account for desolvation

Best Practices

Optimization Tips

Maximize Throughput:

pandadock dock -r target.pdb -l library.sdf \\
               --algorithm cuda_monte_carlo \\
               --scoring empirical \\
               --gpu \\
               --gpu-batch-size 2000 \\
               --fast \\
               --num-poses 1

Target: 7000+ ligands/hour

Balance Speed and Accuracy:

pandadock dock -r target.pdb -l library.sdf \\
               --algorithm monte_carlo_cpu \\
               --scoring empirical \\
               --num-poses 5 \\
               --cpuworkers 16

Two-Stage Screening:

# Stage 1: Empirical screening (fast)
pandadock dock -r target.pdb -l library_100k.sdf \\
               --scoring empirical \\
               --fast \\
               -o stage1/

# Extract top 1000 by score
# Stage 2: Rescore with physics-based
pandadock dock -r target.pdb -l top_1000.sdf \\
               --scoring physics_based \\
               --rescoring mmgbsa \\
               -o stage2/

Output Format

Scoring Output

{
  "binding_score": -6.8,
  "components": {
    "contact_score": -8.5,
    "lipophilic_score": -2.3,
    "hbond_score": -3.2,
    "metal_score": 0.0,
    "flexibility_penalty": 1.8
  }
}

Note: Empirical scores are unitless and calibrated to approximate kcal/mol

Ranking Output

Rank  Ligand_ID      Score    RMSD
1     compound_1523  -8.5     1.2
2     compound_0942  -8.2     0.8
3     compound_2341  -7.9     1.5
...

Comparison with Other Scoring Functions

vs Physics-Based

Aspect

Empirical

Physics-Based

Speed





Accuracy





Interpretability





Throughput





Choose empirical when: Speed is paramount, screening large libraries

Choose physics-based when: Accuracy matters, need energy decomposition

vs Hybrid Scoring

Aspect

Empirical

Hybrid

Speed





Accuracy





Setup





Choose empirical when: Ultra-fast initial screening

Choose hybrid when: Final ranking and lead optimization

Examples

Ultra-Fast Virtual Screening

# Screen 100,000 compound library
pandadock dock -r kinase.pdb -l library_100k.sdf \\
               --algorithm cuda_monte_carlo \\
               --scoring empirical \\
               --gpu \\
               --fast \\
               --num-poses 1 \\
               -o empirical_screening/

Expected runtime: 14-28 hours (GPU), output: top scoring compounds

Fragment Library Screening

pandadock dock -r protein.pdb -l fragments_5k.sdf \\
               --algorithm monte_carlo_cpu \\
               --scoring empirical \\
               --fast \\
               --num-poses 3 \\
               --cpuworkers 8 \\
               -o fragment_hits/

Two-Stage High-Throughput Screening

# Stage 1: Rapid empirical filter (10,000 ? 500)
pandadock dock -r target.pdb -l library_10k.sdf \\
               --scoring empirical \\
               --fast \\
               --num-poses 1 \\
               -o stage1_empirical/

# Extract top 500 compounds by empirical score

# Stage 2: Detailed physics-based rescoring (500 ? 50)
pandadock dock -r target.pdb -l top_500.sdf \\
               --scoring physics_based \\
               --num-poses 20 \\
               -o stage2_physics/

# Extract top 50 for experimental validation

Expected Workflow Results

Input: 10,000 compound library

Stage 1 (empirical):

  • Time: 25-50 hours (CPU) or 1.5-3 hours (GPU)

  • Output: Ranked list, select top 500

Stage 2 (physics-based):

  • Time: 1-2 hours (top 500)

  • Output: Refined ranking, select top 50

Stage 3 (experimental):

  • Test top 50 compounds

  • Expected hit rate: 10-30% (5-15 active compounds)

Validation Studies

Enrichment Performance

Tested on DUD-E (Database of Useful Decoys: Enhanced):

  • Top 1% enrichment: 12-18x

  • Top 5% enrichment: 8-12x

  • AUC (ROC): 0.72-0.78

Conclusion: Good enrichment for initial filtering, not optimal for final ranking

Pose Reproduction

Tested on Astex Diverse Set (85 complexes):

  • Success rate (RMSD < 2?): 80-85%

  • Top pose success: 65-75%

Conclusion: Adequate pose recognition for screening

See Also