pandadock gnn - GNN Commands Reference
The pandadock gnn command group provides access to the SE(3)-equivariant
Graph Neural Network scoring function.
Synopsis
pandadock gnn COMMAND [OPTIONS]
Commands
download-model- Download pre-trained model (~82 MB)train- Train GNN model on protein-ligand datasetpredict- Predict binding affinity for a complexbenchmark- Benchmark model performance on test setcompare- Compare GNN against baseline scoring methodsrescore- Universal rescorer for poses from ANY docking tool
pandadock gnn download-model
Download the official pre-trained PandaDock-GNN model from GitHub releases.
The model was trained on the combined ULVSH + PDBbind dataset (200 epochs) and achieves:
PDBbind Pearson R: 0.88
ULVSH Test Pearson R: 0.82
ULVSH Activity AUC: 0.94
Options:
-o, --output PATHOutput directory for the model. Default: models/
-v, --version TEXTModel version to download. Default: latest
-f, --forceOverwrite existing model file
Example:
# Download to default location
pandadock gnn download-model
# Download to custom directory
pandadock gnn download-model -o /path/to/models/
# Force re-download
pandadock gnn download-model --force
Output:
The model is saved as pandadock_gnn_v3.pt in the output directory.
After downloading, use the model with:
pandadock gnn predict -m models/pandadock_gnn_v3.pt -p protein.mol2 -l ligand.mol2
pandadock gnn rescore -m models/pandadock_gnn_v3.pt -r protein.pdb -p poses.sdf
pandadock hybrid -r protein.pdb -l ligand.sdf -m models/pandadock_gnn_v3.pt --center X Y Z --box X Y Z
pandadock gnn train
Train the PandaDock-GNN model on a protein-ligand dataset.
Required Options:
-d, --dataset PATHPath to ULVSH dataset directory
-o, --output PATHOutput directory for checkpoints and logs
Optional Options:
--epochs NNumber of training epochs. Default: 100
--batch-size NBatch size. Default: 32
--lr FLOATLearning rate. Default: 1e-4
--hidden-dim NHidden dimension. Default: 256
--num-layers NNumber of EGNN layers. Default: 6
--dropout FLOATDropout rate. Default: 0.1
--split [random|target]Data split strategy. Default: random
--patience NEarly stopping patience. Default: 20
--gpu / --cpuUse GPU if available. Default: –gpu
--seed NRandom seed for reproducibility. Default: 42
Example:
pandadock gnn train -d ULVSH/ -o models/ --epochs 100
pandadock gnn predict
Predict binding affinity for a protein-ligand complex.
Required Options:
-m, --model PATHPath to trained model checkpoint
-p, --protein PATHProtein file (MOL2 or PDB)
-l, --ligand PATHLigand file (MOL2 or SDF)
Optional Options:
-s, --site PATHOptional binding site MOL2 file
-o, --output PATHOutput JSON file for results
Example:
pandadock gnn predict -m model.pt -p protein.mol2 -l ligand.mol2
Output:
Prediction Results:
pEC50: 6.234
Energy: -8.52 kcal/mol
Activity probability: 0.87
Predicted active: True
pandadock gnn benchmark
Benchmark GNN model performance on a test set.
Required Options:
-m, --model PATHPath to trained model checkpoint
-d, --dataset PATHPath to ULVSH dataset directory
-o, --output PATHOutput directory for results
Optional Options:
--split [train|val|test]Dataset split to evaluate. Default: test
Example:
pandadock gnn benchmark -m model.pt -d ULVSH/ -o results/
Output:
Generates metrics.json with Pearson R, Spearman rho, RMSE, and MAE.
pandadock gnn compare
Compare GNN performance against all baseline scoring methods from the ULVSH dataset.
Required Options:
-m, --model PATHPath to trained model checkpoint
-d, --dataset PATHPath to ULVSH dataset directory
-o, --output PATHOutput directory for comparison results
Optional Options:
--split [train|val|test|all]Dataset split to evaluate. Default: test
Example:
pandadock gnn compare -m model.pt -d ULVSH/ -o comparison/
Output:
Generates:
comparison_results.csv- Metrics for all methodscomparison_results.json- JSON formatcomparison_plot.png- Bar chart visualization
Example output:
COMPARISON RESULTS (sorted by Pearson R)
======================================================================
Method Type N Pearson R
------------------------------------------------------------
>>> PandaDock-GNN ML Scoring 942 0.6705 <<<
VM2 ULVSH Baseline 942 0.1452
PM6 ULVSH Baseline 939 0.0809
Hyde ULVSH Baseline 942 0.0178
...
PandaDock-GNN Rank: 1/9
*** PandaDock-GNN achieves BEST performance! ***
pandadock gnn rescore
Universal GNN rescorer for poses from any docking tool.
This command allows you to rescore docked poses from ANY docking software (AutoDock Vina, Glide, GOLD, pandadock-flex, pandadock-metal, etc.) using the SE(3)-equivariant GNN scoring function.
Required Options:
-m, --model PATHPath to trained model checkpoint
-r, --receptor PATHReceptor file (PDB or MOL2)
-p, --poses PATHPoses file (multi-conformer SDF from any docking tool)
Optional Options:
-o, --output PATHOutput CSV file with ranked poses. Default: rescored_poses.csv
--output-sdf PATHOutput SDF file with poses ranked by GNN score and GNN properties added
--site-radius FLOATRadius around ligand centroid to extract binding site (Angstrom). Default: 10.0
Examples:
# Rescore poses from pandadock-flex
pandadock gnn rescore -m model.pt -r protein.pdb -p flex_poses.sdf
# Rescore AutoDock Vina output
pandadock gnn rescore -m model.pt -r receptor.pdb -p vina_out.sdf -o ranked.csv
# Get ranked SDF with GNN scores as properties
pandadock gnn rescore -m model.pt -r protein.pdb -p poses.sdf --output-sdf ranked.sdf
# Rescore Glide output
pandadock gnn rescore -m model.pt -r protein.pdb -p glide_poses.sdf -o glide_rescored.csv
Output CSV Format:
pose_name,pose_index,gnn_pKd,gnn_energy,activity_prob,predicted_active,gnn_rank
pose_3,3,7.234,-9.88,0.92,True,1
pose_1,1,6.891,-9.41,0.88,True,2
pose_5,5,6.543,-8.93,0.81,True,3
...
Output SDF Properties:
When --output-sdf is specified, each molecule in the output SDF will have:
GNN_pKd- Predicted pKd/pKi valueGNN_Energy- Predicted binding energy (kcal/mol)GNN_Activity- Activity probability (0-1)GNN_Rank- Rank based on GNN score (1 = best)
Workflow Example:
Combine with any docking tool:
# Step 1: Run flexible docking with pandadock-flex
pandadock-flex -r protein.pdb -l ligand.sdf --center 10 20 30 -o flex_output/
# Step 2: Rescore poses with GNN
pandadock gnn rescore -m model.pt -r protein.pdb -p flex_output/poses.sdf \\
-o flex_rescored.csv --output-sdf flex_rescored.sdf
# Or with AutoDock Vina
vina --receptor receptor.pdbqt --ligand ligand.pdbqt --out vina_poses.sdf
pandadock gnn rescore -m model.pt -r receptor.pdb -p vina_poses.sdf
See Also
PandaDock-GNN Overview - GNN architecture documentation
GNN Training - Training guide
GNN Prediction - Prediction guide
Hybrid Docking - Hybrid docking workflow