際際滷

際際滷Share a Scribd company logo
Real World Applications of
Proteochemometric Modeling
The Design of Enzyme Inhibitors and
Ligands of G-Protein Coupled Receptors
Contents
 Our current approach to Proteochemometric Modeling
 Part I: PCM applied to non-nucleoside reverse
  transcriptase inhibitors and HIV mutants
 Part II: PCM applied to small molecules and the
  Adenosine receptors
 Conclusions
What is PCM ?
  Proteochemometric modeling needs both a ligand
   descriptor and a target descriptor
  Descriptors need to be compatible with each other and
   need to be compatible with machine learning
   technique...




                                                      Bio-Informatics



GJP van Westen, JK Wegner et al. MedChemComm (2011),16-30, 10.1039/C0MD00165A
What is PCM ?
  Proteochemometric modeling needs both a ligand
   descriptor and a target descriptor
  Descriptors need to be compatible with each other and
   need to be compatible with machine learning
   technique...




                                                      Bio-Informatics



GJP van Westen, JK Wegner et al. MedChemComm (2011),16-30, 10.1039/C0MD00165A
What is PCM ?
  Proteochemometric modeling needs both a ligand
   descriptor and a target descriptor
  Descriptors need to be compatible with each other and
   need to be compatible with machine learning
   technique...




                                                      Bio-Informatics



GJP van Westen, JK Wegner et al. MedChemComm (2011),16-30, 10.1039/C0MD00165A
Ligand Descriptors
 Scitegic Circular Fingerprints
   Circular, substructure based
    fingerprints
   Maximal diameter of 3 bonds from
    central atom
   Each substructure is converted to a
    molecular feature
Target Descriptors
 Select binding site residues from full protein
  sequence
 Each unique hashed feature represents one
  amino acid type (comparable with circular
  fingerprints)
Machine Learning
 Using R-statistics as integrated with Pipeline Pilot
   Version 2.11.1 (64-bits)
 Sampled several machine learning techniques
   SVM
     Final method of choice
   PLS
   Random Forest
Real World Applications of PCM
 Part I: PCM of NNRTIs (analog series) on 14 mutants
   Output variable: pEC50
   Data set provided by Tibotec
   Prospectively validated
 Part II: PCM of small molecules on the Adenosine
  receptors
     Output variable pKi
     ChEMBL_04 / StarLite
     Both human and rat data combined
     Prospectively validated
Part I: PCM applied to NNRTIs
Which inhibitor(s) show(s) the best activity spectrum and
can proceed in drug development?
 451 HIV Reverse Transcriptase   Sequence
                                             Mean    StdDev
                                                              n
                                             pEC50   pEC50
  (RT) inhibitors                  1 (wt)     8.3      0.6    451
 14 HIV RT sequences                2        6.9      0.7    259
                                     3        7.6      0.6    444
   Between zero and 13 point        4        7.5      0.7    443
    mutations (at NNRTI              5        7.4      0.8    429
                                     6        6.0      0.6    316
    binding site)                    7        6.5      0.6    99
   Large differences in             8        6.9      0.7    147
    compound activity on             9        8.3      0.6    222
                                    10        7.9      0.7    252
    different sequences              11       7.5      0.7    257
                                    12        8.0      0.6    242
                                    13        7.4      0.8    244
                                    14        8.2      0.8    220
Binding Site
 Selected binding site based on point mutations present
  in the different strains
 24 residues were selected
Used model to predict missing values

C
o
m
p
o
u
n
d
s



         Mutants

    Original Dataset   Completed with model
Prospective Validation
 Compounds have been
  experimentally validated
   Predictions where pEC50
    differs two sd from
    compound average
     (69 compound outliers)
   Predictions where pEC50
    differs two sd from
    sequence average
     (61 sequence outliers)
 Assay validation
                               Completed with model
Prospective Validation

 Model:
   R02 = 0.69
   RMSE = 0.62 log units

 Assay Validation
   R02 = 0.88
   RMSE = 0.50 log units
The Applicability Domain Concept Still
Holds in Target Space
 Prediction error similarity shows a direct correlation with
  average sequence similarity to training set

                                             R022
                                              R0     RMSE
                 1                                                              1


               0.8                                                              0.8

        R0 2   0.6                                                              0.6
                                                                                       RMSE
               0.4                                                              0.4


               0.2                                                              0.2


                 0                                                              0


               -0.2                                                             -0.2
                      0.5    0.6          0.7          0.8          0.9     1

                            Average Sequence Similarity with Training Set
The Applicability Domain Concept Still
Holds in Target Space
 Prediction error similarity shows a direct correlation with
  average sequence similarity to training set

                                             R022
                                              R0     RMSE
                 1                                                              1


               0.8                                                              0.8

        R0 2   0.6                                                              0.6
                                                                                       RMSE
               0.4                                                              0.4


               0.2                                                              0.2


                 0                                                              0


               -0.2                                                             -0.2
                      0.5    0.6          0.7          0.8          0.9     1

                            Average Sequence Similarity with Training Set
The Applicability Domain Concept Still
Holds in Target Space
 Prediction error similarity shows a direct correlation with
  average sequence similarity to training set

                                             R022
                                              R0     RMSE
                 1                                                              1


               0.8                                                              0.8

        R0 2   0.6                                                              0.6
                                                                                       RMSE
               0.4                                                              0.4


               0.2                                                              0.2


                 0                                                              0


               -0.2                                                             -0.2
                      0.5    0.6          0.7          0.8          0.9     1

                            Average Sequence Similarity with Training Set
Does PCM outperform scaling and QSAR?
 PCM outperforms QSAR models trained with identical
  descriptors on the same set
 When considering outliers, PCM outperforms scaling
 PCM can be applied to previously unseen mutants

     Validation                     pEC50            10-NN     10-NN     10-NN
                     Assay   PCM              QSAR
    Experiment                      scaling          (both)   (target)   (cmpd)


   R02 (Full plot)   0.88    0.69    0.69     0.31   0.41      0.21       0.28

   R02 (Outliers)    0.88    0.61    0.59     0.36   0.34      0.32       0.18




  RMSE (Full plot)   0.50    0.62    0.57     0.96   0.90      1.29       1.16

  RMSE (Outliers)    0.50    0.52    0.58     1.06   0.72      1.39       1.29
Model Interpretation (Sequences)
 Effect of mutation presence on compound pEC50
 High impact mutations are K101P, V179I and V179F
Model Interpretation (Compounds)
 Effect of substructure presence on compound pEC50
Model Interpretation (Compounds)
 Example of positively correlated substructure and
  negatively correlated substructure
Conclusions

 PCM can guide inhibitor design by predicting bioactivity
  profiles, as applied here to NNRTIs

 We have shown prospectively that the performance of
  PCM approaches assay reproducibility (RMSE 0.62 vs
  0.50)

 Interpretation allows selection between preferred
  chemical substructures and substructures to be avoided
Part II: PCM applied to the Adenosine
Receptors
 Model based on public data (ChEMBL_04)
 Included:
   Human receptor data
   (Historic) Rat receptor data
 Defined a single binding site (including ELs)
   Based on crystal structure 3EML and translated selected
    residues through MSA to other receptors
 Looking for novel A2A receptor ligands taking SAR
  information from other adenosine receptor subtypes
  into account
Selected Binding Site
Adenosine Receptor Data Set
 Little overlap between species
 Validation set consists of 4556 decoys and 43 known
  actives
                                                  External
Receptor   Human   Rat    Overlap   Range (pKi)                Decoy
                                                  Validation

  A1       1635    2216    147       4.5 - 9.7       130       1139


  A2A      1526    2051    215      4.5 - 10.5       57        1139


  A2B       780    803      79       4.5 - 9.7       11        1139


  A3       1661    327      82      4.5 - 10.0       255       1139
In-silico validation
 External validation on in
  house compound collection
   Lower quality data set leads
    to less predictive model
   Inclusion of Rat data
    improves model (RMSE 0.82
    vs 0.87)
 Our final model is able to
  separate actives from
  decoys
   33 of the 43 known actives
    were in the top 50
Prospective Validation
 Scanned ChemDiv supplier database ( > 790,000 cmpds)
 Selected 55 compounds with focus on diverse chemistry
   Compounds were tested in-vitro
Conclusions

 We have found novel compounds active (in the
  nanomolar range) on the A2A receptor
   Hit rate ~11 %

 PCM models benefit from addition of similar targets from
  other species (RMSE improves from 0.87 to 0.82)

 PCM models can make robust predictions, even when
  trained on data from different labs
Further discussion
 Poster # 47 A. Hendriks, G.J.P. van Westen et al.
   Proteochemometric Modeling as a Tool to Predict Clinical
    Response to Antiretroviral Therapy Based on the Dominant
    Patient HIV Genotype

 Poster # 51 E.B. Lenselink, G.J.P. van Westen et al.
   A Global Class A GPCR Proteochemometric Model: A
    Prospective Validation

 Poster # 54 R.F. Swier, G.J.P. van Westen et al.
   3D-neighbourhood Protein Descriptors for Proteochemometric
    Modeling
Acknowledgements


   Prof. Ad IJzerman        Prof. Herman van Vlijmen
   Andreas Bender           Joerg Wegner
   Olaf van den Hoven       Anik Peeters
   Rianne van der Pijl      Peggy Geluykens
   Thea Mulder              Leen Kwanten
   Henk de Vries            Inge Vereycken
   Alwin Hendriks
   Bart Lenselink
   Remco Swier
Real World Applications of
Proteochemometric Modeling
The Design of Enzyme Inhibitors and
Ligands of G-Protein Coupled Receptors
9th ICCS Noordwijkerhout
Leave One Sequence Out
 By leaving out one sequence in training and validating a
  trained model on that sequence, model performance on
  novel mutants is emulated
Best performing compounds

Sequence Compound with highest    Activity      Full Model      Difference
               pEC50              (pEC50)        (pEC50)       (Activity and
                                                                 Model)
  All             326            8.39(賊 0.61)   8.53(賊 0.73)        0.14
   1              365                9.16           9.55           0.39
   2              221                8.19           8.38            0.19
   3               79                8.71           8.81            0.10
   4              321                8.83           8.79           0.04
   5              321                9.12           8.73           0.39
   6              221                8.01           7.93           0.08
   7              364              untested         7.50            n/a
   8              221              untested         8.42            n/a
   9              365              untested         9.43            n/a
  10              326              untested         9.23            n/a
  11              151                9.05           8.86            0.19
  12              321              untested         9.29            n/a
  13              100                9.06           8.87            0.19
  14               79                9.51           9.62            0.11
                                                  Average           0.18
Worst performing compounds

Sequence Compound with Lowest    Activity     Full Model     Difference
               pEC50             (pEC50)       (pEC50)      (Activity and
                                                              Model)
  All            109            5.85(賊0.54)   5.82(賊0.66)       0.03
   1             248                6.09          6.01          0.08
   2             109              untested        4.87           n/a
   3             422              untested        5.78           n/a
   4              84                5.84          5.67           0.17
   5              84                5.65          5.54           0.11
   6             109                4.60         4.06           0.54
   7             439                5.01          5.20           0.19
   8              84                4.74          5.20          0.46
   9             248              untested        5.96           n/a
  10             181                5.82          6.01           0.19
  11             181                5.42          5.61           0.19
  12             109                5.90         6.09            0.19
  13             181                5.11          5.29           0.18
  14             181                5.62          5.81           0.19
                                                Average          0.21

More Related Content

Viewers also liked (6)

Review of Basic Statistics and Terminology
Review of Basic Statistics and TerminologyReview of Basic Statistics and Terminology
Review of Basic Statistics and Terminology
aswhite
Introductory Lecture to Applied Mathematics Stream
Introductory Lecture to Applied Mathematics StreamIntroductory Lecture to Applied Mathematics Stream
Introductory Lecture to Applied Mathematics Stream
SSA KPI
Applied Statistics - Introduction
Applied Statistics - IntroductionApplied Statistics - Introduction
Applied Statistics - Introduction
Julio Huato
Problems statistics 1
Problems statistics 1Problems statistics 1
Problems statistics 1
Marua Pescu (Beca)
Applied Statistics : Sampling method & central limit theorem
Applied Statistics : Sampling method & central limit theoremApplied Statistics : Sampling method & central limit theorem
Applied Statistics : Sampling method & central limit theorem
wahidsajol
Role of Statistics in Scientific Research
Role of Statistics in Scientific ResearchRole of Statistics in Scientific Research
Role of Statistics in Scientific Research
Varuna Harshana
Review of Basic Statistics and Terminology
Review of Basic Statistics and TerminologyReview of Basic Statistics and Terminology
Review of Basic Statistics and Terminology
aswhite
Introductory Lecture to Applied Mathematics Stream
Introductory Lecture to Applied Mathematics StreamIntroductory Lecture to Applied Mathematics Stream
Introductory Lecture to Applied Mathematics Stream
SSA KPI
Applied Statistics - Introduction
Applied Statistics - IntroductionApplied Statistics - Introduction
Applied Statistics - Introduction
Julio Huato
Applied Statistics : Sampling method & central limit theorem
Applied Statistics : Sampling method & central limit theoremApplied Statistics : Sampling method & central limit theorem
Applied Statistics : Sampling method & central limit theorem
wahidsajol
Role of Statistics in Scientific Research
Role of Statistics in Scientific ResearchRole of Statistics in Scientific Research
Role of Statistics in Scientific Research
Varuna Harshana

Similar to 9th ICCS Noordwijkerhout (20)

Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
pmaloney1
Towards Probabilistic Assessment of Modularity
Towards Probabilistic Assessment of ModularityTowards Probabilistic Assessment of Modularity
Towards Probabilistic Assessment of Modularity
Kevin Hoffman
Machine learning projects with r
Machine learning projects with rMachine learning projects with r
Machine learning projects with r
liyiou
Keynote HotSWUp 2012
Keynote HotSWUp 2012Keynote HotSWUp 2012
Keynote HotSWUp 2012
Martin Pinzger
Mirthe mar12
Mirthe mar12Mirthe mar12
Mirthe mar12
jwzweck
Gregoire H&N
Gregoire H&NGregoire H&N
Gregoire H&N
fondas vakalis
Review solar prediction iea 07-06
Review solar prediction iea 07-06Review solar prediction iea 07-06
Review solar prediction iea 07-06
IrSOLaV Pomares
Wikipedia ws
Wikipedia wsWikipedia ws
Wikipedia ws
Yu Suzuki
WCCI 2008 Tutorial on Computational Intelligence and Games, part 2 of 3
WCCI 2008 Tutorial on Computational Intelligence and Games, part 2 of 3WCCI 2008 Tutorial on Computational Intelligence and Games, part 2 of 3
WCCI 2008 Tutorial on Computational Intelligence and Games, part 2 of 3
togelius
Natalia Restrepo-Coupe_Remotely-sensed photosynthetic phenology and ecosystem...
Natalia Restrepo-Coupe_Remotely-sensed photosynthetic phenology and ecosystem...Natalia Restrepo-Coupe_Remotely-sensed photosynthetic phenology and ecosystem...
Natalia Restrepo-Coupe_Remotely-sensed photosynthetic phenology and ecosystem...
TERN Australia
Paper and pencil_cosmological_calculator
Paper and pencil_cosmological_calculatorPaper and pencil_cosmological_calculator
Paper and pencil_cosmological_calculator
S辿rgio Sacani
Shape contexts
Shape contextsShape contexts
Shape contexts
huebesao
Wikimedia Conference 2009 presentation
Wikimedia Conference 2009 presentationWikimedia Conference 2009 presentation
Wikimedia Conference 2009 presentation
Yu Suzuki
NumXL 1.55 LYNX release notes
NumXL 1.55 LYNX release notesNumXL 1.55 LYNX release notes
NumXL 1.55 LYNX release notes
Spider Financial
Quality by Design : Design Space
Quality by Design :  Design SpaceQuality by Design :  Design Space
Quality by Design : Design Space
GMP EDUCATION : Not for Profit Organization
adc converter basics
adc converter basicsadc converter basics
adc converter basics
hacker1500
Faster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware ClassificationFaster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware Classification
Silvio Cesare
DCT_TR802
DCT_TR802DCT_TR802
DCT_TR802
aniruddh Tyagi
DCT_TR802
DCT_TR802DCT_TR802
DCT_TR802
Aniruddh Tyagi
DCT_TR802
DCT_TR802DCT_TR802
DCT_TR802
aniruddh Tyagi
Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
Objective Determination Of Minimum Engine Mapping Requirements For Optimal SI...
pmaloney1
Towards Probabilistic Assessment of Modularity
Towards Probabilistic Assessment of ModularityTowards Probabilistic Assessment of Modularity
Towards Probabilistic Assessment of Modularity
Kevin Hoffman
Machine learning projects with r
Machine learning projects with rMachine learning projects with r
Machine learning projects with r
liyiou
Keynote HotSWUp 2012
Keynote HotSWUp 2012Keynote HotSWUp 2012
Keynote HotSWUp 2012
Martin Pinzger
Mirthe mar12
Mirthe mar12Mirthe mar12
Mirthe mar12
jwzweck
Review solar prediction iea 07-06
Review solar prediction iea 07-06Review solar prediction iea 07-06
Review solar prediction iea 07-06
IrSOLaV Pomares
Wikipedia ws
Wikipedia wsWikipedia ws
Wikipedia ws
Yu Suzuki
WCCI 2008 Tutorial on Computational Intelligence and Games, part 2 of 3
WCCI 2008 Tutorial on Computational Intelligence and Games, part 2 of 3WCCI 2008 Tutorial on Computational Intelligence and Games, part 2 of 3
WCCI 2008 Tutorial on Computational Intelligence and Games, part 2 of 3
togelius
Natalia Restrepo-Coupe_Remotely-sensed photosynthetic phenology and ecosystem...
Natalia Restrepo-Coupe_Remotely-sensed photosynthetic phenology and ecosystem...Natalia Restrepo-Coupe_Remotely-sensed photosynthetic phenology and ecosystem...
Natalia Restrepo-Coupe_Remotely-sensed photosynthetic phenology and ecosystem...
TERN Australia
Paper and pencil_cosmological_calculator
Paper and pencil_cosmological_calculatorPaper and pencil_cosmological_calculator
Paper and pencil_cosmological_calculator
S辿rgio Sacani
Shape contexts
Shape contextsShape contexts
Shape contexts
huebesao
Wikimedia Conference 2009 presentation
Wikimedia Conference 2009 presentationWikimedia Conference 2009 presentation
Wikimedia Conference 2009 presentation
Yu Suzuki
NumXL 1.55 LYNX release notes
NumXL 1.55 LYNX release notesNumXL 1.55 LYNX release notes
NumXL 1.55 LYNX release notes
Spider Financial
adc converter basics
adc converter basicsadc converter basics
adc converter basics
hacker1500
Faster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware ClassificationFaster, More Effective Flowgraph-based Malware Classification
Faster, More Effective Flowgraph-based Malware Classification
Silvio Cesare

Recently uploaded (20)

Acute & Chronic Inflammation, Chemical mediators in Inflammation and Wound he...
Acute & Chronic Inflammation, Chemical mediators in Inflammation and Wound he...Acute & Chronic Inflammation, Chemical mediators in Inflammation and Wound he...
Acute & Chronic Inflammation, Chemical mediators in Inflammation and Wound he...
Ganapathi Vankudoth
Flag Screening in Physiotherapy Examination.pptx
Flag Screening in Physiotherapy Examination.pptxFlag Screening in Physiotherapy Examination.pptx
Flag Screening in Physiotherapy Examination.pptx
BALAJI SOMA
PERSONALITY DEVELOPMENT & DEFENSE MECHANISMS.pptxPersonality and environment:...
PERSONALITY DEVELOPMENT & DEFENSE MECHANISMS.pptxPersonality and environment:...PERSONALITY DEVELOPMENT & DEFENSE MECHANISMS.pptxPersonality and environment:...
PERSONALITY DEVELOPMENT & DEFENSE MECHANISMS.pptxPersonality and environment:...
ABHAY INSTITUTION
MLS 208 - UNIT 1- Lecture Notes - ETANDO AYUK - SANU - Secured.pdf
MLS 208 -  UNIT 1-  Lecture Notes - ETANDO AYUK - SANU - Secured.pdfMLS 208 -  UNIT 1-  Lecture Notes - ETANDO AYUK - SANU - Secured.pdf
MLS 208 - UNIT 1- Lecture Notes - ETANDO AYUK - SANU - Secured.pdf
Eswatini Medical Christian University - EMCU / Southern Nazarene University - SANU
Sudurpaschim logsewa aayog Medical Officer 8th Level Curriculum
Sudurpaschim logsewa aayog Medical Officer 8th Level CurriculumSudurpaschim logsewa aayog Medical Officer 8th Level Curriculum
Sudurpaschim logsewa aayog Medical Officer 8th Level Curriculum
Dr Ovels
Details Study of Haemorrhage Modern & Ayurveda
Details Study of Haemorrhage Modern & AyurvedaDetails Study of Haemorrhage Modern & Ayurveda
Details Study of Haemorrhage Modern & Ayurveda
RaviAnand201252
Diabetic Ketoacidosis (DKA) & Its Management Protocol
Diabetic Ketoacidosis (DKA) & Its Management ProtocolDiabetic Ketoacidosis (DKA) & Its Management Protocol
Diabetic Ketoacidosis (DKA) & Its Management Protocol
Dr Anik Roy Chowdhury
Renal Physiology - Regulation of GFR and RBF
Renal Physiology - Regulation of GFR and RBFRenal Physiology - Regulation of GFR and RBF
Renal Physiology - Regulation of GFR and RBF
MedicoseAcademics
Stability of Dosage Forms as per ICH Guidelines
Stability of Dosage Forms as per ICH GuidelinesStability of Dosage Forms as per ICH Guidelines
Stability of Dosage Forms as per ICH Guidelines
KHUSHAL CHAVAN
3 - 8 Priority Health aaaaaaOutcomes.pdf
3 - 8 Priority Health aaaaaaOutcomes.pdf3 - 8 Priority Health aaaaaaOutcomes.pdf
3 - 8 Priority Health aaaaaaOutcomes.pdf
NashiedaLilangBuale
FAO's Support Rabies Control in Bali_Jul22.pptx
FAO's Support Rabies Control in Bali_Jul22.pptxFAO's Support Rabies Control in Bali_Jul22.pptx
FAO's Support Rabies Control in Bali_Jul22.pptx
Wahid Husein
4-PuroKalusugan 2025 DM 2025-0024 (1).pptx
4-PuroKalusugan 2025 DM 2025-0024 (1).pptx4-PuroKalusugan 2025 DM 2025-0024 (1).pptx
4-PuroKalusugan 2025 DM 2025-0024 (1).pptx
NashiedaLilangBuale
Solubilization in Pharmaceutical Sciences: Concepts, Mechanisms & Enhancement...
Solubilization in Pharmaceutical Sciences: Concepts, Mechanisms & Enhancement...Solubilization in Pharmaceutical Sciences: Concepts, Mechanisms & Enhancement...
Solubilization in Pharmaceutical Sciences: Concepts, Mechanisms & Enhancement...
KHUSHAL CHAVAN
Multimodal Approaches to Clitoral Augmentation for FGM (PRP _ filler)"
Multimodal Approaches to Clitoral Augmentation for FGM (PRP _ filler)"Multimodal Approaches to Clitoral Augmentation for FGM (PRP _ filler)"
Multimodal Approaches to Clitoral Augmentation for FGM (PRP _ filler)"
Rehab Aboshama
BIOMECHANICS OF THE MOVEMENT OF THE SHOULDER COMPLEX.pptx
BIOMECHANICS  OF THE MOVEMENT OF THE SHOULDER COMPLEX.pptxBIOMECHANICS  OF THE MOVEMENT OF THE SHOULDER COMPLEX.pptx
BIOMECHANICS OF THE MOVEMENT OF THE SHOULDER COMPLEX.pptx
drnidhimnd
ALookInsideProvidenceResearchBiobanks.pdf
ALookInsideProvidenceResearchBiobanks.pdfALookInsideProvidenceResearchBiobanks.pdf
ALookInsideProvidenceResearchBiobanks.pdf
tiffanyecchang
4-PuroKalusugasan 2025 DM 2025-0024.pptx
4-PuroKalusugasan 2025 DM 2025-0024.pptx4-PuroKalusugasan 2025 DM 2025-0024.pptx
4-PuroKalusugasan 2025 DM 2025-0024.pptx
NashiedaLilangBuale
Local Anesthetic Use in the Vulnerable Patients
Local Anesthetic Use in the Vulnerable PatientsLocal Anesthetic Use in the Vulnerable Patients
Local Anesthetic Use in the Vulnerable Patients
Reza Aminnejad
Eye assessment in polytrauma for undergraduates.pptx
Eye assessment in polytrauma for undergraduates.pptxEye assessment in polytrauma for undergraduates.pptx
Eye assessment in polytrauma for undergraduates.pptx
KafrELShiekh University
Best Sampling Practices Webinar USP <797> Compliance & Environmental Monito...
Best Sampling Practices Webinar  USP <797> Compliance & Environmental Monito...Best Sampling Practices Webinar  USP <797> Compliance & Environmental Monito...
Best Sampling Practices Webinar USP <797> Compliance & Environmental Monito...
NuAire
Acute & Chronic Inflammation, Chemical mediators in Inflammation and Wound he...
Acute & Chronic Inflammation, Chemical mediators in Inflammation and Wound he...Acute & Chronic Inflammation, Chemical mediators in Inflammation and Wound he...
Acute & Chronic Inflammation, Chemical mediators in Inflammation and Wound he...
Ganapathi Vankudoth
Flag Screening in Physiotherapy Examination.pptx
Flag Screening in Physiotherapy Examination.pptxFlag Screening in Physiotherapy Examination.pptx
Flag Screening in Physiotherapy Examination.pptx
BALAJI SOMA
PERSONALITY DEVELOPMENT & DEFENSE MECHANISMS.pptxPersonality and environment:...
PERSONALITY DEVELOPMENT & DEFENSE MECHANISMS.pptxPersonality and environment:...PERSONALITY DEVELOPMENT & DEFENSE MECHANISMS.pptxPersonality and environment:...
PERSONALITY DEVELOPMENT & DEFENSE MECHANISMS.pptxPersonality and environment:...
ABHAY INSTITUTION
Sudurpaschim logsewa aayog Medical Officer 8th Level Curriculum
Sudurpaschim logsewa aayog Medical Officer 8th Level CurriculumSudurpaschim logsewa aayog Medical Officer 8th Level Curriculum
Sudurpaschim logsewa aayog Medical Officer 8th Level Curriculum
Dr Ovels
Details Study of Haemorrhage Modern & Ayurveda
Details Study of Haemorrhage Modern & AyurvedaDetails Study of Haemorrhage Modern & Ayurveda
Details Study of Haemorrhage Modern & Ayurveda
RaviAnand201252
Diabetic Ketoacidosis (DKA) & Its Management Protocol
Diabetic Ketoacidosis (DKA) & Its Management ProtocolDiabetic Ketoacidosis (DKA) & Its Management Protocol
Diabetic Ketoacidosis (DKA) & Its Management Protocol
Dr Anik Roy Chowdhury
Renal Physiology - Regulation of GFR and RBF
Renal Physiology - Regulation of GFR and RBFRenal Physiology - Regulation of GFR and RBF
Renal Physiology - Regulation of GFR and RBF
MedicoseAcademics
Stability of Dosage Forms as per ICH Guidelines
Stability of Dosage Forms as per ICH GuidelinesStability of Dosage Forms as per ICH Guidelines
Stability of Dosage Forms as per ICH Guidelines
KHUSHAL CHAVAN
3 - 8 Priority Health aaaaaaOutcomes.pdf
3 - 8 Priority Health aaaaaaOutcomes.pdf3 - 8 Priority Health aaaaaaOutcomes.pdf
3 - 8 Priority Health aaaaaaOutcomes.pdf
NashiedaLilangBuale
FAO's Support Rabies Control in Bali_Jul22.pptx
FAO's Support Rabies Control in Bali_Jul22.pptxFAO's Support Rabies Control in Bali_Jul22.pptx
FAO's Support Rabies Control in Bali_Jul22.pptx
Wahid Husein
4-PuroKalusugan 2025 DM 2025-0024 (1).pptx
4-PuroKalusugan 2025 DM 2025-0024 (1).pptx4-PuroKalusugan 2025 DM 2025-0024 (1).pptx
4-PuroKalusugan 2025 DM 2025-0024 (1).pptx
NashiedaLilangBuale
Solubilization in Pharmaceutical Sciences: Concepts, Mechanisms & Enhancement...
Solubilization in Pharmaceutical Sciences: Concepts, Mechanisms & Enhancement...Solubilization in Pharmaceutical Sciences: Concepts, Mechanisms & Enhancement...
Solubilization in Pharmaceutical Sciences: Concepts, Mechanisms & Enhancement...
KHUSHAL CHAVAN
Multimodal Approaches to Clitoral Augmentation for FGM (PRP _ filler)"
Multimodal Approaches to Clitoral Augmentation for FGM (PRP _ filler)"Multimodal Approaches to Clitoral Augmentation for FGM (PRP _ filler)"
Multimodal Approaches to Clitoral Augmentation for FGM (PRP _ filler)"
Rehab Aboshama
BIOMECHANICS OF THE MOVEMENT OF THE SHOULDER COMPLEX.pptx
BIOMECHANICS  OF THE MOVEMENT OF THE SHOULDER COMPLEX.pptxBIOMECHANICS  OF THE MOVEMENT OF THE SHOULDER COMPLEX.pptx
BIOMECHANICS OF THE MOVEMENT OF THE SHOULDER COMPLEX.pptx
drnidhimnd
ALookInsideProvidenceResearchBiobanks.pdf
ALookInsideProvidenceResearchBiobanks.pdfALookInsideProvidenceResearchBiobanks.pdf
ALookInsideProvidenceResearchBiobanks.pdf
tiffanyecchang
4-PuroKalusugasan 2025 DM 2025-0024.pptx
4-PuroKalusugasan 2025 DM 2025-0024.pptx4-PuroKalusugasan 2025 DM 2025-0024.pptx
4-PuroKalusugasan 2025 DM 2025-0024.pptx
NashiedaLilangBuale
Local Anesthetic Use in the Vulnerable Patients
Local Anesthetic Use in the Vulnerable PatientsLocal Anesthetic Use in the Vulnerable Patients
Local Anesthetic Use in the Vulnerable Patients
Reza Aminnejad
Eye assessment in polytrauma for undergraduates.pptx
Eye assessment in polytrauma for undergraduates.pptxEye assessment in polytrauma for undergraduates.pptx
Eye assessment in polytrauma for undergraduates.pptx
KafrELShiekh University
Best Sampling Practices Webinar USP <797> Compliance & Environmental Monito...
Best Sampling Practices Webinar  USP <797> Compliance & Environmental Monito...Best Sampling Practices Webinar  USP <797> Compliance & Environmental Monito...
Best Sampling Practices Webinar USP <797> Compliance & Environmental Monito...
NuAire

9th ICCS Noordwijkerhout

  • 1. Real World Applications of Proteochemometric Modeling The Design of Enzyme Inhibitors and Ligands of G-Protein Coupled Receptors
  • 2. Contents Our current approach to Proteochemometric Modeling Part I: PCM applied to non-nucleoside reverse transcriptase inhibitors and HIV mutants Part II: PCM applied to small molecules and the Adenosine receptors Conclusions
  • 3. What is PCM ? Proteochemometric modeling needs both a ligand descriptor and a target descriptor Descriptors need to be compatible with each other and need to be compatible with machine learning technique... Bio-Informatics GJP van Westen, JK Wegner et al. MedChemComm (2011),16-30, 10.1039/C0MD00165A
  • 4. What is PCM ? Proteochemometric modeling needs both a ligand descriptor and a target descriptor Descriptors need to be compatible with each other and need to be compatible with machine learning technique... Bio-Informatics GJP van Westen, JK Wegner et al. MedChemComm (2011),16-30, 10.1039/C0MD00165A
  • 5. What is PCM ? Proteochemometric modeling needs both a ligand descriptor and a target descriptor Descriptors need to be compatible with each other and need to be compatible with machine learning technique... Bio-Informatics GJP van Westen, JK Wegner et al. MedChemComm (2011),16-30, 10.1039/C0MD00165A
  • 6. Ligand Descriptors Scitegic Circular Fingerprints Circular, substructure based fingerprints Maximal diameter of 3 bonds from central atom Each substructure is converted to a molecular feature
  • 7. Target Descriptors Select binding site residues from full protein sequence Each unique hashed feature represents one amino acid type (comparable with circular fingerprints)
  • 8. Machine Learning Using R-statistics as integrated with Pipeline Pilot Version 2.11.1 (64-bits) Sampled several machine learning techniques SVM Final method of choice PLS Random Forest
  • 9. Real World Applications of PCM Part I: PCM of NNRTIs (analog series) on 14 mutants Output variable: pEC50 Data set provided by Tibotec Prospectively validated Part II: PCM of small molecules on the Adenosine receptors Output variable pKi ChEMBL_04 / StarLite Both human and rat data combined Prospectively validated
  • 10. Part I: PCM applied to NNRTIs Which inhibitor(s) show(s) the best activity spectrum and can proceed in drug development? 451 HIV Reverse Transcriptase Sequence Mean StdDev n pEC50 pEC50 (RT) inhibitors 1 (wt) 8.3 0.6 451 14 HIV RT sequences 2 6.9 0.7 259 3 7.6 0.6 444 Between zero and 13 point 4 7.5 0.7 443 mutations (at NNRTI 5 7.4 0.8 429 6 6.0 0.6 316 binding site) 7 6.5 0.6 99 Large differences in 8 6.9 0.7 147 compound activity on 9 8.3 0.6 222 10 7.9 0.7 252 different sequences 11 7.5 0.7 257 12 8.0 0.6 242 13 7.4 0.8 244 14 8.2 0.8 220
  • 11. Binding Site Selected binding site based on point mutations present in the different strains 24 residues were selected
  • 12. Used model to predict missing values C o m p o u n d s Mutants Original Dataset Completed with model
  • 13. Prospective Validation Compounds have been experimentally validated Predictions where pEC50 differs two sd from compound average (69 compound outliers) Predictions where pEC50 differs two sd from sequence average (61 sequence outliers) Assay validation Completed with model
  • 14. Prospective Validation Model: R02 = 0.69 RMSE = 0.62 log units Assay Validation R02 = 0.88 RMSE = 0.50 log units
  • 15. The Applicability Domain Concept Still Holds in Target Space Prediction error similarity shows a direct correlation with average sequence similarity to training set R022 R0 RMSE 1 1 0.8 0.8 R0 2 0.6 0.6 RMSE 0.4 0.4 0.2 0.2 0 0 -0.2 -0.2 0.5 0.6 0.7 0.8 0.9 1 Average Sequence Similarity with Training Set
  • 16. The Applicability Domain Concept Still Holds in Target Space Prediction error similarity shows a direct correlation with average sequence similarity to training set R022 R0 RMSE 1 1 0.8 0.8 R0 2 0.6 0.6 RMSE 0.4 0.4 0.2 0.2 0 0 -0.2 -0.2 0.5 0.6 0.7 0.8 0.9 1 Average Sequence Similarity with Training Set
  • 17. The Applicability Domain Concept Still Holds in Target Space Prediction error similarity shows a direct correlation with average sequence similarity to training set R022 R0 RMSE 1 1 0.8 0.8 R0 2 0.6 0.6 RMSE 0.4 0.4 0.2 0.2 0 0 -0.2 -0.2 0.5 0.6 0.7 0.8 0.9 1 Average Sequence Similarity with Training Set
  • 18. Does PCM outperform scaling and QSAR? PCM outperforms QSAR models trained with identical descriptors on the same set When considering outliers, PCM outperforms scaling PCM can be applied to previously unseen mutants Validation pEC50 10-NN 10-NN 10-NN Assay PCM QSAR Experiment scaling (both) (target) (cmpd) R02 (Full plot) 0.88 0.69 0.69 0.31 0.41 0.21 0.28 R02 (Outliers) 0.88 0.61 0.59 0.36 0.34 0.32 0.18 RMSE (Full plot) 0.50 0.62 0.57 0.96 0.90 1.29 1.16 RMSE (Outliers) 0.50 0.52 0.58 1.06 0.72 1.39 1.29
  • 19. Model Interpretation (Sequences) Effect of mutation presence on compound pEC50 High impact mutations are K101P, V179I and V179F
  • 20. Model Interpretation (Compounds) Effect of substructure presence on compound pEC50
  • 21. Model Interpretation (Compounds) Example of positively correlated substructure and negatively correlated substructure
  • 22. Conclusions PCM can guide inhibitor design by predicting bioactivity profiles, as applied here to NNRTIs We have shown prospectively that the performance of PCM approaches assay reproducibility (RMSE 0.62 vs 0.50) Interpretation allows selection between preferred chemical substructures and substructures to be avoided
  • 23. Part II: PCM applied to the Adenosine Receptors Model based on public data (ChEMBL_04) Included: Human receptor data (Historic) Rat receptor data Defined a single binding site (including ELs) Based on crystal structure 3EML and translated selected residues through MSA to other receptors Looking for novel A2A receptor ligands taking SAR information from other adenosine receptor subtypes into account
  • 25. Adenosine Receptor Data Set Little overlap between species Validation set consists of 4556 decoys and 43 known actives External Receptor Human Rat Overlap Range (pKi) Decoy Validation A1 1635 2216 147 4.5 - 9.7 130 1139 A2A 1526 2051 215 4.5 - 10.5 57 1139 A2B 780 803 79 4.5 - 9.7 11 1139 A3 1661 327 82 4.5 - 10.0 255 1139
  • 26. In-silico validation External validation on in house compound collection Lower quality data set leads to less predictive model Inclusion of Rat data improves model (RMSE 0.82 vs 0.87) Our final model is able to separate actives from decoys 33 of the 43 known actives were in the top 50
  • 27. Prospective Validation Scanned ChemDiv supplier database ( > 790,000 cmpds) Selected 55 compounds with focus on diverse chemistry Compounds were tested in-vitro
  • 28. Conclusions We have found novel compounds active (in the nanomolar range) on the A2A receptor Hit rate ~11 % PCM models benefit from addition of similar targets from other species (RMSE improves from 0.87 to 0.82) PCM models can make robust predictions, even when trained on data from different labs
  • 29. Further discussion Poster # 47 A. Hendriks, G.J.P. van Westen et al. Proteochemometric Modeling as a Tool to Predict Clinical Response to Antiretroviral Therapy Based on the Dominant Patient HIV Genotype Poster # 51 E.B. Lenselink, G.J.P. van Westen et al. A Global Class A GPCR Proteochemometric Model: A Prospective Validation Poster # 54 R.F. Swier, G.J.P. van Westen et al. 3D-neighbourhood Protein Descriptors for Proteochemometric Modeling
  • 30. Acknowledgements Prof. Ad IJzerman Prof. Herman van Vlijmen Andreas Bender Joerg Wegner Olaf van den Hoven Anik Peeters Rianne van der Pijl Peggy Geluykens Thea Mulder Leen Kwanten Henk de Vries Inge Vereycken Alwin Hendriks Bart Lenselink Remco Swier
  • 31. Real World Applications of Proteochemometric Modeling The Design of Enzyme Inhibitors and Ligands of G-Protein Coupled Receptors
  • 33. Leave One Sequence Out By leaving out one sequence in training and validating a trained model on that sequence, model performance on novel mutants is emulated
  • 34. Best performing compounds Sequence Compound with highest Activity Full Model Difference pEC50 (pEC50) (pEC50) (Activity and Model) All 326 8.39(賊 0.61) 8.53(賊 0.73) 0.14 1 365 9.16 9.55 0.39 2 221 8.19 8.38 0.19 3 79 8.71 8.81 0.10 4 321 8.83 8.79 0.04 5 321 9.12 8.73 0.39 6 221 8.01 7.93 0.08 7 364 untested 7.50 n/a 8 221 untested 8.42 n/a 9 365 untested 9.43 n/a 10 326 untested 9.23 n/a 11 151 9.05 8.86 0.19 12 321 untested 9.29 n/a 13 100 9.06 8.87 0.19 14 79 9.51 9.62 0.11 Average 0.18
  • 35. Worst performing compounds Sequence Compound with Lowest Activity Full Model Difference pEC50 (pEC50) (pEC50) (Activity and Model) All 109 5.85(賊0.54) 5.82(賊0.66) 0.03 1 248 6.09 6.01 0.08 2 109 untested 4.87 n/a 3 422 untested 5.78 n/a 4 84 5.84 5.67 0.17 5 84 5.65 5.54 0.11 6 109 4.60 4.06 0.54 7 439 5.01 5.20 0.19 8 84 4.74 5.20 0.46 9 248 untested 5.96 n/a 10 181 5.82 6.01 0.19 11 181 5.42 5.61 0.19 12 109 5.90 6.09 0.19 13 181 5.11 5.29 0.18 14 181 5.62 5.81 0.19 Average 0.21