�ݺ�ߣ

Optimization of Transition State
Structures using Evolutionary Algorithms
Dr. Lukasz Miroslaw
lukasz.miroslaw@uzh.ch

Organic Chemistry Institute
Grid Computing Competence Center
University of Zurich, Switzerland

Kim Baldridge’s group meeting, 27.09.2012

1

Table of content

Introduction:
-  Objectives and motivations

-  Evolutionary Algorithms: main concept and key

operators.

Model:
-  Definition

-  Results

Open Questions

Alternative Approaches
2

Introduction

Objective

Detection of transition state (TS) structures with effective methods.

Why?

Transition state structures are energy maxima along the minimum energy
 

path connecting reactants and products of chemical reaction.

  Difficult to detect experimentally, simulations are a must!

  … but simulations are computationally expensive.

GA (by S. D. Bungay, R.A. Porier and R.J. Charron) one of the proposed
 

methods to find TS.

3

Literature Review

Current approaches mentioned in the paper do not guarantee
convergence to TS structures:
- BFGS method, TS-BFGS, OC (Optimally Condensed)
are forced to keep the Hessian matrix positive
definite.
- The methods require a good initial guess (chemical
intuition).

GA has been employed in energy minimization of molecular
clusters since 1995 (Mentres, Scuseria).

4

Evolutionary Algorithm

Individuals are the legal
solutions to our problem.
They form a population that
'evolves' in time and adapts
to the environment.

Fitness function is
measure for the adaptation.

Diversity is crucial. Finding
extrema and saddle points
are more frequent than by
gradient searches.

Operators that drive the
evolution:
Selection, Reproduction
Baldrige Group, group meeting
(Recombination), Mutation. 5

Hard vs. Soft Selection

Hard selection: the best
individuals always win.

Pros: local mimima are
located easily.

Cons: crossing saddles
almost impossible.

Soft selection: probability of
selection depends on the
fitness.

Pros: better saddle crossing.

Cons: Parameter-dependent
method.

6

Cross-over

Recombination:

Mating process: two
parents create offspring.

The offspring consists of
the generic materials from
both parents.

Weaker offspring tend to
die out in time.

Goal: variations allows the
offspring to search out
different available niches,
find better fitness values
ergo better solutions.

7

Mutation

Mutation occurs in
nature. Although this
occurs very infrequently
many believe this is a
main driving force for
evolution. The result of
mutation can often result
in a weaker individual.
Occasionally the result
might be to produce a
stronger one.

8

Fitness Function

Points on PES represent chemical
structures, coordinates are
represented by bond angles, bond
lenghths, dihedral angles, etc.
Minima represent rectants,
products and intermediates in the
reaction.
First order saddle points represent
transition state structures.
High-order saddle points
represent no chemical interest.

Goal : points with zero-gradient
and one negative eigenvalue in
the Hessian matrix.

Baldrige Group, group meeting 9

Example: water

Z-MATRIX coordinates

H
O 1 OH
H 2 OH 1 OHO

OH = 1.08
OHO = 107.5

encoded as
1.08 * 1 000 000 = 1 080 000 = (100000111101011000000)2
107.5 * 1 000 000 = 1 075 000 = (11010001111101100)2
concatenated as (10000011110101100000011010001111101100)2
Mutation, recombination and selection are applied to generate a new
offspring until convergence.
10

Questions to the Authors

Results are promising, TS are in the same range (data not
shown) BUT:
Mutliplicative or Interval Encoding of variables does not
keep the accuracy. Example

x = 0.23420111234, xnew = x * 10acc = 234201.11234
xnew = 234201 → 111001001011011001
But 111001001011011001 → 234201 !
Authors do not precise :
•  the meaning of small perturbations?

•  when the Gray coding was performed, after or before

concatenation?
11

Sphere-based Reaction Path
Following

1. Optimize the geometry to get the ground states of the
analyzed system.
2. Calculate vibrational modes VM in GAMESS.
dim(PES) = dim(VM)
3. Generate K spheres with different radii Ri. i=1..K.
x2+y2+z2 = Ri2 (3D example)
4. Generate M sampling points uniformly distributed on
each sphere.
5. For each sphere measure the energies for the sampling
points and find local minima.
6. Connect local minima on each sphere Ri to obtain a
reaction path.
12

Sphere-based Reaction Path
Following

Problems:

 How to set R and M ?
 What is the best direction for generating new sampling

points on sphere Ri+1 ?
Uniform distribution is generated with Delaunay
triangulation.
  High complexity, many parameters of unknown nature.

Idea: EA to locate local but meaningful
optima on each sphere.

13

Sphere based approach cont.

Let us define N- individuals Xi = (R, M, E) and evolve them
using mutation and soft selection. E describes the
uniformity, e.g. distance between the points.

For all sampling points per each sphere calculate energies
and generate their histogram (distributions). Fitness
function promotes 'better' histograms.
Note: In the distributions keep the information about the
position of all the sampling points.

Pick the histogram bin with lowest energies and evolve the
system in the directions defined by the sampling points in
that bin. 14

EA for reaction path following

Objective: given a system A find better
and preferably stable ground states
(B, C, ...).

B During the evolution the population
A should move from A to B (an/or from A
to C) and cross the saddle.
P
Hypothesis: Reaction Path is very
close to the saddle crossing path
(P).

15


1. Optimize the geometry of the
system A.
2. Initialize the population
Xti = {x1, x2, x3, x4, ..., xn} in
the vicinity of the ground state.

xi is a conformer defined by
Z-Matrix or Cartesian
coordinates:
xi = {a1,a2,a3, …, aP}, ai є RP

Fitness F(xi) is the ENERGY
F(xi) є RP (PES)
16


Evolve until the higher pick (B) has been reached.

The population must cross the saddle. The path P obtained during
saddle crossing should be close to the reaction path. Analyze
only the vicinity of P with more detailed analysis using
GRADIENT, HESSIAN.

17

Does it Make Sense ?

Pros: Cons:
Saddle crossing is nature of
EA. What if P is far from the
reaction path ?
Only energies are used to
drive the evolution. How to constraint
mutations?
Our multi-dimensional
models show that small How to validate P's ?
populations have very good
results.

Chromosomes are real values
18
(not binary)

Problems, Questions

How to translate Cartesian → internal coordinates
How to generate conformants in meaningful way?

Bond brakes, bond formations, bond lenghts = covalent
atomic radii?

Is there a publication that shows the actual PES, even for
small molecules? Is PES continuous?

19

�ݺ�ߣ

Evolutionary-driven Optimization in Computational Chemistry

More Related Content

Evolutionary-driven Optimization in Computational Chemistry