## Monday, April 30, 2012

### Computing properties with molecular dynamics

Within statistical mechanics a molecular property $X$ is computed by $$\left<{X}\right>=\sum^{states}_i X_i p_i$$where $X_i$ is the value of $X$ for energy state $i$ and $p_i$ is the probability of being in energy state $i$ with energy $E_i$:$$p_i=\frac{e^{-E_i/kT}}{\sum_i e^{-E_i/kT}}$$Within molecular dynamics (MD) the corresponding property is computed by$$\left<{X}\right>=\frac{1}{M}\sum^M_i X(t_i)$$where $M$ is the number of time-steps and $X(t_i)$ is the value of $X$ at time $t_i$.

For example, the average energy (also called the internal energy $U$) is given by$$\left<{E}\right>=\frac{1}{M}\sum^M_i E(t_i) \text{ where }E(t_i)=\sum^N_k \frac{1}{2}mv^2_k(t_i)+\sum^N_{k,l}V(r_{kl}(t_i))$$where $N$ is the number of particles.  Similarly for the temperature $T$:$$T(t_i)=\frac{1}{k(3N-3)}\sum^N_k \frac{1}{2}mv^2_k(t_i)$$The two most common MD simulations are constant $E$ and constant $T$ simulations: if the step-size is sufficiently small then energy is conserved and $E$ will be constant but $T$ will fluctuate.  Alternative, one can ensure that T is constant (but $E$ fluctuates) by scaling the velocities every so often:$$v_k=\lambda v_k \text{ where }\lambda=\sqrt{\frac{T}{T(t_i)}}$$The heat capacity at constant volume ($C_V$) and pressure ($P$) is computed from, respectively: $$C_V(t_i)=\frac{(E(t_i)-\left<E\right>)^2}{kT^2}$$ and $$P(t_i)=\frac{NkT}{V}-\frac{1}{3V}\sum_{k,l}r_{kl}(t_i)F_{kl}(t_i)$$ where $F_{kl}$ is the force between particle $k$ and $l$.

In principle, the free energy can also be calculated as an average:$$A\propto kT\ln\left(\frac{1}{M}\sum_i^Me^{+E(t_i)/kT}\right)$$but this is not practically feasible because this expression is dominated by high energies, which are rarely sampled.  Instead free energy differences are computed directly from the probabilities.  For example, to compute the free energy difference for the binding of $X$ and $Y$:$$X\bullet Y\leftrightharpoons X+Y$$one computes the amount of time X and Y is bound and unbound and computes $$\Delta G=-RT\ln\left(\frac{\text{time unbound}}{\text{time bound}}\right)$$

## Saturday, April 28, 2012

### Brief introduction to molecular dynamics (in Danish)

These videos represent 2 lectures from my course Molecular Statistics, that gives a brief introduction to Lennard-Jones Potentials and molecular dynamics.  The lectures are in Danish.  The lecture notes can be found here.

a. Modeling gases and liquids
b. Newtons equation of motion underlies molecular dynamics-part 1
c. The Lennard-Jones Potential
d. Newtons equation of motion underlies molecular dynamics-part 2
e. Initialization
f. Computing the force and taking the step
g. The step size (The video contains a small mistake which is corrected here)
h. Equilibration and constant E vs constant T simulations

Technical details
The first video was made with ScreenFlow and Molecular Workbench
The remaining videos was made with an iPad, Apple head-phones, the Explain Everything app, and a Bamboo Stylus.

## Thursday, April 19, 2012

### Student Helper for Protein Design

The protein design department at Novozymes in Bagsværd is looking for 1-2 student helpers (8 hours per week) starting 1st of October 2012.
The successful candidates will assist the Protein Design staff in developing computational tools for protein engineering and data analysis. Candidates should therefore have experience in programming scripting languages such as python, perl or R, and have knowledge of Linux as a working environment.

The department performs theoretical analyses of protein structures and biophysical experimental data, and knowledge or an interest in this area will therefore be an assest. However we will consider candidates at all levels and are particularly interested in candidates that are in the beginning of their university studies and thus will be able to work with us for a number of years.

Apply to Jens Erik Nielsen (JEIN@novozymes.com) with a CV including grades and a personal statement describing why you are a good candidate for the job.

## Wednesday, April 11, 2012

### Reviews of second PLoS ONE submission

The review of the PLoS ONE paper we submitted March 15 came back April 5th and can be found below.  Very fast!  (Our first PLoS ONE submission, which we submitted February 23, is still under review.  The editor tells us he had a hard time finding people to review it).

In Denmark April 5, 6, and 9 are holidays so this is my first real chance to think about the reviews and here are my gut reactions (I'll post a link to our formal response here later).

1. Yes, given the fact that this appears to be the first paper on high-throughput computational estimation of barrier heights for enzymatic reactions, the paper is by definition proof-of-principle.

We do state on page 13 that "The construction and optimization of mutant structures as well as the generation of energy profiles is automated so that the barrier height of one mutant can be estimated in ca. 24 hours on around 10 processors once the WT reaction path has been constructed."   However, given how central this is to the paper, we probably should expand on this a bit.

2. Well, as we mention, this is the subject of a paper we are currently writing and many of the experiments are not done yet.  However, its hard to see how the development and testing of a rather complex and novel computational strategy and the description and structural rationalization of activity data for dozens of mutants can be squeezed into the same paper.

3. This is discussed at the end of page 8, where we note that this probably has to do lowest minimum of the enzyme-substrate complex not being found.  However, since the binding mode is very similar in all mutants this is likely a systematic error that does not affect the relative barriers significantly.  We should add a discussion of this in the manuscript.

4. General remark: true, but I don't see how to include all these effects in a high-throughput screening method, with the exception of using continuum solvation model (we should re-evaluate some of the computed barriers using COSMO single points along the reaction path).  However, binding affinity changes are generally not considered in current state-of-the-art computational studies of enzyme catalyses and good results are often obtained without considering protein dynamics.  The fact that we are interested in relative barriers should make these issues even less important.  But, true, no reason not to point this out in the manuscript.   We do state on page 2:
In order to make the method computationally feasible, relatively approximate treatments of the wave function, structural model, dynamics and reaction path are used. Given this and the automated setup of calculations, some inaccurate results will be unavoidable. However, the intend of the method is similar to experimental high through-put screens of enzyme activity where, for example, negative results may result from issues unrelated to the intrinsic activity of the enzyme such as imperfections in the activity assay, low expression yield, protein aggregation, etc. Just like its experimental counterpart our technique is intended to identify potentially interesting mutants for further study.
With regard to the structural model being arbitrary: we did test three different structural models (Figure 4) so I can't agree with that.

5. There is nothing in the method used to compute the barriers (i.e. no atomic specific force field parameters or things like) that is specific to this enzyme or even enzymes in general.

-------------
PONE-D-12-07445
A Computational Methodology to Screen Activities of Enzyme Variants
PLoS ONE

Dear Dr Jensen,

Thank you for submitting your manuscript to PLoS ONE. After careful consideration, we feel that it has merit, but is not suitable for publication as it currently stands. Therefore, my decision is "Major Revision."

We invite you to submit a revised version of the manuscript that addresses the points below:

1. The manuscript seems to be more a proof of principle
than a real validation of efficiency and performance.   Moreover, it is not completely clear
if the protocol is fully automatic or if it requires lots of manual intervention.

2. There is no experimental verification of the predictions.

3. The calculated barriers are significantly too low, suggesting that the methods used are not adequate for reliable predictions.

4. There is no consideration of effects of binding affinity changes, nor effects of protein dynamics, nor solvation. The choice of molecular model appears somewhat arbitrary.

5. It is not apparent that these methods will be generally applicable to other enzymes. More evidence and testing is required.

We encourage you to submit your revision within sixty days of the date of this decision.

When your files are ready, please submit your revision by logging on to http://pone.edmgr.com/ and following the Submissions Needing Revision link. Do not submit a revised manuscript as a new submission.

Please also include a rebuttal letter that responds to each point brought up by the academic editor and reviewer(s). This letter should be uploaded as a Response to Reviewers file.

In addition, please provide a marked-up copy of the changes made from the previous article file as a Manuscript with Tracked Changes file. This can be done using 'track changes' in programs such as MS Word and/or highlighting any changes in the new document.

If you choose not to submit a revision, please notify us.

Yours sincerely,

xxx (name removed upon request)
PLoS ONE

Reviewer #1: The manuscript by Hediger et al. presents a computational protocol for the efficient

calculation of the effect of enzyme mutations on the height of the barrier of the
enzyme-catalyzed reaction.  The approach is based on semiempirical methods and is
applied to amide hydrolysis as catalyzed by the lipase B of C. antarctica.
The topic is interesting.  The main conclusions seem to be supported by the results
presented (but see below).  The manuscript is clear.

Major remark:
The authors mention at the end of the Conclusions section that the
application of their protocol to "190 different mutants" will be published elsewhere.
As such, the present manuscript seems to be more a proof of principle
than a real validation of efficiency and performance.   Moreover, it is not completely clear
if the protocol is fully automatic or if it requires lots of manual intervention.

Minor remark:
pagination and publication year are missing in some references (e.g., 18 and 24).

Reviewer #2: This short paper describes tests of computational approaches for predicting barrier heights for amidase activity in a lipase. The work is preliminary. There is no experimental verification of the predictions. The calculated barriers are significantly too low, suggesting that the methods used are not adequate for reliable predictions. There is no consideration of effects of binding affinity changes, nor effects of protein dynamics, nor solvation. The choice of molecular model appears somewhat arbitrary. It is not apparent that these methods will be generally applicable to other enzymes. More evidence and testing is required.

## Monday, April 9, 2012

### arXiv is for research and journals are for CVs

This post started as a comment to a recent post over at A Chemical Education, where this quote is from.
Furthermore, if we only had green OA repositories there would be another loss that I’ve never considered before: the commentaries, reviews, editorials and research highlights that complement the original research articles.
Not necessarily.  I have recently started Computational Chemistry Highlights which is an overlay journal.  CCH highlights (reviews) important papers published in the last 1-2 years in the area of computational chemistry.  "Published" usually means in a peer reviewed journal, but can also be an arXiv preprint.  CCH is free of charge and is not affiliated with any publisher.

It is worth remembering that most of these "commentaries, reviews, editorials and research highlights" are written by scientists free of charge to the publisher, and that publishing in non-open access journals results in loss of copyright, places your work behind a paywall, and funds publishers who don't always have your best interests, or those of the scientific community for that matter, in mind.

On a related topic, Google recently launched Google Scholar Metrics, which ranked arXiv in the top 5 in terms of highly cited journals.  One implication of this is that when it comes to using results for their own research, it is not too important to scientists whether it has been peer reviewed.  If a paper is in your own area of expertise, then you can judge whether the content is trustworthy, i.e. you perform your own review.  It is when the paper is significantly outside your area of expertise that you rely not just on peer review but also the perceived impact of the journal when you judge the merits of the work.  An important example of the latter is when your colleagues or funding agencies judge your CV.

Through on-line discussions I have learned of an interesting situation in mathematics.  Formal peer reviews of proofs can take a very long time, so nearly everyone is relying on arXiv for their day to day research.  It seems that the peer reviewed publications themselves serve mainly to assure your colleagues at large that you are doing good work.  As a result the perceived impact of journals and how to "get in to" the best ones is as critical as in chemistry, even though they are not really read!

So depositing your next paper in arXiv is a good idea: interested people will find the paper and make use of it if they think it is important, even though it is not peer reviewed.  However, right now it is also a good idea to submit the paper to a peer reviewed journal because you will need to impress your colleagues in the future.  However, here one should seriously consider publishing in gold open access journals such as PLoS ONE for reasons I have discussed here.   Perhaps in time overlay journals such as CCH will be seen as equally important when it comes to judging impact.

## Monday, April 2, 2012

### arXiv among top-five journals

If you go to Google Scholar you'll see a link to the new Google Scholar Metrics and the top 100 journals ranked by their five-year h-index.

arXiv.org is number 5 with an h5-index of 256, meaning 256 articles have been cited at least 256 times in the last five year.