Posts by Collection

hobbies

Lincoln Addresses (excerpts)

we here highly resolve these dead shall not have died in vain Read more

Podcasts/YouTube

I listen to some podcasts and YouTube channels Read more

Tennyson (Ulysses, excerpt)

that which we are, we are Read more

people

Zoe Nordquist

I am Zoe. Check out my art and music! Read more

publications

Paper Title Number 2

This paper is about the number 2. The number 3 is left for future work. Read more

Recommended citation: Your Name, You. (2010). "Paper Title Number 2." Journal 1. 1(2).
Download Paper | Download Slides

Paper Title Number 3

This paper is about the number 3. The number 4 is left for future work. Read more

Recommended citation: Your Name, You. (2015). "Paper Title Number 3." Journal 1. 1(3).
Download Paper | Download Slides

Paper Title Number 4

This paper is about fixing template issue #693. Read more

Recommended citation: Your Name, You. (2024). "Paper Title Number 3." GitHub Journal of Bugs. 1(3).
Download Paper

Paper Title Number 5, with math \(E=mc^2\)

This paper is about a famous math equation, \(E=mc^2\) Read more

Recommended citation: Your Name, You. (2024). "Paper Title Number 3." GitHub Journal of Bugs. 1(3).
Download Paper

research

Biophysics and Machine Learning

It is often useful to leverage computational biophysical data and machine learning to fill gaps in experiments. I’ve used this approach with the chaperone Hsp70, BK channel, and to predict druggable sites.

Read more

Hydrophobic Dewetting in protein ion channels and model nanopores

Hydrophobic dewetting forms a vapor barrier and closes BK Channel; how do mutations or drug-like molecules (de-)stabilize this delicate liquid-vapor interplay?

Read more

Computer-aided drug discovery

Using simulations of proteins and small molecules to identify new drug candidates

Read more

Molecular dynamics simulations

Using Newton’s laws to simulate molecular motion, like these wiggling waters:

Read more

Computational design of cancer drugs with SILCS

PROTACs are powerful ways to expand the druggable proteome, critical in cancer and other challenging diseases, but their design is often slow and expensive. Hence, CADD is useful!

Read more

talks

Talk 1 on Relevant Topic in Your Field

This is a description of your talk, which is a markdown file that can be all markdown-ified like any other post. Yay markdown! Read more

Conference Proceeding talk 3 on Relevant Topic in Your Field

This is a description of your conference proceedings talk, note the different field in type. You can put anything in this field. Read more

teaching

Undergraduate Research Mentoring

I have mentored a few undergraduate students at UMass Amherst (2019-2021). I hope to continue this in my own teaching career, through small projects, independent studies, and research projects integrated into courses.
Read more

Scientific Storytelling

Effective scientific storytelling is similar to telling a fairy-tale.

Read more

Natural Sciences First-Year Seminar

The goal of the FYS is to welcome first-year CNS students and prepare them for college and for a major in the sciences. Read more

Guest lectures

  1. Molecular mechanics and additive protein force fields; UMass Chem 586 Statistical Mechanics
    • Feb 2020, Instructor: Dr. Jianhan Chen
  2. (Guest moderator) Student-led discussion on AlphaFold2 Nature paper; Amherst College Biophysics Seminar
    • Mar 2023, Instructor: Dr. Ashley Carter
  3. Drug design and CADD in Cancer Biology; UMB Grad. Program in Life Science 665 Cancer Biology: From Basic Research to the Clinic
    • Sep 2023, Instructor: Dr. Rena Lapidus.
Read more

wiki

00readme

The name of this file follows a convention in the field to put critical overview and introductory information into a file called “README” or “00readme.” The leading zeros help to keep the file sorted at the top if sorting files alphabetically. Read more

Text editors

Text editors are where you will spend the majority of your time, writing scripts for bash/slurm to run simulations, editing simulation input parameter files, writing analysis scripts in python, etc. You will want to get comfortable using one which helps you to be productive! For quick, one-off tasks, the GNOME Text Editor gedit works fine. In the long run, you will want something with more advanced features and integration into bash. NOTE: Don’t use LibreOffice or MS Word! Read more

vi - VIsual editor

My shell-based visual editor of choice is vi. It is a very powerful and ubiquitous tool, but involves an initial learning curve. The content of this entire website was written/edited inside of vi. Note that you can save your preferred configuration in the user config file ~/.vimrc. Read more

GNUplot

It is often highly convenient to make a quick plot of a function or data right from the command line. The gnuplot utility provides this. If desired, one can make high-quality plots as well. However, I recommend using python for plots made regularly or for serious presentations or publications, and using gnuplot for on-the-spot analysis. Read more

Bash

Bash, or the Born-again shell, is one of the most popular and therefore well-documented shell environments, making it an ideal choice for beginners and experts alike. It is the default on many Linux distributions, and was in MacOS until 2019 (Catalina) when the default became z-shell. Note that the Windows shell Command Prompt (cmd.exe) is not Unix-based. Read more

Bash commands

The Unix programming paradigm is built around small programs that read and write plain text in simple, predictable ways. By chaining tools together with redirection and pipes, complex workflows become compact and expressive. This approach is powerful for quickly processing large text files like logs, or large text-based datasets, all right at the command line. Read more

Bash scripting

The Unix programming paradigm is built around small programs that read and write plain text in simple, predictable ways. By chaining tools together with redirection and pipes, complex workflows become compact and expressive. This approach is powerful for quickly processing large text files like logs, or large text-based datasets, all right at the command line. Read more

Looping

The Unix programming paradigm can also allow for programming technqiues like looping, which allows for more complex scripts that extend beyond a single line. Read more

More

While there are always more commands one can learn, I’ve tried to give a list of some very commonly used ones which are not quite as urgent to learn for beginners, but essential for proficiency. Let me know if more should be added. Read more

The Slurm Scheduler

On shared computing resources, which can range from a small network to a high-performance computing cluster (HPCC), a scheduler is the critical tool which allows many users to submit jobs simultaneously and schedule which jobs get assigned to which resources. Schedulers are powerful and complex tools, but with a few key ideas you can get started. Read more

Unix filesystem

Bash, or the born again shell, is a very popular and well-documented command line shell environment for working in the Unix filesystem, and consists of directories (like folders) and files. There are many good overviews to the shell environment I prefer: bash or the born-again shell. The internet is full of great tutorials on unix and bash commands and scripting. Read more

Wiki

Welcome to my wiki for introducing new people to research in computational chemistry and molecular mechanics. I recommend learning a bit in each topic as you go, rather than trying to master topics one at a time. Read more

CHARMM-GUI

There are few tools which have democratized molecular mechanics simulations more than CHARMM-GUI. It is an extremely widely-used tool for setting up simulations for a wide variety of systems (biological and materials), including fairly non-trivial systems and enhanced sampling setups. Input scripts including force fields are automatically generated. It is a powerful first step in setting up many simluations, especially for beginners. Take time to play with settings and appreciate some of the (admittedly few) limitations. Read more

Engines

The MD engine is the sofware that calculates the forces and propagates the atoms, consisting of tools for force integrators, thermostats, barostats, etc. Modern software is increasingly complicated as there must be a separate version complied to run partially or entirely on a GPU. There are many major MD engine software, of which I have primarily used CHARMM, Open-MM, and GROMACS. Other major interfaces include Amber, NAMD, LAMMPS, etc. Historically, most/all also were developed alongside a force field with the same or a related name. Although perhaps not 100% true in all cases, you can largely run most force fields on most engines today with the correct setup. Read more

Force Fields

The MD Force Field is the term used to refer to what the equation used to obtain forces acting on particles in molecular mechanics simulations. Often, the fundamental equation written down in papers is a Potential Energy equation (below), where the forces are obtained by the first derivative. In this page I only list all-atom additive force fields, although many other MM force fields exist (polarizable, coarse-grained, etc). A critical component is the so-called “general” force field, which refers to a set of parameters and/or method for generating parameters for arbitrary small molecules such as drugs. Read more

Molecular Mechanics

Quantum Mechanics and the Origins of Computational Chemistry

Read more

Molecular Dynamics simulations

This is a huge topic area… best exploring by reading and discussing seminal and current research publications, although there are textbooks available which cover key topics. Read more

Reading list

The core topics in the field of molecular mechanics I tend to think of as belonging to two core subtopics: the MM calculations, and the force fields. The MM calculations are performed, sophisticated software “engines” that are responsible for calculating the forces acting on many atoms and propogating their positions forward in time. The force calculation is based on a simple equation (“force field”) that describes how the force acting on a given atom is a function of it’s position with respect to other nearby atoms. Read more

Proteins

Proteins are remarkable molecules, critical for life in many interconnected ways. They perform many critical functions inside of living cells. A particularly important example of a critcal function is an enzyme, or a catalyst. A classic enzyme is a kinase, which will be discussed in detail on it’s own page. Read more

Kinases

Kinases are enzymes that catalyze the transfer of a phosphate group onto a substrate protein. Phosphorylation changes their activity, localization, stability, an/or interactions. Protein kinases form one of the largest enzyme families in eukaryotes and are central regulators of nearly all cellular processes. The enzymatic activity depends on the kinase, but often they recognize specific sequences of amino acids and modify one or more of Tyr, Thr, and Ser. Although different detailed mechanisms exist, in general kinases stabilize a high-energy intermediate such as through the presence of positively-charged residues nearby. Read more

Protein sequence

Proteins are made of sequences of amino acids which condense into complex polymers, forming peptide bonds between the amine and carboxylic acid groups of separate amino acids, and shedding a water. The leftover group is called an amino acid residue, sometimes informally just called a “residue.” Note that residues can be grouped into various loose categories based on the chemical group (“sidechain”) which is coming off the the core “backbone” (N, C\(\alpha\), C=O). The degree of hydrophilicity of these sidechain groups is critical. In the salt water environment of the cell, entropic and enthalpic interactions drive the overall chain to fold such that hydrophobic groups get buried and hydrophilic groups get exposed. By convention, we think of the sequence as being oriented from the amine-terminal (N-term) to the carbonyl-terminal (C-term) end, since is also how the protein is synthesized in the ribosome. Read more

Protein structures

Proteins fold spontaneously into complex 3-dimensional structures in salt water at room and body temperatures, and they are not only critical for life (and mis-folding critical causes of disease), but also stunningly beautiful to behold! You can spend hours on the World Wide Protein Data Bank (wwPDB), looking at various structures. Play around with these below! You can rotate and enlarge the structures. Notice the various secondary structures such as \(\alpha\) helices (purple) and \(\beta\) sheets (yellow), which are formed by hydrogen-bonding between residue’s amide and carbonyl groups. Read more

Python

python was a paradigm-shifting programming language when it was built in the 1990s, and remains widely in use today by serious programming and beginngers alike. Time spent learning the basics is well spent, as the essential skills will translate to other languages. There is almost no programming task which can’t be done well in python. Read more

Vectorization

Efficient computation via vectorization in the numpy library

Note: This is more of an advanced topic, something to be aware of if you are running slow code but not totally necessary. In a previous challenge, you may have created a program to estimate $\pi$ via a simple Monte Carlo simulation. A simple loop in python is not efficient if one wants to evaluate millions or billions of operations! Vectorization implemented in numpy allows more efficient computations to be carried using very efficient linear algebra implementations in programming languages like C or Fortran. Here is an example code for estimating $\pi$ using numpy (installed with mamba install numpy if needed): Read more

Plotting data in Python

Plotting with matplotlib

To run this code, you will need to install matplotlib, such as by running mamba install matplotlib. Here is an example script which makes a scatter plot of a data set. The data file is expected to be space-delimited and be organized so that the x-value is the first column, and then all subsequent columns are interpreted as different y-values. You could create such an input text file using awk to parse a more complex file, for example. Here is an example data file:
X1 Y1 Y2 Y3
1 1 1 1
2 2 4 8
3 3 9 27
4 4 16 64
5 5 25 125
Read more

Getting started in python

Learning Python

I recommend learning python at a website such as learn-python.org, where you can slowly build up your experience with basic elements of programming. You will find common elements with bash programming, because certain foundational elements of programming such as conditional statements and looping are present in any language. Time spend learning python in this way will translate to nearly any other language. Read more

VMD

Visualizing Molecular Dynamics: a very powerful tool, essential for success! You can download the software, or check out the documentation. For quick reference, there are two common ways to load files in vmd. One can load a single system, possibly with multiple files and frames: vmd topology.psf trajectory.dcd Read more

Materials

You can define custom colors and materials inside the after idle block of .vmdrc. Check out the VMD docs e.g. materials and colors. Huge thanks to my friend Jian Huang for pointing this out here! Read more

Save settings and visualization states

You can save your preferred setting for default VMD states in the file .vmdrc. VMD comes with a GUI editor for this file in the Main window under Extensions > VMD Preferences, which is generally the best way to customize the settings. Play around! Read more