Osmiridium

Statistics for Physicists

This is a formula cheat sheet with notes and definitions of common statistical methods used for data analysis in the physical sciences.

Definitions

Mean or Expected Value: $$\mu_x = \langle x \rangle = \frac{1}{N} \sum_{i=1}^N x_i$$

Sample Variance: $$S_x^2=\frac{1}{N-1} \sum_{i=1}^N (x_i-\langle x \rangle)^2 $$

Correlated Sample Variance: $$S_{uv}^2 = \frac{1}{N-1}\sum_{i=1}^N (u_i-\langle u \rangle)(v_i - \langle v \rangle) $$

Standard Deviation: $$\sigma_x = S_x(N \rightarrow \infty) $$

Degrees of Freedom: $$\text{DOF} = N(\text{Measurements}) - N(\text{Distribution Parameters})$$

"Uncertainty of Mean": $$ \frac{\sigma}{\sqrt{N}} $$

"Variance of Mean": $$ \frac{\sigma^2}{N} $$

$\sigma$ is usually a measure of some limiting property of the measurement system, though it is often referred to as the "uncertainty". However, it is possible to have a physical process with some large $\sigma$ but a small measurement uncertainty. This means that a measurement of a physical quantity with a high probability of taking on a value far from the expected value is very robust (we will measure the same distribution every time), but that the quantity cannot be measured with greater precision unless the measurement is improved (up to the uncertainty limit).

Sample Distributions

Sample distributions of measurement values are used to infer the parent distribution of a physical quantity.

Binomial: Used for a small, discrete number of possible independent outcomes or states. Typically, we are measuring whether an event is observed or not (binary outcome) in some time interval and we would like to know how likely it is that $x$ events are observed. The probability of $x$ out $n$ observations with rate $p$ is $$ P(x;n,p) = \frac{n!}{x!(n-x)!}p^x (1-p)^x $$ $$ \mu = np $$ $$ \sigma^2 = np(1-p) $$

Poisson: Similar to binomial, except that the average number of outcomes is less than the possible number of outcomes ($p \rightarrow 0)$ (see Poisson Limit Theorem) and the possible number of outcomes is large. Discrete distribution used to model random and rare events such as photon counts in a PMT. Probability of $x$ observations out of a large number with an expected value of $\lambda$ is $$ P(x;\lambda) = \frac{\lambda^x}{x!} e^{-\lambda} $$ $$\mu = \lambda$$ $$\sigma^2 = \lambda$$ If the distribution is presented as a histogram (un-normalized), the uncertainty in each bin is 1 ($x$ events are observed or they are not). This implies that the standard deviation for each bin is $\sigma_{\text{bin}} = \sqrt{n_{\text{bin}}} $.

Gaussian: Continuous distribution of measurement values, commonly used to describe physical quantities that are subject to broadening due to thermal (versus collisional) processes like Doppler shifts. The probability of observing the measurement value (not the number of observations) $x$ of a quantity with a presumed Gaussian parent distribution with mean $\mu$ and standard deviation $\sigma$ is $$P(x;\mu,\sigma) = \frac{1}{\sigma \sqrt{2\pi}} e^{-\frac{1}{2}(\frac{x-\mu}{\sigma})^2} $$ $$\text{FWHM} = 2.354 \sigma $$ If the sample size is small, the mean and standard deviation of the parent distribution are necessarily poorly-determined and a T-distribution should be used. Typically, we replace the leading $\sigma$-dependent coefficient with a new fitting parameter for a total of three parameters since Gaussian distributions in measurements of raw data are always scaled by some number and are not normalized.

Lorentzian: Continuous distribution of measurement values, used to describe physical quantities that are subject to broadening due to a random (Poisson) reset process such as collisions in a gas. Also known as the Cauchy distribution. The probability of observing the measurement value $x$ of a quantity with parent distribution characterized by expected value $\mu$ and FWHM of $2\gamma$ is $$P(x;\mu,\gamma) = \frac{1}{\pi \gamma} \frac{\gamma^2}{(x-\mu)^2+\gamma^2}$$ Again, we typically use a three-parameter fit as for the Gaussian distribution where we are fitting a physical measurement and not a normalized probability distribution. Also, note that the standard deviation is not defined for the Lorentzian and we use the FWHM instead.

Voigt: The line shape resulting from the convolution of a Gaussian and Lorentzian is a non-analytic function called the Voigt function. There exists a simple relationship between the FWHM of each type of profile $$\frac{f_V}{f_G} = 0.5346 \left( \frac{f_L}{f_G} \right) + \sqrt{1+0.2166 \left( \frac{f_L}{f_G} \right) } $$ which can be used to estimate the ratio of the Lorentzian and Gaussian FWHM if a measurement technique provides a knob to increase and decrease the collisional-broadening process with respect to a roughly constant thermal broadening process (which may come from the measurement method itself).

Fitting Data

$\chi^2$ Test: Non-linear least-squares regressions in SciPy or gnuplot use the Levenberg-Marquardt algorithm to minimize the $\chi^2$ statistic. A good rule of thumb is that $\chi^2 \approx \text{DOF}$ or the reduced-$\chi^2 \approx 1$. The $\chi^2$ statistic is calculated from the data via $$\chi^2 = \sum_{i=1}^N \frac{(x_i - f(x_i))^2}{\sigma_i^2} $$ where $f(x)$ is the optimized model function. A p-value is calculated with this statistic and the $\chi^2$-distribution. A rough interpretation of the p-value: A small p-value indicates the uncertainties may be underestimated and a large p-value indicates the uncertainties may be overestimated.

F Test: A future post will address this test and ANOVA in great detail.

Maximum-Likelihood: A future post will address this in great detail.

Orthogonal Distance Regression: ODR is a very useful fitting method for data which has known uncertainty of the independent as well as the dependent variables. Usually, we would like for the uncertainty in independent variables to be small so we can ignore this aspect of the analysis, but it never hurts to be thorough. In theory, ODR will provide a more accurate value of the uncertainty of the model fitting parameters without needing to go through the trouble of programming a Monte Carlo simulation that varies the independent variables. There is a very nice package for ODR in SciPy. My experience has shown that SciPy ODR needs to be nearly on top of the minimized solution so that running SciPy curve_fit or similar to provide an initial fit is almost always necessary.

References

Bevington, P.; Robinson, D. K. Data Reduction and Error Analysis for the Physical Sciences, 3rd ed.; McGraw Hill: Boston, 2003.

Casella, G.; Berger, R. L. Statistical Inference, 2nd ed.; Duxbury: Pacific Grove, CA, 2001.

Spin Densities with NWChem

Electron spin density around an isolated gadolinium atom in an electric field.

Lately, I've been interested in the chiral-induced spin selectivity (CISS) effect as a means of achieving enantioseparation and controlling chemistry with spin-polarized surfaces. Much of the pioneering work on this topic has been performed only in the past couple of years mainly by the groups of Ron Naaman and Yossi Paltiel at the Weizmann Institute in Israel.

The CISS effect relies partially on the mirror symmetry-breaking of chiral molecules. Enantiomers are typically defined as non-superposable mirror images of the same compound, however enantiomers are not precise mirror images - small structural deviations between the enantiomers cause imperfect mirroring. This, plus the quantum dynamical current of electrons that flow through the molecule as it becomes polarized, can force enantiomers with nearly identical electric dipole moments to have simultaneous opposite magnetic dipole interactions with ferromagnetic materials by which they can be distinguished. This description captures the spirit of the CISS effect, but is not strictly correct since the "magnetic dipole interaction" is actually due to a spin-exchange interaction between the chiral molecule and the localized spins in the ferromagnetic material.

For a discrimination via the exchange interaction, we should expect that two enantiomers will have strong differences in the spatial arrangement of their static or induced spin density - a net excess of spin-up or spin-down electrons. Modeling the dynamical case is difficult, but we can perform some simple calculations in NWChem to illustrate some of these concepts in ground state molecules. (See the ARTAIOS code for more advanced spin transport simulations).

NWChem has a convenient module which exports the spin density around the nuclei for us as .cube files which can be rendered in Avogadro or VESTA. Ideally, we would do a fully-relativistic calculation including spin-orbit coupling, however it seems that the data exporting tool included with NWChem is not yet compatible with the SODFT module so we are forced to settle with performing open-shell DFT with relativistic ZORA corrections. Here is an example plot of the spin density calculated around a gadolinium atom.

Spin density calculated for a Gd atom in NWChem. Blue regions correspond to negative spin density (spin down) and yellow regions correspond to positive spin density (spin up).

Below, the calculated spin densities for the enantiomers of CHBrClF at the B3LYP x2c-tzvppall level can be seen to have very small differences which illustrates the gist of the CISS effect. The electric dipole moments are seen to be exactly mirrored as we expect, but the different spin densities will be expected to cause different adsorption rates near a spin-polarized surface.

start atom

TITLE "R_Chirality"
CHARGE 0
GEOMETRY nucleus point
C       -0.44902151    -0.58263682    -0.42107461
Cl       0.70451160    -1.83047885     0.07158718
H       -0.56238473    -0.61588205    -1.49796314
Br       0.17221920     1.21315323     0.03592669
F       -1.63875135    -0.80322165     0.17213501               
END

BASIS spherical
   * library x2c-tzvppall
END

RELATIVISTIC
   zora on
   zora:cutoff 1d-30
END

DFT
   odft
   xc b3lyp
   convergence energy 1e-8
   grid fine
   iterations 1000
   print "final vectors analysis"
   vectors output R-CBrClFH.movecs
END
task dft optimize
task dft energy

PROPERTY
   vectors R-CBrClFH.movecs
   dipole
   quadrupole
   octupole
   electrondensity
   #efield
   grid pad 10.0 ngrid 100 output R-CBrClFH-elf-pad.cube
END
task dft property

Mirror images of the chiral molecule CHBrClF. Note the small differences in the distribution of spin density, particularly the extended lobe of negative spin density in the upper-left of the left-handed molecule.

Real molecules of biological interest can also be modeled on a laptop if they are small enough. As we know, life exclusively uses left-handed amino acids. Below are plots of the spin densities of the amino acid serine. The geometry optimization took a few hours on my laptop.

start atom

TITLE "L_Chirality"
CHARGE 0
GEOMETRY nucleus point
C       -0.24953497    -1.24818253     0.08197602
C        0.10675344     0.19922910     0.46348136
C       -0.69823886     1.13599301    -0.43132052
N        1.54334480     0.43702801     0.28568460
O        0.72750105    -1.89409115    -0.56459874
O       -1.31849955    -1.73983910     0.32047127
O       -0.37381591     2.47071840    -0.05036869
H       -0.22634707     0.35051766     1.49340254
H       -1.75961374     0.92216681    -0.29515100
H       -0.43371621     0.95929202    -1.48050680
H        1.70448825     1.40765616     0.04475728
H        2.05247547     0.25608986     1.14154027
H        1.47249430    -1.25597205    -0.61495504
H       -0.84595434     3.08360191    -0.61953790                                       
END

BASIS spherical
   * library x2c-tzvppall
END

RELATIVISTIC
   zora on
   zora:cutoff 1d-30
END

DFT
   odft
   xc b3lyp
   convergence energy 1e-8
   grid fine
   iterations 1000
   print "final vectors analysis"
   vectors output L-serine.movecs
END
task dft optimize
task dft energy

PROPERTY
   vectors L-serine.movecs
   dipole
   quadrupole
   octupole
   electrondensity
   #efield
   grid pad 10.0 ngrid 100 output L-serine-elf-pad.cube
END
task dft property

Mirror images of the amino acid serine. Note the large difference in spin density between the enantiomers around the carboxylate moieties.

We can also add in point charges to simulate the electric field gradient a particle will experience as it approaches a conductive metallic surface. The snapshots provided might give us some idea of the magnitude of the dynamical CISS effect via the amount of rearrangement of the spin density. A series of snapshots is shown for an isolated Gd atom.

start atom

TITLE "Gd"
CHARGE 0
GEOMETRY nucleus point
   Gd 0.0 0.0 0.0              
END

BASIS spherical
   * library x2c-tzvpall
END

RELATIVISTIC
   zora on
   zora:cutoff 1d-30
END

BQ  
  0.000  -20.000   0.000   -35.0  
  0.000   20.000   0.000    35.0   
END

DFT
   odft
   mult 9
   xc b3lyp
   convergence energy 1e-8
   grid fine
   iterations 1000
   print "final vectors analysis"
   vectors output Gd.movecs
END
task dft energy

PROPERTY
   vectors Gd.movecs
   dipole
   quadrupole
   octupole
   electrondensity
   #efield
   grid pad 10.0 ngrid 100 output Gd-elf-pad.cube
END
task dft property

Animation of the influence of an increasing electric field strength across an isolated gadolinium atom. The electron density drawn to the positive charge has a net "down" spin orientation. The CISS effect is believed to work in a similar way - a chiral molecule approaching a conductive surface becomes polarized and the electron density has different spin polarization depending on the particular enantiomer.

Quantum Dots in an Optical Trap

I worked with an optical trap to manipulate silica spheres and measure microscopic forces from molecular motors in onion cells. Below is a picture of a silica microsphere caught in the optical trap (the sphere is defocused in the microscope when it is caught in the trap).

Silica microsphere caught in an optical trap.

I was inspired by a paper (reference below) to try putting quantum dots (the same as picture in this post) in the optical trap to see if I could induce a two-photon absorption and fluorescence of the trapped dots. The paper used a 5W 1064nm laser to observe the two-photon process. Our optical trap used a 330mW 975nm laser and so the laser power turned out to be far too weak to observe this process in our setup.

I did observe another interesting phenomenon - when the laser was focused into the glass microscope slide instead of the fluid channel and the microscope illumination turned off, fluorescence was observed. This fluorescence clearly came from the glass since it disappeared when the laser was refocused into the water. It also proved to be a useful way to determine the location of the trap - note that the location of the beam spot matches the position of the silica bead above.

Fluorescence from 975nm laser in a glass microscope slide. Laser speckle is also noticeable.

SONICC

Example SONICC capabilities from the Formulatrix website.

During my research visit in Hamburg in 2018, I had the opportunity to do some crystallization screens and collect data at the PETRA III synchrotron. As part of that project, I had to use a cool instrument by Formulatrix called SONICC to help with these crystallization screens. This instrument accepts 96 well screening plates and uses visible, SHG, and UV imaging to discriminate between salt and protein crystals in each well in only a couple minutes. The principle of operation is really interesting and you can read more about it on the Formulatrix site.

Below are example images from a crystal screen I performed on the phycobiliprotein phycocyanin. The SHG imaging tended to work better than the UV-TPEF, which often suffered from poor contrast even when it was obvious the crystals were from the protein. Normally, it is very difficult to determine whether visible crystals are actually protein or buffer salt, but phycocyanin forms lovely blue crystals that are impossible to mistake.

Visible

SHG

UV-TPEF

Investigating Nuclear Structure with XRF Spectroscopy and SODFT

All physics majors at MIT are required to take a set of experimental physics classes with laboratory work. One of the experiments in these courses concerns x-ray fluorescence (XRF) spectroscopy. Using this experimental setup, one can easily confirm Moseley's Law - the $E \propto Z^2$ relation between the energy of the XRF photons and the atomic number of the element being interrogated. The Law comes about by considering a simple hydrogenic model for all elements, but this treatment becomes inherently less accurate at large $Z$ because relativistic effects become significant and the core orbitals contract. Some students made a splash a few years ago by publishing a paper about the observed deviation from Moseley's Law by using the Sommerfeld model to describe relativistic corrections to the core electron energies. They showed that the lab setup was sensitive enough to measure such deviations from the Moseley $Z^2$ curve.

Photo of the X-Ray spectrometer in MIT Physics Dept. Junior Laboratory. The spectrometer contains a liquid nitrogen cooled Ge crystal drift detector and is connected to an MCA. Radioactive sources with known emission energies are used as calibration standards.

There is an additional correction to the XRF energy spectrum that holds interest for nuclear as well as atomic physicists: the isotope shift. The isotope shift refers to shifts in the energy levels of the atom that are due to variance in the mass and volume between the nuclei of different isotopes of the same element. For light nuclei, the mass shift between isotopes results in a shift of the orbital energies due to a change in the center of mass. For heavy nuclei, the mass shift is convolved with the change in nuclear volume from the light to the heavy elements, known as the volume of field shift. Heavy nuclei can no longer be treated as point particles with positive charge. Instead, the charge of the nucleus must be smeared over a small, but finite, volume. Since the 1s electrons are the closest to the nucleus, it is this level that is most affected by variations in the nuclear charge distribution. Higher energy levels, such as the 2p, 3p, and 4p orbital, are relatively unaffected. XRF spectroscopy can, in principle, be used to indirectly probe the charge distribution of the nucleus by calculating the energies of transitions between 2p, 3p, or 4p and 1s using a theory that with a point-like nuclear model and observing the deviation from those energies due to the finite nuclear volume in experiment. A combined theory and experiment approach would allow testing of theoretical models in atomic physics that properly account for energy changes in the atomic levels due to the isotope shift as well as provide a better understanding of nuclear structure without the need for MeV (or GeV) electron beams. Electrons or photons with energy less than 150 keV are sufficient to produce these characteristic XRF transitions in all naturally occurring elements.

I used the spin-orbit density functional theory module in NWChem to obtain the energies of the atomic orbitals for each of the six elements measured in the experiment. The PBE0 exchange-correlation functional and SARC-ZORA basis sets were used for all calculations. A first-order approach to calculating the $K\alpha$ transition energies from the calculated SODFT energy levels is to simply take the difference between the 1s and 2p energies since it is this atomic transition that produces the majority of the energy of the fluorescent photon. For further accuracy, a true excited state calculation is required. Since XRF concerns the core-electrons, we can assume that the local chemical environment will have minimal effect on the observed energies.

NWChem provides a built-in RMS radius for nuclei composed of a Gaussian smearing of positive charge with $r_{RMS} = 0.836A^{1/3}+0.57$ femtometers where $A$ is the nuclear mass. For $U^{238}$, this gives $r_{RMS} = 5.751$ femtometers. The keyword "finite" or "point" must be provided in the "GEOMETRY" section header to switch between the two options. An example NWChem input file is provided below as a template.

start atom

TITLE "U_Finite"
CHARGE 0
GEOMETRY nucleus finite
   U 0.0 0.0 0.0
END

BASIS spherical
   * library SARC-ZORA
END

RELATIVISTIC
   zora on
   zora:cutoff 1d-30
END

DFT
   odft
   mult 5
   xc pbe0
   convergence energy 1e-8
   grid fine
   iterations 1000
   print "final vectors analysis"
END
task sodft

The experimental results indicate that the experimental setup is able to achieve a 100 eV spectral resolution for XRF spectroscopy measurements - pretty good as XRF spectroscopy goes and considering this was an undergraduate laboratory class. I measured $K\alpha_1$ transition energies which conformed very well (within one standard deviation) to the NIST standard measurements. Both the current measurements and NIST values show good agreement with the level of theory used - better than Moseley's Law or Sommerfeld relativistic corrections - in the NWChem calculations, but they do continue to diverge at high $Z$. This indicates that there is still some aspect of the physics that the NWChem model does not capture, which is not unexpected. One contributing factor might be that the formula NWChem uses to estimate the RMS radius of the nuclei underestimates the currently accepted experimental value of the nuclear radius of U (5.857 fm) by 0.1 femtometers.

Unfortunately, I did not get as far as testing whether the isotope shift could be measured experimentally, but the prospects for doing so on even this modest setup seem promising. With more work and better x-ray sources (I used radioactive materials), it seems reasonable that the spectral resolution might be pushed past 100 eV to make ordering isotopically pure samples of heavy elements worthwhile. The center of mass shift in light elements might also be measured. What I find exciting about this experiment is that it indicates that, in state-of-the-art experimental facilities, it might be possible to test theories in both atomic and nuclear physics simultaneously. Ab-initio electronic structure theories for large nuclei may be refined based on the experimental results and might provide new understanding of the relationship between nuclear and electronic structure.

The next logical step to take this line of research further is to find a more accurate theoretical treatment. I found a number of papers describing the multi-configuration Dirac-Fock approach for QED and relativistic corrections. Some of these references are below. I am not sure what software already exists for doing these calculations or if they include the volume of field corrections. This should resolve any remaining doubts caused by the known inadequacies of DFT. Experimentally, better models of the spectral background would greatly increase the resolution capabilities. Obtaining higher-quality samples of the heavy metals and modeling the effect of the chemical environment of XRF spectra is also important to understand systematic errors. It would also be advantageous to consider the less favorable transitions from higher-energy p-orbitals to the 1s shell since these orbitals interact even less with the nucleus than the 2p orbitals. Other transition schemes might be considered as well.

Interactive HTML Plots with Python and Plotly

I came across Plotly's offline plotting functionality recently as part of my work at Draper Labs and was very impressed with the results. This method is particularly useful for a number of reasons. Firstly, since the plot is interactive and the raw data is listed explicitly in the HTML output, my coworkers have the option to zoom, select a subset of the plotted curves to display, take a snapshot, or extract and replot the raw data. None of this functionality is available in static PNG plots, which I have been accustomed to make in the past. Secondly, most interactive plots require running Matlab or Python with specific modules installed which requires a certain amount of effort and programming know-how that is unreasonable to expect from every team member; using these HTML plots requires no more technical expertise than being able to say "Open with... Chrome".

I can now send these HTML plots to any of my coworkers and be confident that I will not receive tedious requests to replot minute details or that recipients will not or cannot view the data due to technical barriers. I find the interactivity also aids mental digestion of the plotted data and is particularly helpful for large data sets since curves can be selected from the legend. Moreover, the plots are then easily embeddable in webpages.

import numpy as np
import plotly
import plotly.graph_objects as go


def plot_Plotly(datalist = None, legend = None):
    if datalist is None or legend is None:
        exit(1)
     
    # Create interactive Plotly for all data            
    fig = go.Figure()
            
    for i,data in enumerate(datalist):
        fig.add_trace(go.Scatter(x = data[0,:], y=data[1,:], mode='lines', name=str(legend[i])))
        fig.update_layout(title=r'$\text{My Title}$', xaxis_title=r'$\text{My X-Axis}$', yaxis_title=r'$\text{My Y-Axis}$')

    plotly.offline.plot(fig, filename='/Users/nolanpeard/Desktop/plotly.html', include_mathjax='cdn')
    
    # Uncomment the next line to print the appropriate command for embedding in HTML webpages. The line above generates a stand-alone HTML file that can be opened in a browser 
    # print(plotly.offline.plot(fig, include_mathjax='cdn', include_plotlyjs=False, output_type='div'))
    # Remember to include  in the HTML before the first div command, otherwise the plot will not work 


if __name__ == "__main__":
    keys = [r'$\sin(4x)$', r'$\cos(4x)$']
    x = np.linspace(-10,10,1000)
    y1 = np.sin(4*x)
    y2 = np.cos(4*x)

    data = [[x, y1], [x, y2]]
    data = np.asarray(data)
        
    plot_Plotly(datalist = np.asarray(data), legend = np.asarray(keys))

Pages

Statistics for Physicists

Definitions

Sample Distributions

Fitting Data

References

Spin Densities with NWChem

Further Reading

Quantum Dots in an Optical Trap

Further Reading

SONICC

Further Reading

Investigating Nuclear Structure with XRF Spectroscopy and SODFT

Further Reading

Interactive HTML Plots with Python and Plotly