Single-molecule FRET


Single molecule fluorescence resonance energy transfer is a biophysical technique used to measure distances at the 1-10 nanometer scale in single molecules, typically biomolecules. It is an application of FRET wherein a pair of donor and acceptor fluorophores are excited and detected on a single molecule level. In contrast to "ensemble FRET" which provides the FRET signal of a high number of molecules, single-molecule FRET is able to resolve the FRET signal of each individual molecule. The variation of the smFRET signal is useful to reveal kinetic information that an ensemble measurement cannot provide, especially when the system is under equilibrium. Heterogeneity among different molecules can also be observed.

Methodology

Single molecule FRET measurements are typically performed on fluorescence microscopes, either using surface-immobilized or freely-diffusing molecules. Single FRET pairs are illuminated using intense light sources, typically lasers, in order to generate sufficient fluorescence signal to enable single molecule detection. Wide-field multiphoton microscopy is typically combined with total internal reflection fluorescence microscope. This selectively excites FRET pairs on the surface of the measurement chamber and rejects noise from the bulk of the sample. Alternatively, confocal microscopy minimizes background by focusing the fluorescence light onto a pinhole to reject out of focus light. The confocal volume has a diameter of around 220 nm, and therefore it must be scanned across if an image of the sample is needed. With confocal excitation, it is possible to measure much deeper into the sample than when using TIRF. Fluorescence signal is detected either using ultra sensitive CCD or scientific CMOS cameras for wide field microscopy or SPADs for confocal microscopy. Once the single molecule intensities vs. time are available the FRET efficiency can be computed for each FRET pair as a function of time and thereby it is possible to follow kinetic events on the single molecule scale and to build FRET histograms showing the distribution of states in each molecule. However, data from many FRET pairs must be recorded and combined in order to obtain general information about a sample or a dynamic structure.

Surface-Immobilized

In surface-immobilized experiments, biomolecules labeled with fluorescent tags are bound to the surface of the coverglass and images of fluorescence are acquired. Data collection with cameras will produce movies of the specimen which must be processed to derive the single molecule intensities with time.
An advantage of surface-immobilized experiments is that many molecules can be observed in parallel for an extended period of time until photobleaching
.
This allows to conveniently study transitions taking place on slow time scales. A disadvantage is represented by the additional biochemical modifications
needed to link molecules to the surface and the perturbations that the surface can potentially exert on the molecular activity.
In addition, the maximum time resolution of single-molecule intensities is limited by the camera acquisition time.

Freely-Diffusing

SmFRET can also be used to study the conformations of molecules freely diffusing in a liquid sample. In freely-diffusing smFRET experiments, the same biomolecules are free to diffuse in solution while being excited by a small excitation volume. Bursts of photons due a single-molecule crossing the excitation spot are acquired with SPAD detectors. The confocal spot is usually fixed in a given position. Instead, the fluorescence photons emitted by individual molecules crossing the excitation volume are recorded and accumulated in order to build a distribution of different populations present in the sample. Depending on the complexity of this distribution, acquisition times varies from ~5 min to several hours.
A distinctive advantage of setups employing SPAD detectors is that they are not limited by a "frame rate" or a fixed integration time like when using cameras. In fact, unlike cameras, SPADs produce a pulse every time a photon is detected, while an additional electronics is needed to "timestamps" each pulse with 10-50 ns resolution. The high time resolution of confocal single-molecule FRET measurements allows to potentially detect dynamics on time scales as low as 10 μs. However, detecting "slow" transitions on timescales longer than the diffusion time, is more difficult than in surface-immobilized experiments and generally requires much longer acquisitions.
Normally, the fluorescent emission of both donor and acceptor fluorophores is detected by two independent detectors and the FRET signal is computed from the ratio of intensities in the two channels. Some setup configurations further split each spectral channel in two orthogonal polarizations and are able to measure both FRET and fluorescence anisotropy at the same time. In other configurations, 3 or 4 spectral channels are acquired at the same time in order to measure multiple FRET pairs at the same time.
Both CW or pulsed lasers can be used as excitation source. When using pulsed lasers, a suitable acquisition hardware can measure the photon arrival time with respect to the last laser pulse with picosecond resolution, in the so-called time-correlated single photon counting acquisition. In this configuration each photon is characterized by a macro-time and a micro-time. The latter can be used to extract lifetime information and obtain the FRET signal.

Data Analysis of Surface Immobilised smFRET

Typical smFRET data of a two-dye system are time trajectories of the fluorescent emission intensities of the donor dye and the acceptor dye, called the two channels. Mainly two methods are used to obtain the emission of the two dyes: accumulative measurement uses a fixed exposure time of the cameras for each frame, such as PMT, APD, EMCCD, and CMOS camera; single photon arriving time sequence measured using PMT or APD detectors. The principle is to use optical filters to separate the emissions of the two dyes and measure in two channels. For example, a setup using two halves of a charge-coupled device camera is explained in the literature.
For a two-dye system, the emission signals are then used to calculate the FRET efficiency between the dyes over time. The FRET efficiency is the number of photons emitted from the acceptor dye over the sum of the emissions of the donor and the acceptor dye. Usually, only the donor dye is excited, but even more accurate information can be obtained if donor and acceptor dye are alternatingly excited. In a single excitation scheme, the emission of the donor and acceptor dyes is just the number of photons collected for the two dyes divided by the photon collection efficiencies of the two channels respectively which are the functions of the collection efficiency, the filter and optical efficiency, and the camera efficiency of the two wavelength bands. These efficiencies can be calibrated for a given instrumental setup.
where FRET is the FRET efficiency of the two-dye system at a period of time, and are measured photon counts of the acceptor and donor channel respectively at the same period of time, and are the photon collection efficiencies of the two channels. Thus, calculates the actual number of photons emitted from the dye. If the photon collection efficiencies of the two channels are similar and the actual FRET distance is not interested, one can set the two = 1. Note, this FRET value has not been corrected with the different quantum yield of the two dyes.
For the accumulative emission smFRET data, the time trajectories contain mainly the following information: state transitions, noise, camera blurring, photoblinking and photobleaching of the dyes. The state transition information is the information a typical measurement wants. However, the rest signals interfere with the data analysis thus have to be addressed.

Noise

The noise signal of the dye emission typically contains camera readout noise, shot noise and white noise, and real-sample noise, that each follows a different noise distribution due to the different sources. The real-sample noise comes from the thermal disturbance of the system that results in the FRET distance broadening, uneven dye orientation distribution, dye emission variations, fast blinking, and faster-than-integration-time kinetics. Slow variations of the dye emission are likely considered false-positive states that should be experimentally avoid by choosing different dyes. The other noises are from the excitation path and the detection path especially the camera. In the end, the noise of the raw emission data is a combination of noises with Poisson distribution and Gaussian distribution. The noises in each channel sometimes can be simplified to the summation of an intensity-dependent Gaussian noise and a Poisson background noise. The latter noise dominates when the channel is in the OFF or low-intensity state. The noises in the two channels then are combined into the non-linear equation listed above to calculate the FRET values. Thus, the noise on the smFRET trajectories is very complicated. The noise is asymmetric in above and below the mean FRET values and its magnitude changes towards the two ends of the FRET values due to the changing uncertainties of the A and D channels. Most noises can be reduced by binning the data with a cost of losing time resolution. See GitHub for an example MATLAB codes to simulate smFRET time trajectories without and with noise.

Camera blurring

The camera blurring signal comes from the discrete nature of the measurements. The emission signal has a mismatch with the real transition signal because one or both are stochastic. When a state transition happens between the readout of two emission reading intervals, the signal is the average of the two parts in the same measuring window, which then affects the state identification accuracy and eventually the rate constant analysis. This is less a problem when the measuring frequency is much faster than the transition rate but becomes a real problem when they approach each other. In order to reflect the camera blurring effect in the simulated smFRET trajectories, one must simulate the data in a higher time resolution than the data collection time then bin the data into the final trajectories. See GitHub for an example MATLAB codes to simulate smFRET time trajectories with a faster time resolution then bin it to the measuring time resolution to simulate the camera blurring effect.

Photoblinking and photobleaching

The time trajectories also contain the photoblinking and photobleaching information of the two dyes. This information has to be removed which creates gaps in the time trajectory where the FRET information is lost. The photoblinking and photobleaching information can be removed for a typical dye system with relatively long photoblinking intervals and photobleaching lifetimes that has been chosen in the measurement. Thus, they are less a problem for data analysis. However, it will become a big problem if a dye with short blinking intervals or long dark-state lifetime is used.

State identification algorithms

Several data analysis methods have been developed to analyze the data, such as thresholding methods, Hidden Markov Model methods, and transition point identification methods. Thresholding methods simply set a threshold between two adjacent states on the smFRET trajectories. The FRET values above the threshold belong to a state and the values below belong to another. This method works for the data with a very high signal to noise ratio thus have distinguishable FRET states. This method is intuitive and has a long history of applications. An example source code can be found in the software postFRET. HMMs base on algorithms that statistically calculate probability functions of each state assignment, i.e. add penalties to a less probable assignment. The typical open source-code software packages can be found online such as HaMMy, vbFRET, ebFRET, SMART, SMACKS, MASH-FRET, and etc. Transition-point analysis or change-point analysis uses algorithms to identify when a transition happens over the time trajectory using statistical analysis. For example, CPA based on Fisher information theory and the Student's t-test method identifies state transitions and minimizes description length by selecting the optimum number of states, i.e. balancing the penalty of noise and total number of states.
A list of software packages can be found on the KinSoftChallenge website.

Rate analysis

Once the states are identified, they can be used to calculate the Förster resonance energy transfer distances and transition rate constants between the states. For a parallel reaction matrix among the states, the rate constants of each transition cannot be pulled out from the average lifetimes of each transition, which is fixed the inverse sum of the rate constants. The lifetimes of transitions from one state to all other states are all the "same" for a single molecule. However, the rate constants can be calculated from the probability functions, the number of each transition over the total time of the state it transfers from.
where i is the initial state, f is the final state of the transition, N is the number of this transitions in the time trajectories, n is the total number of state, k is the rate constant, t is the time of each state before the transition happens. For example, one measures 130 second of smFRET time trajectories. The total time of a molecule stay on state one is t1 = 100 second, state two t2 = 20 s, and state three is t3 = 10 s. Among the 100 s the molecule stays state one, it transfers to state two 70 times and transfers to state three 30 times at the end of its dwell times, the rate constant of state 1 to state 2 is thus k12 = 70/100 = 0.7 s−1, the rate constant of state 1 to state 3 is k13 = 30/100 = 0.3 s−1. Typically the probabilities of the lengths of the 70 times or the 30 times transition are exponentially distributed. The average dwell times of the two distributions of state one, i.e. the two lifetimes, τ12 and τ13, are both the same at 1 s for these two transitions. The number of transition N can be the total number of state transition, or the fitted amplitude of the exponential decay function of the accumulated histogram of the state dwell-times. The latter only works for well behaved dwell-time distributions. A system under equilibrium with each state parallel transfers to all others in first-order reactions is a special case for easier understanding. When the system is not under equilibrium, this equation still holds but care should be applied to use it.
The interpretation of the above equation is simply based on the assumption that each molecule is the same, the ergodic hypothesis. The existence of each state is just represented by its total time which is its "concentration". Thus, The rate of transition to any other state is the number of transition normalized by this concentration. Numerically, the time concentration can be converted to number concentration to mimic an ensemble measurement. Because the single molecules are the same to all other molecules, we can assume that the time trajectory is a random combination of a lot of molecules each only occupies a very shot period of time, say ?t. Thus, the "concentration" of the molecule at state n is cn = tn / ?t. Among all these "molecules", if Nnf transfer to state f during this measuring time ?t, the rate of transfer by definition is r = Nnf / ?t = knf cn. Thus knf = Nnf / tn.
One can see that the single molecule measurement of rate constant is only dependent on the ergodic hypothesis, which can be judged if statistically enough number of single molecules are measured and expected well-behaved distributions of dwell times are observed. Heterogeneity among single molecules can be observed as well if each molecule has distinguishable set of states or set of rate constants.

Reduce the uncertainty level

The uncertainty level of the rate analysis can be estimated from multiple experimental trials, bootstrapping analysis, and fitting error analysis. State misassignment is a common error source during the data analysis, which originates from state broadening, noise, and camera blurring. Data binning, moving average, and wavelet transform can help reduce the effect of state broadening and noise but will enhance the camera blurring effect.
The camera blurring effect can be reduced via faster sampling frequency relies on the development of a more sensitive camera, special data analysis, or both. Traditionally in HMMs, the data point before and after the transition is specially assigned to reduce the wrong assignment rate of these data points to states in between of the transition. However, there is a limitation for this method to work. When the transition frequency is approaching the sampling frequency, too much data are blurred for this method to work.
A two-step data analysis method has been reported to increase the analysis accuracy for such data. The idea is to simulate a trajectory with the Monte Carlo simulation method and compare it to the experimental data. At the right condition, both the simulation and the experimental data will contain the same degree of blurring information and noise. This simulated trajectory is a better answer than the raw experimental data because its ground truth is "known". This method has open-source codes available as postFRET and MASH-FRET. This method can also slightly correct the effect of the non-Gaussian noise that has caused trouble to accurately identify the states using the statistical methods.
The current data analysis for smFRET still requires great care and special training which is in a call for deep-learning algorithms to play a role to free the labor in data analysis.

Advantages of smFRET

SmFRET allows for a more precise analysis of heterogeneous populations and has a few advantages when compared to ensemble FRET.
One benefit of studying distances in single molecules is that heterogeneous populations can be studied more accurately with values specific for each molecule rather than computing an average based on an ensemble. This allows for the study of specific homogeneous populations within a heterogeneous population. For example, if two existing homologous populations within a heterogeneous population have different FRET values, an ensemble FRET analysis will produce a weighted averaged FRET value to represent the population as a whole. Thus, the obtained FRET value does not produce data on the two distinct populations. In contrast, smFRET would be able to differentiate between the two populations and would allow analysis of the existing homologous populations.
SmFRET also provides dynamic temporal resolution of an individual molecule that cannot be accomplished through ensemble FRET measurements. Ensemble FRET has the ability to detect well-populated transition states that accumulate in a population, but it lacks the ability to characterize intermediates that are short-lived and do not accumulate. This limit is addressed by smFRET which offers a direct way to observe the intermediates of single molecules regardless of accumulation. Therefore, smFRET demonstrates the ability to capture transient subpopulations in a heterogeneous environment. Kinetic information in a system under equilibrium is lost at the ensemble level because none of the concentrations of the reactants and products change over time. However, at the single-molecule level, the transfer between the reactants and products can happen at a measurable high rate and balanced over time stochastically. Thus, tracing the time trajectory of a particular molecule enables the direct measurement of the rate constant of each transition step, including the intermediates that are hidden in the ensemble level due to its low concentrations. This allows smFRET to be used to study an RNA’s folding dynamics. Similar to protein folding, RNA folding goes through multiple interactions, folding pathways, and intermediates before reaching its native state.
SmFRET is also shown to utilize a three-color system better than ensemble FRET. Using two acceptor fluorophores rather than one, FRET can observe multiple sites for correlated movements and spatial changes in any complex molecule. This is shown in the research on the Holliday Junction. SmFRET with the three-color system offers insights on synchronized movements of junction's three helical sites and near non-existence of its parallel states. Ensemble FRET can use three-color system as well. However, any obvious advantages are outweighed by three-color system's requirements which includes a clear separation of fluorophore signals. For a clear distinction of signal, FRET overlaps must be small but that also weakens FRET strength. SmFRET corrects its overlap limitations by using band-pass filters and dichroic mirrors which further the signal between two fluorescence acceptors and solve for any bleed through effects.

Applications

A major application of smFRET is to analyze the minute biochemical nuances that facilitate protein folding. In recent years, multiple techniques have been developed to investigate single molecule interactions that are involved in protein folding and unfolding. Force-probe techniques, using atomic force microscopy and laser tweezers, have provided information on protein stability. smFRET allows researchers to investigate molecular interactions using fluorescence. Forster resonance energy transfer was first applied to single molecules by Ha et al. and applied to protein folding in work by Hochstrasser, Weiss, et al. The benefit that smFRET as a whole has afforded to analyzing molecular interactions is the ability to test single molecule interactions directly without having to average ensembles of data. In protein folding analysis, ensemble experiments involve taking measurements of multiple proteins that are in various states of transition between their folded and unfolded state. When averaged, the protein structure that can be inferred from the ensemble of data only provides a rudimentary structural model of protein folding. However, true understanding of protein folding requires deciphering the sequence of structural events along the folding pathways between the folded and unfolded states. It is this particular branch of research that smFRET is highly applicable.
FRET studies calculate corresponding FRET efficiencies as a result of time-resolved observation of protein folding events. These FRET efficiencies can then be used to infer distances between molecules as a function against time. As the protein transitions between the folded and unfolded states, the corresponding distances between molecules can indicate the sequence of molecular interactions that lead to protein folding.
Another application of smFRET is for DNA and RNA folding dynamics. Typically, two different locations of a nucleotide are labeled with the donor and acceptor dyes. The change of the distance between the two locations changes over time due to the folding and unfolding of the nucleotide plus the random diffusion of the two points over time, within each measuring window and among different windows. Due to the complexity of the folding/unfolding trajectory, it is extremely difficult to measure the process at the ensemble level. Thus, smFRET becomes a key technique in this field. On top of the challenges of smFRET data analysis, one challenge is to label multiple positions of interest, another is from the two-point dynamics to calculate the overall folding pathways.
Single-molecule FRET can also be applied to study the conformational changes of the relevant channel motifs in certain channels. For example, labeled tetrameric KirBac potassium channels were labeled with donor and acceptor fluorophores at particular sites in order to understand the structural dynamics within the lipid membrane, thus allowing them to generalize similar dynamics for similar motifs in other eukaryotic Kir channels or even cation channels in general. The use of smFRET in this experiment allows for visualization of the conformational changes that cannot be seen if the macroscopic measurements are simply averaged. This will lead to ensemble analysis rather than analysis of individual molecules and the conformational changes within, allowing us to generalize similar dynamics for similar motifs in other eukaryotic channels.
The structural dynamics of the KirBac channel was thoroughly analyzed in both the open and closed states, dependent on the presence of the ligand PIP2. Part of the results based on smFRET demonstrated the structural rigidity of the extracellular region. The selectivity filter and the outer loop of the selectivity filter region was labeled with fluorophores and conformational coupling was observed. The individual smFRET trajectories strongly demonstrated a FRET efficiency of around 0.8 with no fluctuations, regardless of the state of the channel.
Recently, single-molecule FRET has been applied to quantitatively detect target DNA and to distinguish single nucleotide polymorphism. Unlike ensemble FRET, single-molecule FRET allows real time monitoring of target binding events. Additionally, Low background, high signal-to-noise ratio observed with single-molecule FRET technique leads to ultra-sensitivity These days, different types of signal amplification steps are incorporated in order to push down the detection limit.

Limitations

Despite making approximate estimates, a limitation of smFRET is the difficulty of obtaining the correct distance involved in energy transfer. Requiring an accurate distance estimate gives rise to a major challenge because the fluorescence of the donor and acceptor fluorophores, as well as the energy transfer, is dependent on the environment and how the dyes are oriented, which can vary depending on the flexibility of where the fluorophores are bound. This issue, however, is not particularly relevant when the distance estimation of the two fluorophores does not need to be determined with exact and absolute precision.
Extracting kinetic information from a complicated biological system with transition rate around a few millisecond or below remains challenging. The current time resolutions of such measurements are typically at millisecond level with a few reports at the microsecond level. There is a theoretical limitation from the dye photophysics. The lifetime of the excited state of a typical organic dye molecule is about 1 nanosecond. In order to obtain statistical confidence of the FRET values, tens to hundreds of photons are required, which put the best possible time resolution to the order of 1 microsecond. In order to reach this limit, very strong light is required, which often cause photodamage to the organic molecules. Another limitation is the photobleaching lifetime of the dye, which is a function of light intensity and oxidation/reduction stress of the environment. The photobleaching lifetime of a typical organic dye under typical experimental conditions is in a few seconds, or a few minutes with the help of oxygen scavenger solutions. Thus kinetic events longer than a few minutes is difficult to probe with organic dyes. Other probes with longer lifetimes such as quantum dots, polymer dots, and all-inorganic dyes have to be used instead of the organic dyes.