Introduction

Large-scale interaction maps suggest a complex interplay of proteins within a myriad of functional assemblies1,2. A critical step in assigning functions to these assemblies is to determine their structure3. This goal is challenging, as many of these assemblies exist in low-copy numbers within cells, are frequently heterogeneous and may interact only transiently. Consequently, structural information for many protein complexes is not readily accessible by using the classical tools of structural biology (e.g., X-ray crystallography, nuclear magnetic resonance spectroscopy). New approaches are being developed that involve integrating data from a number of lower-resolution experimental methods and by combining distance and interaction restraints from these methods with homology modeling, architectural or even atomic models are being generated4. These restraints can be derived from a variety of experimental measurements including MS of intact complexes, chemical cross-linking, fluorescence resonance energy transfer, small angle X-ray scattering, and analytical ultracentrifugation5,6. One very recent addition to this series of biophysical tools is ion mobility separation coupled to mass spectrometry (IM-MS). IM is an established technique for studying shape and conformation in small molecules and individual proteins in the gas phase7,8,9,10 but has only recently been applied to intact protein complexes11,12. When IM is coupled with MS, mass and consequently subunit composition can be determined simultaneously with the overall topology of protein complexes10,12,13.

IM-MS analysis is performed by first ionizing the protein complex of interest. In our experiments, nano-electrospray ionization is used, typically requiring careful preparation procedures for most protein complexes. These procedures, as well as general practical aspects of sample preparation, are detailed in a protocol by Hernández and Robinson14. Although they are not discussed in detail here, knowledge of the materials and protocol steps described in that work are critical to the success of the protocol described below.

After ionization, ions are injected into a region containing neutral gas at a controlled pressure (e.g., 0.5 mBar of nitrogen gas). Under the influence of a relatively weak electric field, injected ions undergo IM separation7,8,9,15,16,17. Large ions experience more collisions with neutrals and thus take more time (drift time or tD) to traverse the chamber than smaller ions. Under these conditions, ions will migrate through the neutrals and separate according to their ion-neutral collision cross-section (Ω). Additionally, ions having higher charge will experience greater separation field strengths and traverse the chamber more quickly. Consequently, ion mobility is often described as proportional to collision cross-section-to-charge ratio (Ω/z). After separation, ions are sampled by a mass spectrometer and analyzed according to their mass-to-charge (m/z) ratio. We have based the specific aspects of our protocol on our experience with the Synapt (Quadrupole-Ion Mobility-Time-of-Flight) HDMS instrument (Waters) as it is the only commercial IM-MS instrument currently available. However, we also describe several practical steps that are necessary for recording IM data of large protein complexes as well as methods for modeling and extracting quaternary structure information that apply generally to any IM-MS instrument or data set. We begin by posing a series of questions commonly encountered in this field of research.

Will separation of the most likely potential structures of a protein complex be possible?

Many protein complexes fall into structural classes that can be described with purely geometric modifiers (e.g., ring or spherical). For these cases, it is often helpful to evaluate the ability of IM to separate different protein structural archetypes before attempting more detailed data analysis (discussed below). For example, Figure 1a shows modeling data for four different protein complex topologies as a function of subunit mass and number. To generate these data, we used open source software for calculating the collision cross-section of trial structures (MOBCAL)18,19, and altered the code so that spherical representations of whole subunits could be used in place of all-atom models (see protocol). It is worth noting that the collision cross-sectional dependence of similar structural trends has been discussed in the atomic cluster literature previously20,21,22. The plot shows trend lines describing the cross-sectional increase predicted for the addition of subunits added to four different structural types (linear, ring, stacked double-ring and close-packed) composed of small (8 kDa) subunits. Although closely similar for small subunit numbers, the four trends diverge substantially at large subunit numbers, indicating that the structural families attributed to complexes composed of a larger number of subunits are more easily distinguished by IM separation.

Figure 1: Is separation of likely potential structures possible?
figure 1

(a) Simulated trends in collision cross-section for four different topologies as a function of the number of subunits. As the number of subunits increases in an assembly, structural families diverge, making ion mobility separation more facile at a fixed value for resolution. (b) Plot of subunit mass versus number of subunits comprising a protein assembly. The trends shown represent those subunit mass/number of subunit combinations where ring and collapsed structures are resolved (at half height, R = tt). The numbers indicated correspond to the IM resolution necessary to resolve ring and collapsed structures along the corresponding trend. Above these lines the structures are also resolved at the indicated value of R.

The separation power of an IM device is often characterized by resolution (R). Here, we define resolution as the centroid of the drift time distribution divided by the width of the distribution at half height (tt). The higher the value of R achieved, the smaller the Ω differences that can be distinguished by IM separation. In addition to the topology and number of subunits, the mass of individual subunits will also influence the resolution necessary to separate different topologies. More massive subunits will, on average, increase the overall dimensions of the final quaternary structure adopted by the complex. Figure 1b plots subunit mass against subunit number and indicates several trend-lines that refer to R values that are sufficient to separate (at half height) a ring structure from a compact collapsed assembly at a given subunit number/mass combination. For example, at an R value of 10, only those structures containing a large number of small subunits are separable, and the chief effect of increasing IM resolution is the ability to separate ring/collapsed structures comprised of fewer subunits of similar size. Note that the influence of subunit mass on the resolution required to separate protein complex topologies is more significant at lower values of R.

What are the general features of IM-MS data for protein complexes?

IM-MS data are inherently three-dimensional, consisting of mass, drift time and intensity (relative abundance) data for all the ions observed23,24,25,26. A typical data set is illustrated in Figure 2, which shows a mass spectrum (a), an IM-MS contour plot (b) and a total integrated IM spectrum (c) for a series of ions corresponding to the small heat shock protein from wheat, TaHSP16.9. Signals corresponding to monomers, dimers and dodecamers are observed; however, the dodecamer is the dominant species in both the spectrum and in solution27,28,29. Often, the interplay between IM and MS data is vital for the complete analysis and utility of both dimensions of information. For example, MS data are used to define the charge and composition of the ions observed, without which it is often difficult to both normalize and interpret IM data correctly (see protocol).

Figure 2: Multidimensional IM-MS data representation and consequences of using too much acceleration voltage before IM separation.
figure 2

(a) Mass spectrum compiled from all ions observed. (b) Plot of drift time versus m/z for the small heat shock protein complex (sHSP) TaHSP16.9 (aqueous solution, 200 mM ammonium acetate buffer, 2 μM protein complex), illustrating the multidimensional nature of the data produced by ion mobility–mass spectrometry. Ions within the spectrum can be assigned to charge states of the TaHSP16.9 dodecamer (202.8 kDa) ranging from 28+ to 33+ (data acquired on a Waters Synapt HDMS quadrupole–ion mobility–time-of-flight instrument; see Table 2 for typical instrument settings). (c) Ion mobility arrival time distribution for all ions observed. An exponential intensity gradient was used to generate the contour plot in b. (d) A series of three mass spectra acquired at increasing values of activating voltage (in the cone and ion trap regions before the ion mobility separation device) for the TaHSP16.9 dodecamer (202.8 kDa). The blue shaded region highlights the peak corresponding to the 31+ charge state. The dashed line indicates the centroid position of that charge state at low activation voltage (lower spectrum). As activation voltage is increased, the peaks comprising the mass spectrum decrease both in width and mass and, in the process, more closely relate to the sequence mass of the protein complex. The mass increase recorded for the ions (mass in excess of the expected sequence) was highest for low activation conditions (0.74%, lower spectrum) and lowest for high activation conditions (0.1%, upper spectrum). (e) Ion mobility arrival time distributions for the 30+ charge state of TaHSP16.9 from each of the activation voltages are shown on the right. Low activation voltages are required to observe compact states of the 12mer, and the ion increases in size dramatically at higher voltages, indicative of unfolding. As such, optimal conditions for both mass and mobility measurements for protein assemblies are often incompatible.

Generally, plots of drift time versus m/z, as shown in Figure 2b, reveal trends in the data that can be indicative of either the charge state or molecular class of the ions observed23,24,25,26. For most protein complex ions, we have observed that the signals for a charge state series corresponding to a monodisperse protein assembly display a good correlation (R2 > 0.99) to a linear relationship between drift time and m/z allowing for polydisperse samples to be identified readily30. In addition, plotting data in a format similar to Figure 2b often highlights the presence of post-mobility cell fragmentation31,32. If protein complexes are separated by ion mobility as intact assemblies and are activated before mass measurement, any fragmentation products generated will appear at the same drift time as the original parent ion. If analyzing samples containing more than one protein complex, or stoichiometry of a single protein, post-mobility cell fragmentation can lead to erroneous assignment of fragment ions as alternative protein topologies13. Therefore, three-dimensional representations of IM-MS data are of critical importance for the initial stages of interpretation.

How are instrument conditions balanced for optimal IM separation and mass measurement?

Previous reports have detailed the conditions necessary to observe intact protein assemblies by MS14,33,34,35,36,37. These conditions often include increased pressures in the source region of the instrument and carefully reducing the amount of 'collisional heating' (also termed, 'activation') experienced by protein complex ions to avoid dissociation38. However, as discussed previously, some activation is usually employed to desolvate the protein complex ions and achieve optimum mass accuracy39,40. Figure 2d shows a series of mass spectra for TaHSP16.9 acquired using increasing accelerating voltage into the storage ion guide located before the IM separator. The mass accuracy achieved for the protein complex, as indicated in the figure, increases under more activating conditions as expected (see protocol). The corresponding IM data recorded for the 30+ charge state of TaHSP16.9 are shown in Figure 2e. As activation energy is increased beyond a threshold value, the drift time observed for the intact protein complex increases. We attribute this increase in drift time to the generation of multiple unfolded states of the protein complex as the internal energy of the protein complex is altered by collisional activation13. These unfolded states are unlikely to correspond to any accessible solution-state structures, as gas-phase activation occurs in an environment where solvation and Coulombic forces have vastly different magnitudes compared to solution41. Therefore, to obtain drift time data consistent with solution-phase structures, careful control of the voltages used to accelerate ions before IM separation is required. In addition, to achieve both high mass accuracy and IM data consistent with solution-phase structure, data must be acquired over a range of acceleration voltages rather than a single optimized set of parameters (see protocol).

How is drift time data converted to collision cross-section?

With careful measurement of pressure (P, in torr) and temperature (T, in Kelvin), the following equation can be used to convert drift times (tD, in seconds and corrected for time spent outside the drift cell) to collision cross-section values (Ω output in m2) using a standard drift tube separator (constant electric field):

where kb is the Boltzmann constant, z is the ion charge, e (C) is the elementary charge, mI is the mass of the ion, mN is the mass of the neutral gas (both in kg), E is the electric field strength (V/m), L is the length of the drift region (m) and N is the neutral gas number density (m−3). Several outstanding reviews have described detailed procedures for making accurate cross-section measurements using standard drift tubes8,9,17. In cases where gas purity, pressure and temperature cannot be measured accurately, calibrating the drift time measurements using ions of known collision cross-sections is preferable42. This method of calibration, rather than absolute measurement, is also preferred for travelling wave-type IM separators that use time-varying electric fields within the drift region to propel the ions toward the detector43,44. During the course of a measurement, the instrument operator can vary both the magnitude and velocity of voltage 'waves' to optimize IM separation. Over the course of analyzing many ions with known collision cross-sections, it was found that Ω was proportional to tDX, where X is an empirically determined parameter that depends upon many variables, including the height and velocity of the voltage 'waves' used to propel ions through the IM separation region45.

In our laboratory, we use the procedure described below (see protocol) to calibrate travelling wave drift times to collision cross-sections45,46. The output of this procedure is illustrated in Figure 3a, where calibration curves are shown at three different wave-heights (magnitude of the voltage 'wave'). The slope of the resulting calibration curve, and the exponential factor X, depends upon wave 'height', as shown in Figure 3a (see protocol step 15 for detailed description of corrected drift time). Typically, we run experiments over a range of wave heights to rule out the influence of electric field (i.e., dipole alignment) on the separation. Currently, there are few potential calibrant ions that have reported collision cross-sections in excess of 3,500 Å2. Therefore, we usually calibrate the data with a mixture of protein ions (including equine myoglobin, equine cytochrome c and human ubiquitin) that define the relationship between collision cross-section and drift time for smaller ions and extrapolate this relationship to larger values. The dependence of X on the wave height or wave velocity is an active area of research. The protocol given here provides the best empirical fit to the data as well as collision cross-section measurements that are in close agreement with literature values. In general, to perform higher-confidence extrapolations to large collision cross-section values, a greater number of points are preferred to define the calibration relationship.

Figure 3: Plots used for calibrating IM drift times to collision cross-sections.
figure 3

(a) Calibration curves, combining data from bradykinin (human), ubiquitin (bovine), cytochrome c (equine) and myoglobin (equine), displayed as linear plots of collision cross-section and corrected drift time (units of milliseconds raised to the power of X). Plots are shown for three magnitudes of the wave that propels the ions through the travelling wave drift cell used in the device (7 V: blue; 7.5 V: violet; 8 V: green). The average correlation coefficient for all three fits is displayed as well as the exponential factor (X) determined as the best fit for each wave height. The units of the dependent axis are determined in Step 15 of the protocol. (b) Plot of literature values for collision cross-section from references 45 and 46 against molecular mass. The plot can be used as a means of validating a drift time calibration to larger values of collision cross-section by predicting the value for a roughly spherical ion of the appropriate molecular weight. The best-fit relationship is displayed on the graph and indicates an average density for proteins of 0.48 Da Å−3 in the gas phase. Blue, violet, gray and green bars indicate the range of cross-sections recorded for all the charge states observed in the mass spectrum of TRAP 11mer, TRAP 12mer, TaHSP16.9 and TTR, respectively.

An extensive series of charge-reduced and low-charge state collision cross-sections extracted from the literature for a large data set of proteins, including the three discussed above, are shown in Figure 3b47,48. The fit shown is based on the relationship between the projected area (collision cross-section) and density-dependant mass of spherical-type protein complexes having a nearly constant packing density (ρ = 0.48 Å3/Da). We typically use the relationship shown in Figure 3b as a first approximation of the near-spherical collision cross-section of collapsed protein complexes and also to validate calibration curves for large molecular weight species (see protocol for details of validation). Although generally useful, it is important to note that the relationship shown in Figure 3b is used only as a rough guide to the collision cross-section of collapsed protein structures and supplements the molecular modeling approaches described below. Protein complexes measured by our group are plotted together with the literature data shown in Figure 3b to indicate the agreement of the calibrated data with previous results and trends. Our data are plotted as boxes to indicate the overall spread of collision cross-sections recorded from all charge states generated by nanoelectrospray ionization12.

A wide range of densities have been used in the literature to describe the packing efficiencies of gas-phase proteins and protein complexes38,49,50 and the density used here is toward the lower extreme of this range. As such, it is not surprising that some complexes exhibit collision cross-sections that appear substantially more compact than predicted by the general relationship shown in Figure 3b. For example, the collision cross-sections recorded for the higher charge states of TaHSP16.9 extend as low as 7,200 Å2, 85% of the collision cross-section by the relationship shown in Figure 3b. At the other extreme, larger than expected (>130% of the predicted value) collision cross-sections for the tryptophan–RNA-binding attenuation protein (TRAP) 11mer have been interpreted as evidence for a ring-type topology for the gas-phase protein assembly12.

How is molecular modeling employed to analyze data?

A critical component of IM-MS data analysis relies upon attempting to fit computational models to the collision cross-section and mass information gathered from experiment. For small ions, a large number of computational tools can be employed to determine the structure or ensemble of structures that exist at a given internal temperature. For example, the gas-phase structures of small peptides are typically determined by using a simulated-annealing approach to exhaustively search the configuration space of the molecule, combined with energy calculations to determine the lowest-energy configuration. Energy determinations range in precision from force field–based relative energetics, for large systems, to density functional theory or ab initio approaches for smaller systems. Many varieties on this basic theme have appeared in the literature7,8,9,17,42,51,52,53,54,55,56,57,58,59.

Owing to their size, the analysis of multiprotein complexes requires a paradigm shift from the methods described above. The computational approaches currently used in our laboratory to assign structure to gas-phase multiprotein complex ions are summarized in Figure 4. Although only three tracks are shown, a continuum of structural models and approaches should be considered for modeling a protein assembly, beginning with sphere-type coarse grain models and ending with atomistic models. For our measurements, performed at an IM resolution of 10, atomistic models tend to be less useful than those that have some level of coarse-graining, as these models are more commensurate with the information content of the measurement and allow more thorough topology searching. Atomic-level structures for subunits, subcomplexes or intact assemblies are, however, useful as templates for generating coarse grain structures (see protocol) and provide an opportunity for more focused folding and docking calculations in the context of the intact complex. This approach has proved useful for assessing the extent of unfolding experienced by activated protein complexes in the gas phase (similar to Fig. 2d,e)13.

Figure 4: Decision tree describing how model structures for protein complexes can be generated and compared with collision cross-section measurements.
figure 4

After measuring the minimal information required to begin the modeling process (shown top), at least three tracks are available for modeling information. The choice between these tracks depends on the amount of high-resolution structural information already available for the protein complex, computational resources available and the information required from the simulation as output (i.e., focusing on the folded state of a monomer within the complex rather than the complex as a whole). A continuum of tracks actually exists between the three shown here. For example, higher resolution coarse-graining can be achieved than the spheroid representations shown in the track highlighted in blue.

What are the limitations of IM-MS for protein complex analysis?

The calibration procedure described below imposes several limitations on the IM-MS technique for protein complexes. Although we have been able to achieve accurate results for complexes approaching 1 MDa through careful calibration and replicate measurements, the ability of the calibration protocol to produce high-precision collision cross-section measurements for very large protein complexes (>500 kDa) is limited by the current pool of calibrant ions available (see Fig. 3 and Step 12) and their associated precision (see Step 20). Measured collision cross-sections larger than 3,500 Å2 would allow more precise calibrated measurements and pave the way for future IM-MS studies of large protein complexes.

Some of the most important limitations of IM-MS technology are derived from the ionization event used to generate protein complex ions. Currently, using nano-electrospray, it is extremely difficult to generate ions that correspond to hydrophobic membrane-bound protein assemblies. There have been some recent reports, however, that provide some indication that careful control of solubilizing molecules in solution and IM-MS instrument parameters may allow membrane protein complexes to be studied more widely in the future60. Clear and assignable MS information is critical for the interpretation of IM data and these and other limitations are already discussed in a previous protocol14. Given interpretable MS data, the structural information provided by IM-MS is limited by the IM resolution (R) achieved for the complex of interest. Currently, it is difficult to find examples of IM resolution >10–15 for large protein complexes13,50; however, much higher values have been reported for smaller molecules61,62. Increasing the maximum IM resolution achievable for proteins and protein complexes is an active area of research. We envision that the information provided by IM-MS could be integrated as restraints, along with other low-resolution structural information, to provide higher-fidelity structure representations of protein assemblies, including those comprised of more than one type of subunit4,5.

Materials

Reagents

  • Equine myoglobin (from horse heart) (Sigma, cat. no. M1882; 1 g, pdb: 1wla)

  • Equine cytochrome c (from horse heart) (Sigma, cat. no. C-2506; 1 g, pdb: 1hrc)

  • Bovine ubiquitin (from red blood cells) (Sigma, cat. no. U6253; 25 mg, pdb: 1ubq (Human))

  • Tetrameric transthyretin (from human plasma) (Sigma, cat. no. P1742; 1 mg, pdb: 1dvq)

  • Ar (pureshield)

  • N2 (>99.9%, vol/vol)

  • Alternative gases: Xe (>99.9%), SF6 (>99.9%) (Boc Gases)

Equipment

  • IM-mass spectrometer (e.g., Synapt HDMS, Q-IM-o-ToF with modified 32,000 m/z quadruople option, Waters)

  • MOBCAL software, Jarrold Group, http://www.indiana.edu/~nano/Software.html (free)

  • Modifided MOBCAL software, Robinson Group (free, contact by e-mail)

  • MAESTRO software, http://www.schrodinger.com/ (free to academic users)

  • PYMOL software, http://PYMOL.sourceforge.net/ (open-source access, but need to do own compilation, maintenance and support)

  • HEX software, http://www.csd.abdn.ac.uk/hex/ (free to academic or government-funded laboratories)

  • Clustering software: hierarchical clustering using stored dissimilarities, Robinson Group (free, contact by e-mail)

  • Topology search algorithm: for generating plausible structures that are intermediate between two archetypal protein topologies, Robinson Group (free, contact by e-mail)

  • The following alternative software packages can also be used in various sections of the protocol:

  • Sigma software (alternative code for cross-section calculations), Bowers Group UC Santa Barbara, http://bowers.chem.ucsb.edu/theory_analysis/cross-sections/sigma.shtml (free, contact by e-mail)

  • Vmd, http://www.ks.uiuc.edu/Research/vmd/ (free with registration)

Procedure

Optimization of IM separation for large protein complexes

  1. 1

    As a test complex, we recommend using transthyretin (TTR, tetramer, 56 kDa). A solution of this protein complex should be prepared at a concentration of 5–10 μM in aqueous ammonium acetate solution for subsequent steps of this protocol. Refer to the associated Nature Protocols article14, to generate appropriate protein solution and spray conditions for observing intact protein complexes by MS. Note that activation to increase mass resolution and accuracy can be detrimental to IM data, as noted above.

  2. 2

    Choose a neutral gas for use in the IM separation and activation (Trap/Transfer) areas of the instrument. More massive gases will provide longer drift times (if used in the IM region of the instrument) and larger activation energies (if used in the Trap/Transfer region of the instrument). In our experiments, we use N2 for IM separation and Ar for the Trap/Transfer region—although benefits of using a heavier gas (e.g., Xe or SF6) in the Trap/Transfer region have been reported63, but these gases are substantially more expensive.

  3. 3

    Start from the standard settings given in Table 1 for the focusing/IM separation voltages. These settings were derived on our instrument by optimization using the TTR sample discussed in Step 1 and should provide you with adequate IM resolution to begin optimization.

    Table 1 Standard instrument conditions for the synapt HDMS (Q-IM-o-ToF) for analyzing large protein complexes
  4. 4

    Ion mobility–mass spectrometry for TTR has been published previously13, and the settings described in Table 1 should provide a spectrum similar to that data. The IM resolution of the features observed should be roughly as shown in Figure 2 (10, tt). If multiple features for a given peak are observed with anomalously narrow peak widths (>30, tt), refer to TROUBLESHOOTING table. See Figure 5 and related discussion in the 'Anticipated results' section for a pictorial representation of common problems.

    Figure 5: Optimizing ion mobility separations.
    figure 5

    (a) Normal spectrum acquired for a concentrated CsI solution (100 mg ml−1 in water) with clusters covering a mass range from 500 to 10,000 m/z acquired using a static wave height of 7 V. (b) The same sample and conditions as in panel a; however, in this case, the transfer wave velocity and ToF pusher frequency are nearly synchronous, resulting in artifical 'beats' or 'waves' in the resulting ion mobility separation. (c) The same instrumental conditions as in panel a; however, wave height used for the IM separation is halved. In this case, the ions observed during the first portion of the separation shown are the same as those observed in the tailing edge of the ion mobility data shown in panel a.

    Troubleshooting

  5. 5

    From the initial values used to generate a stable signal for the protein complex, lower the following voltages: cone voltage, extractor cone, bias voltage and trap collision voltage, and monitor the centroid of the drift time distribution for the ions of interest. Activation of the complex should be minimized to avoid unfolding of the protein complex. The above voltages should be decreased (in 10 V steps) until the drift time observed for the ions of interest does not change by more than 3% between steps.

    Critical Step

    This step is crucial to avoid measurement of ions having undergone gas-phase unfolding.

  6. 6

    Remove the wave height ramping function described in Table 1 and attempt to separate the complex at a fixed wave height and wave velocity value.

    Critical Step

    This step is important for calibration (see 'Analyzing IM-MS Data with Molecular Modeling', below), as the current calibration procedure for the instrument requires input data acquired at fixed wave heights.

  7. 7

    From the minimum value of the voltages determined in Step 5, increase the Trap collision voltage in a stepwise manner, recording data at 10 V intervals. Continue until the signal-to-noise ratio for the intact protein complex is lower than 3:1, due to the gas-phase dissociation of the intact complex38. This step, along with Step 5, ensures that both the optimum IM conditions (at low Trap collision voltage) and MS resolution/mass accuracy (at high trap collision voltage) is achieved (see Fig. 2). Note, at this point, higher mass accuracy and desolvation can often be achieved without losing the IM resolution achieved in Steps 4–6 by activating ions in the Transfer region of the instrument. As, by this method, activation takes place after IM and before MS analysis, IM resolution will remain unaffected.

Calibration of travelling-wave IM drift times

  1. 8

    Prepare calibrant solutions by diluting equine cytochrome c, equine myoglobin and bovine ubiquitin in 49:49:2 methanol/water/acetic acid at a concentration of 10 μM.

  2. 9

    Record IM-MS data for an unknown protein complex over a range of different wave heights to separate the ions. This step is designed to indicate the presence of any field-strength-dependent effects on the drift of the ions under investigation.

  3. 10

    Use precisely the same instrument conditions (including pressures) for all elements downstream of the trapping ion guide to acquire data for the three calibrant proteins, the solutions for which were generated in Step 8.

    Critical Step

    Altering voltages that affect the recorded drift time (i.e., any element of the IM separation stage or post-IM ion transfer stage) between calibration runs and measurements of unknowns can cause significant errors in calibrated measurements.

  4. 11

    Correct calibrant drift times (acquired using a single wave-height value) for mass-dependent flight time, calculated by the equation

    where tD is the corrected drift time in ms, tD is the experimental drift time in ms, m/z is the mass-to-charge ratio of the observed ion and C is a constant. The constant C can be found within the control software of the Synapt Q-IM-o-ToF instrument and is designated as the 'EDC (Enhanced Duty Cycle) delay coefficient'. C varies slightly from instrument to instrument, typically between 1.4 and 1.6. Note that the C values from the EDC setup are only valid if the exit lens of the transfer T-wave guide and transfer ion optics are the same as used when setting up the EDC calibration. To ensure this, once the system has been tuned for best mobility performance, select the Trapping tab from the tune page and select 'Use EDC'. Next insert the Transfer DC Exit voltage value from the TriWave DC tab view into the 'Extract Height (V)' setting. Put the m/z value of a calibrant ion species into the EDC mass box. Now open the 'Acquisition settings' window from the 'System' menu on the tune page. While monitoring the intensity of the selected m/z ion on the tune page, adjust the 'EDC Delay coefficient' (C value) value to maximize the signal intensity. Once this has been achieved, the coefficient can be used in the correction term to get a reasonable approximation for the m/z-dependent flight time from the transfer T-wave guide to the pusher. The time required for ions to transit the Transfer region is also present in all drift times recorded. This time can be calculated by taking the length of the transfer T-wave (10 cm) divided by the wave velocity (indicated in the software) and subtracted from the total drift time recorded.

  5. 12

    Take calibrant collision cross-sections (Ω, see Table 2) and correct them for both ion charge state and reduced mass (μ) to generate Ω′ (Ω′ = Ω/[charge × (1/μ)1/2]).

    Table 2 Commonly used calibrant ions and their collision cross-sections47.
  6. 13

    Create a plot of ln tD against ln Ω′.

  7. 14

    Fit the plot to a linear relationship of the form: ln Ω′ = X × ln tD + ln A, where A is a fit-determined constant and X is referred to as the 'exponential factor' in this protocol. The correlation coefficient of the fit achieved in this step should be high (R2 > 0.98).

  8. 15

    Re-plot Ω versus a new corrected drift time (tD), where tD is given by [tDX × charge × (1/μ)1/2]. As in Step 14, the correlation coefficient of the calibration plot should be high (R2 > 0.98). An anomalously low correlation coefficient could indicate an error in a previous step in the calibration protocol (Steps 8–15; see Fig. 3a for examples).

    Critical Step

    This approach provides a linear calibration plot, which provides a relationship between literature cross-section and T-wave drift time that is straightforward to extrapolate for measurements of large protein complex ions.

  9. 16

    Use the plot generated in Step 15 to calibrate drift time data for unknowns.

  10. 17

    Repeat Steps 11–16 for every wave-height value used to separate protein complex ions of interest.

  11. 18

    Validate this calibration curve against the trend shown in Figure 3b. If cross-sectional values obtained are smaller (65%) or significantly larger (180%) than those predicted by the plot shown in Figure 3b, then refer to TROUBLESHOOTING table.

    Critical Step

    These upper and lower bounds are based both on our experience with a wide range of protein complexes and on their average packing densities in the gas-phase and is a crucial check of whether the calibration procedure has been correctly applied. It is important to note that the upper and lower bounds indicated may not hold for very high molecular weight species (1 MDa).

    Troubleshooting

  12. 19

    Attempt a collision cross-section measurement of TTR. Using Steps 8–17, we achieve a measurement of 2,900 ± 213 Å2 for the 14–16+ charge states (ref. 13).

  13. 20

    Estimate the error of the collision cross-section measurement. The total error of the calibrated collision cross-sections generated using Steps 8–17 above should be estimated and reported along with the average measurement. We assume that the total error of the calibrated measurement is a sum of the standard deviation of three or more replicate measurements (reproducibility, ER), the average error of calibration curve generated in Step 15 (ECal) and the error carried by the protein standards used to calibrate the drift times of the unknowns (ES, assumed to be 1%)64. The expression ECal estimates the error associated with converting between measurements in N2 and calibrant IM measurements in He. Provided that the relative cross-sectional differences between ions are the same between the two gases, the error in this step should be minimal (<2.5%)65. The total error (ET) is typically between ±5 and 8% (in our experience) for protein complexes up to 500 kDa.

    Critical Step

    It is important to include error when comparing the measurement against calculated collision cross-section values for model structures.

Analyzing IM-MS data with molecular modeling

  1. 21

    Generate a modified version of MOBCAL18,19. Currently, we use two different versions of the MOBCAL code. One is a version of the code that is used for determining the collision cross-section of all-atom representations of a complex. For this version of MOBCAL, few alterations to the code are required. Increasing the 'len' variable is used to assure that the program can accept coordinate files that contain large numbers of atoms (len should be greater than or equal to the number of atoms in the coordinate file). We also alter the number of iterations used by the program to calculate cross-sections based on a single geometry file, as discussed in the accompanying instructions (mobcal.txt). The second version of the code is designed to determine the collision cross-section of coarse-grained (or hybridized) models of a complex. To generate this modified version of the code, use the following procedure: (i) open mobcal.f (source code) in a standard text editing program. (ii) On line 583 of the program, one of two sections that define atom masses and sizes begins. Replace atom radii and masses with the radius and mass of subunits comprising the complex of interest. To generate these sizes, there are several methods available. In the past, we have had success in estimating the radius either by (a) inputting a coordinate file corresponding to a high-resolution structure of a monomeric unit of the complex or (b) estimating the size of a compact monomer based on the relationship shown in Figure 3b. The same 'atom' radii and masses need to be modified in the section that begins on line 2,601 of the program code. (iii) Remove the Trajectory method calculation from the code by placing a 'c' in front of line 338. Typically, we report the projection approximation value as the estimated collision cross-section of the model. We assume that scattering phenomena are accounted for adequately by appropriate selection of subunit radii. Subunit radii can be estimated in a variety of ways, including the crystal structures of the appropriate subunit (it is recommended to perform a trajectory method calculation with MOBCAL to generate an average radius for the subunit) or by average density estimates if no subunit structure is available (as in Fig. 3b).

    Critical Step

    Without this program, it will be difficult to quantitatively compare measured values to model structures.

  2. 22

    Generate a coordinate file that corresponds to potential structure(s) of the protein complex under investigation. This coordinate file can be of any file format initially, but for compatibility with MOBCAL, it is recommended that the information other than Cartesian coordinates (e.g., secondary structure notation or subunit designations commonly found in .pdb files) be kept to a minimum. In some cases, we generate simplified coordinate systems by arranging carbon atoms at defined distances using packages such as MAESTRO or PYMOL. This approach often streamlines the process, as tools that define the symmetries of the coordinate systems generated are already built into the software (preferred to generating a coordinate system manually or with a simplified algorithm).

  3. 23

    Modify the coordinate files generated in Step 22 for compatibility with MOBCAL. There is an example input file packaged with MOBCAL for comparison. Your input file must match this format exactly (e.g., spaces between number columns).

    Critical Step

    Many MOBCAL errors can be traced to incorrect formatting of the input file.

  4. 24

    Run MOBCAL to calculate the collision cross-section of the input model. Refer to TROUBLESHOOTING table for common errors encountered.

    Troubleshooting

  5. 25

    Evaluate the agreement of the model with experimental data.

  6. 26

    Generate additional models for comparison against experimental data. This can be accomplished in a variety of ways. Different possible architectures of a protein assembly can be assembled based on biochemical data or literature sources. In some cases, this step involves docking several proteins of known solution structure together in different orientations. Manual docking can be accomplished using the Hex program, VMD, or other packages. Ideally, full docking calculations would be used to assess the stability of the final structure; however, this may not be possible in all instances (e.g., when high resolution data are not available for all subunits). IM-MS data can be used to test consistency against one of several hypotheses for a given protein complex structure. Molecular dynamics can also be used to generate potential candidate structures. We have developed a software approach that generates a large number of potential quaternary structures that are intermediate to two likely candidate structures (see ref. 12 for a complete description of the program, available on request). Typically, the output of this program is used in conjunction with clustering algorithms to generate a family of candidate structures that best fit the data.

Troubleshooting

Troubleshooting advice can be found in Table 3.

Table 3 Troubleshooting table.

Anticipated results

While performing IM-MS experiments is it is possible to generate IM separation artifacts. Two such artifacts, unique to the Synapt T-wave IM separation device, are illustrated in Figure 5. Figure 5a shows an optimized trace for all drift times recorded for a solution of CsI using a wave height of 7 V to propel the ions through the IM separation stage. In Figure 5b, the Transfer wave velocity has been altered such that it is nearly synchronous with the frequency of the ToF extraction region. This results in a spectrum containing multiple artificial 'beats' or 'waves', as are visible in Figure 5b. As discussed above, this spectrum can be corrected by adjusting the frequency of the Transfer T-wave such that it is approximately one-sixth of the frequency of the ToF extraction region. Another IM artifact that can be observed if careful control of the instrument voltage settings is not exercised is shown in Figure 5c. If the wave height used to generate the IM data shown in Figure 5a is lowered by a factor of 2, the drift time plot changes its appearance dramatically (Fig. 5c). In this drift time plot, some ions appear at relatively short drift times that were not apparent in Figure 5a. This is because the wave height is too low to effect efficient separation over the IM separation time allowed for this experiment (50 ms) and some ions, those on the trailing edge of the distribution in Figure 5a, require more than one allotment of IM separation time to travel to the orthogonal extraction region of the ToF in Figure 5c. The apparent 'beats' or 'waves' in the data shown in Figure 5c are the result of poor signal to noise rather than any additional artifact of unoptimized IM separation.

We have observed that IM-MS is particularly powerful in the analysis of protein complexes for both (i) removing chemical noise from structural information and (ii) performing structural studies on protein complexes. As indicated in point number 1, proteins are often prepared in buffers containing salts and other stabilizing agents. These molecules generate chemical noise in MS experiments that is difficult to remove without extensive liquid or solid phase separations. Figure 6a–c shows an example where IM-MS is used to remove unwanted background ions from the mass spectrum of a sample of human TTR. The mass spectrum shown in Figure 6b illustrates what mass spectrometry alone would reveal about the sample. Although the mass spectrum contains a large amount of ion signal, there are no well-defined peaks. Figure 6a shows the same data, but as a contour plot that reveals both the IM and MS dimensions (as described in Fig. 2). By analyzing this three dimensional data set, it is apparent that the unstructured signal observed in the mass spectrum exhibits broad trends in the plot of tD versus m/z shown in Figure 6b. The chemical noise, observed at higher drift times, can be identified as low-charge state salt clusters while the signals for the more highly charged TTR ions are observed at faster drift times than salt clusters of similar m/z. The TTR ion signal can be selected from the three-dimensional data set and plotted separately, as shown in Figure 6c. The data are now suitable for mass measurement and assignment. Figures 6a–c demonstrate the power of IM-MS for assessing protein complexes in a wider range of buffer components than are currently accessible by MS alone.

Figure 6: Expected separation and structural results from IM-MS of protein complexes.
figure 6

(a) Plot of drift time versus m/z for human tetrameric transthyretin (MW = 56 kDa), prepared in an ammonium acetate buffer contaminated with sodium chloride unpurified after recovery from a cellular matrix (recombinant expression). Trends in the data are evident, including several bands at high drift time and 3–4 well-defined signals at low drift time (violet shaded area). (b) Total integrated mass spectrum for all ions. No well-defined peaks are observed for tetrameric transthyretin. (c) A mass spectrum compiled from just those ions that fall within the shaded area from panel a. The data shown here are sufficient for a high-confidence mass measurement and assignment of the signals as originating from tetrameric transthyretin (mass measured = 55.85 ± 0.1 kDa). (d) Mass spectrum of the apo tryptophan RNA-binding attenuation protein (TRAP) complex in both the 11- and 12-membered state. Charge state series for the 11mer includes 19–23+ ions, whereas the 12mer distribution extends from 21+ to 25+. Although some overlap between the two species is evident in the spectrum (most notably for the higher charge state species), mass spectrometry data and ion mobility data can be recorded for several separated charge states of the two protein assemblies. (e) Ion mobility drift time distributions converted to a collision cross-section axis for selected peaks from panel a. Data for the 11mer are shown on top (blue) and for the 12mer on the bottom (violet). Dashed lines represent model structures shown on the figure: collapsed 11mer (green) ring 11mer (blue) collapsed 12mer (yellow) ring 12mer (purple). As the 11mer, the 12mer also forms a stable ring structure, which is observed in the gas phase, although higher charge states of the complex form collapsed structures.

Figure 6d,e highlights the power of IM-MS data for performing structural studies on protein complexes (point 2 above). These data, on TRAP, show evidence of both 11- and 12-membered protein assemblies, as observed previously66. IM-MS allows structural assignment of both the 11-mer and 12-mer even though both complexes coexist in solution. The precise set of solution conditions that favor the formation of dodecameric TRAP are currently unknown; however, we have observed dodecamer ions principally from samples analyzed after extended periods at room temperature (20 °C). Our data suggest that the 12-member TRAP complex is also ring-like (similar to the undecamer) at low charge states (21+). In addition, higher charge states appear to have progressively collapsed structures. Interestingly, this collapse occurs at similar values of charge-per-subunit values for the two complexes (between 1.72 and 1.81 charges-per-subunit values for undecamer and between 1.75 and 1.83 for the dodecamer)12. Overall, the data presented in Figure 6d,e illustrate one of the primary advantages of an IM-MS approach to protein structure determination relative to classical spectroscopic methods. Classical tools of structural biology, for example, nuclear magnetic resonance and X-ray crystallography, require homogenous protein complexes containing mono-disperse protein populations in high concentrations (millimolar in some cases). The IM-MS technique described in this protocol has neither of these limitations and is applicable to heterogeneous, dynamic protein complexes.