2. Journal of Proteome Research Article
http://isoglyp.utep.edu], but the validity of predicted sites must deidentified, that is, all patient information was removed, before
be questioned when experimental confirmation is lacking. usage in this study. The use of deidentified clinical samples for
Glycoproteomic techniques aimed at mapping O-glycosyla- method development is in agreement with Swedish law, and the
tion sites have recently been introduced as powerful tools for study was permitted by the head of the Clinical Chemistry
structural characterization of native glycoproteins.3,12,13,29 laboratory, Sahlgrenska University Hospital (Dnr 797-550/12).
Darula and Medzihradszky used the jacalin lectin, recognizing
GalNAcα1-O- of the core 1 structure, to purify tryptic O- PNGase F Pretreatment and Sialic Acid
Capture-and-Release Protocols
glycopeptides and identified 23 O-glycosylation sites from
bovine serum glycoproteins.13,30 Recently, they expanded their Aliquots of CSF samples (1 mL) were dialyzed against water
list to include 125 O-glycosylation sites by using prefractiona- using membranes with a 12−14 kDa molecular-weight cutoff
tion steps at both the glycoprotein and glycopeptide levels.31 (MWCO) (Spectrum Lab) (n = 2) or desalted on Sephadex
Steentoft et al. used zinc-finger nuclease-induced knockout of PD-10 columns (GE Healthcare) (n = 6). The samples were
the core 1 Gal T1 chaperone cosmc, inhibiting further lyophilized, dissolved in 50 μL of water, and subjected to
elongation of GalNAc-O- precursor substrates, for studies of PNGase F treatment according to the manufacturer’s protocol
O-glycosylation in Simple cell cultures.3 Additionally, they (New England Biolabs). The samples were denatured at 50 °C
employed Vicia villosa agglutinin (VVA) lectin chromatography for 10 min in the glycoprotein denaturing buffer. Temperatures
to enrich GalNAc-modified tryptic glycopeptides and identified higher than 60 °C should be avoided because of risk of
more than 350 O-glycosylation sites.3 Another glycoproteomics irreversible sample denaturation. G7 buffer, NP40, and PNGase
approach is the usage of TiO2 solid phases for the enrichment F were added and incubated at 37 °C for 16 h. The samples
of sialylated glycopeptides in combination with peptide N- were then desalted against water using 10 kDa MWCO
glycosidase F (PNGase F) treatment to release formerly N- microdialysis (Pierce). Finally, the samples (100−200 μL) were
glycosylated peptides from the solid phase.32 The usability of subjected to sialic acid capture and release for the enrichment
this methodology for the purification of O-glycopeptides has of desialylated glycopeptides, as described elsewhere.12
yet to be demonstrated. We initially developed a protocol for
sialic acid capture and release of both N- and O-glycoproteins/
Liquid Chromatography−Mass Spectrometry
glycopeptides from clinical samples12 using hydrazide chem-
istry.33 Mild periodate oxidation was used to introduce an Mass spectrometric analysis was performed essentially as
aldehyde on sialic acid terminated glycoproteins, which were described in ref 12. In short, samples were dissolved in 20 μL
then covalently captured onto hydrazide beads and trypsin of 0.1% formic acid and separated by nano-liquid chromatog-
digested, and finally, tryptic glycopeptides were released by raphy on a 150 × 0.075 mm C18 reverse-phase column
formic acid hydrolysis of the acid-labile sialic acid glycosidic (Zorbax; Agilent Technologies) in 50 min for elution of narrow
bond. Using liquid chromatography coupled to tandem mass chromatographic peaks and 120 min for broader peaks, with a
spectrometry (LC−MS/MS) for the glycopeptide analyses, we gradient from 0 to 50% acetonitrile in 0.1% formic acid at a
identified desialylated glycans of 36 N- and 44 O-glycosylation flow rate of 200−300 nL/min. The eluting peptides were
sites on human cerebrospinal fluid (CSF) glycoproteins.12 We allowed through a nano-ESI source to a hybrid linear
also used this method to identify desialylated glycans of 58 N- quadrupole ion trap/FT ion cyclotron resonance (ICR) mass
and 63 O-glycosylation sites from human urine samples.34 The spectrometer equipped with a 7 T magnet (LTQ-FT; Thermo
N-glycan structures were essentially all of the complex type, and Fisher Scientific). All spectra were acquired in positive-ion
the O-glycans were mainly of the core 1 type. For these CSF mode, and the mass spectrometer was operated in the data-
and urine samples, the presence of abundant N-glycopeptides dependent mode to automatically switch between MS1, MS2,
was prominent in the ion chromatograms and reduced the and MS3 acquisition. The FTICR precursor scan was acquired
likelihood to fragment less abundant coeluting O-glycopeptides. at an isotopic resolution of 50000, and the most intense ion was
To specifically study the site-specific O-glycosylation of CSF isolated and fragmented in the linear ion trap (LTQ) using a
proteins, we have now included a pretreatment step using normalized collision energy of 30%. For each MS2 spectrum,
PNGase F to selectively remove N-glycans from native the five most intense fragment ions were sequentially selected
glycoproteins and thus facilitate the selective MS analysis of for CID fragmentation in MS3. A repeat count of two was used,
O-glycopeptides. We have now also developed an automated and ions were then dynamically excluded for 180 s. For ECD,
protocol to search for the HexHexNAc-O-substituted peptides the precursor ions were guided to the ICR cell and fragmented.
using the Mascot search engine. For the assignment of specific The most abundant ion from an inclusion list, obtained by
Ser/Thr/Tyr glycosylation site(s) for peptides containing initial use of the CID−MS2/MS3 approach, was selected for
multiple hydroxylated amino acid, we used electron-capture fragmentation and irradiated with low-energy electrons
dissociation (ECD) and electron-transfer dissociation (ETD) produced by an emitter cathode for 80 ms using an arbitrary
to allow for selective peptide backbone fragmentation of O- energy setting of 4 or 5 in duplicate fragmentation events.
glycopeptides. For higher-energy collision dissociation (HCD) and ETD,
we used Orbitrap Velos and Orbitrap XL instruments
(Thermo), respectively. The reverse-phase C18 chromatog-
raphy and ESI interface setups were as previously described.35
■
The MS run times were 70 min, and the gradient ranged from 0
to 40% acetonitrile in 0.1% formic acid. For the Velos Orbitrap
MATERIALS AND METHODS experiments, the MS1 precursor scans and CID−MS2 spectra
The CSF samples (10 mL, n = 8) were taken on the suspicion were acquired with an isotopic resolution of 30000 and 7500,
of infection but were, upon analysis, found to have normal respectively, in the Orbitrap. The software could thus assign the
white blood cell count and blood brain barrier function. The charge states of MS2 peaks, which was necessary for attaining
samples were collected by lumbar puncture and were data-dependent CID−MS3 transitions from the five most
centrifuged at 1800g for 10 min within 30 min after sample abundant peaks in each MS2 spectrum. The CID−MS3 spectra
collection, aliquoted (1 mL fractions), and stored at −80 °C
pending analysis. The aliquots of the CSF samples were
574 dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584
3. Journal of Proteome Research Article
were acquired as profile data in the LTQ. The normalized specifying the topic glycosylation in the FT line, the search terms
collision energies for CID−MS2 and −MS3 were set to 30%, (GalNAc...) and (HexNAc...), and experimentally verif ied. The
and the minimum signal intensities for data dependent neighboring ±10 amino acid residues were plotted, and
triggering of CID were set to 10000 and 500 counts in the Weblogos37 were constructed using version 3.1 (http://
MS1 and MS2 steps, respectively. Also, one HCD-MS2 spectrum weblogo.threeplusone.com), where the previously reported
was acquired on the Orbitrap Velos, after the MS3 events, at sites from CSF were omitted to avoid bias from the
normalized collision energy of 40%. For the ETD experiments, methodology used both previously12 and in this report.
■
using the Orbitrap XL, the normalized collision energy was set
to 35% and the activation time was 200 ms. For each MS1
spectrum, at a resolution of 30000, three ETD spectra were RESULTS
collected, and the minimum signal required was set to 100000
counts. The ETD spectra were collected either as profile data PNGase F Treatment
from the Orbitrap, at a resolution of 7500, or as centroided data We subjected eight deidentified CSF samples to peptide N-
from the LTQ. glycosidase F (PNGase F) treatment and enriched O-linked
glycopeptides (O-glycopeptides) with the sialic acid capture-
Analysis of MS Data and-release protocol (Figure 1A). For two CSF samples, half of
The LC−MS/MS files from CID acquisitions were converted the volumes were treated with PNGase F and the other half was
to Mascot general format (.mgf) using the Raw2 msm left untreated, and then both were subjected to glycopeptide
application.36 The top 12 peaks per 100 Da were selected, enrichment. Peaks of tryptic N-linked glycopeptides (N-
and MS3 spectra were included. The in-house Mascot server glycopeptides) were virtually absent in the PNGase F treated
was accessed through Mascot Daemon (version 2.3.0), and CSF samples (Figure 1B) but were prominent in the untreated
searches were performed with the enzyme specificity set to samples (Figure 1C). By inspection of the CID−MS2 and
Trypsin and then changed to Semitrypsin. The human sequences −MS3 spectra, we identified several O-glycopeptides with
of the Swiss-Prot database were searched (20249 sequences; mainly HexHexNAc-O- structure, most likely corresponding to
January 25, 2011), but then the NCBI database (16392747 the core 1 (Galβ3GalNAcα-O-) glycan.
sequences; December 27, 2011) was used to account for
sequence variations. HexHexNAc (365.1322 Da) on Ser, Thr, Automated Mascot Search to Identify O-Glycopeptides
and Tyr residues was set to variable modification together with
neutral loss of HexHexNAc and Hex (162.0528 Da) for scoring To efficiently analyze the fragment-ion spectra, we designed a
purposes and from the “peptide” to account for neutral loss of protocol to automate the Mascot searches for HexHexNAc-O-
HexHexNAc and Hex from the precursor. Alternatively, substituted peptides (Figure 2). Use of the Raw2 msm
Hex2HexNAc2 (730.2644 Da) and HexHexNAc2 (568.2116 application36 for the generation of Mascot .mgf search files
Da) on Ser, Thr, and Tyr residues were set to variable allowed the precursor masses (MS1) to be assigned not only to
modifications together with neutral losses of the same masses in the CID−MS2 spectrum but also to five consecutive MS3
separate searches. Other variable modifications were Asn-to- spectra. Thus, the high mass accuracy (<5 ppm) of the
Asp conversion (+0.9840 Da), methionine oxidation, and loss FTICR or Orbitrap Velos MS1 precursor ions was implemented
of NH3 for peptides with N-terminal Gln and N-terminal for the subsequent MS2 and MS3 spectra that were measured at
carbamidomethyl-Cys. Carbamidomethyl-Cys was set to a fixed low resolution, but with high sensitivity, in the LTQ.
modification. The Instrument setting of ion trap was selected. Accordingly, a variable modification corresponding to HexHex-
Peptide tolerance was set to 10 ppm, and fragment tolerance NAc (365.1322 Da) on Ser/Thr/Tyr residues and the
was set to 0.6 Da. All MS2 and MS3 spectra of Mascot-proposed simultaneous neutral loss of the same mass to account for
O-glycopeptides were manually checked to contain the the lack of HexHexNAc of the peptide ion were included as
anticipated HexHexNAc-O- or (HexHexNAc-O-)2 structures parameters during database searches. An advantage of using
and were further investigated for matches that pinpointed the Raw2 msm was that all isotopic peaks were used to calculate the
glycan to a specific Ser/Thr/Tyr residue within the peptide. precursor mass, which gives a better mass accuracy compared to
The ECD and ETD spectra were converted and aggregated merely picking the first isotopic peak (Figure S1 and Table S1,
using Mascot distiller (version 2.3.2.0, Matrix Science), and the Supporting Information).
ions were presented as singly protonated in the output Mascot We first tested this search protocol on the LC−MS/MS files
file. Search parameters were set as described above, except that that we previously had analyzed manually.12 Of the 43
the fragment tolerance was set to 0.03 Da, no neutral losses HexHexNAc-O- and (HexHexNAc-O-)2-substituted peptides
were allowed for the HexHexNAc modification, and the that had been manually identified, we now were able to
Instrument parameters were set to consider c, z, and z + 1 automatically identify 35 in less than 5 min as opposed to
ions. Also, the precursor ion masses of ECD and ETD spectra weeks of manual interpretation. The O-glycopeptides that were
were matched manually to those of glycopeptides that had been not automatically identified either had precursor-ion intensities
identified by the automated Mascot search protocol. The MS- that were too weak or the MS1 precursor ions were assigned
product tool from Protein prospector (http://prospector.ucsf. wrong charge states by the Raw2 msm application. The
edu) was used to prepare peak lists of c and z ions for automated Mascot search protocol identified one additional O-
glycopeptide matches, and O-glycosylation sites were pin- glycopeptide, 60-AIMGAAHEPSPPGGLDAR-77 from β-gal-
pointed to unique Ser/Thr/Tyr residues by tracing c and z ions actoside α-2,6-sialyltransferase 2 (ST6Gal II/SIAT2, Uni-
that included or lacked HexHexNAc-O- modifications. protKB ID used in Table 1), for which the only possible
glycosylation site (Ser-69) is underlined. We then analyzed the
O-glycopeptides from the PNGase F treated CSF samples and
identified 85 peptides constituting 106 unique O-glycosylation
sites, of which about half had not previously been described
Data Analysis of Glycosylation Sites (Table 1). For identified O-glycopeptides containing several
Glycosylation sites of human proteins in the Uniprot
knowledge base (UniprotKB) database were compiled by
575 dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584
4. Journal of Proteome Research Article
for the Tyr-55 and Ser-56 alternatives, the scores were 32. This
difference in scores was due to the diagnostic presence of a
glycosylated y13 fragment ion [y(13)-162, Figure 2D], which
pinpointed the glycosylation site to Thr-63 and excluded Tyr-
55 and Ser-56 from being glycosylated (Figure 2B−D). A
second example of CID fragmentation of the peptide backbone
in the presence of intact glycosylation was for the O-
glycosylation of 899-ALLIPPSSPMPGP-911 from Brevican
core protein (PGCB), which was automatically assigned to
Ser-905 (Figure S2, Supporting Information). In general, high
abundance of b- and/or y-ion peaks, arising from fragmentation
at the N-terminal side of Pro,38 was often significant for CID
peptide fragmentation, also in the presence of intact or partially
intact HexHexNAc-O- structures.
ECD and ETD Fragmentation to Pinpoint Correct
Glycosylation Sites
To further verify O-glycopeptide identities and to assign the
correct Ser, Thr, or Tyr glycosylation sites for glycopeptides
containing two or more Ser/Thr/Tyr residues, we used ECD
for peptide fragmentation without simultaneous fragmentation
of glycans39 (Figure 3). For example, we assigned the
glycosylation site of the abundant ion (m/z 993 in Figure 1B
and Figure S1A, Supporting Information) of the C-terminal
301-VQAAVGTSAAPVPSDNH-317 peptide from apolipopro-
tein E (APOE) to Ser-308 (m/z 662 in Figure 3A), which is in
accordance with other studies.3,40 Also, a Hex2HexNAc2
glycoform of this glycopeptide was present, and ECD-MS2
showed that both Ser-308 and Thr-307 were glycosylated with
two separate HexHexNAc-O- structures (Figure 3B). A third
ECD example was the 23-LLSDHSKPTAETVAPDN-
TAIPSLR-46 glycopeptide, from SPARC-like protein 1
(SPRL1), where Thr-31 and Thr-40 (both underlined) were
found to be glycosylated with HexHexNAc-O- whereas the four
additional Ser/Thr residues were unglycosylated (Figure 3C).
However, by the use of CID−MS2/MS3, we identified
(HexHexNAc-O-)3 and (HexHexNAc-O-)4 glycoforms of the
same tryptic peptide (Figure S3, Supporting Information), and
the identifications were based on the presence of a diagnostic
y4 fragment ion (m/z 472) in common to the three glycoforms.
We also used electron-transfer dissociation (ETD) fragmenta-
tion and pinpointed, for example, the O-glycosylation site of the
Figure 1. PNGase F pretreatment and sialic acid capture-and-release HexHexNAc-O-substituted 649-GLTTRPGSGLTNIK-662
protocol of CSF samples. (A) CSF samples were subjected to PNGase peptide from the amyloid precursor protein (APP/A4) to
F treatment (step 1), subjected to periodate oxidation, captured on Thr-651 or Thr-652 (Figure 3D). By the ECD and ETD
hydrazide beads, and trypsin-digested while still attached to the beads approach, we assigned 31 glycosylation sites to unique Ser/Thr
(steps 2−4). Desialylated O-glycopeptides were released by formic
residues of peptides with several Ser/Thr alternatives. We did
acid hydrolysis (step 5). The LC−MS total-ion chromatogram of O-
glycopeptides enriched from (B) PNGase F pretreated and (C) not identify any Tyr-glycosylated Aβ peptides in the CSF
untreated CSF. Selected parent ions corresponding to chromato- samples,2 nor did we observe any evidence for other Tyr-
graphic peaks are annotated with their nominal m/z values. N, N- glycosylated peptides.3 In total, using a combination of CID
glycopeptide; O, O-glycopeptide. and ECD/ETD, 67 desialylated glycans of unique O-
glycosylation sites were pinpointed to correct Ser/Thr residues.
Seventeen O-glycopeptides contained only one HexHexNAc-O-
structure and one available Ser/Thr glycosylation site.
hydroxylated residues, we were able to pinpoint 50 attachment
sites correctly using CID or ECD/ETD.
CID Fragmentation to Pinpoint the Correct Glycosylation
Site Automated Search for More Complex Glycoforms
We allowed for a neutral loss of Hex (−162.0528 Da) from the Apart from the core-1-like HexHexNAc-O- structure, the core 2
HexHexNAc-O-substituted precursor. Thus, all possible compatible Hex(HexHexNAc)HexNAc-O- and Hex(HexNAc)-
HexNAc-O-substituted b- and y-ion peaks in the MS2 and HexNAc-O- structures (730.2644 and 568.2116 Da, respec-
MS3 spectra were taken into account in the Mascot search. Two tively) were introduced as allowed modifications in separate
examples where CID was used to pinpoint glycosylation sites Mascot searches. A few false hits of Hex2HexNAc2 arose from
are given below. The O-glycosylation site of 55-YSQAVPAV- O-glycopeptides containing two separate HexHexNAc-O-
TEGPIPEVLK-72 from cathepsin D (CATD) was assigned to structures but were disqualified because of a lack of diagnostic
Thr-63 with a Mascot score of 36 (p < 0.05 threshold 29), but saccharide oxonium ions otherwise typically found in the CID−
576 dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584
5. Journal of Proteome Research Article
Figure 2. Mascot search method for automated identification of HexHexNAc-O- substituted peptides. (A) The MS1 precursor was measured in
FTICR or Orbitrap mode and (B) was subjected to CID to generate the MS2 spectrum. (C) Further CID of the peptide (top) and peptide +
HexNAc ion (bottom) generated the MS3 spectra that were used in Mascot searches to identify the O-glycopeptide. The CID−MS2 and −MS3
transitions follow black (straight) arrows, and the MS1 precursor assignments follow red (rounded) arrows.
MS2/MS3 spectra of more complex O-glycopeptides.34 One Reproducibility of Sample Preparation, LC−MS/MS
novel O-glycopeptide with Hex2HexNAc2 modification was, Analysis, and Glycosylation Pattern of Human CSF Samples
however, identified as 210-AATVGSLAGQPLQER-224 from
Apolipoprotein E (APOE) (Figure S4, Supporting Informa- The reproducibility of the sample preparation, LC−MS/MS
tion). Peaks from the saccharide oxonium ions [HexHexNAc2]+ analysis, and presence of the same O-glycosylation sites across
(m/z 569.3 in Figure S4A, Supporting Information) and individual CSF samples was assayed by analyzing 19 abundant
[Hex2HexNAc2]+ (m/z 731.3 Da in Figure S4A, Supporting glycopeptides from six CSF samples that were acquired
Information) exceeding [HexHexNAc]+ (m/z 366.3 in Figure sequentially using identical preparative and LC−MS/MS
S4A, Supporting Information) in mass were observed, which settings on the FTICR instrument (Table S2, Supporting
verified the presence of the more complex core-2-like Information). These glycopeptides were selected because they
Hex(HexHexNAc)HexNAc-O- structure as opposed to two were automatically identified by Mascot searches in at least
separate HexHexNAc-O- structures. The CID−MS3 spectrum three of the six samples. The Mascot scores for these 19
of the peptide + HexNAc ion (m/z 850.9) was used for the glycopeptides were similar across the six samples; for example,
automated identification (Figure S4A, right spectrum; Support- the differences between median and average scores were <5%
ing Information). Also, the ETD spectrum of the [M + 3H]3+ for all glycopeptides except for (HexHexNAc-O-)2-substituted
precursor indeed showed that the complete glycan was attached 301-VQAAVGTSAAPVPSDNH-317 from APOE. MS1 peaks
solely to Thr-213 (Figure S4B, Supporting Information). A of the 19 glycopeptides were manually identified having the
manual survey of all CID−MS2 and −MS3 spectra was correct mass (±5 ppm) and expected elution time (±2 min) in
performed to investigate for the presence of O-glycopeptides all of the six samples. We also used an alternative approach,
with more complex glycans, but none were found, demonstrat- exemplified by the (HexHexNAc-O-) 2 -substituted 23-
ing that, using this methodology, sialylated HexHexNAc-O- LLSDHSKPTAETVAPDNTAIPSLR-46 glycopeptide from
structures appeared vastly dominating in these samples. SPRL1, which was automatically identified by Mascot in only
one of the six LC−MS/MS (Table S2, Supporting
Information). However, the MS1 peak (Figure S5A inset,
Supporting Information) was indeed present, although at
577 dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584
6. Journal of Proteome Research Article
Table 1. O-Glycopeptides Identified from Human CSF Glycoproteinsa,b
a
Ser/Thr residues of glycosylation sites that were experimentally verified are shown in bold red. Pro residues in glycosylated S/T-X-X-P, P-S/T and
S/T-P sequences are shown in bold blue. Ser/Thr/Tyr residues in glycopeptides with experimentally unverified glycosylation site(s) are shown in
bold green. The number of HexHexNAc-O- sites is indicated in the last column when more than one are present. b(a) Previously not reported
glycosylation site. (b) Reported from human CSF.12 (c) Previously reported (UniprotKB). (d) Previously reported in immunopurified APP/A4 from
human CSF.2 (e) Reported from human cell culture.3
varying intensities, in all six samples (Figure S5A−F, strated that a Pro residue was sequence conserved at the n − 1,
Supporting Information). Thus, the reproducibility of the n + 1, and n + 3 positions where n is the O-glycosylation site. As
glycosylation patterns in the CSF samples was typically a comparison, all experimentally verified GalNAc-O-glycosyla-
consistent during the sample preparations LC−MS/MS tion sites for human proteins in the UniprotKB (222 sites,
analyses and between individuals. release 2012_02) were analyzed in Pro frequency and Weblogo
plots (Figure 4B), essentially confirming our results. However,
Weblogo Analysis of the O-Glycosylation Sites the frequencies were not as pronounced as when only our CSF
We prepared a Pro frequency plot for the ±10 residues data were used. The combination of Pro in n + 1 and n + 3 (S/
surrounding the 67 experimentally verified O-glycosylation sites T-P-X-P) was found in approximately one-third of the
and found that the fractions of Pro occurrence at S/T-X-X-P, P- experimentally verified glycosylation sites (Figure 4C). The
S/T, and S/T-P sequences were about one-half, one-third, and frequency of Pro in the n + 2 position was low, but that of Ala
one-fourth, respectively (Figure 4A, left). The sum of the and Leu was higher at the n + 2 position for the S/T-X-X-P
fraction values exceeds 1 because more than one sequence sequence (Figure 4C). Two typically glycosylated sequences
combination often occurred per each glycosylation site (e.g., in were thus T-P-A-P and T-P-L-P, where T-P-A-P was a favorable
S/T-P-X-P). The Weblogo plot (Figure 4A, right) demon- motif for the O-glycosylation of model peptides by ppGalNAc-
578 dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584
7. Journal of Proteome Research Article
Figure 3. ECD and ETD spectra of O-glycopeptides. ECD spectrum of the C-terminal tryptic HexHexNAc-O-substituted 301-
VQAAVGTSAAPVPSDNH-317 peptide (A) from APOE and (B) its (HexHexNAc-O-)2 glycoform. (C) ECD spectrum of (HexHexNAc-O-)2-
substituted 23-LLSDHSKPTAETVAPDNTAIPSLR-46 from SPRL1. (D) ETD spectrum of HexHexNAc-O-substituted 649-GLTTRPGSGLTNIK-
662 from APP/A4.
T123 and was the first site to be glycosylated in a peptide the corresponding (HexHexNAc-O-)3-substituted peptide in
containing multiple Ser/Thr and Pro residues by the brain- the LC−MS/MS spectra and found one MS1 precursor that
specific ppGalNAc-T13.41 deviated by 2.3 ppm from the theoretical mass and had an ion
Selected Examples of Identified O-Glycoproteins intensity that was approximately 1% compared to the
(HexHexNAc-O-)2-substituted peptide (Figure S6, Supporting
Some selected examples of identified O-glycosylations were Information), which, in turn, usually was in the range of 2% in
chosen because partial manual analysis was also used and a relation to the HexHexNAc-O-substituted peptide (Figure S1,
glycoform abundance study was carried out (APOE), because Supporting Information). The CID−MS2 spectrum supported
six previously unknown O-glycosylation sites were identified in the (HexHexNAc-O-)3 structure, and we thus confirmed Ser-
one O-glycoprotein (ETBR2), because O-glycosylation was 314 to be a minor glycosylation site in APOE from a human
identified at Thr residues of the N-glycosylation Asn-X-Ser/Thr CSF sample.
consensus sequence (ETBR2 and YIPF3), and because an We also identified two additional minor glycosylation sites
unexpected lack of anticipated O-glycosylation in the PNGase F carrying core-1-like HexHexNAc-O- structure at Thr-26 of the
treated CSF samples was found (HEMO). N-terminal 19-KVEQAVETEPEPELR-33 peptide and at Thr-
36 of the sequential peptide 34-QQTEWQSGQR-42 from
Apolipoprotein E APOE. They are minor because the two major O-glycopeptides
The dominating MS1 precursors in the LC−MS chromato- containing the Thr-212 and Ser-308 glycosylation sites
grams (Figure 1B) were HexHexNAc-O-substituted 301- dominate the ion chromatograms and the two newly observed
VQAAVGTSAAPVPSDNH-317 [m/z 993 in Figure 1B and APOE glycopeptides are present at much lower ion intensities
Figure S1A (Supporting Information) and m/z 662 in Figure (Figure S1 and Table S1, Supporting Information) and were
3A and Table S1 (Supporting Information)] and 210- only automatically selected for CID− and ECD−MS2 in CSF
AATVGSLAGQPLQER-224 [m/z 931 in Figure 1B and Figure samples that had been treated with PNGase F.
S1C and Table S1 (Supporting Information)] containing the
well-established Ser-308 and Thr-212 glycosylation sites,
respectively.3,12,40,42 Also, additional O-glycosylation of Thr- Endothelin B Receptor-Like Protein 2 and YIPF3
307 (Figure 3B) has been identified from cell culture3 and from
CSF.12 Additionally, Steentoft et al. identified a third For Endothelin B receptor-like protein 2 (ETBR2), we
glycosylation site on Ser-314 of 301-VQAAVGT- identified six glycosylation sites, all with core-1-like HexHex-
SAAPVPSDNH-317, where the three underlined residues NAc-O- structures, present on three glycopeptides of the
were all substituted with HexNAc.3 We manually searched for extracellular part of the protein (Figure S7A, Supporting
Information). We found that the 22-VSGGAPLHLGR-32
579 dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584
8. Journal of Proteome Research Article
Figure 4. Proline frequency and Weblogo probability plots of O-
glycosylation sites. (A) Proline frequency (left) and Weblogo (right)
for the 67 experimentally verified O-glycosylation sites in this study.
(B) Proline frequency (left) and Weblogo (right) for 222
experimentally verified O-glycosylation sites from human proteins in
the UniprotKB database and (C) Weblogo plot for glycopeptides
containing the S/T-X-X-P glycosylation sequence and identified in this
study.
tryptic peptide was glycosylated (Figure S7B, Supporting
Information), which is different from the proposed signal
sequence cleavage at Gly-25 (Figure S7A, Supporting
Information; see entry ETBR2_HUMAN in the UniprotKB
database) and indicates that O-glycosylation can indeed affect
the cleavage of the signal peptide of glycoproteins. Two and Figure 5. CID−MS2 and −MS3 of HexHexNAc-O-substituted
three glycosylation sites were identified on the 70-PIH- LPTTVLDATAK from protein YIPF3. (A) CID−MS2 at the MS1
PAGLQPTKPLVATSPNPGK-91 peptide of ETBR2 (panels precursor (m/z 747.9). (B) CID−MS3 of the peptide + HexNAc ion
C and D, respectively, of Figure S7, Supporting Information), (m/z 666.8 in spectrum A) and an expansion showing the presence of
where Thr-85 was unglycosylated in the (HexHexNAc-O)2- HexNAc-substituted y5-y7 fragments. Note that the Asn (N) residue
substituted peptide, supporting an initial glycosylation of Thr- was identified as Asp (D).
79 within the P-S/T sequence and Ser-86 within the S/T-P-X-P
sequence. Also, the 104-GNLTGAPGQR-113 peptide from
ETBR2 was found to be glycosylated (Figure S7F, Supporting the LC−MS/MS spectra of CSF samples even without
Information), and interestingly, Asn-105 had been changed to treatment with PNGase F (Figure 1C). The CID−MS2/MS3
Asp-105. Because Asp is in an Asn-X-Ser/Thr N-glycosylation and ECD spectra of HexHexNAc-O- and (HexHexNAc-O-)2-
motif, this indicates that an N-glycan at Asn-105 was substituted Hemopexin peptides are shown for entry HEMO in
hydrolyzed during the PNGase F treatment. A second example the Additional Spectra section of the Supporting Information.
of O-glycosylation of Ser/Thr in the Asn-X-Ser/Thr consensus We initially believed that these O-glycopeptides would be major
was demonstrated from the HexHexNAc-O-substituted 331- ions also in the LC−MS/MS spectra of PNGase F treated CSF,
LPTTVLNATAK-341 peptide from YIPF3 where the Asn also but quite surprisingly, there was no trace of them in the
had been converted to Asp (Figure 5). The presence of PNGase F treated samples (Figure 1B). Although this was a
HexNAc-substituted y5, y6, and y7 and unglycosylated b6, b7, reproducible result limited to Hemopexin, the explanation for
and b8 fragments (expanded in Figure 5B) demonstrated that this finding is still unclear.
■
Thr-339 was the O-glycosylation site.
DISCUSSION
Hemopexin In this study, we have added the use of PNGase F treatment to
Hemopexin (HEMO) is both N- and O-glycosylated,43 and remove N-glycans prior to our sialic acid capture-and-release
HexHexNAc-O-substituted 24-TPLPPTSAHGNVAE- protocol to selectively characterize O-glycopeptides originating
GETKPD-43 at m/z 795 and HexHexNAc-O-substituted 24- from CSF glycoproteins. This pretreatment was reproducibly
TPLPPTSAHGNVAEGETKPDPDVTER-49 at m/z 771, successful and made it possible to identify a larger number of
where Thr-24 is the O-glycosylation site, were prominent in O-glycosylations because there was a reduced analytical
580 dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584
9. Journal of Proteome Research Article
interference from N-glycopeptides in the LC−MS/MS spectra the same instrument (Table S2, Supporting Information) and
(Figure 1). As N-glycans are hydrolyzed by the PNGase F one O-glycopeptide, which was automatically identified in only
treatment, formerly N-glycosylated Asn residues were changed one of the six samples (Figure S5, Supporting Information).
to Asp residues (+0.9840 Da), which was introduced as an Based on Mascot scores, retention times, and the presence of
allowed modification in the Mascot searches to facilitate the accurate MS1 peaks in all of the LC−MS/MS runs, the 20
possible identification of O-glycosylation sites also on glycopeptides were reproducibly found in all six CSF samples,
previously N-glycosylated tryptic peptides. Two such O- thus showing analytical reproducibility and similarity with
glycopeptides were identified, 331-LPTTVLDATAK-341 from respect to O-glycosylation pattern between individuals. We
YIPF3, which was glycosylated at Thr-339 (Figure 5) and were unable to identify any Tyr O-glycosylations in our
contained Asp-337 instead of Asn-337, and 104- PNGase F treated CSF samples, indicating that the recently
GDLTGAPGQR-113 from ETBR2 containing an Asn-106 to described HexNAc-O-Tyr modifications are relatively unusu-
Asp-106 change (Figure S7F, Supporting Information). al.2,3
Interestingly, in both cases, the Thr residue of the N- The sialic acid capture-and-release protocol is very specific
glycosylation Asn-X-Ser/Thr consensus motif was O-glycosy- for the enrichment of formerly sialylated glycopeptides, and it is
lated, demonstrating that a preformed N-glycan structure does important to note that nonsialylated glycoproteins will not be
not necessarily block the ppGalNAc-T from interaction with its enriched. It is thus not possible to assay glycosylation sites and
substrate. Simultaneous N- and O-glycosylations of the Asn-X- glycan structure of nonsialylated O-glycans using this method-
Ser/Thr motif has previously been described.44 ology. A second drawback of the protocol, which is common
Automated strategies for the structural characterization of O- for all protocols where glycopeptides are purified from their
glycopeptides are known to be demanding.29 Some protocols unglycosylated peptide counterparts, is that important
are available for glycan fragmentation analysis of glycopeptides information regarding site occupancy, that is, the relative
with already known peptide sequence(s),45,46 but automated distribution of glycosylation versus unglycosylation of peptides,
protocols aimed at analyzing both the glycan structure and the cannot be addressed in a quantitative manner. Another possible
peptide sequence are scarce. 29 The predominance of limitation of the protocol could be that the presence of O-
HexHexNAc-O- substituted peptides in our study made it glycans in the vicinity of Lys/Arg residues might block the
efficient to design an automated Mascot search protocol to access of trypsin to cleave the glycoproteins while attached to
identify core 1 substituted O-glycosylation sites using a the hydrazide beads, and thus some glycopeptides could be
multistage CID−MS2/MS3 approach. Conveniently, the missed in the LC−MS/MS analysis. This would be particularly
HexHexNAc-O- structure is predominantly fragmented during valid for mucins containing highly O-glycosylated regions and
the MS2 step generating the peptide and the peptide + HexNAc some highly glycosylated O-glycopeptides may be too large and
fragments as major ion peaks. Subsequent MS3 of the peptide complex for the present LC−MS/MS and/or automated
ion generates peptide backbone fragmentation into the b- and Mascot analysis, and will thus not be identified. However,
y-ion series. Thus, by introducing HexHexNAc (+365.1322 Da) only four of the 84 identified O-glycopeptides of Table 1
as a variable modification of Ser/Thr/Tyr in the Mascot search, contained an internal (i.e., not present at the glycopeptide N-
which simultaneously allowed for neutral loss of HexHexNAc, or C-terminal) missed trypsin cleavage site and thus O-glycans
the high accuracy measured glycopeptide mass was used as the do not seem to block trypsin to any larger extent from cleaving
precursor for the CID−MS3 spectrum of the peptide ion reduced/alkylated nonmucin O-glycoproteins immobilized
(Figure 2). The use of a similar strategy for assignment of high- onto the beads.
accuracy MS1 precursor masses for subsequent MS2 and MS3 of Steentoft et al recently identified more than 350 O-
phosphopeptides has been shown to increase the number of glycosylation sites from five different human cell-lines, of
identified peptides.47 CID−MS2 of the HexHexNAc-O- which mucin-16 contributed with about 100 sites.3 Interest-
substituted peptide ion and MS3 of the peptide + HexNAc ingly, only twelve of their reported O-glycosylation sites are in
ion often resulted in peptide backbone fragmentation of the common with this study (Table 1), four of which were from
remaining glycopeptide, which was used to assign the APP/A4 and four from APOE. We recently reported the
glycosylated Ser/Thr site within peptides containing several identification of 57 O-glycosylation sites from human urine
Ser/Thr residues (Figure 2 and Figure S2, Supporting proteins using the sialic acid capture-and-release strategy,34 and
Information). The automated Mascot search protocol could 15 of those glycosylation sites were in common with this study.
also be expanded to search for more complex glycans because a Thus, a combination of methods and sample sources is needed
core-2-like Hex(HexHexNAc)HexNAc-O- structure on Thr- to accomplish comprehensive O-glycoproteomic mapping of
212 of 210-AATVGSLAGQPLQER-224 from APOE was also proteins in relevant cells and clinical samples. A few of the O-
identified (Figure S4, Supporting Information), which is in glycopeptides reported here are most likely peptide fragments,
accordance with a previous glycoproteomics study of APOE.40 that is, neuropeptides that are released into the CSF. For
Occasionally, we also performed manual analysis of CID−MS2/ instance, endogenous neuropeptides containing the three
MS3 spectra to further characterize O-glycosylation sites different glycopeptide stretches from ProSAAS that we present
(Figures S3 and S6, Supporting Information). here are generated by convertase cleavage of the proprotein
To correctly pinpoint the attachment site(s) of the O- (see entry PCSK1_HUMAN in the UnprotKB database) and
glycopeptides, we used ECD and ETD on FTICR and Orbitrap were also identified in a neuropeptidomics study of human
instruments, respectively. In total, we have successfully chromaffin secretory vesicles.48 More importantly, the same
identified 106 O-glycosylation sites from CSF proteins and ProSAAS neuropeptides were identified and found to be
experimentally verified the exact attachment site for 67 of these modified with sialylated O-glycans in a CSF peptidomics
(Table 1). To check for analytical reproducibility, we selected study.49 The presence of selectively glycosylated neuropeptides
19 O-glycopeptides, based on their Mascot identification in at is interesting because the glycosylation might influence the
least three out six CSF samples that were run sequentially on proprotein-processing pathways.50
581 dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584
10. Journal of Proteome Research Article
We observed predominantly O-glycosylations on Ser/Thr Wallenberg Foundation are acknowledged for MS instrumen-
residues (position n), which had Pro residues at the n − 1, n + tation funding.
■
1, and/or n + 3 positions (P-S/T, S/T-P, and S/T-X-X-P,
respectively). This selective glycosylation-enhancing effect of
REFERENCES
Pro has been described previously based on data analysis of
reported O-glycosylation sites.51−54 Such glycosylation motifs (1) Varki, A.; Cummings, R.; Esko, J.; Freeze, H.; Stanley, P.;
have also been demonstrated for ppGalNAc-T1 and -T2 toward Bertozzi, C. R.; Hart, G.; Etzler, M. E. Essentials of Glycobiology.: Cold
S/T-P and S/T-X-X-P and for ppGalNAc-T2 toward P-S/T on Spring Harbor Laboratory Press: New York, 2009.
(2) Halim, A.; Brinkmalm, G.; Ruetschi, U.; Westman-Brinkmalm, A.;
model peptide libraries,21 and ppGalNAc-T3, -T5, and -T12 Portelius, E.; Zetterberg, H.; Blennow, K.; Larson, G.; Nilsson, J. Site-
also exhibit similar Pro specificities.22 In addition, model specific characterization of threonine, serine, and tyrosine glycosyla-
peptides containing T-P-A-P have been identified to be prone tions of amyloid precursor protein/amyloid β-peptides in human
to O-glycosylation by ppGalNAc-T123 and brain-specific cerebrospinal fluid. Proc. Natl. Acad. Sci. U.S.A. 2011, 108 (29),
ppGalNAc-T13,41 a glycosylation sequence that was also 11848−11853.
identified in this study (Figure 4C). Any of these ppGalNAc- (3) Steentoft, C.; Vakhrushev, S. Y.; Vester-Christensen, M. B.;
Ts are thus likely candidates for performing the O- Schjoldager, K. T.-B. G.; Kong, Y.; Bennett, E. P.; Mandel, U.;
Wandall, H.; Levery, S. B.; Clausen, H. Mining the O-glycoproteome
glycosylations observed for CSF O-glycoproteins. Our study
using zinc-finger nuclease−glycoengineered SimpleCell lines. Nat.
gives support to similar sequence-specific interactions between Methods 2011, 8 (11), 977−982.
the ppGalNAc-Ts and their substrates irrespective of whether (4) Varki, A. Uniquely human evolution of sialic acid genetics and
they are model peptides or natural proteins occurring in vivo. biology. Proc. Natl. Acad. Sci. U.S.A. 2010, 107, 8939−8946.
For 41 of the 106 identified O-glycosylation sites, we were not (5) Schauer, R. Sialic acids as regulators of molecular and cellular
able to pinpoint the actual glycosylated Ser/Thr residue using interactions. Curr. Opin. Struct. Biol. 2009, 19 (5), 507−514.
CID, ECD, or ETD (Table 1). Many of these O-glycopeptides (6) Liu, Y.-C.; Yen, H.-Y.; Chen, C.-Y.; Chen, C.-H.; Cheng, P.-F.;
indeed contained P-S/T, S/T-P, and S/T-X-X-P sequences but Juan, Y.-H.; Chen, C.-H.; Khoo, K.-H.; Yu, C.-J.; Yang, P.-C.; Hsu, T.-
L.; Wong, C.-H. Sialylation and fucosylation of epidermal growth
could nevertheless not be confidently assigned because of a lack factor receptor suppress its dimerization and activation in lung cancer
of unequivocal MS/MS data. cells. Proc. Natl. Acad. Sci. U.S.A. 2011, 108 (28), 11332−11337.
In conclusion, by removing the N-glycans from human CSF (7) Sørensen, A. L.; Rumjantseva, V.; Nayeb-Hashemi, S.; Clausen,
samples by PNGase F in a pretreatment step, it was possible to H.; Hartwig, J. H.; Wandall, H. H.; Hoffmeister, K. M. Role of sialic
selectively enrich tryptic O-glycopeptides using a sialic acid acid for platelet life span: Exposure of β-galactose results in the rapid
capture-and-release protocol. The core-1-like HexHexNAc-O- clearance of platelets from the circulation by asialoglycoprotein
structure was vastly dominant, which facilitated the use of an receptor-expressing liver macrophages and hepatocytes. Blood 2009,
114 (8), 1645−1654.
automated Mascot search protocol for identification of the O-
(8) Pang, P.-C.; Chiu, P. C. N.; Lee, C.-L.; Chang, L.-Y.; Panico, M.;
glycosylation sites. By using this methodology, we were able to Morris, H. R.; Haslam, S. M.; Khoo, K.-H.; Clark, G. F.; Yeung, W. S.
expand our list of O-glycosylation sites of CSF glycoproteins by B.; Dell, A. Human sperm binding is mediated by the sialyl-Lewisx
a factor of 3. We believe the this strategy should be useful for oligosaccharide on the zona pellucida. Science 2011, 333 (6050),
other clinical subproteomes as well, particularly those where 1761−1764.
complex N-glycosylations are quantitatively dominating, such as (9) Larsson, J. M. H.; Karlsson, H.; Sjovall, H.; Hansson, G. C. A
human serum samples. complex, but uniform O-glycosylation of the human MUC2 mucin
from colonic biopsies analyzed by nanoLC/MSn. Glycobiology 2009, 19
(7), 756−766.
(10) Sihlbom, C.; van Dijk, H. I.; Lidell, M. E.; Noll, T.; Hansson, G.
■
S
ASSOCIATED CONTENT
C.; Backstrom, M. Localization of O-glycans in MUC1 glycoproteins
using electron-capture dissociation fragmentation mass spectrometry.
Glycobiology 2009, 19 (4), 375−381.
(11) Johansson, M. E. V.; Larsson, J. M. H.; Hansson, G. C. The two
* Supporting Information mucus layers of colon are organized by the MUC2 mucin, whereas the
Additional information as noted in text. This material is outer layer is a legislator of host-microbial interactions. Proc. Natl.
available free of charge via the Internet at http://pubs.acs.org. Acad. Sci. U.S.A. 2011, 108, 4659−4665.
■ AUTHOR INFORMATION
(12) Nilsson, J.; Ruetschi, U.; Halim, A.; Hesse, C.; Carlsohn, E.;
Brinkmalm, G.; Larson, G. Enrichment of glycopeptides for glycan
structure and attachment site identification. Nat. Methods 2009, 6
Corresponding Author (11), 809−811.
*E-mail: jonas.nilsson@clinchem.gu.se Tel.: +46 31 342 2174. (13) Darula, Z.; Medzihradszky, K. F. Affinity enrichment and
characterization of mucin core-1 type glycopeptides from bovine
Fax: +46 31 82 84 58.
serum. Mol. Cell. Proteomics 2009, 8 (11), 2515−2526.
Notes (14) Balog, C.; Mayboroda, O.; Wuhrer, M. Mass spectrometric
identification of aberrantly glycosylated human apolipoprotein C-III
The authors declare no competing financial interest. peptides in urine from Schistosoma mansoni-infected individuals. Mol.
■ ACKNOWLEDGMENTS
We thank Prof. Henrik Zetterberg and Prof. Kaj Blennow at the
Cell. Proteomics 2010, 9 (4), 667−681.
(15) Sun, W.; Parry, S.; Ubhayasekera, W.; Engstrom, Å; Dell, A.;
Schedin-Weiss, S. Further insight into the roles of the glycans attached
to human blood protein C inhibitor. Biochem. Biophys. Res. Commun.
Neurochemistry Laboratory, Sahlgrenska University Hospital, 2010, 403 (2), 198−202.
for access to CSF samples. Expert MS assistance by Dr. Carina (16) Semenov, A. G.; Postnikov, A. B.; Tamm, N. N.; Seferian, K. R.;
Sihlbom and Sjoerd van der Post at the Proteomics Core Karpova, N. S.; Bloshchitsyna, M. N.; Koshkina, E. V.; Krasnoselsky,
Facility, The Sahlgrenska Academy, is acknowledged. This M. I.; Serebryanaya, D. V.; Katrukha, A. G. Processing of Pro-Brain
study was supported by grants from the Swedish Research
Council (8266 to G.L.), Alzheimer Foundation, and Magn.
Bergwall Foundation and governmental grants to the
Sahlgrenska University Hospital. The Inga-Britt and Arne
Lundberg Research Foundation and the Knut and Alice
582 dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584
11. Journal of Proteome Research Article
Natriuretic Peptide Is Suppressed by O-Glycosylation in the Region analysis by HILIC and mass spectrometry. Nat. Protoc. 2010, 5 (12),
Close to the Cleavage Site. Clin. Chem. 2009, 55 (3), 489−498. 1974−1982.
(17) Schjoldager, K. T.-B. G.; Vester-Christensen, M. B.; Bennett, E. (33) Zhang, H.; Li, X.-j.; Martin, D. B.; Aebersold, R. Identification
P.; Levery, S. B.; Schwientek, T.; Yin, W.; Blixt, O.; Clausen, H. O- and quantification of N-linked glycoproteins using hydrazide
glycosylation modulates proprotein convertase activation of angio- chemistry, stable isotope labeling and mass spectrometry. Nat.
poietin-like protein 3: Possible role of polypeptide GalNAc-trans- Biotechnol. 2003, 21 (6), 660−666.
ferase-2 in regulation of concentrations of plasma lipids. J. Biol. Chem. (34) Halim, A.; Nilsson, J.; Ruetschi, U.; Hesse, C.; Larson, G.
2010, 285 (47), 36293−36303. Human urinary glycoproteomics; attachment site specific analysis of
(18) Maryon, E. B.; Zhang, J.; Jellison, J. W.; Kaplan, J. H. Human N-and O-linked glycosylations by CID and ECD. Mol. Cell. Proteomics
copper transporter 1 lacking O-linked glycosylation is proteolytically 2012, 11, M111−013649.
cleaved in a Rab9-positive endosomal compartment. J. Biol. Chem. (35) Carlsohn, E.; Nystrom, J.; Karlsson, H.; Svennerholm, A.-M.;
2009, 284 (41), 28104−28114. Nilsson, C. L. Characterization of the outer membrane protein profile
(19) Tabak, L. A. The role of mucin-type O-glycans in eukaryotic from disease-related Helicobacter pylori isolates by subcellular
development. Sem. Cell Dev. Biol. 2010, 21 (6), 616−621. fractionation and nano-LC FT-ICR MS analysis. J. Proteome Res.
(20) ten Hagen, K. G.; Fritz, T. A.; Tabak, L. A. All in the family: the 2006, 5 (11), 3197−3204.
UDP-GalNAc:polypeptide N-acetylgalactosaminyltransferases. Glyco- (36) Olsen, J. V. Parts per Million Mass Accuracy on an Orbitrap
biology 2003, 13 (1), 1R−16R. Mass Spectrometer via Lock Mass Injection into a C-trap. Mol. Cell.
(21) Gerken, T. A.; Raman, J.; Fritz, T. A.; Jamison, O. Identification Proteomics 2005, 4 (12), 2010−2021.
of common and unique peptide substrate preferences for the UDP- (37) Crooks, G. E.; Hon, G.; Chandonia, J.-M.; Brenner, S. E.
GalNAc:polypeptide α-N-acetylgalactosaminyltransferases T1 and T2 WebLogo: A sequence logo generator. Genome Res. 2004, 14 (6),
derived from oriented random peptide substrates. J. Biol. Chem. 2006, 1188−1190.
281 (43), 32403−32416. (38) Breci, L. A.; Tabb, D. L.; Yates, J. R.; Wysocki, V. H. Cleavage
(22) Gerken, T. A.; Jamison, O.; Perrine, C. L.; Collette, J. C.; N-terminal to proline: Analysis of a database of peptide tandem mass
Moinova, H.; Ravi, L.; Markowitz, S. D.; Shen, W.; Patel, H.; Tabak, L. spectra. Anal. Chem. 2003, 75 (9), 1963−1971.
A. Emerging paradigms for the initiation of mucin-type protein O- (39) Cooper, H. J.; Hakansson, K.; Marshall, A. G. The role of
glycosylation by the polypeptide GalNAc transferase family of electron capture dissociation in biomolecular analysis. Mass Spectrom.
glycosyltransferases. J. Biol. Chem. 2011, 286 (16), 14493−14507. Rev. 2005, 24 (2), 201−222.
(23) Yoshida, A.; Suzuki, M.; Ikenaga, H.; Takeuchi, M. Discovery of (40) Lee, Y.; Kockx, M.; Raftery, M. J.; Jessup, W.; Griffith, R.;
the shortest sequence motif for high level mucin-type O-glycosylation. Kritharides, L. Glycosylation and sialylation of macrophage-derived
J. Biol. Chem. 1997, 272 (27), 16884−16888. human apolipoprotein E analyzed by SDS-PAGE and mass
(24) Fritz, T. A.; Hurley, J. H.; Trinh, L.-B.; Shiloach, J.; Tabak, L. A. spectrometry: Evidence for a novel site of glycosylation on Ser290.
The beginnings of mucin biosynthesis: The crystal structure of UDP- Mol. Cell. Proteomics 2010, 9 (9), 1968−1981.
GalNAc:polypeptide α-N-acetylgalactosaminyltransferase-T1. Proc. (41) Zhang, Y.; Iwasaki, H.; Wang, H.; Kudo, T.; Kalka, T.; Hennet,
Natl. Acad. Sci. U.S.A. 2004, 101 (43), 15307−15312. T.; Kubota, T.; Cheng, L.; Inaba, N.; Gotoh, M.; Togayachi, A.; Guo,
(25) Fritz, T. A.; Raman, J.; Tabak, L. A. Dynamic association J.; Hisatomi, H.; Nakajima, K.; Nishihara, S.; Nakamura, M.; Marth, J.;
between the catalytic and lectin domains of human UDP- Narimatsu, H. Cloning and characterization of a new human UDP-N-
GalNAc:polypeptide α-N-acetylgalactosaminyltransferase-2. J. Biol. acetyl-α-D-galactosamine:polypeptide N-acetylgalactosaminyltransfer-
Chem. 2006, 281 (13), 8613−8619. ase, designated pp-GalNAc-T13, that is specifically expressed in
(26) Raman, J.; Fritz, T. A.; Gerken, T. A.; Jamison, O.; Live, D.; Liu, neurons and synthesizes GalNAc α-serine/threonine antigen. J. Biol.
M.; Tabak, L. A. The catalytic and lectin domains of UDP- Chem. 2003, 278 (1), 573−584.
GalNAc:polypeptide α-N-Acetylgalactosaminyltransferase function in (42) Wernette-Hammond, M. E.; Lauer, S. J.; Corsini, A.; Walker, D.;
concert to direct glycosylation site selection. J. Biol. Chem. 2008, 283 Taylor, J. M.; Rall, S. C. Glycosylation of human apolipoprotein E. The
(34), 22942−22951. carbohydrate attachment site is threonine 194. J. Biol. Chem. 1989, 264
(27) Wandall, H. H.; Irazoqui, F.; Tarp, M. A.; Bennett, E. P.; (15), 9094−9101.
Mandel, U.; Takeuchi, H.; Kato, K.; Irimura, T.; Suryanarayanan, G.; (43) Takahashi, N.; Takahashi, Y.; Putnam, F. W. Structure of human
Hollingsworth, M. A.; Clausen, H. The lectin domains of polypeptide hemopexin: O-Glycosyl and N-glycosyl sites and unusual clustering of
GalNAc-transferases exhibit carbohydrate-binding specificity for tryptophan residues. Proc. Natl. Acad. Sci. U.S.A. 1984, 81 (7), 2021−
GalNAc: Lectin binding to GalNAc-glycopeptide substrates is required 2025.
for high density GalNAc-O-glycosylation. Glycobiology 2007, 17 (4), (44) Bock, S. C.; Skriver, K.; Nielsen, E.; Thøgersen, H. C.; Wiman,
374−387. B.; Donaldson, V. H.; Eddy, R. L.; Marrinan, J.; Radziejewska, E.;
(28) Perrine, C. L.; Ganguli, A.; Wu, P.; Bertozzi, C. R.; Fritz, T. A.; Huber, R. Human C1 inhibitor: Primary structure, cDNA cloning, and
Raman, J.; Tabak, L. A.; Gerken, T. A. Glycopeptide-preferring chromosomal localization. Biochemistry 1986, 25 (15), 4292−4301.
polypeptide GalNAc transferase 10 (ppGalNAc T10), involved in (45) Deshpande, N.; Jensen, P. H.; Packer, N. H.; Kolarich, D.
mucin-type O-glycosylation, has a unique GalNAc-O-Ser/Thr-binding GlycoSpectrumScan: Fishing glycopeptides from MS spectra of
site in its catalytic domain not found in ppGalNAc T1 or T2. J. Biol. protease digests of human colostrum sIgA. J. Proteome Res. 2010, 9
Chem. 2009, 284 (30), 20387−20397. (2), 1063−1075.
(29) Jensen, P. H.; Kolarich, D.; Packer, N. H. Mucin-type O- (46) Cooper, C. A.; Gasteiger, E.; Packer, N. H. GlycoModA
glycosylationPuttingthe pieces together. FEBS J. 2010, 277 (1), software tool for determining glycosylation compositions from mass
81−94. spectrometric data. Proteomics 2001, 1 (2), 340−349.
(30) Darula, Z.; Chalkley, R. J.; Baker, P.; Burlingame, A. L.; (47) Timm, W.; Ozlu, N.; Steen, J. J.; Steen, H. Effect of high-
Medzihradszky, K. F. Mass spectrometric analysis, automated accuracy precursor masses on phosphopeptide identification from MS3
identification and complete annotation of O-linked glycopeptides. spectra. Anal. Chem. 2010, 82 (10), 3977−3980.
Eur. J. Mass Spectrom. 2010, 16 (3), 421−428. (48) Gupta, N.; Bark, S. J.; Lu, W. D.; Taupenot, L.; O’Connor, D.
(31) Darula, Z.; Sherman, J.; Medzihradszky, K. F. How to dig T.; Pevzner, P.; Hook, V. Mass Spectrometry-Based Neuropeptido-
deeper? Improved enrichment methods for mucin core-1 type mics of Secretory Vesicles from Human Adrenal Medullary
glycopeptides. Mol. Cell. Proteomics 2012, 11, O111−016774. Pheochromocytoma Reveals Novel Peptide Products of Prohormone
(32) Palmisano, G.; Lendal, S. E.; Engholm-Keller, K.; Leth-Larsen, Processing. J. Proteome Res. 2010, 9, 5065−5075.
R.; Parker, B. L.; Larsen, M. R. Selective enrichment of sialic acid- (49) Zougman, A.; Pilch, B.; Podtelejnikov, A.; Kiehntopf, M.;
containing glycopeptides using titanium dioxide chromatography with Schnabel, C.; Kumar, C.; Mann, M. Integrated analysis of the
583 dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584
12. Journal of Proteome Research Article
cerebrospinal fluid peptidome and proteome. J. Proteome Res. 2008, 7
(1), 386−399.
(50) Gram Schjoldager, K. T.-B.; Vester-Christensen, M. B.; Goth, C.
K.; Petersen, T. N.; Brunak, S.; Bennett, E. P.; Levery, S. B.; Clausen,
H. A Systematic Study of Site-specific GalNAc-type O-Glycosylation
Modulating Proprotein Convertase Processing. J. Biol. Chem. 2011,
286 (46), 40122−40132.
(51) Wilson, I. B.; Gavel, Y.; von Heijne, G. Amino acid distributions
around O-linked glycosylation sites. Biochem. J. 1991, 275 (Pt 2), 529−
534.
(52) Elhammer, A. P.; Poorman, R. A.; Brown, E.; Maggiora, L. L.;
Hoogerheide, J. G.; dy, F. J. The specificity of UDP-
GalNAc:polypeptide N-acetylgalactosaminyltransferase as inferred
from a database of in vivo substrates and from the in vitro
glycosylation of proteins and peptides. J. Biol. Chem. 1993, 268
(14), 10029−10038.
(53) Gupta, R.; Birch, H.; Rapacki, K.; Brunak, S.; Hansen, J. E. O-
GLYCBASE version 4.0: A revised database of O-glycosylated
proteins. Nucleic Acids Res. 1999, 27 (1), 370−372.
(54) Thanka Christlet, T. H.; Veluraja, K. Database analysis of O-
glycosylation sites in proteins. Biophys. J. 2001, 80 (2), 952−960.
584 dx.doi.org/10.1021/pr300963h | J. Proteome Res. 2013, 12, 573−584