created: 23rd June 1997, last updated: 23rd June 1997,© 1998 ABRF
ABRF-97SEQ: SEQUENCING RESULTS OF A LOW LEVEL SAMPLE
Kathryn Stone, Joseph Fernandez, Arie Admon, William Henzel, William Lane, Michael Rohde and Laurey Steinke Association of Biomolecular Resource Facilities (ABRF) Protein Sequence Research Committee, 9650 Rockville Pike, Bethesda MD 20814.
The ABRF-97SEQ sample is the 10th in a series of studies designed to aid participant laboratories in determining their abilities to obtain amino acid sequence data. This year's sample was a mixture of 2 peptides at an approximate pmole ratio of 10:2 and was indicative of a peak that might be obtained by RP-HPLC of a tryptic digest. Participants were asked to use Edman sequencing or a combination of Edman sequencing and MS/MS or PSD sequence analysis.
One goal of this year's study was to determine the success rate of cysteine determination when a pre-derivatized sample was provided. Thus, the cysteine(s) in the sample were modified using acrylamide to Cys-S-propionamide prior to sending out the sample. A PTH-Cys-S-PAM standard was supplied along with the sample so that participant laboratories could first determine the separation of this cysteine derivative on their sequencing systems. In addition, an internal sequencing standard was also supplied. This standard, which participants were asked to co-sequence with the ABRF-97SEQ sample, allowed an independent monitoring of the sequencer performance. Another goal of the study was to help participants correctly utilize the information in a MALDI-MS spectra of ABRF-97SEQ that was provided. Participants were asked to use this to assist in determining the length of the peptide and to help verify the correct sequence had been called.
Introduction
ABRF-97SEQ represents the tenth in a series of unknown samples which have been distributed to members of the Association for Biomolecular Resource Facilities which perform protein sequencing. These samples are designed to allow member laboratories to evaluate themselves using a sample which presents some of the problems facilities routinely encounter. The current sample was designed to examine the ability to sequence low level mixtures of peptides (a situation often encountered in core facilities) and reliably identify tryptophan and derivatized cysteine in various positions in a peptide. In addition, this sample was designed to be very similar to the mixture of peptides used to provide the dataset distributed as ABRF-96SEQ. This allows comparison of sequence calling expertise with the technical aspect of sequencing peptides.
Methods
Peptides were synthesized by Janet Crawford at Yale University, on a Rainin Symphony Multiple Peptide Synthesizer using double coupling and standard fmoc chemistry. The cysteine residues were derivatized by Joe Fernandez at Rockefeller University using acrylamide. Amino acid analysis was performed by standard techniques to determine concentration of the peptides. The peptides were then mixed at a 10:2 ratio of the 21-mer to the 14-mer. Sequencing was performed by Edman, MS/MS and PSD by the committee members to ensure sample quality. Samples were then distributed to ABRF member laboratories, which were asked to sequence the mixture, adding 5 picomoles of an internal standard. The 17-mer internal sequencing standard contained norleucine in positions 1, 6, 11 and 16 and succyinylated lysine in the remaining positions. The purpose of the internal sequencing standard was to allow an independent monitoring of the sequencer performance during the run. Results were reported to a third party, which removed identifying marks and forwarded the data to the sequencing committee for analysis.
Abbreviations Key
PC Positive Correct
PW Positive Wrong [Includes calls of end before the end of the peptide]
TC Tentative Correct [Includes calls of (end) before the end of the peptide]
TW Tentative wrong
To download an Excel file showing Sequence Assignments for ABRF-97SEQ by Faculty Identification Number, click here.
Major Sequence: IWTCM EGANS YQCAS WAGLY FK
Minor Sequence: HYAEG DSVAT KPAR
Residues identified correctly the most often are alanine in cycle 8, methionine in cycle 5, glutamate in cycle 6 and glycine in cycle 7. Residues identified incorrectly the most frequently are isoleucine in cycle 1, tryptophan in cycle 16 and serine in cycles 15 and 10.
Amino acids identified correctly most often are valine, cycle 8 and glutamate, cycle 4. The residues identified incorrectly most often were histidine in cycle 1, and the alanine in cycle 9. Cycle 1 can be difficult to call due to the elution of any free amino acids in this cycle and many respondents opted to make a "no call" for this residue. Cycle 9 was complicated by the sequencing lag from the major sequence which contained an alanine in cycle 8.
|
| |||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
|
| ||||
|
|
| ||||
|
Total # Cycles Assigned |
|
|
|
|
|
|
Average Cycles Assigned |
|
|
|
|
|
|
Total # Correct Assignments |
|
|
|
|
|
|
Total # Incorrect Assignments |
|
|
|
|
|
|
Total # Positive Assignments |
|
|
|
|
|
|
Total # Tentative Assignments |
|
|
|
|
|
|
Average # Correct Assigned |
|
|
|
|
|
|
Average # Positive Assigned |
|
|
|
|
|
|
Average # Tentative Assigned |
|
|
|
|
|
|
Average # Incorrect Assigned |
|
|
|
|
|
|
Accuracy of PC Assignments |
|
|
|
|
|
|
Accuracy of TC Assignments |
|
|
|
|
|
|
(a) Sequence assignments were categorized as positive correct (PC), tentative correct (TC), positive incorrect (PI), or tentative incorrect (TI). The number of responses (R) was 50 for ABRF-97SEQ and 95 for ABRF-96SEQ B. The number of unassigned residues was 53, 137, 15, and 177 for ABRF-97SEQ major, ABRF-97SEQ minor, ABRF-96SEQ dataset B major, and ABRF-96SEQ dataset B minor, respectively. | |||||
As indicated previously, this years sample (ABRF-97SEQ)was designed to be similar to the previous study (ABRF-96SEQ) where a sequencing dataset, instead of a sample, was sent out. As indicated in the above table, a greater number of amino acid residues were positively correctly called by participants in 1996 for both the major and minor sequences. In this current study, the peptide mixture, which contained peptides with similar compositions and lengths to the ABRF-96SEQ study, were sequenced in the participating laboratories. The average number of cycles called for the major sequence was 17 residues in the ABRF-97SEQ study and 22 residues in the ABRF-96SEQ study.
Comparison of Cysteine and Tryptophan Assignments for ABRF-SEQ Samples
|
| ||||
|
| ||||
|
|
|
|
|
|
|
STD-1 |
|
|
|
|
|
ABRF-89SEQ |
|
|
|
|
|
ABRF-90SEQ |
|
|
|
|
|
ABRF-91SEQ |
|
|
|
|
|
ABRF-92SEQ |
|
|
|
|
|
ABRF-93SEQ |
|
|
|
|
|
ABRF-94SEQ |
|
|
|
|
|
ABRF-95SEQ |
|
|
|
|
|
ABRF-96SEQA |
|
|
|
|
|
ABRF-96SEQB Major |
|
|
|
|
|
ABRF-96SEQB Minor |
|
|
|
|
|
ABRF-97SEQ Major |
|
|
|
|
|
ABRF-97SEQ Minor |
|
|
|
|
The positive accuracy of cysteine determination is slightly lower in ABRF-97SEQ Major (88% and 97% for the two positions) than it was in ABRF-96SEQB Major (99% for both positions). In the ABRF-96SEQ study, a dataset was provided for sequence calling and contained carboxamidomethylated cysteine. Cysteine accuracy was 99% for both C6 and C13. The cysteines in the ABRF-97 SEQ sample were derivitized to Sys-S propionamide prior to sending out the sample. In contrast, positive accuracy of cysteine indentification in ABRF-95SEQ, where laboratories were asked to derivatize cysteine in-house, was lower (63%). Positive identification of tryptophan was similar in ABRF-96SEQB Major and ABRF-97SEQ Major when it appeared early in the peptide (83% and 90%). When tryptophan appeared late in the peptide, positive accuracy of identification was lower in the peptide sequenced in member laboratories (64% compared to 92%).
|
| |||||||||||
|
| |||||||||||
|
|
|
|
|
| |||||||
|
|
|
|
|
|
|
|
|
|
|
|
|
|
ABD 470 |
|
|
|
|
|
|
|
|
|
|
|
|
ABD 473/6 |
|
|
|
|
|
|
|
|
| ||
|
ABD 477 |
|
|
|
|
|
|
|
|
| ||
|
ABD 49X-HT |
|
|
|
|
|
|
|
|
| ||
|
HP G100X |
|
|
|
|
|
|
|
|
| ||
|
Porton/Beckman |
|
|
|
|
|
|
|
|
| ||
|
ABD 494-HS |
|
|
|
|
|
|
|
|
| ||
|
| |||||||||||
|
(a) HT = high throughput and HS = high sensitivity | |||||||||||
Participants who used ABD 49X-HT were able to make an average of 17 positive correct calls for ABRF-97SEQ Major and 5 positive correct calls for ABRF-97SEQ Minor.
Use of Mass Spectrometry in Protein Sequencing
Surprisingly, little use was made of mass spectrometry in analyzing this sample. Mass spectrometry is used only in a minority of protein sequencing facilities, as indicated by the survey responses. The committee feels this is a valuable tool and would expect wider use.
Minimum uses of MS include:
Each of these will add confidence to the assigned sequence.

To obtain a high-resolution PICT file of the figure above, click here.
MS/MS sequencing has some advantages over Edman:
Sequencing by Edman degradation also has advantages over MS/MS:
In summary both Edman and Mass Spectrometry data are complementary and essential techniques in protein characterization and should be used together whenever possible.
ABRF-97SEQ Survey Demographics
All fifty respondents used Edman sequencing. Thirty eight used ABI sequencers (21 used Procise 49X HT (High Throughput), one used 49X HS (High Sensitivity), 7 used 477A, 4 used 473/6A, 5 used 470A, four Beckman/Porton, and eight used Hewlett Packard instruments. The average age for all instruments was 4.02 years.
For sample support matrix, glass fiber filters was the choice for 37 users, 4 used PVDF, one used both PVDF and glass fiber, and seven used biphasic columns (one HP user used PVDF). There was one survey that did not specify the support used.
Forty labs did not use modified reagents or cycles. Modifications listed included: n-Me-piperidine and microcartridge, unspecified improvements to cycles, non-vendor solvents and reagents, added DTT in S1 and S2, or PITC in TMA.
Premix buffers (from concentrate or ready to use) were used by 44 facilities, about half responded that additives were used (most commonly acetone, inorganic phosphate or TFA; one lab used tryptophan and three added dmptu to buffer B).
Twenty-five labs indicated that over the past year >75% of their Edman sequencing was in the 10-75 pmole range, twenty in the 1-10 range, two under a picomole and one in the 250-1000 pmole range.
Reconstitution of the sample averaged at 47 µL (47 out of 50 reported a volume used); thirty-five labs used the committee's recommended solvent. Only two labs reported taking sample for MS, but the average amount reported to have been used for Edman was 32 µL versus the average of 47 µL used for reconstitution. The fraction of PTH-AAs injected for HPLC analysis was 69% average, 42% minimum and 95% maximum.
Use of the internal standard supplied by the committee was reported as "yes" by 42 and "no" by 6 labs. Thirty-three found it useful and 31 would use it again (9 and 15 responded "no", respectively). The PTH-PAMCys standard supplied by the committee was used by 40 labs and most found it between Glu and His for the ABI systems, some on top of dmptu; one placed it between Gly and Glu. Most HP users found it before His.
Only 20.5 labs reported that they routinely used MS and Edman for analysis of peptides. Sixteen reported that they used MS for purity assessment, fourteen to estimate the number of cycles, fifteen to eliminate Edman ambiguities, and seven reported using the MS/MS mode. Nine labs did their own MS analysis on the sample, five using Perceptive and four using Finnigan instrumentation; six used MALDI (one Lasermat and the rest assorted Perceptive) and three used electrospray (two TSQ and one LCQ).
When asked what type of sample would be preferred for next year's study, only thirty-three responses were received; the weighted average scores were close for either <10 pmole or blotted samples; less highly scored were the options of mixtures or blocked samples. Other suggestions included: modified or unknown amino acids (n = 2), blotted sample under 2 pmoles of a single sequence, CNBr cleavage problems of blocked PVDF or solution sample, radiosequence, and "No Sample!"
Other general comments:
Conclusions
- Three laboratories assigned all amino acid residues correctly for both the major and the minor sequences. Five more laboratories were able to assign all the residues in the major sequence correctly while not assigning all residues in ABRF97-SEQ minor.
- The accuracy of positive and tentative correct assignments is lower in the 2 picomole (minor) sequence than it is with the 10 picomole (major) sequence. This study does not provide data to determine if accuracy of the 2 picomole sequence is lower because of the low level of sample or if it is lower because this is the minor component of a mixture.
- Positive accuracy of a 10 picomole sample is slightly higher when member laboratories are provided with a dataset, as opposed to performing sequencing in-house. The accuracy of a 2 picomole sample of peptide is significantly lower when sequencing is performed in-house as opposed to a dataset being provided.
- All three labs which correctly called the entire ABRF-97SEQ use a ADB 494-HT. All three labs use Premix buffers with acetone and phosphate to buffer A. All three also routinely use both MS and Edman analysis for peptides to screen peptides for purity, to estimate the number of Edman cycles needed and to eliminate Edman ambiguities. None of the three use MS/MS mode.
- Based on the high percentage accuracy for cysteine calls, it appears the derivatization of cysteine poses a problem for member laboratories, and not the identification of derivatized cysteine.
- Tryptophan is identified with good positive accuracy when found early during Edman sequencing. When it is found in cycle 16 positive accuracy falls. This was not the case when data was provided member laboratories. It is possible that differences in positive accuracy identifying tryptophan are due in part to to low or poor recoveries of this PTH amino acid.
- The positive accuracy of those laboratories using the same instrument used to generate the dataset for ABRF96-SEQB is comparable to the positive accuracy for the dataset last year. Problems associated with sequence identification are attributable more to instrumentation and sample handling problems and not to data interpretation.
- There was no correlation between the number of positive correct identifications in either the major or minor peptide and the repetitive yeilds of the machines as measured by the internal standard.
- Mass Spectrometry was not yet being widely used among participating member laboratories to sequence peptides. 42% of member laboratories are using mass spectrometry for mass determination and aid in sequencing.
- The 22% of participants who made positive overcalls for the first cycle after the peptide sequence had all marked that they did not use mass spectrometry in their laboratories. 30% of these overcalls were lysine, indicating a problem with lag. Positive overcall should not have occurred since a maldi-ms spectra was provided with the sample.
Protein Sequence Committee Questionnaire (accompanying the poster)
Did you participate in this year's study: ABRF-97SEQ?
Yes. (Thank you for your response. Please feel free to add your comments)
No Please continue with next question.
Why didn't you participate in the study?
No time
Sample amount was too low.
It costs too much to run a test sample.
I/We don't care about such studies.
There were too many ABRF samples this year; I/we couldn't do them all.
I/We ran it but got no results so we did not bother to send in a negative result.
Other_____________________________________________________________
Comments:
Return to the The ABRF Home Page