PeptideAtlas
PeptideAtlas Home
 Seattle Proteome
 Center

  
PeptideAtlas:
  Overview
  Contacts
  Data Contributors
  Publications
  Software
  Database Schema
  Feedback
  FAQ


Atlas Data:
  Data Repository
  HPPP Data Central
  PeptideAtlas Builds
  Search Database
  

  Contribute Data
  Genome Browser
Setup


Spectral Libs:
  Libraries + Info
  SpectraST Search


Glossary/Terms:
  Atlas nomenclature
  SGD nomenclature

  


PeptideAtlas Overview

The long term goal of the PeptideAtlas project is full annotation of eukaryotic genomes through a thorough validation of expressed proteins. The PeptideAtlas provides a method and a framework to accommodate proteome information coming from high-throughput proteomics technologies. The online database administers experimental data in the public domain. We encourage you to contribute to the database.

Details of the PeptideAtlas construction can be found within the first publication. Briefly, the general outline of obtaining high quality peptide sequences, mapping, and storing in a database is shown in the figure below and outlined here.

A protein mixture sample is prepared (perhaps labeled, digested with trypsin, purified, separated using chromotography).

The sample is run through a mass spectrometer (e.g., ESI MS/MS).

The MS/MS spectra are compared to theoretical spectra to identify possible peptides (SEQUEST).

The peptide identifications are scored, formed into false and true positive distributions, and subsequently filtered to retain only the highest scoring identifications (PeptideProphet).

The peptide sequences are compared to protein sequence databases using the NCBI BLAST program ( e.g. for human, we use the Ensembl protein sequence database). As the peptides are identified in a given protein, so are their locations relative to the protein start (CDS coordinates).

The search results are parsed for perfect pattern matches.

The peptide locations in chromosomal coordinates are calculated from the CDS coordinates.

The data are stored in the SBEAMS database and can be accessed through web pages (see Browse Database). The peptides are assigned a unique identity of the form PAp[8-digit number], such as PAp0000001 in our database, but can also be found via their sequences.

The Human peptides can be browsed as a track within Ensembl's Genome Browser. This functionality is available for the Human_P0.9_Ens30_NCBI35.

The PeptideAtlas is stored using a database schema which accommodates different builds of PeptideAtlas, different versions of ENSEMBL, different organisms (for example, human, fly, mouse), and different reference protein sequence sets as starting material.



© 2005, Institute for Systems Biology, All Rights Reserved