PeptideAtlas
PeptideAtlas Home
 Seattle Proteome
 Center

  
PeptideAtlas:
  Overview
  Contacts
  Data Contributors
  Publications
  Software
  Database Schema
  Feedback
  FAQ

Atlas Data:
  Data Repository
  Human Plasma
(Farrah, et al.)

  HPPP Data Central
  PeptideAtlas Builds
  Search Database
  

  Contribute Data
  Genome Browser
Setup


Related:
  SRMAtlas
  PASSEL
  Phosphopep
  Unipep
  mspecLINE

Spectral Libs:
  Libraries + Info
  SpectraST Search


Glossary/Terms:
  Atlas nomenclature
  SGD nomenclature
  Protein ID terms

  
  LOGIN


PeptideAtlas builds are available for download as a group of three different file formats, each containing some unique information not found in the other file formats.

FASTA format file: A file of sequences from the PeptideAtlas build where the first line is ">" followed by the PeptideAtlas accession and the second line is the peptide amino-acid sequence.

>PAp0000001
AAHEEICTTNEGVMYR


Biosequence Set: the FASTA file that we map the peptide sequences to. It includes the database (including decoys) we use to do the searches, and may includes the ensemble database if available.

CDS coordinates file: A file containing the peptide accession, and the position of the peptide relative to protein start (CDS coordinates).

PeptideAtlas accession Sequence length Protein accession Length of sequence match % Identity (=% match of sequence) Start of sequence in protein CDS End of sequence in protein CDS Difference between sequence and matched sequence
PAp00000135 10 ENSP00000315757 10 100 40 49 0

CDS and Chromosomal Coordinates file: A file containing the peptide accession, the peptide's position within a protein relative to protein start (CDS coordinates), and it's chromosomal coordinates.

PeptideAtlas accession Sequence length Protein accession Length of sequence match % Identity (=% match of sequence) Start of sequence in protein CDS End of sequence in protein CDS Difference between sequence and matched sequence Protein description Strand Start of sequence in chromosome End of sequence in chromosome Transcript ID Gene ID
PAp00000135 10 ENSP00000315757 10 100 40 49 0 chromosome:NCBI35:13:1:114142980:1 -1 45631043 45631072 ENST00000323076 ENSG00000136167

Database export XML: This file contains the contents of the various tables in the Peptide Atlas schema for a specified build, exported in an XML format. This format is suitable for loading into a preexisting PeptideAtlas schema using the SBEAMS DataImport.pl script. This format can be loaded into mysql or MS SQL Server databases, and could possibly be used for others.

mysqldump export file: Peptide Atlas schema and data for a given build exported using mysqldump utility. The data can be loaded into an empty mysql instance with the mysql command-line utility as follows: mysql -u username -D database < PA_export.mysql This greatly accelerates loading the data info a mysql instance relative to the xml format above, but would probably require some changes to work with a database other than mysql (SQL dialect issues)

© 2011, Institute for Systems Biology, All Rights Reserved