| Label |
Technical definition |
Practical definition |
| Canonical |
From each ProteinProphet protein_group, the protein with the
highest probability is selected to be canonical. Then, recursively,
any other protein in that group which shares fewer than 80% of its
peptides with any other canonical from that group is also labeled
canonical. During this selection process, each set of
indistinguishable proteins is considered to be a single entity, and
the one from that set with the most preferred identifier (for human
and mouse, Swiss-Prot primary splice variant) is the one labeled
canonical. |
The set of canonical proteins is a minimal, non-redundant list of
proteins derived from the set of identified peptides for an Atlas. The
number of canonicals is what we use as the protein count for the Atlas
build. |
| Possibly Distinguished |
From each ProteinProphet protein_group, any protein that is not
canonical and not subsumed is labeled possibly_distinguished. As
above, from among any set of indistinguishable proteins, only one (the
one with the most preferred identifier) is labeled
possibly_distinguished. |
The set of canonical proteins plus possibly_distinguished proteins
is a more inclusive, but also non-redundnat, list of proteins derived
from the set of identified peptides for an Atlas. The canonical list
will not explain all observed peptides, but the combined canonical
plus possibly_distinguished list '''will''' explain all observed
peptides. |
| Subsumed |
Any protein labeled subsumed by ProteinProphet. As above, from
among any set of indistinguishable proteins, only one (the one with
the most preferred identifier) is labeled subsumed. |
A protein whose observed peptides are a subset of the observed
peptides of a canonical or possibly distinguished protein is considered subsumed. For any pair
of subsuming/subsumed proteins, it is possible that both have been
observed, but it is more conservative to claim that only the subsuming
has been observed. Subsumed proteins are not necessary to explain all
observed peptides. |
| NTT-Subsumed |
Any protein that is possibly_distinguished by the above
definition, but whose peptides differ from a canonical only by the
number of tryptic terminii. |
Proteins that are ntt-subsumed contain exactly the same set of
observed peptides as a canonical protein, but at least one of those
peptides has fewer tryptic terminii in the ntt-subsumed protein. For
any pair of possibly_distinguished/ntt-subsumed proteins, it is much
more likely that the possibly_distinguished has been observed, because
that will be the one with the greater number of tryptic terminii among
its peptides. |