ProForma Proteoform Notation
Published in the Journal of Proteome Research here.
Releases and Working Version
A numbered version of the standard notation PDF is found here.
The work-in-progress version of the standard is here.
Our subcommittee was tasked to propose a notation that can be used to write the sequence of a given proteoform. A proteoform is a specific set of amino acids arranged in a particular order, which may be further modified (cotranslationally, posttranslationally, or chemically) at designated locations. We met via phone, working on electronically shared documents, for an hour a week between November and December 2016. Our task is described below:
Task: provide an unambiguous notation for writing an individual proteoform. The notation must:
- Be Human readable. Suitable for display in written document or presentation.
- Be Machine parsable.
- Contain the complete amino acid sequence of the observed proteoform
- Specify the location and type of each modification.
We welcome comments, suggestions and extensions to the notation to ensure its widespread usage. They can be submitted as issues and will then be discussed to find a common solution.
Software such as stand-alone tools, plugins, libraries can be added via pull requests and will be carefully evaluated by the committee.
Validator software checks a file with proteoform sequences for inconsistencies such as wrong usage of the nomenclature rules.
Convertors allow conversion to and from files containing files annotated according to the notation.