View on GitHub

ProForma: a Standard Proteoform Notation

The Consortium for Top Down Proteomics standardized notation for writing fully characterized proteoforms.

ProForma Proteoform Notation

Version 2.0

ProForma version 2.0 is published in the Journal of Proteome Research.

ProForma is now being actively maintained by HUPO-PSI. Please see https://github.com/HUPO-PSI/ProForma.

Version 1.0

ProForma version 1.0 is also published in the Journal of Proteome Research.

###

tocgraphic2

Releases and Working Version

A numbered version of the standard notation PDF is found here.

The work-in-progress version of the standard is here.

Introduction

Our subcommittee was tasked to propose a notation that can be used to write the sequence of a given proteoform. A proteoform is a specific set of amino acids arranged in a particular order, which may be further modified (cotranslationally, posttranslationally, or chemically) at designated locations. We met via phone, working on electronically shared documents, for an hour a week between November and December 2016. Our task is described below:

Task: provide an unambiguous notation for writing an individual proteoform. The notation must:

Be Human readable. Suitable for display in written document or presentation.
Be Machine parsable.
Contain the complete amino acid sequence of the observed proteoform
Specify the location and type of each modification.

Contributions

Notation rules

We welcome comments, suggestions and extensions to the notation to ensure its widespread usage. They can be submitted as issues and will then be discussed to find a common solution.

Software tools

Software such as stand-alone tools, plugins, libraries can be added via pull requests and will be carefully evaluated by the committee.

Validator software checks a file with proteoform sequences for inconsistencies such as wrong usage of the nomenclature rules.

Convertors allow conversion to and from files containing files annotated according to the notation.