Protein domains of low sequence complexity--dark matter of the proteome [Perspectives]

Steven L. McKnight Department of Biochemistry, UT Southwestern Medical Center, Dallas, Texas 75390-9152, USA Corresponding author: steven.mcknightutsouthwestern.edu Abstract

This perspective begins with a speculative consideration of the properties of the earliest proteins to appear during evolution. What did these primitive proteins look like, and how were they of benefit to early forms of life? I proceed to hypothesize that primitive proteins have been preserved through evolution and now serve diverse functions important to the dynamics of cell morphology and biological regulation. The primitive nature of these modern proteins is easy to spot. They are composed of a limited subset of the 20 amino acids used by traditionally evolved proteins and thus are of low sequence complexity. This chemical simplicity limits protein domains of low sequence complexity to forming only a crude and labile type of protein structure currently hidden from the computational powers of machine learning. I conclude by hypothesizing that this structural weakness represents the underlying virtue of proteins that, at least for the moment, constitute the dark matter of the proteome.

Comments (0)

No login
gif