[Bioinformatics] Exploring the Protein Sequence Space with Global Generative Models

Sergio Romero-Romero1,4, Sebastian Lindner2,4 and Noelia Ferruz3 1Department of Biochemistry, University of Bayreuth, 95447 Bayreuth, Germany 2University of Heidelberg, 69047 Heidelberg, Germany 3Barcelona Institute of Molecular Biology, 08028 Barcelona, Spain Correspondence: noelia.ferruzibmb.csic.es

4 These authors contributed equally to this work.

Recent advancements in specialized large-scale architectures for training images and language have profoundly impacted the field of computer vision and natural language processing (NLP). Language models, such as the recent ChatGPT and GPT-4, have demonstrated exceptional capabilities in processing, translating, and generating human language. These breakthroughs have also been reflected in protein research, leading to the rapid development of numerous new methods in a short time, with unprecedented performance. Several of these models have been developed with the goal of generating sequences in novel regions of the protein space. In this work, we provide an overview of the use of protein generative models, reviewing (1) language models for the design of novel artificial proteins, (2) works that use non-transformer architectures, and (3) applications in directed evolution approaches.

留言 (0)

沒有登入
gif