Biotechnology Bulletin ›› 2025, Vol. 41 ›› Issue (10): 143-155.doi: 10.13560/j.cnki.biotech.bull.1985.2025-0470

Previous Articles     Next Articles

Advances in Protein Mining and Design Based on Artificial Intelligence

HE Yuan1,2(), MOU Qiang1,2, HE Yu-bing2,3(), ZHAO Xiao-yan2, WANG Jian1,2, ZHOU Guo-min4,5,6, ZHANG Jian-hua1,2()   

  1. 1.Agricultural Information Institute, Chinese Academy of Agricultural Sciences, Beijing 100081
    2.National Nanfan Research Institute, Chinese Academy of Agriculture Science, Sanya 572024
    3.Institute of Crop Sciences, Chinese Academy of Agricultural Sciences, Beijing 100081
    4.Nanjing Institute of Agricultural Mechanization, Ministry of Agriculture and Rural Affairs, Nanjing 210014
    5.National Agricultural Science Data Center, Beijing 100081
    6.Institute of Western Agriculture, Chinese Academy of Agricultural Sciences, Changji 831100
  • Received:2025-05-08 Online:2025-10-26 Published:2025-10-28
  • Contact: HE Yu-bing, ZHANG Jian-hua E-mail:821012450699@caas.cn;heyubing@caas.cn;zhangjianhua@caas.cn

Abstract:

Proteins serve as fundamental components of life, with their structural and functional diversity underpinning complex biological processes such as cellular metabolism, signal transduction, and environmental response. As core subjects in life sciences and synthetic biology, protein functional mining and rational design have long demonstrated significant application potential in fields including drug development, industrial enzyme optimization, and agricultural bioengineering. With the accumulation of high-throughput multi-omics data and advances in computational biology, traditional approaches, relying on sequence alignment, structural analysis, and experimental screening, have increasingly revealed limitations in efficiency and scalability. In recent years, artificial intelligence (AI) technologies have been progressively integrated into protein science, catalyzing a paradigm shift toward data-driven research. This review summarizes and analyzes representative advances in AI-driven protein functional mining and rational design, with a particular focus on the two mainstream design frameworks: “sequence-to-structure” and “structure-to-sequence”. The review also explores diverse mining strategies based on sequence and structural similarity and further discusses the practical contributions of key AI methodologies, such as language models, evolutionary information integration, and generative modeling, in enhancing design efficiency and accuracy.

Key words: artificial intelligence, protein design, language models, protein mining