| [1] |
Zhao Q, Li M, Zhang M, Tan H. 2024. Glandular trichomes: the factory of artemisinin biosynthesis. Medicinal Plant Biology 3:e019 doi: 10.48130/mpb-0024-0018 |
| [2] |
Jiang B, Gao L, Wang H, Sun Y, Zhang X, et al. 2024. Characterization and heterologous reconstitution of Taxus biosynthetic enzymes leading to baccatin III. Science 383:622−29 doi: 10.1126/science.adj3484 |
| [3] |
Reed J, Orme A, El-Demerdash A, Owen C, Martin LBB, et al. 2023. Elucidation of the pathway for biosynthesis of saponin adjuvants from the soapbark tree. Science 379:1252−64 doi: 10.1126/science.adf3727 |
| [4] |
Chen W, Gao Y, Xie W, Gong L, Lu K, et al. 2014. Genome-wide association analyses provide genetic and biochemical insights into natural variation in rice metabolism. Nature Genetics 46:714−21 doi: 10.1038/ng.3007 |
| [5] |
Schulte-Sasse R, Budach S, Hnisz D, Marsico A. 2021. Integration of multiomics data with graph convolutional networks to identify new cancer genes and their associated molecular mechanisms. Nature Machine Intelligence 3:513−26 doi: 10.1038/s42256-021-00325-y |
| [6] |
Huang W, Zhang X, Li J, Lv J, Wang Y, et al. 2024. Substrate promiscuity, crystal structure, and application of a plant UDP-glycosyltransferase UGT74AN3. ACS Catalysis 14:475−88 doi: 10.1021/acscatal.3c05309 |
| [7] |
Abramson J, Adler J, Dunger J, Evans R, Green T, et al. 2024. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630:493−500 doi: 10.1038/s41586-024-07487-w |
| [8] |
Hayes T, Rao R, Akin H, Sofroniew NJ, Oktay D, et al. 2025. Simulating 500 million years of evolution with a language model. Science 387:850−58 doi: 10.1126/science.ads0018 |
| [9] |
Nguyen E, Poli M, Durrant MG, Kang B, Katrekar D, et al. 2024. Sequence modeling and design from molecular to genome scale with Evo. Science 386:eado9336 doi: 10.1126/science.ado9336 |
| [10] |
Liu Y, Zhao X, Gan F, Chen X, Deng K, et al. 2024. Complete biosynthesis of QS-21 in engineered yeast. Nature 629:937−44 doi: 10.1038/s41586-024-07345-9 |
| [11] |
Liao LX, Song XM, Wang LC, Lv HN, Chen JF, et al. 2017. Highly selective inhibition of IMPDH2 provides the basis of antineuroinflammation therapy. Proceedings of the National Academy of Sciences of the United States of America 114:E5986−E5994 doi: 10.1073/pnas.1706778114 |
| [12] |
De La Peña R, Hodgson H, Liu JC, Stephenson MJ, Martin AC, et al. 2023. Complex scaffold remodeling in plant triterpene biosynthesis. Science 379:361−68 doi: 10.1126/science.adf1017 |
| [13] |
Zhang M, Bao YO, Zhao CX, Tian YG, Wang ZL, et al. 2024. A four-step biosynthetic pathway involving C-3 oxidation–reduction reactions from cycloastragenol to astragaloside IV in Astragalus membranaceus. The Plant Journal 120:569−77 doi: 10.1111/tpj.17001 |
| [14] |
Mehta N, Meng Y, Zare R, Kamenetsky-Goldstein R, Sattely E. 2024. A developmental gradient reveals biosynthetic pathways to eukaryotic toxins in monocot geophytes. Cell 187:5620−37 doi: 10.1016/j.cell.2024.08.027 |
| [15] |
Nett RS, Lau W, Sattely ES. 2020. Discovery and engineering of colchicine alkaloid biosynthesis. Nature 584:148−53 doi: 10.1038/s41586-020-2546-8 |
| [16] |
Hong B, Grzech D, Caputi L, Sonawane P, López CER, et al. 2022. Biosynthesis of strychnine. Nature 607:617−22 doi: 10.1038/s41586-022-04950-4 |
| [17] |
Zubieta C, He X, Dixon RA, Noel JP. 2001. Structures of two natural product methyltransferases reveal the basis for substrate specificity in plant O-methyltransferases. Nature Structural Biology 8:271−79 doi: 10.1038/85029 |
| [18] |
Wang HT, Wang ZL, Chen K, Yao MJ, Zhang M, et al. 2023. Insights into the missing apiosylation step in flavonoid apiosides biosynthesis of Leguminosae plants. Nature Communications 14:6658 doi: 10.1038/s41467-023-42393-1 |
| [19] |
Peng Z, Song L, Chen M, Liu Z, Yuan Z, et al. 2024. Neofunctionalization of an OMT cluster dominates polymethoxyflavone biosynthesis associated with the domestication of citrus. Proceedings of the National Academy of Sciences of the United States of America 121:e1973352175 doi: 10.1073/pnas.2321615121 |
| [20] |
Hodgson H, De La Peña R, Stephenson MJ, Thimmappa R, Vincent JL, et al. 2019. Identification of key enzymes responsible for protolimonoid biosynthesis in plants: Opening the door to azadirachtin production. Proceedings of the National Academy of Sciences of the United States of America 116:17096−104 doi: 10.1073/pnas.1906083116 |
| [21] |
Fu S, Liu B. 2020. Recent progress in the synthesis of limonoids and limonoid-like natural products. Organic Chemistry Frontiers 7:1903−47 doi: 10.1039/D0QO00203H |
| [22] |
Liu X, Li J, Zhu X, Xu Z, Qi J. 2024. Research advances on paclitaxel biosynthesis. Synthetic Biology Journal 5(3):527−47 (in Chinese) doi: 10.12211/2096-8280.2023-085 |
| [23] |
Kautsar SA, Suarez Duran HG, Blin K, Osbourn A, Medema MH. 2017. plantiSMASH: automated identification, annotation and expression analysis of plant biosynthetic gene clusters. Nucleic Acids Research 45:W55−W63 doi: 10.1093/nar/gkx305 |
| [24] |
Skinnider MA, Johnston CW, Gunabalasingam M, Merwin NJ, Kieliszek AM, et al. 2020. Comprehensive prediction of secondary metabolite structure and biological activity from microbial genome sequences. Nature Communications 11:6058 doi: 10.1038/s41467-020-19986-1 |
| [25] |
Carroll L M, Larralde M, Fleck JS, Ponnudurai R, Milanese A, et al. 2021. Accurate de novo identification of biosynthetic gene clusters with GECCO. bioRxiv Preprint doi: 10.1101/2021.05.03.442509 |
| [26] |
Hannigan GD, Prihoda D, Palicka A, Soukup J, Klempir O, et al. 2019. A deep learning genome-mining strategy for biosynthetic gene cluster prediction. Nucleic Acids Research 47:e110 doi: 10.1093/nar/gkz654 |
| [27] |
Li Z, Liu F, Yang W, Peng S, Zhou J. 2022. A Survey of Convolutional Neural Networks: Analysis, Applications, and Prospects. IEEE Transactions on Neural Networks and Learning Systems 33:6999−7019 doi: 10.1109/TNNLS.2021.3084827 |
| [28] |
Lipton ZC, Berkowitz J, Elkan C. 2015. A critical review of recurrent neural networks for sequence learning. arXiv Preprint doi: 10.48550/arXiv.1506.00019 |
| [29] |
Chen T, Guestrin C. 2016. XGBoost: A Scalable Tree Boosting System. Proc. 22nd ACM SIGKDD International Conference on Knowledge Discovery and Data Mining, San Francisco, California, USA, 13−17 August 2016. New York: Association for Computing Machinery. pp. 785−94. doi: 10.1145/2939672.2939785 |
| [30] |
Zhang J, Huang J, Jin S, Lu S. 2024. Vision-Language Models for Vision Tasks: A Survey. IEEE Transactions on Pattern Analysis and Machine Intelligence 46:5625−44 doi: 10.1109/TPAMI.2024.3369699 |
| [31] |
Malhis N, Jacobson M, Jones S J M, Gsponer J. 2020. LIST-S2: taxonomy based sorting of deleterious missense mutations across species. Nucleic Acids Research 48:W154−W161 doi: 10.1093/nar/gkaa288 |
| [32] |
Laurens VDM, Hinton G. 2008. Visualizing Data using t-SNE. Journal of Machine Learning Research 9:2579−605 |
| [33] |
Goodfellow IJ. 2014. Generative adversarial nets. Proc. 27 th International Conference on Neural Information Processing Systems, Montreal, USA, 2014. Montreal: MIT Press. pp. 2672−80. doi: 10.3156/JSOFT.29.5_177_2 |
| [34] |
Kohonen T. 2013. Essentials of the self-organizing map. Neural Networks 37:52−65 doi: 10.1016/j.neunet.2012.09.018 |
| [35] |
Han Z, Xu Z, Xu Y, Lin J, Chen X, et al. 2024. Phylogenomics reveal DcTPS-mediated terpenoid accumulation and environmental response in Dendrobium catenatum. Industrial Crops and Products 208:117799 doi: 10.1016/j.indcrop.2023.117799 |
| [36] |
Han Z, Gong Q, Huang S, Meng X, Xu Y, et al. 2023. Machine learning uncovers accumulation mechanism of flavonoid compounds in Polygonatum cyrtonema Hua. Plant Physiology and Biochemistry 201:107839 doi: 10.1016/j.plaphy.2023.107839 |
| [37] |
Jumper J, Evans R, Pritzel A, Green T, Figurnov M, et al. 2021. Highly accurate protein structure prediction with AlphaFold. Nature 596:583−89 doi: 10.1038/s41586-021-03819-2 |
| [38] |
Lin Z, Akin H, Rao R, Hie B, Zhu Z, et al. 2023. Evolutionary-scale prediction of atomic-level protein structure with a language model. Science 379:1123−30 doi: 10.1126/science.ade2574 |
| [39] |
Heinzinger M, Weissenow K, Sanchez JG, Henkel A, Mirdita M, et al. 2024. Bilingual language model for protein sequence and structure. NAR Genomics and Bioinformatics 6:lqae150 doi: 10.1093/nargab/lqae150 |
| [40] |
Avsec Z, Agarwal V, Visentin D, Ledsam JR, Grabska-Barwinska A, et al. 2021. Effective gene expression prediction from sequence by integrating long-range interactions. Nature Methods 18:1196−203 doi: 10.1038/s41592-021-01252-x |
| [41] |
Akiyama M, Sakakibara Y. 2022. Informative RNA base embedding for RNA structural alignment and clustering by deep representation learning. NAR Genomics and Bioinformatics 4:lqac012 doi: 10.1093/nargab/lqac012 |
| [42] |
Poplin R, Chang PC, Alexander D, Schwartz S, Colthurst T, et al. 2018. A universal SNP and small-indel variant caller using deep neural networks. Nature Biotechnology 36:983−87 doi: 10.1038/nbt.4235 |
| [43] |
Fu Y, Yu S, Li J, Lao Z, Yang X, et al. 2024. DeepMineLys: deep mining of phage lysins from human microbiome. Cell Reports 43:114583 doi: 10.1016/j.celrep.2024.114583 |
| [44] |
Madani A, Krause B, Greene ER, Subramanian S, Mohr BP, et al. 2023. Large language models generate functional protein sequences across diverse families. Nature Biotechnology 41:1099−106 doi: 10.1038/s41587-022-01618-2 |
| [45] |
Wu KE, Yang KK, van den Berg R, Alamdari S, Zou JY, et al. 2024. Protein structure generation via folding diffusion. Nature Communications 15:1059 doi: 10.1038/s41467-024-45051-2 |
| [46] |
Wang Y, Song M, Liu F, Liang Z, Hong R, et al. 2025. Artificial intelligence using a latent diffusion model enables the generation of diverse and potent antimicrobial peptides. Science Advances 11:eadp7171 doi: 10.1126/sciadv.adp7171 |
| [47] |
Gumulya Y, Baek J, Wun S, Thomson RES, Harris KL, et al. 2018. Engineering highly functional thermostable proteins using ancestral sequence reconstruction. Nature Catalysis 1:878−88 doi: 10.1038/s41929-018-0159-5 |
| [48] |
Zhang K, Yang X, Wang Y, Yu Y, Huang N, et al. 2025. Artificial intelligence in drug development. Nature Medicine 31:45−59 doi: 10.1038/s41591-024-03434-4 |
| [49] |
Chen M, Zhang W, Gou Y, Xu D, Wei Y, et al. 2023. GPS 6.0: an updated server for prediction of kinase-specific phosphorylation sites in proteins. Nucleic Acids Research 51:W243−W250 doi: 10.1093/nar/gkad383 |
| [50] |
Lobentanzer S, Feng S, Bruderer N, Maier A, The BioChatter Consortium, et al. 2025. A platform for the biomedical application of large language models. Nature Biotechnology 43:166−69 doi: 10.1038/s41587-024-02534-3 |
| [51] |
Huang Z, Bianchi F, Yuksekgonul M, Montine T J, Zou J. 2023. A visual–language foundation model for pathology image analysis using medical Twitter. Nature Medicine 29:2307−16 doi: 10.1038/s41591-023-02504-3 |
| [52] |
Liu W, Li J, Tang Y, Zhao Y, Liu C, et al. 2025. DrBioRight 2.0: an LLM-powered bioinformatics chatbot for large-scale cancer functional proteomics analysis. Nature Communications 16:2256 doi: 10.1038/s41467-025-57430-4 |