[1]

Wan Z, Wang X, Liu C, Alam S, Zheng Y, et al. 2023. Efficient large language models: a survey. ArXiv Preprint

doi: 10.48550/arXiv.2312.03863
[2]

Turing AM. 1950. Computing machinery and intelligence. Mind 59(236):433−60

[3]

Raiaan MAK, Sakib S, Fahad NM, Mamun AA, Rahman MA, et al. 2024. A systematic review of hyperparameter optimization techniques in Convolutional Neural Networks. Decision Analytics Journal 11:100470

doi: 10.1016/j.dajour.2024.100470
[4]

Wu T, He S, Liu J, Sun S, Liu K, et al. 2023. A brief overview of ChatGPT: The history, status quo and potential future development. IEEE/CAA Journal of Automatica Sinica 10(5):1122−36

doi: 10.1109/JAS.2023.123618
[5]

Touvron H, Lavril T, Izacard G, Martinet X, Lachaux MA, et al. 2023. LLaMA: open and efficient foundation language models. ArXiv Preprint

doi: 10.48550/arXiv.2302.13971
[6]

Wang H, Liu C, Xi N, Qiang Z, Zhao S, et al. 2023. HuaTuo: tuning LLaMA model with chinese medical knowledge. ArXiv Preprint

doi: 10.48550/arXiv.2304.06975
[7]

Nguyen HT. 2023. A brief report on LawGPT 1.0: A virtual legal assistant based on GPT-3. ArXiv Preprint

doi: 10.48550/arXiv.2302.05729
[8]

Zhou Y, Ni Y, Gan Y, Yin Z, Liu X, et al. 2024. Are LLMs rational investors? A study on detecting and reducing the financial bias in LLMs. ArXiv Preprint

doi: 10.48550/arXiv.2402.12713
[9]

Hopfield JJ. 1982. Neural networks and physical systems with emergent collective computational abilities. Proceedings of the National Academy of Sciences 79(8):2554−58

doi: 10.1073/pnas.79.8.2554
[10]

Jumper J, Evans R, Pritzel A, Green T, Figurnov M, et al. 2021. Highly accurate protein structure prediction with AlphaFold. Nature 596:583−89

doi: 10.1038/s41586-021-03819-2
[11]

van Dijk ADJ , Kootstra G, Kruijer W, de Ridder D . 2021. Machine learning in plant science and plant breeding. IScience 24(1):101890

doi: 10.1016/j.isci.2020.101890
[12]

Mahood EH, Kruse LH, Moghe GD. 2020. Machine learning: A powerful tool for gene function prediction in plants. Applications in Plant Sciences 8(7):e11376

doi: 10.1002/aps3.11376
[13]

Toubiana D, Puzis R, Wen L, Sikron N, Kurmanbayeva A, et al. 2019. Combined network analysis and machine learning allows the prediction of metabolic pathways from tomato metabolomics data. Communications Biology 2:214

doi: 10.1038/s42003-019-0440-4
[14]

Sun L, Liu H, Zhang L, Meng J. 2015. lncRScan-SVM: a tool for predicting long non-coding RNAs using support vector machine. PLoS One 10(10):e0139654

doi: 10.1371/journal.pone.0139654
[15]

Brenchley R, Spannagl M, Pfeifer M, Barker GL, D’Amore R, et al. 2012. Analysis of the bread wheat genome using whole-genome shotgun sequencing. Nature 491:705−10

doi: 10.1038/nature11650
[16]

Hoff KJ, Lange S, Lomsadze A, Borodovsky M, Stanke M. 2016. BRAKER1: unsupervised RNA-Seq-based genome annotation with GeneMark-ET and AUGUSTUS. Bioinformatics 32(5):767−69

doi: 10.1093/bioinformatics/btv661
[17]

Umarov RK, Solovyev VV. 2017. Recognition of prokaryotic and eukaryotic promoters using convolutional deep learning neural networks. PLoS One 12(2):e0171410

doi: 10.1371/journal.pone.0171410
[18]

Li Y, Lee KK, Walsh S, Smith C, Hadingham S, et al. 2006. Establishing glucose- and ABA-regulated transcription networks in Arabidopsis by microarray analysis and promoter classification using a Relevance Vector Machine. Genome Research 16(3):414−427

doi: 10.1101/gr.4237406
[19]

Washburn JD, Mejia-Guerra MK, Ramstein G, Kremling KA, Valluru R, et al. 2019. Evolutionarily informed deep learning methods for predicting relative transcript abundance from DNA sequence. Proceedings of the National Academy of Sciences of the United States of America 116(12):5542−49

doi: 10.1073/pnas.1814551116
[20]

Ding Z, Kihara D. 2019. Computational identification of protein-protein interactions in model plant proteomes. Scientific Reports 9:8740

doi: 10.1038/s41598-019-45072-8
[21]

Ofer D, Brandes N, Linial M. 2021. The language of proteins: NLP, machine learning & protein sequences. Computational and Structural Biotechnology Journal 19:1750−58

doi: 10.1016/j.csbj.2021.03.022
[22]

Soltis PS, Nelson G, Zare A, Meineke EK. 2020. Plants meet machines: Prospects in machine learning for plant biology. Applications in Plant Sciences 8(6):e11371

doi: 10.1002/aps3.11371
[23]

Chang Y, Wang X, Wang J, Wu Y, Yang L, et al. 2024. A survey on evaluation of large language models. ACM Transactions on Intelligent Systems and Technology 15(3):1−45

doi: 10.1145/3641289
[24]

Jin J, Yu Y, Wang R, Zeng X, Pang C, et al. 2022. iDNA-ABF: multi-scale deep biological language learning model for the interpretable prediction of DNA methylations. Genome Biology 23:219

doi: 10.1186/s13059-022-02780-1
[25]

Karollus A, Hingerl J, Gankin D, Grosshauser M, Klemon K, et al. 2024. Species-aware DNA language models capture regulatory elements and their evolution. Genome Biology 25:83−83

doi: 10.1186/s13059-024-03221-x
[26]

Zhang Y, Ge F, Li F, Yang X, Song J, et al. 2023. Prediction of multiple types of RNA modifications via biological language model. IEEE/ACM Transactions on Computational Biology and Bioinformatics 20:3205−14

doi: 10.1109/TCBB.2023.3283985
[27]

Duan C, Zang Z, Xu Y, He H, Liu Z, et al. 2024. FGBERT: function-driven pre-trained gene language model for metagenomics. ArXiv Preprint

doi: 10.48550/arXiv.2402.16901
[28]

Chen K, Zhou Y, Ding M, Wang Y, Ren Z, et al. 2023. Self-supervised learning on millions of pre-mRNA sequences improves sequence-based RNA splicing prediction. BioRxiv Preprint

doi: 10.1101/2023.01.31.526427v2
[29]

Gao Z, Liu Q, Zeng W, Jiang R, Wong WH. 2024. EpiGePT: a pretrained transformer-based language model for context-specific human epigenomics. Genome Biology 25:310

doi: 10.1186/s13059-024-03449-7
[30]

Wang N, Bian J, Li Y, Li X, Mumtaz S, et al. 2024. Multi-purpose RNA language modelling with motif-aware pretraining and type-guided fine-tuning. Nature Machine Intelligence 6:548−57

doi: 10.1038/s42256-024-00836-4
[31]

Zhang Y, Lang M, Jiang J, Gao Z, Xu F, et al. 2024. Multiple sequence alignment-based RNA language model and its application to structural inference. Nucleic Acids Research 52(1):e3

doi: 10.1093/nar/gkad1031
[32]

Wang X, Gu R, Chen Z, Li Y, Ji X, et al. 2023. UNI-RNA: universal pre-trained models revolutionize RNA research. BioRxiv Preprint

doi: 10.1101/2023.07.11.548588
[33]

Benegas G, Batra SS, Song YS. 2023. DNA language models are powerful predictors of genome-wide variant effects. Proceedings of the National Academy of Sciences of the United States of America 120(44):e2311219120

doi: 10.1073/pnas.2311219120
[34]

Liu G, Chen L, Wu Y, Han Y, Bao Y, et al. 2025. PDLLMs: A group of tailored DNA large language models for analyzing plant genomes. Molecular Plant 18(2):175−78

doi: 10.1016/j.molp.2024.12.006
[35]

Mendoza-Revilla J, Trop E, Gonzalez L, Roller M, Dalla-Torre H, et al. 2024. A foundational large language model for edible plant genomes. Communications Biology 7:835

doi: 10.1038/s42003-024-06465-2
[36]

Levy B, Xu Z, Zhao L, Kremling K, Altman R, et al. 2022. FloraBERT: cross-species transfer learning withattention-based neural networks for geneexpression prediction. Research Square Preprint

doi: 10.21203/rs.3.rs-1927200/v1
[37]

Lam HYI, Ong XE, Mutwil M. 2024. Large language models in plant biology. Trends in Plant Science 29(10):1145−55

doi: 10.1016/j.tplants.2024.04.013
[38]

Zhai C. 2008. Statistical language models for information retrieval: a critical review. Foundations and Trends in Information Retrieval 2:137−213

doi: 10.1561/1500000008
[39]

Lecun Y, Bottou L, Bengio Y, Haffner P. 1998. Gradient-based learning applied to document recognition. Proceedings of the IEEE 86:2278−324

doi: 10.1109/5.726791
[40]

Grossberg S. 2013. Recurrent neural networks. Scholarpedia 8(2):1888

doi: 10.4249/scholarpedia.1888
[41]

Hochreiter S, Schmidhuber J. 1997. Long short-term memory. Neural Computation 9(8):1735−80

doi: 10.1162/neco.1997.9.8.1735
[42]

Alzubaidi L, Zhang J, Humaidi AJ, Al-Dujaili A, Duan Y, et al. 2021. Review of deep learning: concepts, CNN architectures, challenges, applications, future directions. Journal of Big Data 8:53

doi: 10.1186/s40537-021-00444-8
[43]

Sherstinsky A. 2020. Fundamentals of Recurrent Neural Network (RNN) and Long Short-Term Memory (LSTM) Network. Physica D: Nonlinear Phenomena 404:132306

doi: 10.1016/j.physd.2019.132306
[44]

Min B, Ross H, Sulem E, Veyseh APB, Nguyen TH, et al. 2023. Recent advances in natural language processing via large pre-trained language models: A survey. ACM Computing Surveys 56:30

doi: 10.1145/3605943
[45]

Brown T, Mann B, Ryder N, Subbiah M, Kaplan JD. 2020. Language models are few-shot learners. NIPS'20: Proceedings of the 34 th International Conference on Neural Information Processing Systems, Vancouver BC, Canada, 6−12 December, 2020. Red Hook, NY, United States: Curran Associates Inc. pp. 1877−901. https://dl.acm.org/doi/abs/10.5555/3495724.3495883

[46]

Floridi L, Chiriatti M. 2020. GPT-3: its nature, scope, limits, and consequences. Minds and Machines 30:681−94

doi: 10.1007/s11023-020-09548-1
[47]

Radford A, Wu J, Child R, Luan D, Amodei D, et al. 2019. Language models are unsupervised multitask learners. OpenAI Blog 1(8):9

[48]

Liu Y, Han T, Ma S, Zhang J, Yang Y, et al. 2023. Summary of ChatGPT-related research and perspective towards the future of large language models. Meta-Radiology 1(2):100017

doi: 10.1016/j.metrad.2023.100017
[49]

Ji Y, Zhou Z, Liu H, Davuluri RV. 2021. DNABERT: pre-trained bidirectional encoder representations from transformers model for DNA-language in genome. Bioinformatics 37(15):2112−20

doi: 10.1093/bioinformatics/btab083
[50]

Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, et al. 2017. Attention is all you need. NIPS'17: Proceedings of the 31 st International Conference on Neural Information Processing Systems, Long Beach, California, USA, 4–9 December 2017. Red Hook, NY, United States: Curran Associates Inc. pp. 6000–10. https://dl.acm.org/doi/10.5555/3295222.3295349

[51]

Devlin J, Chang MW, Lee K, Toutanova K. 2019. BERT: Pre-training of deep bidirectional transformers for language understanding. Proceedings of the 2019 conference of the North American chapter of the association for computational linguistics: human language technologies, Minneapolis, Minnesota, USA, 2019. USA: Association for Computational Linguistics. pp. 4171–86. doi: 10.18653/v1/N19-1423

[52]

Cordonnier JB, Loukas A, Jaggi M. 2019. On the relationship between self-attention and convolutional layers. ArXiv Preprint

doi: 10.48550/arXiv.1911.03584
[53]

Katharopoulos A, Vyas A, Pappas N, Fleuret F. 2020. Transformers are RNNs: fast autoregressive transformers with linear attention. Proceedings of the 37 th International Conference on Machine Learning, Online, 2020. pp. 5156−65. https://proceedings.mlr.press/v119/katharopoulos20a.html

[54]

Searls DB. 2002. The language of genes. Nature 420:211−17

doi: 10.1038/nature01255
[55]

Chowdhary KR. 2020. Natural Language Processing. In Fundamentals of Artificial Intelligence. New Delhi: Springer. pp. 603−49. doi: 10.1007/978-81-322-3972-7_19

[56]

Oudelaar AM, Higgs DR. 2021. The relationship between genome structure and function. Nature Reviews Genetics 22:154−68

doi: 10.1038/s41576-020-00303-x
[57]

Hagoort P. 2003. Interplay between syntax and semantics during sentence comprehension: ERP effects of combining syntactic and semantic violations. Journal of Cognitive Neuroscience 15(6):883−99

doi: 10.1162/089892903322370807
[58]

Hirschberg J, Manning CD. 2015. Advances in natural language processing. Science 349(6245):261−66

doi: 10.1126/science.aaa8685
[59]

Woodworth MA, Lakadamyali M. 2024. Toward a comprehensive view of gene architecture during transcription. Current Opinion in Genetics & Development 85:102154

doi: 10.1016/j.gde.2024.102154
[60]

Murrell A, Rakyan VK, Beck S. 2005. From genome to epigenome. Human Molecular Genetics 14:R3−R10

doi: 10.1093/hmg/ddi110
[61]

Bonev B, Cavalli G. 2016. Organization and function of the 3D genome. Nature Reviews Genetics 17:661−78

doi: 10.1038/nrg.2016.112
[62]

Yu Y, Si X, Hu C, Zhang J. 2019. A review of recurrent neural networks: LSTM cells and network architectures. Neural Computation 31(7):1235−70

doi: 10.1162/neco_a_01199
[63]

Bubeck S, Chandrasekaran V, Eldan R, Gehrke J, Horvitz E, et al. 2023. Sparks of artificial general intelligence: Early experiments with GPT-4. ArXiv Preprint

doi: 10.48550/arXiv.2303.12712
[64]

Konieczny L, Roterman-Konieczna I, Spólnik P. 2014. The structure and function of living organisms. In Systems Biology, ed. Roterman-Konieczna I. Cham: Springer. pp. 1–32. doi: 10.1007/978-3-319-01336-7_1

[65]

Kronfeldner M. 2021. Digging the channels of inheritance: On how to distinguish between cultural and biological inheritance. Philosophical Transactions of the Royal Society of London Series B, Biological Sciences 376:20200042

doi: 10.1098/rstb.2020.0042
[66]

Brendel V, Busse HG. 1984. Genome structure described by formal languages. Nucleic Acids Research 12(5):2561−68

doi: 10.1093/nar/12.5.2561
[67]

Marsit S, Hénault M, Charron G, Fijarczyk A, Landry CR. 2021. The neutral rate of whole-genome duplication varies among yeast species and their hybrids. Nature Communications 12:3126

doi: 10.1038/s41467-021-23231-8
[68]

De Bodt S, Maere S, Van de Peer Y. 2005. Genome duplication and the origin of angiosperms. Trends in Ecology and Evolution 20(11):591−97

doi: 10.1016/j.tree.2005.07.008
[69]

Wolfe KH. 2001. Yesterday’s polyploids and the mystery of diploidization. Nature Reviews Genetics 2:333−41

doi: 10.1038/35072009
[70]

Holland PW, Garcia-Fernàndez J, Williams NA, Sidow A. 1994. Gene duplications and the origins of vertebrate development. Development Supplement 1994:125−33

doi: 10.1242/dev.1994.Supplement.125
[71]

Wood TE, Takebayashi N, Barker MS, Mayrose I, Greenspoon PB, et al. 2009. The frequency of polyploid speciation in vascular plants. Proceedings of the National Academy of Sciences of the United States of America 106(33):13875−79

doi: 10.1073/pnas.0811575106
[72]

Shafee T, Lowe R. 2017. Eukaryotic and prokaryotic gene structure. WikiJournal of Medicine 4(1):1−5

doi: 10.15347/wjm/2017.002
[73]

Kozak M. 1999. Initiation of translation in prokaryotes and eukaryotes. Gene 234(2):187−208

doi: 10.1016/S0378-1119(99)00210-3
[74]

Matera AG, Wang Z. 2014. A day in the life of the spliceosome. Nature Reviews Molecular Cell Biology 15:108−21

doi: 10.1038/nrm3742
[75]

Kersey PJ. 2019. Plant genome sequences: past, present, future. Current Opinion in Plant Biology 48:1−8

doi: 10.1016/j.pbi.2018.11.001
[76]

Dame RT, Rashid FZM, Grainger DC. 2020. Chromosome organization in bacteria: mechanistic insights into genome structure and function. Nature Reviews Genetics 21:227−242

doi: 10.1038/s41576-019-0185-4
[77]

Panchy N, Lehti-Shiu M, Shiu SH. 2016. Evolution of gene duplication in plants. Plant Physiology 171(4):2294−316

doi: 10.1104/pp.16.00523
[78]

Innan H, Kondrashov F. 2010. The evolution of gene duplications: classifying and distinguishing between models. Nature Reviews Genetics 11:97−108

doi: 10.1038/nrg2689
[79]

Kellis M, Birren BW, Lander ES. 2004. Proof and evolutionary analysis of ancient genome duplication in the yeast Saccharomyces cerevisiae. Nature 428:617−24

doi: 10.1038/nature02424
[80]

Maston GA, Evans SK, Green MR. 2006. Transcriptional regulatory elements in the human genome. Annual Review of Genomics and Human Genetics 7:29−59

doi: 10.1146/annurev.genom.7.080505.115623
[81]

Avsec Ž, Agarwal V, Visentin D, Ledsam JR, Grabska-Barwinska A, et al. 2021. Effective gene expression prediction from sequence by integrating long-range interactions. Nature Methods 18:1196−203

doi: 10.1038/s41592-021-01252-x
[82]

National Center for Biotechnology Information. 2025. Genome. www.ncbi.nlm.nih.gov/datasets/genome

[83]

Zhang Q, Ding K, Lyv T, Wang X, Yin Q, et al. 2024. Scientific large language models: A survey on biological & chemical domains. arXiv Preprint

doi: 10.48550/arXiv.2401.14656
[84]

An W, Guo Y, Bian Y, Ma H, Yang J, et al. 2022. MoDNA: motif-oriented pre-training for DNA language model. BCB '22: Proceedings of the 13 th ACM International Conference on Bioinformatics, Computational Biology and Health Informatic, Northbrook Illinois, 2022. New York, United States: Association for Computing Machinery. pp. 1–5. doi: 10.1145/3535508.3545512

[85]

Luo H, Chen C, Shan W, Ding P, Luo L. 2022. iEnhancer-BERT: a novel transfer learning architecture based on DNA-language model for identifying enhancers and their strength. In Intelligent Computing Theories and Application, ICIC 2022. Lecture Notes in Computer Science. Cham: Springer. pp. 153–65. doi: 10.1007/978-3-031-13829-4_13

[86]

Luo H, Shan W, Chen C, Ding P, Luo L. 2023. Improving language model of human genome for DNA–protein binding prediction based on task-specific pre-training. Interdisciplinary Sciences: Computational Life Sciences 15:32−43

doi: 10.1007/s12539-022-00537-9
[87]

Li J, Wu Z, Lin W, Luo J, Zhang J, et al. 2023. iEnhancer-ELM: improve enhancer identification by extracting position-related multiscale contextual information based on enhancer language models. Bioinformatics Advances 3(1):vbad043

doi: 10.1093/bioadv/vbad043
[88]

Dalla-Torre H, Gonzalez L, Mendoza-Revilla J, Lopez Carranza N, Grzywaczewski AH, et al. 2025. Nucleotide Transformer: building and evaluating robust foundation models for human genomics. Nature Methods 22:287−97

doi: 10.1038/s41592-024-02523-z
[89]

Benegas G, Albors C, Aw AJ, Ye C, Song YS. 2024. GPN-MSA: an alignment-based DNA language model for genome-wide variant effect prediction. bioRxiv Preprint

doi: 10.1101/2023.10.10.561776
[90]

Wang X, Gao X, Wang G, Li D. 2023. miProBERT: identification of microRNA promoters based on the pre-trained model BERT. Briefings in Bioinformatics 24(3):bbad093

doi: 10.1093/bib/bbad093
[91]

Fishman V, Kuratov Y, Shmelev A, Petrov M, Penzar D, et al. 2025. GENA-LM: a family of open-source foundational DNA language models for long sequences. Nucleic Acids Research 53(2):gkae1310

doi: 10.1093/nar/gkae1310
[92]

Zhou Z, Ji Y, Li W, Dutta P, Davuluri R, et al. 2024. DNABERT-2: Efficient foundation model and benchmark for multi-species genome. arXiv Preprint

doi: 10.48550/arXiv.2306.15006
[93]

Theodoris CV, Xiao L, Chopra A, Chaffin MD, Al Sayed ZR, et al. 2023. Transfer learning enables predictions in network biology. Nature 618:616−24

doi: 10.1038/s41586-023-06139-9
[94]

Li Z, Jin J, Long W, Wei L. 2023. PLPMpro: Enhancing promoter sequence prediction with prompt-learning based pre-trained language model. Computers in Biology and Medicine 164:107260

doi: 10.1016/j.compbiomed.2023.107260
[95]

Penić RJ, Vlašić T, Huber RG, Wan Y, Šikić M. 2024. RiNALMo: general-purpose RNA language models can generalize well on structure prediction tasks. arXiv Preprint

doi: 10.48550/arXiv.2403.00043
[96]

Hwang Y, Cornman AL, Kellogg EH, Ovchinnikov S, Girguis PR. 2024. Genomic language model predicts protein co-regulation and function. Nature Communications 15:2880

doi: 10.1038/s41467-024-46947-9
[97]

Zvyagin M, Brace A, Hippe K, Deng Y, Zhang B, et al. 2023. GenSLMs: Genome-scale language models reveal SARS-CoV-2 evolutionary dynamics. The International Journal of High Performance Computing Applications 37(6):683−705

doi: 10.1177/10943420231201154
[98]

Zhang D, Zhang W, Zhao Y, Zhang J, He B, et al. 2024. DNAGPT: A generalized pretrained tool for multiple DNA sequence analysis tasks. bioRxiv Preprint

doi: 10.1101/2023.07.11.548628v3
[99]

Malusare A, Kothandaraman H, Tamboli D, Lanman NA, Aggarwal V. 2023. Understanding the natural language of DNA using encoder-decoder foundation models with byte-level precision. Bioinformatics Advances 4(1):vbae117

doi: 10.1093/bioadv/vbae117
[100]

Nguyen E, Poli M, Faizi M, Thomas A, Wornow M, et al. 2024. Hyenadna: Long-range genomic sequence modeling at single nucleotide resolution. Advances in Neural Information Processing Systems 36 (NeurIPS 2023). pp. 43177−201. https://proceedings.neurips.cc/paper_files/paper/2023/file/86ab6927ee4ae9bde4247793c46797c7-Paper-Conference.pdf

[101]

Nguyen E, Poli M, Durrant MG, Kang B, Katrekar D, et al. 2024. Sequence modeling and design from molecular to genome scale with Evo. Science 386:eado9336

doi: 10.1126/science.ado9336
[102]

Abramson J, Adler J, Dunger J, Evans R, Green T, et al. 2024. Accurate structure prediction of biomolecular interactions with AlphaFold 3. Nature 630:493−500

doi: 10.1038/s41586-024-07487-w
[103]

Iram A, Dong Y, Ignea C. 2024. Synthetic biology advances towards a bio-based society in the era of artificial intelligence. Current Opinion in Biotechnology 87:103143

doi: 10.1016/j.copbio.2024.103143
[104]

Goodstein DM, Shu S, Howson R, Neupane R, Hayes RD, et al. 2012. Phytozome: a comparative platform for green plant genomics. Nucleic Acids Research 40(D1):D1178−D1186

doi: 10.1093/nar/gkr944
[105]

Tello-Ruiz MK, Naithani S, Stein JC, Gupta P, Campbell M, et al. 2018. Gramene 2018: unifying comparative genomics and pathway resources for plant research. Nucleic Acids Research 46:D1181−D1189

doi: 10.1093/nar/gkx1111
[106]

Fernandez-Pozo N, Menda N, Edwards JD, Saha S, Tecle IY, et al. 2015. The Sol Genomics Network (SGN)-from genotype to phenotype to breeding. Nucleic Acids Research 43(D1):D1036−D1041

doi: 10.1093/nar/gku1195
[107]

Yu J, Jung S, Cheng CH, Lee T, Zheng P, et al. 2015. CottonGen: The community database for cotton genomics, genetics and breeding research. Plants 10:2805

doi: 10.3390/plants10122805
[108]

Hamilton JP, Li C, Buell CR. 2025. The rice genome annotation project: an updated database for mining the rice genome. Nucleic Acids Research 53(D1):D1614−D1622

doi: 10.1093/nar/gkae1061
[109]

Lamesch P, Berardini TZ, Li D, Swarbreck D, Wilks C, et al. 2012. The Arabidopsis Information Resource (TAIR): improved gene annotation and new tools. Nucleic Acids Research 40(D1):D1202−D1210

doi: 10.1093/nar/gkr1090
[110]

Xia C, Jiang S, Tan Q, Wang W, Zhao L, et al. 2022. Chromosomal-level genome of Macadamia (Macadamia integrifolia). Tropical Plants 1:3

doi: 10.48130/tp-2022-0003
[111]

Xia Z, Huang D, Zhang S, Wang W, Ma F, et al. 2021. Chromosome-scale genome assembly provides insights into the evolution and flavor synthesis of passion fruit (Passiflora edulis Sims.). Horticulture Research 8:14

doi: 10.1038/s41438-020-00455-1
[112]

Kistowski Jv, Arnold JA, Huppler K, Lange KD, Henning JL, et al. 2015. How to Build a Benchmark. ICPE '15: Proceedings of the 6 th ACM/SPEC International Conference on Performance Engineering, Austin Texas, USA, 2015. United States: Association for Computing Machinery. pp. 333–36. doi: 10.1145/2668930.2688819

[113]

Uffelmann E, Huang QQ, Munung NS, Vries JD, Okada Y, et al. 2021. Genome-wide association studies. Nature Reviews Methods Primers 1:59

doi: 10.1038/s43586-021-00056-9
[114]

Naveed H, Khan AU, Qiu S, Saqib M, Anwar S, et al. 2023. A comprehensive overview of large language models. arXiv Preprint

doi: 10.48550/arXiv.2307.06435
[115]

Vig J. 2019. A multiscale visualization of attention in the transformer model. ArXiv Preprint

doi: 10.48550/arXiv.1906.05714
[116]

Keles FD, Wijewardena PM, Hegde C. 2023. On the computational complexity of self-attention. Proceedings of The 34 th International Conference on Algorithmic Learning Theory. pp. 597–619. https://proceedings.mlr.press/v201/duman-keles23a.html

[117]

Zhou Y, Zhang J, Xiong X, Cheng ZM, Chen F. 2022. De novo assembly of plant complete genomes. Tropical Plants 1:7

doi: 10.48130/tp-2022-0007
[118]

Fernández P, Amice R, Bruy D, Christenhusz MJ, Leitch IJ, et al. 2024. A 160 Gbp fork fern genome shatters size record for eukaryotes. iScience 27(6):109889

doi: 10.1016/j.isci.2024.109889
[119]

Zou M, Xia Z. 2022. Hyper-seq: a novel, effective, and flexible marker-assisted selection and genotyping approach. The Innovation 3(4):100254

doi: 10.1016/j.xinn.2022.100254