Update of the octoploid strawberry genome annotation and gene regulatory network analysis revealed key factors in strawberry fruit maturation

Wendie Ma; Yan Wang; Yuanxiu Lin; Yunting Zhang; Ya Luo; Haoru Tang; Qing Chen; Wendie Ma; Yan Wang; Yuanxiu Lin; Yunting Zhang; Ya Luo; Haoru Tang; Qing Chen

doi:10.48130/frures-0026-0001

2026 Volume 6

Article Contents

Next Previous

ARTICLE Open Access

Update of the octoploid strawberry genome annotation and gene regulatory network analysis revealed key factors in strawberry fruit maturation

1.
College of Horticulture, Sichuan Agricultural Univerisity, Sichu 611130, China
2.
Key Laboratory of Agricultural Bioinformatics, Ministry of Education, Sichuan Agricultural University, Sichuan 611130, China

More Information

Corresponding author: supnovel@sicau.edu.cn (Chen Q)

Received: 07 October 2025
Revised: 03 December 2025
Accepted: 29 December 2025
Published online: 25 March 2026
Fruit Research 6, Article number: e010 (2026) | Cite this article

Abstract

The existing genome annotations for the strawberry cultivar 'Benihoppe' contain incomplete or inaccurate gene models, particularly lacking alternative isoforms and complete untranslated regions, which impedes precise transcriptomic analysis. Here FxaBHv1.0.a2 is presented, a substantially improved genome annotation generated by integrating 257 short-read, and 33 full-length PacBio/Nanopore RNA-seq libraries that encompass diverse tissues, developmental stages, and abiotic stress treatments. Benchmarking Universal Single-Copy Orthologs (BUSCO) completeness increased from 96.2% to 99.5%, fragmented models were eliminated, and 89% of genes now possess complete 5′ and 3′ untranslated regions. Comprehensive functional information was assigned to 90% of the 103,316 protein-coding loci, and 6,167 ncRNAs. Leveraging this high-quality annotation, a high-confidence transcriptional network controlling fruit development and ripening was reconstructed. Using an integrative approach that combined Mfuzz clustering, Weighted Gene Co-expression Network Analysis (WGCNA), and Gene Network Inference with Ensemble of Trees (GENIE3) causal network inference, key regulators were identified and prioritised. The present results pinpoint several high-confidence candidate master regulators, including the known factor MYB1 (a MYB10 homolog), the recently validated SENSITIVE TO PROTON RHIZOTOXICITY 1 (STOP1), and a novel hub, a BTB/POZ domain and ankyrin repeat-containing protein (NBCL). These transcription factors are predicted to orchestrate the critical hormonal transition from auxin repression to abscisic acid (ABA)-driven maturation by targeting core components of their respective signaling pathways. This work not only provides a foundational genomic tool for the strawberry research community, but also delivers novel insights into the regulatory architecture of fruit ripening, identifying high-priority targets for future functional validation and crop improvement.
- Strawberry,
- Genome annotation,
- Fruit maturation,
- Regulation network

Supplementary information

Supplementary Table S1 RNAseq data used in this study.
Supplementary Table S2 Primers used in this study.
Supplementary Table S3 Key bridging transcription factors that colosely related to red stage and receptacle tissue.
Supplementary Table S4 Top 500 regulatory links of the key TFs and its deduced target from GENIE3 analysis.
Supplementary Fig. S1 Mfuzz soft-clustering pattern of genes in achenes.
Supplementary Fig. S2 Eigengene expression pattern in the green module during strawberry fruit development.

Rights and permissions
Copyright: © 2026 by the author(s). Published by Maximum Academic Press, Fayetteville, GA. This article is an open access article distributed under Creative Commons Attribution License (CC BY 4.0), visit https://creativecommons.org/licenses/by/4.0/.

References

[1]	Afrin S, Gasparrini M, Forbes-Hernandez TY, Reboredo-Rodriguez P, Mezzetti B, et al. 2016. Promising health benefits of the strawberry: a focus on clinical studies. Journal of Agricultural and Food Chemistry 64(22):4435−4449 doi: 10.1021/acs.jafc.6b00857 CrossRef Google Scholar
[2]	Whitaker VM, Knapp SJ, Hardigan MA, Edger PP, Slovin JP, et al. 2020. A roadmap for research in octoploid strawberry. Horticulture Research 7(1):33 doi: 10.1038/s41438-020-0252-1 CrossRef Google Scholar
[3]	Song Y, Peng Y, Liu L, Li G, Zhao X, et al. 2024. Phased gap-free genome assembly of octoploid cultivated strawberry illustrates the genetic and epigenetic divergence among subgenomes. Horticulture Research 11(1):uhad252 doi: 10.1093/hr/uhad252 CrossRef Google Scholar
[4]	Liston A, Wei N, Tennessen JA, Li J, Dong M, et al. 2020. Revisiting the origin of octoploid strawberry. Nature Genetics 52(1):2−4 doi: 10.1038/s41588-019-0543-3 CrossRef Google Scholar
[5]	Edger PP, Poorten TJ, VanBuren R, Hardigan MA, Colle M, et al. 2019. Origin and evolution of the octoploid strawberry genome. Nature Genetics 51(3):541−547 doi: 10.1038/s41588-019-0356-4 CrossRef Google Scholar
[6]	Shulaev V, Sargent DJ, Crowhurst RN, Mockler TC, Folkerts O, et al. 2011. The genome of woodland strawberry (Fragaria vesca). Nature Genetics 43(2):109−116 doi: 10.1038/ng.740 CrossRef Google Scholar
[7]	Mao J, Wang Y, Wang B, Li J, Zhang C, et al. 2023. High-quality haplotype-resolved genome assembly of cultivated octoploid strawberry. Horticulture Research 10(1):uhad002 doi: 10.1093/hr/uhad002 CrossRef Google Scholar
[8]	Hu S, Zeng X, Liu Y, Li Y, Qu M, et al. 2024. Global characterization of somatic mutations and DNA methylation changes during vegetative propagation in strawberries. Genome Research 34:1582−1594 doi: 10.1101/gr.279378.124 CrossRef Google Scholar
[9]	Jin X, Du H, Chen M, Zheng X, He Y, et al. 2025. A fully phased octoploid strawberry genome reveals the evolutionary dynamism of centromeric satellites. Genome Biology 26(1):17 doi: 10.1186/s13059-025-03482-0 CrossRef Google Scholar
[10]	Han H, Salinas N, Barbey CR, Jang YJ, Fan Z, et al. 2025. A telomere-to-telomere phased genome of an octoploid strawberry reveals a receptor kinase conferring anthracnose resistance. GigaScience 14:giaf005 doi: 10.1093/gigascience/giaf005 CrossRef Google Scholar
[11]	Hardigan MA, Feldmann MJ, Pincot DDA, Famula RA, Vachev MV, et al. 2021. Blueprint for phasing and assembling the genomes of heterozygous polyploids: application to the octoploid genome of strawberry. bioRxiv 467115 doi: 10.1101/2021.11.03.467115 CrossRef Google Scholar
[12]	Han H, Jang YJ, Han K, Park HN, Kim DS, et al. 2025. Chromosome-level genome assembly of cultivated strawberry 'Seolhyang' (Fragaria × ananassa). Scientific Data 12(1):1002 doi: 10.1038/s41597-025-05191-6 CrossRef Google Scholar
[13]	Zhang J, Liu S, Zhao S, Nie Y, Zhang Z. 2025. A telomere-to-telomere haplotype-resolved genome of white-fruited strawberry reveals the complexity of fruit colour formation of cultivated strawberry. Plant Biotechnology Journal 23:78−80 doi: 10.1111/pbi.14479 CrossRef Google Scholar
[14]	Kim JH, Whitaker VM, Lee S. 2025. A haplotype-phased genome characterizes the genomic architecture and causal variants for RXf1 conferring resistance to Xanthomonas fragariae in strawberry (F. × ananassa). BMC Genomics 26:453 doi: 10.1186/s12864-025-11517-w CrossRef Google Scholar
[15]	Li X, Liu S, Zhang J, Zhang Z. 2025. Genome assembly and transcriptome profiling of the woodland strawberry (Fragaria vesca) 'Ruegen'. Fruit Research 5:e031 doi: 10.48130/frures-0025-0022 CrossRef Google Scholar
[16]	Jin X, Du H, Zhu C, Wan H, Liu F, et al. 2023. Haplotype-resolved genomes of wild octoploid progenitors illuminate genomic diversifications from wild relatives to cultivated strawberry. Nature Plants 9(8):1252−1266 doi: 10.1038/s41477-023-01473-2 CrossRef Google Scholar
[17]	Chang L, Dong J, Zhong C, Sun J, Sun R, et al. 2018. Pedigree analysis of strawberry cultivars released in China. Journal of Fruit Science 35(2):158−167 (in Chinese) doi: 10.13925/j.cnki.gsxb.20170279 CrossRef Google Scholar
[18]	Wang L, Shi P, Ping Z, Huang Q, Jiang L, et al. 2024. The golden genome annotation of Ganoderma lingzhi reveals a more complex scenario of eukaryotic gene structure and transcription activity. BMC Biology 22(1):271 doi: 10.1186/s12915-024-02073-y CrossRef Google Scholar
[19]	Zhao Y, Chen Z, Hu M, Liu H, Zhao H, et al. 2024. Integrating Iso-seq and RNA-seq data for the reannotation of the greater amberjack genome. Scientific Data 11(1):675 doi: 10.1038/s41597-024-03495-7 CrossRef Google Scholar
[20]	Shi C, Yu H, Song L, Lu Y, Wang X, et al. 2024. Comprehensive re-annotation and transcriptome analysis provide insights into pepper development. Scientia Horticulturae 336:113406 doi: 10.1016/j.scienta.2024.113406 CrossRef Google Scholar
[21]	Liu T, Li M, Liu Z, Ai X, Li Y. 2021. Reannotation of the cultivated strawberry genome and establishment of a strawberry genome database. Horticulture Research 8(1):41 doi: 10.1038/s41438-021-00476-4 CrossRef Google Scholar
[22]	Luo Y, Ge C, Ling Y, Mo F, Yang M, et al. 2020. ABA and sucrose co-regulate strawberry fruit ripening and show inhibition of glycolysis. Molecular Genetics and Genomics 295(2):421−438 doi: 10.1007/s00438-019-01629-w CrossRef Google Scholar
[23]	Li BJ, Shi YN, Jia HR, Yang XF, Sun YF, et al. 2023. Abscisic acid mediated strawberry receptacle ripening involves the interplay of multiple phytohormone signaling networks. Frontiers in Plant Science 14:1117156 doi: 10.3389/fpls.2023.1117156 CrossRef Google Scholar
[24]	Zhong Y, Wei X, Zhang J, Wang L. 2025. Transcriptome sequencing reveals jasmonate playing a key role in ALA-induced osmotic stress tolerance in strawberry. BMC Plant Biology 25(1):41 doi: 10.1186/s12870-025-06068-x CrossRef Google Scholar
[25]	Li S, Chang L, Sun R, Dong J, Zhong C, et al. 2022. Combined transcriptomic and metabolomic analysis reveals a role for adenosine triphosphate-binding cassette transporters and cell wall remodeling in response to salt stress in strawberry. Frontiers in Plant Science 13:996765 doi: 10.3389/fpls.2022.996765 CrossRef Google Scholar
[26]	Xiao G, Zhang Q, Zeng X, Chen X, Liu S, et al. 2022. Deciphering the molecular signatures associated with resistance to Botrytis cinerea in strawberry flower by comparative and dynamic transcriptome analysis. Frontiers in Plant Science 13:888939 doi: 10.3389/fpls.2022.888939 CrossRef Google Scholar
[27]	Yuan H, Yu H, Huang T, Shen X, Xia J, et al. 2019. The complexity of the Fragaria x ananassa (octoploid) transcriptome by single-molecule long-read sequencing. Horticulture Research 6:46 doi: 10.1038/s41438-019-0126-6 CrossRef Google Scholar
[28]	Chen Q, Lin X, Tang W, Deng Q, Wang Y, et al. 2022. Transcriptomic complexity in strawberry fruit development and maturation revealed by nanopore sequencing. Frontiers in Plant Science 13:872054 doi: 10.3389/fpls.2022.872054 CrossRef Google Scholar
[29]	Holst F, Bolger A, Günther C, Maß J, Triesch S, et al. 2023. Helixer—de novo prediction of primary eukaryotic gene models combining deep learning and a hidden Markov model. bioRxiv 527280 doi: 10.1101/2023.02.06.527280 CrossRef Google Scholar
[30]	Haas BJ, Delcher AL, Mount SM, Wortman JR, Smith RK Jr, et al. 2003. Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucleic Acids Research 31(19):5654−5666 doi: 10.1093/nar/gkg770 CrossRef Google Scholar
[31]	Musacchia F, Basu S, Petrosino G, Salvemini M, Sanges R. 2015. Annocript: a flexible pipeline for the annotation of transcriptomes able to identify putative long noncoding RNAs. Bioinformatics 31(13):2199−2201 doi: 10.1093/bioinformatics/btv106 CrossRef Google Scholar
[32]	Anders S, Huber W. 2010. Differential expression analysis for sequence count data. Genome Biology 11(10):R106 doi: 10.1186/gb-2010-11-10-r106 CrossRef Google Scholar
[33]	Kumar L, Futschik ME. 2007. Mfuzz: a software package for soft clustering of microarray data. Bioinformation 2(1):5−7 doi: 10.6026/97320630002005 CrossRef Google Scholar
[34]	Langfelder P, Horvath S. 2008. WGCNA: an R package for weighted correlation network analysis. BMC Bioinformatics 9(1):559 doi: 10.1186/1471-2105-9-559 CrossRef Google Scholar
[35]	Huynh-Thu VA, Irrthum A, Wehenkel L, Geurts P. 2010. Inferring regulatory networks from expression data using tree-based methods. PLoS One 5(9):e12776 doi: 10.1371/journal.pone.0012776 CrossRef Google Scholar
[36]	Tanaka T, Haraguchi Y, Todoroki T, Saisho D, Abiko T, Kai H. 2025. Reference-based chromosome-scale assembly of Japanese barley (Hordeum vulgare ssp. vulgare) cultivar Hayakiso 2. DNA Research 32(4):dsaf016 doi: 10.1093/dnares/dsaf016 CrossRef Google Scholar
[37]	Fait A, Hanhineva K, Beleggia R, Dai N, Rogachev I, et al. 2008. Reconfiguration of the achene and receptacle metabolic networks during strawberry fruit development. Plant Physiology 148(2):730−750 doi: 10.1104/pp.108.120691 CrossRef Google Scholar
[38]	Koyama H, Wu L, Agrahari RK, Kobayashi Y. 2021. STOP1 regulatory system: Centered on multiple stress tolerance and cellular nutrient management. Molecular Plant 14(10):1615−1617 doi: 10.1016/j.molp.2021.08.014 CrossRef Google Scholar
[39]	Bian R, Yao J, Nie Y, Zhang Y, Wu Z, et al. 2025. A novel function for the transcription factor sensitive to proton rhizotoxicity1 in promoting anthocyanin accumulation in strawberry. Plant Biotechnology Journal 23(9):3727−3747 doi: 10.1111/pbi.70194 CrossRef Google Scholar
[40]	Castillejo C, Waurich V, Wagner H, Ramos R, Oiza N, et al. 2020. Allelic variation of MYB10 is the major force controlling natural variation in skin and flesh color in strawberry (Fragaria spp.) fruit. The Plant Cell 32(12):3723−3749 doi: 10.1101/2020.06.12.148015 CrossRef Google Scholar
[41]	Denoyes B, Prohaska A, Petit J, Rothan C. 2023. Deciphering the genetic architecture of fruit color in strawberry. Journal of Experimental Botany 74(20):6306−6320 doi: 10.1093/jxb/erad245 CrossRef Google Scholar
[42]	Zhang LL, Zhu QY, Sun JL, Yao ZW, Qing T, et al. 2024. XBAT31 regulates reproductive thermotolerance through controlling the accumulation of HSFB2a/B2b under heat stress conditions. Cell Reports 43(6):114349 doi: 10.1016/j.celrep.2024.114349 CrossRef Google Scholar
[43]	Gabriel L, Brůna T, Hoff KJ, Ebel M, Lomsadze A, et al. 2024. BRAKER3: fully automated genome annotation using RNA-seq and protein evidence with GeneMark-ETP, AUGUSTUS, and TSEBRA. Genome Research 34(5):769−777 doi: 10.1101/gr.278090.123 CrossRef Google Scholar
[44]	Amin MR, Yurovsky A, Tian Y, Skiena S. 2018. DeepAnnotator: genome annotation with deep learning. Proceedings of the 2018 ACM International Conference on Bioinformatics, Computational Biology, and Health Informatics, August 29−September 1, 2018, Washington DC, USA. New York, NY, USA: Association for Computing Machinery. pp. 254−259 doi: 10.1145/3233547.3233577
[45]	Li Y, Pi M, Gao Q, Liu Z, Kang C. 2019. Updated annotation of the wild strawberry Fragaria vesca V4 genome. Horticulture Research 6:61 doi: 10.1038/s41438-019-0142-6 CrossRef Google Scholar
[46]	Lu J, Makun L, Yang X, Grierson D, Yuan H, et al. 2025. Haplotype-resolved chromosome-level genome assembly of Fragaria × ananassa Duch. cv. 'Yuexin'. Scientific Data 12(1):974 doi: 10.1038/s41597-025-05322-z CrossRef Google Scholar
[47]	Jang YJ, Kim T, Lin M, Kim J, Begcy K, et al. 2024. Genome-wide gene network uncover temporal and spatial changes of genes in auxin homeostasis during fruit development in strawberry (F. × ananassa). BMC Plant Biology 24(1):876 doi: 10.1186/s12870-024-05577-5 CrossRef Google Scholar
[48]	Zhou J, Sittmann J, Guo L, Xiao Y, Huang X, et al. 2021. Gibberellin and auxin signaling genes RGA1 and ARF8 repress accessory fruit initiation in diploid strawberry. Plant Physiology 185(3):1059−1075 doi: 10.1093/plphys/kiaa087 CrossRef Google Scholar
[49]	Ha CM, Jun JH, Nam HG, Fletcher JC. 2004. BLADE-ON-PETIOLE1 encodes a BTB/POZ domain protein required for leaf morphogenesis in Arabidopsis thaliana. Plant and Cell Physiology 45(10):1361−1370 doi: 10.1093/pcp/pch201 CrossRef Google Scholar
[50]	Li BJ, Grierson D, Shi Y, Chen KS. 2022. Roles of abscisic acid in regulating ripening and quality of strawberry, a model non-climacteric fruit. Horticulture Research 9:uhac089 doi: 10.1093/hr/uhac089 CrossRef Google Scholar
[51]	Li BJ, Shi YN, Xiao YN, Jia HR, Yang XF, et al. 2024. AUXIN RESPONSE FACTOR 2 mediates repression of strawberry receptacle ripening via auxin-ABA interplay. Plant Physiology 196(4):2638−2653 doi: 10.1093/plphys/kiae510 CrossRef Google Scholar
[52]	Martín-Pizarro C, Vallarino JG, Osorio S, Meco V, Urrutia M, et al. 2021. The NAC transcription factor FaRIF controls fruit ripening in strawberry. The Plant Cell 33(5):1574−1593 doi: 10.1093/plcell/koab070 CrossRef Google Scholar
[53]	Li X, Martín-Pizarro C, Zhou L, Hou B, Wang Y, et al. 2023. Deciphering the regulatory network of the NAC transcription factor FvRIF, a key regulator of strawberry (Fragaria vesca) fruit ripening. The Plant Cell 35(11):4020−4045 doi: 10.1093/plcell/koad210 CrossRef Google Scholar
[54]	Xiao K, Fan J, Bi X, Tu X, Li X, et al. 2025. A NAC transcription factor and a MADS-box protein antagonistically regulate sucrose accumulation in strawberry receptacles. Plant Physiology 197:kiaf043 doi: 10.1093/plphys/kiaf043 CrossRef Google Scholar
[55]	Fan J, Cao M, Bi X, Zhu Y, Gao Q, et al. 2025. A FvERF3-FvNAC073 module regulates strawberry fruit size and ripening. The Plant Journal 122:e70262 doi: 10.1111/tpj.70262 CrossRef Google Scholar
[56]	Koskela EA, Sønsteby A, Flachowsky H, Heide OM, Hanke MV, et al. 2016. TERMINAL FLOWER1 is a breeding target for a novel everbearing trait and tailored flowering responses in cultivated strawberry (Fragaria × ananassa Duch.). Plant Biotechnology Journal 14(9):1852−1861 doi: 10.1111/pbi.12545 CrossRef Google Scholar

About this article

Cite this article

Ma W, Wang Y, Lin Y, Zhang Y, Luo Y, et al. 2026. Update of the octoploid strawberry genome annotation and gene regulatory network analysis revealed key factors in strawberry fruit maturation. Fruit Research 6: e010 doi: 10.48130/frures-0026-0001

Ma W, Wang Y, Lin Y, Zhang Y, Luo Y, et al. 2026. Update of the octoploid strawberry genome annotation and gene regulatory network analysis revealed key factors in strawberry fruit maturation. Fruit Research 6: e010 doi: 10.48130/frures-0026-0001

Figures(4) / Tables(2)

Download PDF

Article Metrics

Article views(3160) PDF downloads(459)

Other Articles By Authors

on this site
- Wendie Ma
- Yan Wang
- Yuanxiu Lin
- Yunting Zhang
- Ya Luo
- Haoru Tang
- Qing Chen
on Google Scholar
- Wendie Ma
- Yan Wang
- Yuanxiu Lin
- Yunting Zhang
- Ya Luo
- Haoru Tang
- Qing Chen

HTML

Introduction

The cultivated strawberry (Fragaria × ananassa Duch.) is a globally important perennial horticultural crop and one of the most widely consumed berry fruits. Its global annual production now exceeds nine million metric tons, with China, the United States, and Spain as the leading producers (FAOSTAT, 2023). Beyond its economic importance, strawberries are a rich source of essential nutrients, such as vitamin C and folate, and health-promoting phytonutrients, including anthocyanins and other phenolic compounds. These compounds have been associated with a reduced risk of cardiovascular diseases and certain cancers^[1].

The exceptional agronomic traits of the cultivated strawberry are underpinned by a remarkably complex genetic background. This relatively young domesticated crop originated from an accidental hybridization in 18^th-century Europe between two wild octoploid species: F. virginiana from North America and F. chiloensis from Chile^[2]. Its allo-octoploid genome (2n = 8x = 56) comprises four distinct diploid subgenomes, designated A, B, C, and D, which remain incompletely characterized^[3−5]. This intricate polyploid structure, coupled with the high heterozygosity typical of an outcrossing species, poses substantial challenges for genomic analyses and has markedly hindered progress in molecular breeding and functional genomics. Researchers frequently rely on the structurally simpler diploid ancestor, F. vesca, as a model system^[6]. However, to fully elucidate complex biological phenomena in polyploids, such as chromosomal interactions, allelic expression patterns, subgenome stabilization, and divergence, studies must ultimately focus on the octoploid strawberry itself. Moreover, octoploid strawberries dominate current commercial cultivation, creating an urgent industry need to improve traits such as disease resistance, climate resilience, and fruit quality. Improving foundational genomic resources for the octoploid strawberry is therefore critical to overcoming this long-standing 'genomic bottleneck', thereby facilitating more precise and efficient research and breeding programs.

In recent years, significant advancements have been made in cultivated strawberry genomics. The release of the first chromosome-scale reference genome for the octoploid cultivar 'Camarosa' marked a pivotal milestone. Published by Edger et al., this assembly combined short-read sequencing with PacBio long-read sequencing for scaffolding^[5]. It provided the first comprehensive insight into octoploid genome architecture, revealing a dominant subgenome derived from the F. vesca lineage that disproportionately influences key agronomic traits. Building on this foundation, and enabled by advances in sequencing technologies, particularly the widespread adoption of long-read sequencing and improved assembly algorithms, high-quality reference genomes have been developed for additional cultivars, including 'Yanli'^[7], 'Benihoppe'^[3,8], 'EA78'^[9], 'Florida Brilliance'^[10], 'Royal Royce'^[11], 'Seolhyang'^[12], 'Chulian'^[13], and FL17.68–110^[14], as well as for several wild progenitors, including F. vesca 'Ruegen'^[15] and accessions of F. chiloensis and F. virginiana^[16]. These genomes have helped elucidate fundamental biological questions related to subgenome structural variation, biased allelic expression, centromeric sequence expansion and divergence, and the genetic basis of variation and evolution in genes underlying critical agronomic traits.

Despite this progress, the corresponding gene annotations remain suboptimal. Most assemblies still define only one mRNA per locus and lack comprehensive untranslated regions. For example, current gene annotations for various strawberry genomes struggle to accurately delineate complete transcript boundaries, often resulting in missing or incorrectly predicted gene models. In the case of the 'Benihoppe' strawberry, its initial genome annotation achieved a benchmarking universal single-copy orthologs (BUSCO) completeness score of approximately 96%^[3], indicating that a subset of conserved single-copy orthologous genes was either absent or fragmented. Subsequently, a haplotype-resolved telomere-to-telomere (T2T) genome of 'Benihoppe' was assembled, with BUSCO completeness scores of 98.0%–98.4%^[8]. However, large-scale full-length transcript evidence, essential for capturing alternative splicing and complete UTRs, remains lacking. 'Benihoppe', a cultivar developed in Japan from a cross between 'Akihime' and 'Sachinoka', is one of the most widely cultivated varieties in China, and is frequently used as a parent in Asian breeding programs^[17]. These imperfections in its structural annotation have impeded research relying on this cultivar and limited the application of efficient molecular-assisted selection and genomic selection breeding strategies.

Recent technological advancements in sequencing, particularly the synergistic integration of long- and short-read RNA sequencing (RNA-seq), provide a powerful approach to substantially enhance genome annotation quality by capturing full-length transcripts and defining complete gene structures^[18−21]. This study has two primary objectives: (1) to generate a substantially improved genome annotation (FxaBHv1.0.a2) for the economically important octoploid strawberry cultivar 'Benihoppe' by integrating 257 short-read and 33 long-read (PacBio/Nanopore) RNA-seq datasets that collectively span various tissues, developmental stages, and biotic/abiotic stresses; and (2) to leverage this high-fidelity genomic resource for an in-depth analysis of the transcriptional regulatory networks governing fruit development and ripening. By applying Mfuzz clustering and constructing a causal inference network using gene network inference with an ensemble of trees (GENIE3), key master regulators orchestrating this critical agronomic process were identified. This research not only delivers an essential updated resource for the strawberry research community but also provides novel insights into the molecular regulatory mechanisms underlying fruit development and maturation.

Discussion

Accurate genome annotation is essential for precise gene expression analysis, and supports applications like molecular breeding and functional genomics. In the past five years, the strawberry genome assembly has been significantly improved^{[3,5,7,9,10,12,15,16]}, yet annotations remain suboptimal. For example, the phase-resolved 'Benihoppe' strawberry genome defines only one mRNA per gene, with UTRs annotated for just 30% of genes^[3], while the Haplotype-resolved 'Benihoppe' genome predicted gene models covering 98.0%–98.4% BUSCO homologes^[8]. Conventionally, two main strategies are integratively used for genome annotation. The first employs ab initio tools like AUGUSTUS or GeneMark^[43], which predict gene structures using Hidden Markov Models. Recently, AI-driven tools such as DeepAnnotator and Helixer have impressive prediction accuracy and efficiency using machine and deep learning^[29,44]. The second strategy, considered more reliable, uses experimental evidence from expressed sequence tags, RNA-seq reads, or homologous gene sequences. Short-read RNA-seq, while widely used, requires complex computational assembly, often producing incomplete or chimeric transcripts. In contrast, long-read RNA-seq captures full mRNA sequences in a single read, also reveals alternative splicing isoforms. Recent strawberry genome reannotations demonstrate the value of integrating these two data types. The diploid F. vesca reannotation, using Illumina and SMRT RNA-seq, increased BUSCO completeness from 91.1% to 98.1%, resolved fragmented models, and added UTRs to 59.1% of genes^[45]. The Camarosa reannotation, combining PacBio and Illumina data, annotated 108,447 genes with 97.85% BUSCO completeness, improved UTRs for 79.9% of genes, and identified more transcription factors^[21]. The haplotype-resolved 'Yuexin' assembly annotated 110,776 genes with 99.07% BUSCO completeness, revealing structural variants linked to quality traits^[46]. In this study, Helixer was used for initial gene prediction, refined with 257 short-read and 33 long-read RNA-seq datasets, achieving 99.5% BUSCO completeness, an average gene length of 3,211 bp. Compared to Camarosa v1.0.a2 (97.85% BUSCO), and Yuexin (99.07%), the present annotation offers superior UTR coverage (89%), and no fragmented BUSCOs, highlighting the benefits of extensive transcriptome sampling. We have to mention that several haplotype-resolved genomes and associated annotations were available^[13,14,46]. They provided indispensible data for studying allele-specific expression and structural variants. However, current haplotype-aware alignment and quantification tools remain limited and immature for highly homozygous octoploid genomes. Consequently, haplotype-resolved gene annotation and expression quantification still introduce considerable uncertainty. A conservative subgenome-collapsed strategy was therefore adopted to maximize accuracy in network inference, and anticipate that future advances in haplotype-aware pipelines will unlock the full potential of these high-quality phased assemblies.

The development and ripening of the strawberry accessory fruit is a complex process, orchestrated by an intricate interplay between phytohormones and transcription factors, particularly through signaling between the achenes and the receptacle. It's well-established that achene-derived auxin promotes early receptacle growth while repressing ripening^[47]. Previous studies have identified key TFs like RGA1, ARF8, and ARF6 that mediate auxin and gibberellin signaling during these initial stages^[47,48]. The present findings provide a deeper mechanistic insight into the transition away from this auxin-dominant phase. The identification of ARF6 within our ripening-associated pink module aligns with its known role in auxin dynamics. More significantly, our GENIE3 regulatory network revealed that top-ranked TFs directly target GH3.6, an auxin-conjugating enzyme. Among these, the novel hub NBCL acts as an adapter of the E3 ubiquitin-protein ligase complex CUL3-RBX1-BTB^[49]. This suggests a key ripening strategy involves the active post-translational suppression of auxin signaling to permit maturation. This decline in auxin appears to be a prerequisite for the ascent of abscisic acid, the primary driver of ripening^[50]. The interplay between these hormones is critical, as demonstrated by ARF2's repression of the ABA biosynthesis gene FaNCED3^[51]. The present network analysis uncovers a complementary regulatory layer: as auxin levels fall, factors like NBCL and STOP1 become active and directly target core ABA signaling components, including ABI5 and the ABA receptor PYL12, respectively. This creates a robust molecular switch that not only removes the auxin 'brake' but also actively 'accelerates' ABA-mediated ripening processes. Once ABA signaling is initiated, a cascade of downstream TFs executes the ripening program.

Recent studies have established the NAC transcription factor FaRIF as a master regulator of strawberry fruit ripening, where it directly activates genes involved in anthocyanin biosynthesis, sugar metabolism, and ABA signaling^[52,53]. Additional NAC family members, such as NAC073, antagonistically work with CMB1L to control sucrose accumulation^[54], while the ERF3–NAC073 regulatory cascade coordinately modulates both fruit growth and ripening^[55]. These findings highlight the extensive crosstalk and hierarchical organization among transcription factor families in orchestrating comprehensive regulatory networks for strawberry fruit maturation. Using an integrative multi-evidence scoring approach, we identified NAC098 as a central hub regulator of receptacle maturation. The same framework confidently prioritized known key players, including a MYB1 homolog of the well-characterized anthocyanin master regulator MYB10, as well as SOC1, which are implicated in flavonoid biosynthesis^[40], and developmental timing^[56]. In summary, the present study positions TFs like NBCL and STOP1 as pivotal, high-level integrators of the auxin-to-ABA hormonal transition that defines strawberry ripening. These hubs, along with key downstream effectors like NAC, bZIPs, and MYB TFs, form a cohesive transcriptional network controlling fruit quality traits. Their high ranking in the present analysis underscores their potential as prime targets for CRISPR-based functional validation to enhance strawberry agronomic performance.

	v1.0.a1	v1.0H (haplotype)	v1.0.a2 (this study)
Number of genes	109,320	116,232	103,316
Number of mRNAs	109,320	175,313	113,259
Mean length of genomic loci	2,462	2,823	3,211
Mean exon number	4.5	5.5	5.6
Mean CDS length	1,049	1,232	1,169
Single-exon gene number	31,703	29,344	17,048
Multi-exon gene number	77,617	86,888	86,268
Genes with 5'UTR	37,673	45,601	102,967
Genes with 3'UTR	42,605	46,873	101,449
Genes with both 5'UTR and 3'UTR	31,198	40,542	101,404
Mean 5'UTR length	173	371	248
Mean 3'UTR length	350	631	414
Shortest gene length	228	155	104
Longest gene length	151,412	808,995	67,370
Shortest intron length	4	20	20
Longest intron length	50,784	483,249	29,051
Complete BUSCOs (%)	96.2	98.7	99.5
Fragmented BUSCOs (%)	1.1	0.3	0
Missing BUSCOs (%)	2.7	1.0	0.5
Genes with GO terms	54,718	NA	64,266
Genes with KEGG assignment	20,318	NA	43,732

GeneID	TotalScore	Functional description	ShortName
FxaBH_1D_001688	7.5	Protein SENSITIVE TO PROTON RHIZOTOXICITY 1	STOP1
FxaBH_7B_002588	7.5	MADS-box protein SOC1	SOC1
FxaBH_6C_001031	7.5	BTB/POZ domain and ankyrin repeat-containing protein NBCL	NBCL
FxaBH_1C_001667	7.5	Transcription factor MYB1	MYB1
FxaBH_1B_001845	7.5	Transcription factor MYB1	MYB1
FxaBH_1A_002113	7.5	Transcription factor MYB1	MYB1
FxaBH_7B_001452	6.5	Scarecrow-like protein 8	SCL8
FxaBH_6D_001039	6.5	BTB/POZ domain and ankyrin repeat-containing protein NBCL	NBCL
FxaBH_6B_002354	6.5	Probable WRKY transcription factor 75	WRKY75
FxaBH_6B_001087	6.5	BTB/POZ domain and ankyrin repeat-containing protein NBCL	NBCL
FxaBH_6A_001285	6.5	BTB/POZ domain and ankyrin repeat-containing protein NBCL	NBCL
FxaBH_3C_001503	6.5	Putative E3 ubiquitin-protein ligase XBAT31	XBAT31
FxaBH_7D_003052	4.5	Chalcone synthase	CHS1
FxaBH_7D_002223	4.5	Chalcone--flavanone isomerase 1	CHI1
FxaBH_7D_002157	4.5	Aspartic proteinase PCS1	PCS1
FxaBH_7D_002011	4.5	Chalcone isomerase-like protein 2	CHIL2
FxaBH_7D_001599	4.5	Anthocyanidin 3-O-glucosyltransferase 1	GT1
FxaBH_7C_002928	4.5	Chalcone synthase	CHS1
FxaBH_7C_002308	4.5	3-hydroxyisobutyryl-CoA hydrolase 1	CHY1
FxaBH_7C_002093	4.5	Chalcone--flavanone isomerase 2	CHI2
FxaBH_7C_001900	4.5	Chalcone isomerase-like protein 2	CHIL2
FxaBH_7C_001470	4.5	Anthocyanidin 3-O-glucosyltransferase 1	GT1
FxaBH_7B_003068	4.5	Chalcone synthase	CHS1
FxaBH_7B_002249	4.5	Chalcone -flavanone isomerase 2	CHI2
FxaBH_7B_002068	4.5	Chalcone isomerase-like protein 2	CHIL2
FxaBH_7B_000048	4.5	Polygalacturonase
FxaBH_7A_003557	4.5	Chalcone synthase	CHS1
FxaBH_7A_003556	4.5	Chalcone synthase	CHS1
FxaBH_7A_003555	4.5	Chalcone synthase	CHS1
FxaBH_7A_002558	4.5	Chalcone--flavanone isomerase 1	CHI1

{{lists.name}}

Update of the octoploid strawberry genome annotation and gene regulatory network analysis revealed key factors in strawberry fruit maturation