| [1] |
Williams AJ, Grulke CM, Edwards J, McEachran AD, Mansouri K, et al. 2017. The CompTox Chemistry Dashboard: a community data resource for environmental chemistry. |
| [2] |
Kim S, Chen J, Cheng T, Gindulyte A, He J, et al. 2025. PubChem 2025 update. |
| [3] |
OECD. 2018. Users' Handbook supplement to the Guidance Document for developing and assessing Adverse Outcome Pathways. OECD Series on Adverse Outcome Pathways 1: OECD Publishing, Paris. doi: 10.1787/5jlv1m9d1g32-en |
| [4] |
Papadopoulos D, Papadakis N, Litke A. 2020. A methodology for open information extraction and representation from large scientific corpora: the CORD-19 data exploration use case. |
| [5] |
Hirschberg J, Manning CD. 2015. Advances in natural language processing. |
| [6] |
Li J, Sun A, Han J, Li C. 2022. A survey on deep learning for named entity recognition. |
| [7] |
Gonzalez Hernandez F, Nguyen Q, Smith VC, Cordero JA, Ballester MR, et al. 2024. Named entity recognition of pharmacokinetic parameters in the scientific literature. |
| [8] |
Dagdelen J, Dunn A, Lee S, Walker N, Rosen AS, et al. 2024. Structured information extraction from scientific text with large language models. |
| [9] |
Liang W, Su W, Zhong L, Yang Z, Li T, et al. 2024. Comprehensive Characterization of oxidative stress-modulating chemicals using GPT-based text mining. |
| [10] |
Zhang X, Kao Y, Che S, Yan J, Zhou S, et al. 2025. Chinese medical named entity recognition integrating adversarial training and feature enhancement. |
| [11] |
Ying H, Yuan H, Lu J, Qu Z, Zhao Y, et al. 2025. GENIE: Generative Note Information Extraction model for structuring EHR data. |
| [12] |
Li K, Zhang J, Yao C, Shi C. Automatic relation extraction from text: a survey. 2016 International Conference on Identification, Information and Knowledge in the Internet of Things (IIKI), Beijing, China, 2016. USA: IEEE. pp. 83−86 doi: 10.1109/IIKI.2016.58 |
| [13] |
Hochreiter S, Schmidhuber J. 1997. Long short-term memory. |
| [14] |
Chung J, Gulcehre C, Cho K, Bengio Y. 2014. Empirical evaluation of gated recurrent neural networks on sequence Modeling. |
| [15] |
Howard J, Ruder S. 2018. Universal language model fine-tuning for text classification. Proc. 56th Annual Meeting of the Association for Computational Linguistics, Melbourne, Australia, 2018. US: Association for Computational Linguistics. pp. 328−339 doi: 10.18653/v1/p18-1031 |
| [16] |
Cho K, Merrienboer B, Gulcehre C, Bahdanau D, Bougares F, et al. 2014. Learning phrase representations using RNN encoder-decoder for statistical machine translation. Proceedings of the 2014 Conference on Empirical Methods in Natural Language Processing (EMNLP), Doha, Qatar, 2014. PA, USA: ACL. pp. 1724−1734 doi: 10.3115/v1/d14-1179 |
| [17] |
Vaswani A, Shazeer N, Parmar N, Uszkoreit J, Jones L, et al. 2023. Attention is all you need. http://arxiv.org/abs/1706.03762. (Accessed on 2025-06-17) |
| [18] |
Brown TB, Mann B, Ryder N, Subbiah M, Kaplan J, et al. 2020. Language models are few-shot learners. |
| [19] |
Huang J, Cheng F, He L, Lou X, Li H, et al. 2024. Effect driven prioritization of contaminants in wastewater treatment plants across China: a data mining-based toxicity screening approach. |
| [20] |
Srivastava H, Kumar Das S. 2023. Air pollution prediction system using XRSTH-LSTM algorithm. |
| [21] |
Cheng F, Li H, Brooks BW, You J. 2021. Signposts for aquatic toxicity evaluation in China: text mining using event-driven taxonomy within and among regions. |
| [22] |
Shrestha S, Mount J, Vald G, Sermet Y, Samuel DJ, et al. 2025. A community-centric intelligent cyberinfrastructure for addressing nitrogen pollution using web systems and conversational AI. |
| [23] |
Strogonov V, Pollert J. 2025. Artificial intelligence-enhanced web application approach to data management in the WIDER UPTAKE project. |
| [24] |
Ren Y, Zhang T, Dong X, Li W, Wang Z, et al. 2024. WaterGPT: training a large language model to become a hydrology expert. |
| [25] |
Gunasekar S, Joselin Retna Kumar G, Dileep Kumar Y. 2022. Sustainable optimized LSTM-based intelligent system for air quality prediction in Chennai. |
| [26] |
Wu Z, Liu N, Li G, Liu X, Wang Y, et al. 2023. Meta-learning-based spatial-temporal adaption for coldstart air pollution prediction. |
| [27] |
Panneerselvam V, Thiagarajan R. 2023. ACBiGRU-DAO: attention convolutional bidirectional gated recurrent unit-based dynamic arithmetic optimization for air quality prediction. |
| [28] |
Liu Z, Yang Q, Shao J, Wang G, Liu H, et al. 2022. Improving daily precipitation estimation in the data scarce area by merging rain gauge and TRMM data with a transfer learning framework. |
| [29] |
Patra SR, Chu HJ, Tatas. 2023. Regional groundwater sequential forecasting using global and local LSTM models. |
| [30] |
Zhao X, Greenberg J, An Y, Hu XT. 2021. Fine-tuning BERT model for materials named entity recognition. 2021 IEEE International Conference on Big Data (Big Data), Orlando, FL, USA, 2021. US: IEEE. pp. 3717−3720 doi: 10.1109/BigData52589.2021.9671697 |
| [31] |
Kang Y, Kim J. 2024. ChatMOF: an artificial intelligence system for predicting and generating metal-organic frameworks using large language models. |
| [32] |
Duan H, Skreta M, Cotta L, Rajaonson EM, Dhawan N, et al. 2025. Boosting the predictive power of protein representations with a corpus of text annotations. |
| [33] |
Shi H, Zhao Y. 2024. Integration of advanced large language models into the construction of adverse outcome pathways: opportunities and challenges. |
| [34] |
Yang J, Xu H, Mirzoyan S, Chen T, Liu Z, et al. 2024. Poisoning medical knowledge using large language models. |
| [35] |
Chen Q, Hu Y, Peng X, Xie Q, Jin Q, et al. 2025. Benchmarking large language models for biomedical natural language processing applications and recommendations. |
| [36] |
Zhu JJ, Yang M, Jiang J, Bai Y, Chen D, et al. 2024. Enabling GPTs for expert-level environmental engineering question answering. |
| [37] |
Boiko DA, MacKnight R, Kline B, Gomes G. 2023. Autonomous chemical research with large language models. |
| [38] |
Bran AM, Cox S, Schilter O, Baldassari C, White AD, Schwaller P. 2024. Augmenting large language models with chemistry tools. |
| [39] |
Zheng Y, Koh HY, Ju J, Nguyen ATN, May LT, et al. 2025. Large language models for scientific discovery in molecular property prediction. |
| [40] |
Lane TR, Vignaux PA, Harris JS, Snyder SH, Urbina F, et al. 2025. Machine learning and large language models for modeling complex toxicity pathways and predicting steroidogenesis. |
| [41] |
Bodnar C, Bruinsma WP, Lucic A, Stanley M, Allen A, et al. 2025. A foundation model for the Earth system. |
| [42] |
Hayes T, Rao R, Akin H, Sofroniew NJ, Oktay D, et al. 2025. Simulating 500 million years of evolution with a language model. |
| [43] |
Chan N, Parker F, Bennett W, Wu T, Jia MY, et al. 2024. MedTsLLM: leveraging LLMs for multimodal medical time series analysis. |
| [44] |
Wang Z, Jin Q, Wei CH, Tian S, Lai PT, et al. 2025. GeneAgent: self-verification language agent for gene-set analysis using domain databases. |
| [45] |
Dhar P. 2020. The carbon impact of artificial intelligence. |
| [46] |
Perković G, Drobnjak A, Botički I. 2024. Hallucinations in LLMs: understanding and addressing challenges. 2024 47th MIPRO ICT and Electronics Convention (MIPRO), Opatija, Croatia, 2024. US: IEEE. pp. 2084−2088 doi: 10.1109/MIPRO60963.2024.10569238 |
| [47] |
Strubell E, Ganesh A, McCallum A. 2019. Energy and policy considerations for deep learning in NLP. Proc. The 57th Annual Meeting of the Association for Computational Linguistics, Italy, 2019. pp. 3645−3650 |
| [48] |
Herrera M, Xie X, Menapace A, Zanfei A, Brentan BM. 2025. Sustainable AI infrastructure: a scenario-based forecast of water footprint under uncertainty. |
| [49] |
Zhang Y, Lin S, Xiong Y, Li N, Zhong L, et al. 2025. Fine-tuning large language models for interdisciplinary environmental challenges. |