DeepETD: a novel deep-learning based model for endogenous metabolite target discovery

Zhixuan Xu; Xiaomin Wang; Xiaobo Yang; Xiao Yuan; Kongkai Zhu; Xinyue Min; Weilie Xiao; Heng Xu; Cheng Luo; Hao Zhang; Zhixuan Xu; Xiaomin Wang; Xiaobo Yang; Xiao Yuan; Kongkai Zhu; Xinyue Min; Weilie Xiao; Heng Xu; Cheng Luo; Hao Zhang

doi:10.48130/targetome-0026-0024

2026 Volume 2

Article Contents

Next Previous

ORIGINAL ARTICLE Open Access

DeepETD: a novel deep-learning based model for endogenous metabolite target discovery

1.
State Key Laboratory of Discovery and Utilization of Functional Components in Traditional Chinese Medicine, Institute of Interdisciplinary Integrative Medicine Research, Shanghai University of Traditional Chinese Medicine, Shanghai 201203, China
2.
School of Chinese Materia Medica, Nanjing University of Chinese Medicine, Nanjing 210023, China
3.
Chemical Biology Research Center, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Shanghai 201203, China
4.
Zhongshan Institute for Drug Discovery, Shanghai Institute of Materia Medica, Chinese Academy of Sciences, Zhongshan 528437, China
5.
Advanced Medical Research Institute and Meili Lake Translational Research Park, Shandong University, Jinan, Shandong 250012, China
6.
Key Laboratory of Medicinal Chemistry for Natural Resource, Ministry of Education, School of Pharmacy and School of Chemical Science and Technology, Yunnan University, Kunming 650500, China
7.
Precision Medicine Research Institute of Guizhou, and Guizhou Provincial Key Laboratory of Digestive System Diseases, The Affiliated Hospital of Guizhou Medical University, Guiyang 550000, China
^#Authors contributed equally: Zhixuan Xu, Xiaomin Wang, Xiaobo Yang

More Information

Corresponding authors: idaheng@simm.ac.cn (Xu H); cluo@simm.ac.cn (Luo C); zhanghao@shutcm.edu.cn (Zhang H)

Received: 07 April 2026
Revised: 31 May 2026
Accepted: 03 June 2026
Published online: 17 June 2026
Targetome 2(3), Article number: e024 (2026) | Cite this article

Abstract

Metabolites are involved in almost all fundamental biological processes. The identification of their protein targets is crucial for elucidating non-canonical signaling roles and evaluating their therapeutic potential. In recent years, due to the continuous development of chemical proteomics, atypical phenotypic functions of classic metabolites have been continuously discovered. However, the discovery of endogenous metabolites for their disease-related functions is still progressing slowly. To accelerate the identification of disease-related targets for endogenous metabolites, we propose a new hypothesis: endogenous metabolites and their molecular targets are expected to have similar disease associations. Following this hypothesis, here we report the development of a novel deep-learning model called DeepETD, which integrates bioinformatics data and introduces an attention mechanism to predict functional targets of specific metabolite phenotypes. Using this model, we constructed a publicly accessible database named EMTDD containing potential targets for 3,382 common human endogenous metabolites. Overall, this study presents a new computational method and resource for endogenous metabolite target discovery as an important supplement to experimental methods such as chemical proteomics.
- DeepETD,
- Metabolite target discovery,
- EMTDD,
- Testosterone,
- Leukotriene B4

Supplementary information

Supplementary Fig. S1 Performance evaluation of DeepETD.
Supplementary Fig. S2 Negative MST validation results for predicted targets.

Rights and permissions
Copyright: © 2026 by the author(s). Published by Maximum Academic Press on behalf of China Pharmaceutical University. This article is an open access article distributed under Creative Commons Attribution License (CC BY 4.0), visit https://creativecommons.org/licenses/by/4.0/.

References

[1]	Husted AS, Trauelsen M, Rudenko O, Hjorth SA, Schwartz TW. 2017. GPCR-mediated signaling of metabolites. Cell Metabolism 25:777−796 doi: 10.1016/j.cmet.2017.03.008 CrossRef Google Scholar
[2]	Qiu S, Cai Y, Yao H, Lin C, Xie Y, et al. 2023. Small molecule metabolites: discovery of biomarkers and therapeutic targets. Signal Transduction and Targeted Therapy 8:132 doi: 10.1038/s41392-023-01399-3 CrossRef Google Scholar
[3]	Palermo A. 2023. Metabolomics- and systems-biology-guided discovery of metabolite lead compounds and druggable targets. Drug Discovery Today 28:103460 doi: 10.1016/j.drudis.2022.103460 CrossRef Google Scholar
[4]	Luzarowski M, Skirycz A. 2019. Emerging strategies for the identification of protein−metabolite interactions. Journal of Experimental Botany 70:4605−4618 doi: 10.1093/jxb/erz228 CrossRef Google Scholar
[5]	Cox MA, Bassi C, Saunders ME, Nechanitzky R, Morgado-Palacin I, et al. 2020. Beyond neurotransmission: acetylcholine in immunity and inflammation. Journal of Internal Medicine 287:120−133 doi: 10.1111/joim.13006 CrossRef Google Scholar
[6]	Kopec AM, Smith CJ, Bilbo SD. 2019. Neuro-immune mechanisms regulating social behavior: dopamine as mediator? Trends in Neurosciences 42:337−348 doi: 10.1016/j.tins.2019.02.005 CrossRef Google Scholar
[7]	Ye D, Xu H, Tang Q, Xia H, Zhang C, et al. 2021. The role of 5-HT metabolism in cancer. Biochimica et Biophysica Acta (BBA) - Reviews on Cancer 1876:188618 doi: 10.1016/j.bbcan.2021.188618 CrossRef Google Scholar
[8]	Myburgh J. 2010. Norepinephrine: more of a neurohormone than a vasopressor. Critical Care 14:196 doi: 10.1186/cc9246 CrossRef Google Scholar
[9]	Piazza I, Kochanowski K, Cappelletti V, Fuhrer T, Noor E, et al. 2018. A map of protein-metabolite interactions reveals principles of chemical communication. Cell 172:358−372.e23 doi: 10.1016/j.cell.2017.12.006 CrossRef Google Scholar
[10]	Li X, Gianoulis TA, Yip KY, Gerstein M, Snyder M. 2010. Extensive in vivo metabolite-protein interactions revealed by large-scale systematic analyses. Cell 143:639−650 doi: 10.1016/j.cell.2010.09.048 CrossRef Google Scholar
[11]	Qin W, Yang F, Wang C. 2020. Chemoproteomic profiling of protein−metabolite interactions. Current Opinion in Chemical Biology 54:28−36 doi: 10.1016/j.cbpa.2019.11.003 CrossRef Google Scholar
[12]	Nicholson JK, Lindon JC. 2008. Metabonomics. Nature 455:1054−1056 doi: 10.1038/4551054a CrossRef Google Scholar
[13]	Xu H, Zhao H, Ding C, Jiang D, Zhao Z, et al. 2023. Celastrol suppresses colorectal cancer via covalent targeting peroxiredoxin 1. Signal Transduction and Targeted Therapy 8:51 doi: 10.1038/s41392-022-01231-4 CrossRef Google Scholar
[14]	Cheng F, Zhou Y, Li W, Liu G, Tang Y. 2012. Prediction of chemical-protein interactions network with weighted network-based inference method. PLoS One 7:e41064 doi: 10.1371/journal.pone.0041064 CrossRef Google Scholar
[15]	Lounkine E, Keiser MJ, Whitebread S, Mikhailov D, Hamon J, et al. 2012. Large-scale prediction and testing of drug activity on side-effect targets. Nature 486:361−367 doi: 10.1038/nature11159 CrossRef Google Scholar
[16]	Barabási AL, Gulbahce N, Loscalzo J. 2011. Network medicine: a network-based approach to human disease. Nature Reviews Genetics 12:56−68 doi: 10.1038/nrg2918 CrossRef Google Scholar
[17]	Chen B, Butte AJ. 2016. Leveraging big data to transform target selection and drug discovery. Journal of Clinical Pharmacology & Therapeutics 99:285−297 doi: 10.1002/cpt.318 CrossRef Google Scholar
[18]	Zhang Y, Liu C, Liu M, Liu T, Lin H, et al. 2023. Attention is all you need: utilizing attention in AI-enabled drug discovery. Briefings in Bioinformatics 25:bbad467 doi: 10.1093/bib/bbad467 CrossRef Google Scholar
[19]	Wishart DS, Guo A, Oler E, Wang F, Anjum A, et al. 2022. HMDB 5.0: the human metabolome database for 2022. Nucleic Acids Research 50:D622−D631 doi: 10.1093/nar/gkab1062 CrossRef Google Scholar
[20]	Gilson MK, Liu T, Baitaluk M, Nicola G, Hwang L, et al. 2016. BindingDB in 2015: a public database for medicinal chemistry, computational chemistry and systems pharmacology. Nucleic Acids Research 44:D1045−D1053 doi: 10.1093/nar/gkv1072 CrossRef Google Scholar
[21]	Schriml LM, Munro JB, Schor M, Olley D, McCracken C, et al. 2022. The human disease ontology 2022 update. Nucleic Acids Research 50:D1255−D1261 doi: 10.1093/nar/gkab1063 CrossRef Google Scholar
[22]	Gargano MA, Matentzoglu N, Coleman B, Addo-Lartey EB, Anagnostopoulos AV, et al. 2024. The human phenotype ontology in 2024: phenotypes around the world. Nucleic Acids Research 52:D1333−D1346 doi: 10.1093/nar/gkad1005 CrossRef Google Scholar
[23]	Zdrazil B, Felix E, Hunter F, Manners EJ, Blackshaw J, et al. 2024. The ChEMBL Database in 2023: a drug discovery platform spanning multiple bioactivity data types and time periods. Nucleic Acids Research 52:D1180−D1192 doi: 10.1093/nar/gkad1004 CrossRef Google Scholar
[24]	Yin J, Chen KM, Clark MJ, Hijazi M, Kumari P, et al. 2020. Structure of a D2 dopamine receptor−G-protein complex in a lipid membrane. Nature 584:125−129 doi: 10.1038/s41586-020-2379-5 CrossRef Google Scholar
[25]	Xu P, Huang S, Krumm BE, Zhuang Y, Mao C, et al. 2023. Structural genomics of the human dopamine receptor system. Cell Research 33:604−616 doi: 10.1038/s41422-023-00808-0 CrossRef Google Scholar
[26]	Brzozowski AM, Pike ACW, Dauter Z, Hubbard RE, Bonn T, et al. 1997. Molecular basis of agonism and antagonism in the oestrogen receptor. Nature 389:753−758 doi: 10.1038/39645 CrossRef Google Scholar
[27]	Toy W, Shen Y, Won H, Green B, Sakr RA, et al. 2013. ESR1 ligand-binding domain mutations in hormone-resistant breast cancer. Nature Genetics 45:1439−1445 doi: 10.1038/ng.2822 CrossRef Google Scholar
[28]	Fortunati N, Catalano MG, Boccuzzi G, Frairia R. 2010. Sex Hormone-Binding Globulin (SHBG), estradiol and breast cancer. Molecular and Cellular Endocrinology 316:86−92 doi: 10.1016/j.mce.2009.09.012 CrossRef Google Scholar
[29]	Maggiolini M, Vivacqua A, Fasanella G, Recchia AG, Sisci D, et al. 2004. The G protein-coupled receptor GPR30 mediates c-fos up-regulation by 17β-estradiol and phytoestrogens in breast cancer cells. Journal of Biological Chemistry 279:27008−27016 doi: 10.1074/jbc.M403588200 CrossRef Google Scholar
[30]	Wang N, He X, Zhao J, Jiang H, Cheng X, et al. 2022. Structural basis of leukotriene B4 receptor 1 activation. Nature Communications 13:1156 doi: 10.1038/s41467-022-28820-9 CrossRef Google Scholar
[31]	Grishkovskaya I, Avvakumov GV, Sklenar G, Dales D, Hammond GL, et al. 2000. Crystal structure of human sex hormone-binding globulin: steroid transport by a laminin G-like domain. The EMBO Journal 19:504−512 doi: 10.1093/emboj/19.4.504 CrossRef Google Scholar
[32]	Xing C, Zhang J, Zhao H, He B. 2022. Effect of sex hormone-binding globulin on polycystic ovary syndrome: mechanisms, manifestations, genetics, and treatment. International Journal of Women's Health 14:91−105 doi: 10.2147/IJWH.S344542 CrossRef Google Scholar
[33]	Bhasin S, Brito JP, Cunningham GR, Hayes FJ, Hodis HN, et al. 2018. Testosterone therapy in men with hypogonadism: an endocrine society clinical practice guideline. The Journal of Clinical Endocrinology & Metabolism 103:1715−1744 doi: 10.1210/jc.2018-00229 CrossRef Google Scholar
[34]	Terawaki K, Yokomizo T, Nagase T, Toda A, Taniguchi M, et al. 2005. Absence of leukotriene B4 receptor 1 confers resistance to airway hyperresponsiveness and Th2-type immune responses. The Journal of Immunology 175:4217−4225 doi: 10.4049/jimmunol.175.7.4217 CrossRef Google Scholar
[35]	Sumida H, Yanagida K, Kita Y, Abe J, Matsushima K, et al. 2014. Interplay between CXCR2 and BLT1 facilitates neutrophil infiltration and resultant keratinocyte activation in a murine model of imiquimod-induced psoriasis. The Journal of Immunology 192:4361−4369 doi: 10.4049/jimmunol.1302959 CrossRef Google Scholar
[36]	Zhou J, Lai W, Yang W, Pan J, Shen H, et al. 2018. BLT1 in dendritic cells promotes Th1/Th17 differentiation and its deficiency ameliorates TNBS-induced colitis. Cellular & Molecular Immunology 15:1047−1056 doi: 10.1038/s41423-018-0030-2 CrossRef Google Scholar
[37]	Mauvais-Jarvis F, Bhasin S. 2026. Metabolic messengers: testosterone. Nature Metabolism 8:52−61 doi: 10.1038/s42255-025-01431-6 CrossRef Google Scholar
[38]	He R, Chen Y, Cai Q. 2020. The role of the LTB4-BLT1 axis in health and disease. Pharmacological Research 158:104857 doi: 10.1016/j.phrs.2020.104857 CrossRef Google Scholar
[39]	Zheng L, Cao J, Jing L, Kang D, Wang Z, et al. 2026. Convergence of computer-aided drug discovery and artificial intelligence: towards next-generation therapeutics. Pharmaceutical Science Advances 4:100100 doi: 10.1016/j.pscia.2025.100100 CrossRef Google Scholar
[40]	Yang X, Zhang B, Wang S, Lu Y, Chen K, et al. 2023. OTTM: an automated classification tool for translational drug discovery from omics data. Briefings in Bioinformatics 24:bbad301 doi: 10.1093/bib/bbad301 CrossRef Google Scholar
[41]	Lamb J, Crawford ED, Peck D, Modell JW, Blat IC, et al. 2006. The connectivity map: using gene-expression signatures to connect small molecules, genes, and disease. Science 313:1929−1935 doi: 10.1126/science.1132939 CrossRef Google Scholar
[42]	Swinney DC, Anthony J. 2011. How were new medicines discovered? Nature Reviews Drug Discovery 10:507−519 doi: 10.1038/nrd3480 CrossRef Google Scholar
[43]	Eder J, Sedrani R, Wiesmann C. 2014. The discovery of first-in-class drugs: origins and evolution. Nature Reviews Drug Discovery 13:577−587 doi: 10.1038/nrd4336 CrossRef Google Scholar

About this article

Cite this article

Xu Z, Wang X, Yang X, Yuan X, Zhu K, et al. 2026. DeepETD: a novel deep-learning based model for endogenous metabolite target discovery. Targetome 2(3): e024 doi: 10.48130/targetome-0026-0024

Xu Z, Wang X, Yang X, Yuan X, Zhu K, et al. 2026. DeepETD: a novel deep-learning based model for endogenous metabolite target discovery. Targetome 2(3): e024 doi: 10.48130/targetome-0026-0024

Figures(6)

Download PDF

Special Issue

Multidisciplinary Integration Empowers Intelligent Target Discovery and Drug Development of Traditional Chinese Medicine Active Components

Article Metrics

Article views(408) PDF downloads(127)

Other Articles By Authors

on this site
- Zhixuan Xu
- Xiaomin Wang
- Xiaobo Yang
- Xiao Yuan
- Kongkai Zhu
- Xinyue Min
- Weilie Xiao
- Heng Xu
- Cheng Luo
- Hao Zhang
on Google Scholar
- Zhixuan Xu
- Xiaomin Wang
- Xiaobo Yang
- Xiao Yuan
- Kongkai Zhu
- Xinyue Min
- Weilie Xiao
- Heng Xu
- Cheng Luo
- Hao Zhang

HTML

Introduction

Metabolites play crucial roles in various biological processes in physiology or pathology. As the main natural small-molecule compounds in prokaryotes and eukaryotes, metabolites have a high degree of structural and functional diversity^[1,2]. With the development of metabolomics, a large number of novel metabolites have been discovered, and their biological functions are constantly being re-annotated and expanded^[1,3]. However, compared with the mature studies of protein-protein and protein-DNA interactions, the systematic identification of metabolite-protein interactions is still relatively insufficient. This gap greatly limits our understanding of metabolite functions.

In recent years, chemical proteomics has become a key technology to systematically and accurately reveal metabolite-protein interactions. Thanks to continuous breakthroughs in mass spectrometry technology, new chemo-proteomic methods have been developed for identifying these interactions^[4]. These methods use labeled or unlabeled metabolites to capture their binding proteins, which greatly facilitates the discovery of new functions of classical metabolites. For example, the application of classical neurotransmitters outside the central nervous system has been gradually confirmed^[5−8]. Chemical proteomics methods have constructed the metabolite-protein interactome, showing that metabolites can target multiple proteins and produce global effects, or only act on a few proteins and produce restricted effects^[9,10]. Although most chemical proteomics technologies belong to biophysical methods, the means to directly correlate metabolite-protein interactions with specific cellular phenotypes are still limited^[11]. In addition, the concentration, localization, and activity of metabolites may change significantly under different nutritional states and disease conditions^[12]. Therefore, it is urgent to develop new strategies to integrate multi-source biomedical information, such as cellular localization, concentration, functional pathways, and phenotypes, so as to promote the accurate identification of functional targets for endogenous metabolites.

Previously, we developed a computational tool called OTTER for identifying natural product targets with known pharmacological activities^[13]. However, endogenous metabolites face more challenges in target identification: the binding affinity between endogenous metabolites and proteins varies greatly, and there are more biological factors to be taken into account, such as disease relevance and subcellular localization. To address these challenges, we propose a hypothesis: if an endogenous metabolite is significantly associated with or consistent with a protein at the level of multidimensional biological characteristics, the protein is likely to be a potential target of the metabolite. In other words, endogenous metabolites should have a consistent 'fingerprint' with their protein targets. Based on this assumption, we report the development of a new method based on bioinformatics and deep learning called DeepETD (Deep learning-based Endogenous metabolites Target Discovery). This method aims to improve the identification ability of functional targets of endogenous metabolites through advanced data integration and deep learning technology.

Compared with traditional compound-protein interaction prediction methods^[14,15], DeepETD fully utilizes comprehensive biomedical data. The model integrates multi-source biomedical information and represents the biological characteristics of metabolites and proteins (including subcellular localization, cellular phenotype, and disease association information^[16,17]) as unified feature fingerprints. Subsequently, the model uses a deep learning algorithm with an attention mechanism to process high-dimensional and noisy data^[18]. The attention mechanism assigns weights to the inputs from different data sources, which significantly improves the prediction performance of metabolite-protein interactions. All prediction results are stored in the web server named EMTDD (Endogenous Metabolites Target Discovery Database). Users can easily query the potential target information of endogenous metabolites predicted by DeepETD.

Materials and methods

Dataset construction

A dataset containing positive and negative samples of endogenous metabolite-protein interactions was constructed using a binding affinity cutoff strategy. Endogenous metabolite protein interaction pairs with an IC₅₀ value of ≤ 100 nmol·L⁻¹ were designated as positive samples, whereas those with an IC₅₀ value of > 100 nmol·L⁻¹ were considered as negative samples. The multidimensional biological information used to build the dataset was collected from multiple authoritative databases and literature resources, including the Human Metabolome Database (HMDB)^[19], the IUPHAR/BPS Guide to PHARMACOLOGY, and the BindingDB database^[20]. Disease association information and cellular phenotype data were integrated from the Disease Ontology (DO) database and the Human Phenotype Ontology (HPO) database^[21,22]. To enrich the dataset, this study systematically searched all PubMed abstracts to extract the cellular localization, associated diseases, and cellular phenotype information of each metabolite and protein.

The keywords of cellular localization are selected from the commonly used subcellular localization terms in the literature. For each entity, the top five cellular localization terms, the top ten associated diseases, and the top five cellular phenotypes were screened according to the occurrence frequency, and the dataset was constructed based on these characteristics. Each sample contains characteristic information of metabolites and proteins, specifically covering their cellular localization, associated diseases, and cellular phenotypes. After the positive and negative samples are combined, they are divided into a training set and a validation set according to the proportions of 80% and 20%. All features were converted into numerical feature vectors using either one-hot encoding or embedding layers.

Construction of biological feature fingerprints
For each metabolite and protein, three core biological characteristics are defined: associated disease types, cellular phenotypes, and subcellular localization. Feature extraction is based on large-scale text mining technology by calculating the co-occurrence frequency of specific biological terms (diseases, phenotypes, etc.) and target objects (metabolites or proteins) in PubMed abstracts. If the two frequently co-exist in the same abstract, they are considered to be highly relevant. Subsequently, these co-occurrence frequencies are quantified and encoded as high-dimensional feature fingerprints, which are used as model inputs to characterize the potential association between metabolites and proteins in a multidimensional biological context.

Model architecture
An attention-based neural network model was developed to predict the interaction between endogenous metabolites and proteins. The model comprises the following layers. Embedding layer: The discrete input features (diseases, phenotypes, and subcellular localizations) are mapped into continuous vector representations through the embedding layer. The embedding dimensions of disease, phenotype, and subcellular localization were set to 32, 16, and 16. The model sets two embedding layers for the input features of metabolites and proteins, respectively. These embedding layers map the original features to the low-dimensional space, reducing the dimensions to 64, 32, and 16 in turn. Attention mechanism: To optimize the feature weights, an independent attention layer is introduced after the embedding layer. This layer contains two fully connected layers and a Tanh activation function for outputting attention weights. By using these attention weights to weigh and sum the embedding vectors, the global representation of protein features is obtained. Feature concatenation layer: The feature representations of metabolites and proteins are concatenated to form a comprehensive feature vector.

Deep neural network: The combination vector is processed through two fully connected layers with hidden layer sizes of 256 and 128, respectively. To prevent overfitting, the LeakyReLU activation function and a dropout layer with a rate of 0.3 were added between the fully connected layers to ensure the generalization performance of the model when dealing with unseen data. Finally, the features are mapped to a scalar through an additional fully connected layer, and the sigmoid activation function is applied to output the predicted probability of the interaction.

Model training and optimization
The model was trained using the binary cross-entropy loss function. The Adam optimizer was employed to update the parameters. The learning rate was set to 0.001. Training was conducted over 20 epochs. Each epoch includes forward propagation to calculate the loss and backpropagation to update the parameters. To prevent overfitting and improve the generalization ability of the model, the early stopping strategy was adopted. If the area under the receiver operating characteristic curve (AUC-ROC) on the validation set did not improve for 10 consecutive epochs, the training was stopped, and the best-performing model was retained. The AUC-ROC value and accuracy were used as evaluation metrics in both the training and validation stages. When the validation set reached the optimal AUC-ROC value, the model parameters were saved for subsequent testing. Ten-fold cross-validation was used to evaluate the generalization ability and robustness of the model on different data sets. To ensure the reproducibility of the results, the random seed was set to 42 at the beginning of training, and the randomness control was maintained during data loading, model initialization, and training. The experiment was conducted in a computing environment equipped with NVIDIA GPUs and implemented based on the PyTorch deep learning framework.

Disease-context-aware target prediction
For each metabolite, disease-associated contexts were identified according to the co-occurrence frequency between metabolites and disease terms extracted from PubMed abstracts. The top 10 most frequently associated diseases were selected as the disease contexts for prediction. Under each disease context, candidate proteins were filtered to retain only proteins associated with the corresponding disease. Metabolite-protein pairs were then input into DeepETD for interaction prediction, and the predicted interaction probabilities were used as the final interaction scores under the corresponding disease context.

Web-server of the endogenous metabolites target discovery database (EMTDD)
Using the optimized DeepETD model, potential targets for 3,382 endogenous metabolites recorded in the Human Metabolome Database (HMDB) were predicted across 10 highly relevant diseases. Relevant data were used to build a publicly accessible web server called EMTDD. The server is available at http://otter-simm.com/EM/EMTDD.html.

Microscale Thermophoresis (MST)
MST experiments were performed using a Monolith NT.115 (Blue/Red) instrument (NanoTemper Technologies GmbH, Germany). The target protein with an EGFP fluorescent label was overexpressed in HEK293T cells, and cell lysates were used for the experiment. HEK293T cells were transfected with plasmids using EZ Trans Lipo (Life-iLab) and lysed 48 h after transfection. A 5 μL cell lysate was mixed with 5 μL of endogenous metabolites (Testosterone, Nature-Standard, ST05380100; Leukotriene B4, MedChemExpress, HY-107608) at different concentrations. Then the samples were incubated for 10 min at room temperature. Using nanoblue excitation, the MST power was set to medium. The fitting was performed using the K_d model method incorporated by the MO Affinity Analysis v2.3 software (NanoTemper Technologies).

Molecular docking
Molecular docking was performed using Schrödinger and displayed by PyMOL. The reported ACAT1 crystal structure (PDB ID: 2IBY) and the AlphaFold-predicted structures of PRMT2 and NR2F6 were used as the initial conformation, and the ProteinPreparation module was used to optimize the protein. After all missing hydrogen atoms were added, binding sites were detected using the SiteMap module. Testosterone and Leukotriene B4 were prepared using the LigPrep module, and the Standard Precision docking mode was selected. The docking result with the highest score was chosen to generate a visual 3D structure diagram using Pymol software.

{{lists.name}}

DeepETD: a novel deep-learning based model for endogenous metabolite target discovery