Early detection of tomato leaf spot and wilt diseases based on hyperspectral imaging technology

Chang'an Zhou; Laichen Zheng; Kaixin Meng; Wei Zheng; Kaixing Zhang; Qinghua Shi; Chang'an Zhou; Laichen Zheng; Kaixin Meng; Wei Zheng; Kaixing Zhang; Qinghua Shi

doi:10.48130/vegres-0025-0010

This study explores the application of hyperspectral imaging technology integrated with machine learning for the early detection of tomato leaf spot and blight diseases. Pathogens causing leaf spot and blight were introduced to young tomato seedlings through foliar sprays and root irrigation. A hyperspectral imaging system was established to capture detailed spectral images of tomato leaves during the initial stages of disease development. The collected hyperspectral data were preprocessed using smoothing algorithms (Savitzky-Golay, SG), multivariate scatter correction (MSC), standard normal variate transformation (SNV), and first derivative (1^st Der). A support vector machine (SVM) detection model was trained to evaluate the effectiveness of these preprocessing methods. Through comprehensive modeling and comparative analysis, the 1^st Der-SG preprocessing approach was identified as the most effective, achieving an overall detection accuracy of 79.3% on the test dataset. Furthermore, feature extraction was performed on the preprocessed data using competitive adaptive weighted sampling (CARS), successive projection algorithm (SPA), uninformative variable elimination (UVE), and principal component analysis (PCA). Subsequently, the dung beetle optimization (DBO) algorithm was employed to enhance the performance of support vector machines (SVM) and bi-directional long short-term memory networks (BiLSTM), resulting in the development of DBO-SVM and DBO-BiLSTM models. The most effective model combination, 1^st Der-SG-UVE-DBO-BiLSTM, achieved an outstanding overall accuracy of 97.3% on the test set. This research highlights the significant potential of hyperspectral imaging technology for the early detection of tomato leaf spot and blight diseases. The findings provide valuable technical insights for tomato disease detection and establish a theoretical foundation for early disease identification in other crops.

HTML

Introduction

Tomato plants are highly adaptable and easy to cultivate, making them one of the most widely grown and economically significant vegetable crops worldwide^[1]. In recent years, driven by technological advancements and economic growth, tomato cultivation has garnered substantial support from governments, societies, and markets. Currently, China leads globally in both tomato cultivation area and yield, with tomato production and sales ranking among the top agricultural crops in the country. According to a survey by the International Organization of Seedkeepers, the global tomato cultivation area has surpassed 350 million hectares, with China accounting for an impressive 150 million hectares. Global tomato production exceeds 700 million tons, of which China contributes up to 400 million tons. Despite these achievements, tomato production faces significant challenges from various diseases, particularly leaf spot and wilt diseases, which severely impact yield. In modern agricultural practices, the targeted and rational application of pesticides is essential to ensure high-quality, high-yield production, and food safety. This approach not only effectively controls diseases but also minimizes environmental pollution caused by excessive pesticide use. However, the foundation for such targeted pesticide application lies in the accurate monitoring of plant growth and disease conditions, with early detection and precise identification of plant diseases being the most critical factors^[2]. Therefore, the early, rapid, and accurate diagnosis of vegetable diseases is vital for achieving high-quality, high-yield, and safe agricultural production. This is of great significance for promoting green, safe, and sustainable production of vegetables and fruits.

Hyperspectral imaging technology, an emerging method for crop disease detection, integrates imaging and spectral techniques to capture spectral data and analyze disease-related images. Researchers have achieved significant progress in disease detection using hyperspectral imaging, providing robust technical support for agricultural production. For instance, Wu et al. employed hyperspectral imaging to develop SPA2-ELM and CNN models for the early detection of soybean rust disease, as well as a CNN-SVM model for disease severity classification. After preprocessing and feature extraction, these models achieved test set accuracies of 87.5% and 92.08%, respectively^[3]. Similarly, Zhong utilized a bioluminescence system combined with hyperspectral imaging to monitor tomato bacterial wilt disease. By applying SNV preprocessing and SPA feature extraction, the high-throughput linear discriminant analysis model achieved a detection rate exceeding 90% for tomato bacterial wilt disease^[4]. In another study, Smigaj et al. investigated the optimal predictive factors for pine needle blight disease using hyperspectral data integrated with LiDAR technology. Their findings demonstrated that combining hyperspectral imaging and LiDAR significantly enhanced the detection capability for pine needle blight disease^[5]. Jaafar et al. further advanced the field by employing radial basis function neural networks to detect early and late stages of pumpkin powdery mildew disease, achieving detection rates of 82% and 99% under laboratory conditions^[6].

Despite these advancements, the majority of current research is confined to identifying and detecting a limited number of pathogens within a single crop or cultivation system, such as maize leaf spot disease and soybean powdery mildew disease. Few studies have explored dual diseases in a single crop, such as soybean anthracnose and bacterial blight^[7], or pests like aphids and red spider mites in cotton^[8]. These studies underscore the efficacy of hyperspectral imaging techniques integrated with machine learning and deep learning for crop disease detection. However, research on the early detection of plant diseases remains limited, particularly when phenotypic symptoms are not yet apparent. Early detection entails identifying diseases before visible symptoms appear on leaves—a task that traditional methods, such as manual visual inspection or single-band imaging techniques, often fail to accomplish due to their proneness to errors, time-intensive nature, and inability to deliver early, accurate, and rapid detection^[9].

In this study, we focus on leaf spot and wilt diseases, which are prevalent in tomato plants, as research subjects. By leveraging hyperspectral imaging technology combined with machine learning methods, we aim to detect these diseases at an early stage, enabling timely and effective disease control. The main research content and findings are summarized as follows:

(1) Multiple preprocessing and feature extraction methods were applied to process hyperspectral data, enabling the establishment of various early detection models. Through comparative analysis, the combination of processing methods with the highest detection performance was identified.

(2) The dung beetle optimization (DBO) algorithm was introduced to optimize the parameters of support vector machines (SVM), and bidirectional long short-term memory (BiLSTM) neural networks. This approach identified the optimal parameter configurations for each model, significantly enhancing the efficiency of tomato leaf spot and blight detection.

(3) A unified model capable of simultaneously detecting both diseases was developed, thereby improving disease identification efficiency. This achievement provides a robust foundation for targeted management of crop diseases.

Discussion

Der processing for further analysis

In this study, after processing the hyperspectral data using four preprocessing methods—1^st Der, MSC, SNV, and SG—the model with 1^st Der-SG preprocessing achieved the highest accuracy on the test set. This advantage can be attributed to the inevitable presence of instrumental noise during hyperspectral data acquisition, as well as the physical changes in samples over extended periods under varying environmental conditions, which lead to spectral baseline drift. Baseline drift significantly impacts model recognition performance. In contrast, the scattering effect caused by uneven particle size and distribution in tomato samples is relatively minor, making scattering correction methods less effective than baseline correction.

The superior performance of 1^st Der over 2^nd Der is due to the primary function of the first-order derivative, which is to eliminate low-frequency noise (e.g., baseline drift) and emphasize spectral trends. By calculating the slope of the spectral curve, the first-order derivative highlights changes in absorption peaks or inflection points, which are often closely associated with substance composition or disease characteristics. On the other hand, while the second-order derivative also removes low-frequency noise, it further amplifies high-frequency changes in the spectral data, such as sharpening absorption peaks. However, the second-order derivative can inadvertently amplify high-frequency noise (e.g., random fluctuations or measurement errors), leading to suboptimal results, especially when processing noisier data.

UVE feature extraction for further analysis
In the feature extraction process, the UVE algorithm demonstrates optimal modeling performance, achieving a recall rate of 100% for healthy samples. This superior performance is attributed to the UVE algorithm's higher robustness compared to CARS, particularly when handling noisy data such as spectral data. By eliminating uninformative variables, UVE effectively reduces the impact of noise on feature selection. Additionally, UVE does not rely on large-scale sampling competition during variable selection, making it more suitable for small-sample modeling scenarios like this study.

In contrast to the SPA algorithm, which selects variables stepwise to maximize orthogonality among them, UVE performs better for data with nonlinear relationships, such as hyperspectral data. Furthermore, compared to PCA, which primarily focuses on dimensionality reduction and may inadvertently lose important spectral features, UVE emphasizes variable selection, ensuring the retention of critical spectral information.

Some limitations of this study
The experimental environment in this study was relatively controlled, with all samples collected indoors after uniform cultivation and inoculation. While this approach ensured consistency, it may have influenced the experimental results. Future research could expand the scope by collecting hyperspectral images of tomato leaves grown under field conditions. This would allow for the analysis of spectral data in more complex environments and facilitate real-time detection of tomato leaf spot and wilt diseases.

Due to experimental limitations, this study focused solely on tomato leaf spot and wilt diseases. Future work could extend the research to include other prevalent diseases, such as early blight, late blight, and gray mold, or explore alternative crops like eggplant and peppers. Implementing intercropping models for multiple crops and diseases could enable simultaneous detection and classification, supporting timely disease management and enhancing crop yield and quality.

[1]	Liu W, Liu Z, Huang C, Lu M, Liu J, et al. 2016. Statistical analysis of the occurrence and damage of major crop pests and diseases in the past decade. Plant Protection 42(5):1−9,46 doi: 10.3969/j.issn.0529-1542.2016.05.001 CrossRef Google Scholar
[2]	Yu J, Schumann AW, Cao Z, Sharpe SM, Boyd NS. 2019. Weed detection in perennial ryegrass with deep learning convolutional neural network. Frontiers in Plant Science 10:1422 doi: 10.3389/fpls.2019.01422 CrossRef Google Scholar
[3]	Wu Z. 2018. Research on early detection and grading method of soybean mosaic disease based on hyperspectral imaging. Thesis. Zhejiang University of Technology, China
[4]	Zhong L. 2021. Detection of tomato greening disease by bioluminescence and hyperspectral imaging. Thesis. Zhejiang University, China
[5]	Smigaj M, Gaulton R, Suárez JC, Barr SL. 2019. Combined use of spectral and structural characteristics for improved red band needle blight detection in pine plantation stands. Forest Ecology and Management 434:213−23 doi: 10.1016/j.foreco.2018.12.005 CrossRef Google Scholar
[6]	Abdulridha J, Ampatzidis Y, Roberts P, Kakarla SC. 2020. Detecting powdery mildew disease in squash at different stages using UAV-based hyperspectral imaging and artificial intelligence. Biosystems Engineering 197:135−48 doi: 10.1016/j.biosystemseng.2020.07.001 CrossRef Google Scholar
[7]	Liu S, Yu H, Sui Y, Kong L, Yu Z, et al. 2023. Hyperspectral data analysis for classification of soybean leaf diseases. Spectroscopy and Spectral Analysis 43(05):1550−55 doi: 10.3964/j.issn.1000-0593(2023)05-1550-06 CrossRef Google Scholar
[8]	Wang X, Deng J, Huang H, Deng Y, Jiang T, et al. 2019. Identification of pests in cotton fields based on hyperspectral data. Journal of South China Agricultural University 40(3):97−103 doi: 10.7671/j.issn.1001-411X.201807041 CrossRef Google Scholar
[9]	Riefolo C, Antelmi I, Castrignanò A, Ruggieri S, Galeone C, et al. 2021. Assessment of the hyperspectral data analysis as a tool to diagnose Xylella fastidiosa in the asymptomatic leaves of olive plants. Plants 10(4):683 doi: 10.3390/plants10040683 CrossRef Google Scholar
[10]	Uddin MP, Mamun MA, Hossain MA. 2020. PCA-based feature reduction for hyperspectral remote sensing image classification. ITET Technical Review 38(4):377−396 doi: 10.1080/02564602.2020.1740615 CrossRef Google Scholar
[11]	Silalahi DD, Midi H, Arasan J, Mustafa M S, Caliman JP. 2018. Robust generalized multiplicative scatter correction algorithm on pretreatment of near infrared spectral data. Vibrational Spectroscopy 97:55−65 doi: 10.1016/j.vibspec.2018.05.002 CrossRef Google Scholar
[12]	Nason GP. 2010. Wavelet Methods in Statistics with R. New York: Springer. 257 pp. doi: 10.1007/978-0-387-75961-6.
[13]	Wu Y, Li X, Zhang Q, Zhou X, Qiu H, et al. 2023. Recognition of spider mite infestations in jujube trees based on spectral-spatial clustering of hyperspectral images from UAVs. Frontiers in Plant Science 14:1078676 doi: 10.3389/fpls.2023.1078676 CrossRef Google Scholar
[14]	Savitzky A, Golay MJE. 1964. Smoothing and differentiation of data by simplified least squares procedures. Analytical Chemistry 36:1627−39 doi: 10.1021/ac60214a047 CrossRef Google Scholar
[15]	Helland IS, Næs T, Isaksson T. 1995. Related versions of the multiplicative scatter correction method for preprocessing spectroscopic data. Chemometrics and Intelligent Laboratory Systems 29(2):233−41 doi: 10.1016/0169-7439(95)80098-T CrossRef Google Scholar
[16]	Saunders C, Stitson MO, Weston J. 2022. Support vector machine. Computer Science 1(4):1−28 Google Scholar
[17]	Li J, Peng Y , Chen L, Huang W. 2014. Near-infrared hyperspectral imaging combined with CARS algorithm to quantitatively determine soluble solids content in "Y" pear. Spectroscopy and Spectral Analysis 34:1264−69 doi: 10.3964/j.issn.1000-0593(2014)05-1264-06 CrossRef Google Scholar
[18]	Qiao S, Tian Y, Wang Q, Song S, Song P. 2021. Nondestructive detection of decayed blueberry based on information fusion of hyperspectral imaging (HSI) and low-field nuclear magnetic resonance (LF-NMR). Computers and Electronics in Agriculture 184:106100 doi: 10.1016/j.compag.2021.106100 CrossRef Google Scholar
[19]	Xu L, Wang X, Chen H, Bo X, Yong H, et al. 2022. Predicting internal parameters of kiwifruit at different storage periods based on hyperspectral imaging technology. Journal of Food Measurement and Characterization 16:3910−25 doi: 10.1007/s11694-022-01477-0 CrossRef Google Scholar
[20]	Araújo MCU, Saldanha TCB, Galvão RKH, Yongeyama T, Chame HC, et al. 2001. The successive projections algorithm for variable selection in spectroscopic multicomponent analysis. Chemometrics and Intelligent Laboratory Systems 57(2):65−73 doi: 10.1016/S0169-7439(01)00119-8 CrossRef Google Scholar
[21]	Gao J, Li X, Zhu F, He Y. 2013. Application of hyperspectral imaging technology to discriminate different geographical origins of Jatropha curcas L seeds. Computers and Electronics in Agriculture 99:186−93 doi: 10.1016/j.compag.2013.09.011 CrossRef Google Scholar
[22]	Liu S, Tian Y, Zhang F, Feng D. 2017. Non-destructive detection of apple diseases using hyperspectral images based on quadratic continuous projection method and BP artificial neural network. Food Science 38(08):277−82 doi: 10.7506/spkx1002-6630-201708043 CrossRef Google Scholar
[23]	Centner V, Massart DL, de Noord OE, de Jong S, Vandeginste BM, et al. 1996. Elimination of uninformative variables for multivariate calibration. Analytical Chemistry 68(21):3851−58 doi: 10.1021/ac960321m CrossRef Google Scholar
[24]	Chen Y, Wang Z, Wang Z. 2017. Novel variable selection method based on uninformative variable elimination and ridge extreme learning machine: CO gas concentration retrieval trial. Spectroscopy and Spectral Analysis 37(01):299−305 Google Scholar
[25]	Pearson K, Mag P. 1901. On lines and planes of closest fit to systems of points in space. The London, Edinburgh, and Dublin Philosophical Magazine and Journal of Science 2:559−72 doi: 10.1080/14786440109462720 CrossRef Google Scholar
[26]	Johnson DE. 2005. Applied multivariate methods for data analysis. Beijing: Higher Education Press. pp. 93−111
[27]	Yu Y, Wang B, Zhang L. 2009. A Fast data-oriented algorithm for Principal Component Analysis. Pattern Recognition and Artificial Intelligence 22(4):568−73 Google Scholar
[28]	Liu Z, Zhao W. 2021. Research on cross-media correlation analysis method based on semantic and distribution features. Journal of Information Science 40(5):471−78 doi: 10.3772/j.issn.1000-0135.2021.05.004 CrossRef Google Scholar
[29]	Liu W, Li Y, Luo J, Li W, Fu S. 2020. Sentiment analysis of Chinese short text based on BERT and BiLSTM. Journal of Taiyuan Normal University (Natural Science Edition) 19(04):52−58 Google Scholar
[30]	Shi L, Zhang J, Gao Y, Wei L, Tao Y. 2023. Network intrusion detection based on Transformer and BiLSTM. Computer Engineering (03):29−36,57 doi: 10.19678/j.issn.1000-3428.0065135 CrossRef Google Scholar
[31]	Xue J, Shen B. 2023. Dung beetle optimizer: a new meta-heuristic algorithm for global optimization. The Journal of Supercomputing 79(7):7305−36 doi: 10.1007/s11227-022-04959-6 CrossRef Google Scholar
[32]	Yuan X, Yang F, Yang T. 2023. UAV 3D path planning method based on adaptive dung beetle algorithm. Radio Engineering 54:928−36 doi: 10.3969/j.issn.1003-3106.2024.04.016 CrossRef Google Scholar
[33]	Wu Q, Xu L, Zou Z, Wang J, Zeng Q, et al. 2022. Rapid nondestructive detection of peanut varieties and peanut mildew based on hyperspectral imaging and stacked machine learning models. Frontiers in Plant Science 13:1047479 doi: 10.3389/fpls.2022.1047479 CrossRef Google Scholar

Methods	Train set detection accuracy (%)				Test set detection accuracy (%)
Methods	Health recall	Blight recall	Leaf spot recall	Overall accuracy	Health recall	Blight recall	Leaf spot recall	Overall accuracy
SNV-SG	89.9	61.1	82.0	77.1	86.3	39.5	78.7	71.3
MSC-SG	90.2	66.4	87.4	81.6	80.9	59.3	81.6	73.3
1^st Der-SG	93.3	70.5	87.7	84.0	90.9	64.4	80.0	79.3
2^nd Der-SG	94.8	76.0	87.7	86.2	87.0	64.0	78.9	76.3

Times	Best principal component	RMSECV
1	8	0.4789
2	12	0.4794
3	9	0.4775
4	11	0.4793
5	7	0.4789

Model	Methods	Train set detection accuracy (%)				Test set detection accuracy (%)
Model	Methods	Health recall	Blight recall	Leaf spot recall	Overall accuracy	Health recall	Blight recall	Leaf spot recall	Overall accuracy
SVM	CARS	95.3	78.5	76.2	83.3	90.0	70.0	86.0	82.0
	SPA	95.3	73.8	80.8	83.3	92.0	60.0	90.0	80.7
	UVE	98.7	82.0	93.3	91.3	92.0	74.0	98.0	88.0
	PCA	94.7	69.8	75.5	80.0	84.0	68.0	72.0	74.7
DBO- SVM	CARS	94.7	86.6	90.7	90.7	94.0	86.0	92.0	90.7
	SPA	93.3	89.2	88.8	90.4	88.0	84.0	88.0	86.7
	UVE	98.0	87.3	98.0	94.4	98.0	86.0	98.0	94.0
	PCA	96.0	88.5	92.1	92.2	94.0	88.0	92.0	91.3
BiLSTM	CARS	90.7	85.9	80.1	85.5	90.0	76.0	92.0	86.0
	SPA	94.7	86.5	84.2	88.4	86.0	84.0	86.0	85.3
	UVE	90.0	87.9	97.4	91.8	86.0	90.0	96.0	90.7
	PCA	93.3	77.0	92.8	87.8	92.0	72.0	96.0	86.7
DBO- BiLSTM	CARS	98.7	90.6	98.7	96.0	96.0	96.0	96.0	96.0
	SPA	98.0	90.5	95.4	94.7	94.0	92.0	98.0	94.7
	UVE	100.0	95.3	96.7	97.3	100.0	94.0	98.0	97.3
	PCA	98.0	91.2	98.7	96.0	96.0	98.0	94.0	96.0

{{lists.name}}

Early detection of tomato leaf spot and wilt diseases based on hyperspectral imaging technology

Abstract

Rights and permissions

References

About this article

Cite this article

Article Metrics

Access History

Other Articles By Authors