تشخیص بیماریهای رایج گل‌کلم با استفاده از پردازش تصویر و یادگیری عمیق

نوع مقاله : مقاله پژوهشی

نویسندگان

1 گروه مهندسی بیوسیستم-دانشکده کشاورزی- دانشگاه محقق اردبیلی-اردبیل-ایران

2 گروه مهندسی کامپیوتر-دانشگاه صنعتی شریف-تهران-ایران

10.22034/jess.2023.391624.1995

چکیده

گل کلم یک سبزی بسیار سالم است که حاوی منبع مهمی از مواد مغذی است. گل کلم یک منبع عالی از ویتامین C،فولات، ویتامین K ، ویتامین های پیچیده B و ویتامین E است. گل کلم همچنین، حاوی ماده معدنی حیاتی، مانند:کلسیم، منیزیم، فسفر، پتاسیم، سدیم و آهن است که بدون اضافه کردن کلسترول، برای بدن مضر نمی باشند. آنتی اکسیدان ها و مواد مغذی گیاهی را فراهم می کند که می تواند در برابر سرطان محافظت کند. همچنین حاوی فیبر برای کاهش وزن و هضم غذا، کولین که برای یادگیری و حافظه ضروری است و بسیاری از مواد مغذی مهم دیگر است. عواملی همچون کمبود مواد مغذی، شرایط آب و هوایی و بیماری‌ها باعث می‌شود که رشد گل کلم همراه با مشکلاتی باشد. گل کلم حاوی بیماری های پاتولوژی گیاهی مختلفی است؛ اما روش سنتی برای شناسایی کلم های بیمار و جداسازی آنها خسته کننده و وقت گیر است بنابراین می توان با استفاده از تکنیک تصویربرداری و شبکه عصبی مصنوعی در کوتاهترین زمان ممکن آنها را شناسایی کرد. هدف از این تحقیق، طیقه بندی گل کلم به 4 گروه سالم، آلوده به کپک پودری، پوسیدگی سیاه و پوسیدگی نرم باکتریایی با استفاده از تکنیک پردازش تصویر و یادگیری عمیق LeNet می باشد. ابتدا مجموع 655 تصویر رنگی شامل 4 کلاس مذکور تهیه شد. 70 درصد داده ها برای آموزش مدل در نظر گرفته شد. نتایج نشان داد که مدل توانست گل کلم های سالم، گل کلم های آلوده به پوسیدگی سیاه و کپک پودری تواستند با دقت 100 درصد شناسایی شوند. گل کلم های آلوده به پوسیدگی نرم باکتریایی تواستند با دقت 99 درصد شناسایی شدند.

کلیدواژه‌ها


عنوان مقاله [English]

Diagnosis of common cauliflower diseases using image processing and deep learning

نویسندگان [English]

  • Razieh Pourdarbani 1
  • Sajad Sabzi 2
1 Dept. of Biosystems, University of Mohaghegh Ardabili, Ardabil, Iran
2 Dept. of Computer engineering-University of Sharif-Tehran-Iran
چکیده [English]

Abstract
Cauliflower is a very healthy vegetable that is an important source of nutrients that is naturally rich in fiber and B vitamins. It provides antioxidants and phytonutrients that can protect against cancer. It also contains fiber for weight loss and digestion, choline, which is essential for learning and memory, and many other important nutrients. Factors such as lack of nutrients, weather conditions and diseases cause the growth of cauliflower to be accompanied by problems. Cauliflower contains various plant pathology diseases; but the traditional method for identifying diseased cabbages and separating them is tedious and time-consuming, so they can be identified in the shortest possible time using imaging techniques and artificial neural networks. The purpose of this research is to categorize cauliflower into 4 healthy groups, infected with powdery mildew, black rot and bacterial soft rot using LeNet image processing and deep learning techniques. First, a total of 655 color images including the mentioned 4 classes were prepared. 70% of the data was considered for model training. The results showed that the model was able to identify healthy cauliflowers, cauliflowers infected with black rot and powdery mildew with 100% accuracy. Cauliflower infected with bacterial soft rot could be identified with 99% accuracy.
Introduction
Factors such as lack of nutrients, weather conditions and diseases cause the growth of cauliflower to be accompanied by problems. Some of the known diseases of cauliflower are: black rot, powdery mildew, bacterial soft rot and phoma Stem Kancer.
Plant pathology diseases have different types. Some of them may not be visible to the naked eye, so powerful microscopes are needed. Some of them can be identified using multispectral and hyperspectral imaging techniques. If disease symptoms are visible, image processing is a promising option.
The traditional method to identify any agricultural product is visual inspection, which is a very tedious and time-consuming task. A computer vision system or machine vision system includes various fields such as medical operations, traffic monitoring, inspection of industrial and agricultural products, and food industries like classification, harvesting and automatic evaluation and identification of different grain varieties as non-destructive evaluation. Some researchers who use machine vision systems to classify different seeds. Zhu et al. (2022) conducted a study on classification of cauliflower based on magnetic resonance imaging. The training of healthy and stressed cauliflower was done with LDA, QDA, PLSDA and CNN binary classification. The results showed binary classification rate and F score up to 95%. Hamouda et al. (2017) proposed an algorithm based on HSV color space to distinguish cauliflower from weeds and soil. A region of interest (ROI) was determined by filtering each of the HSV channels. The obtained results were compared with manual interpretation. Algorithm performance was evaluated with sensitivity and accuracy of 98.91% and 99.04%, respectively.
Considering the extraordinary nutritional and therapeutic properties of cauliflower, maintaining the health of the product and separating diseased plants is of particular importance. It is worth mentioning that these infected plants can be used to prepare compost. The purpose of this research is to categorize cauliflower into 4 healthy groups, infected with powdery mildew, black rot and bacterial soft rot using image processing and deep learning techniques.
Methodology
First, a total of 655 images including 4 healthy classes, infected with powdery mildew, infected with black rot and infected with bacterial soft rot were prepared. Of these images, 70% were allocated for training data, 10% for validation data and 20% for test data.
2D-LeNet convolutional neural network was used to analyze the images. This model was first unveiled in 1998 in an article called Gradient-Based Learning Applied to Document Recognition. The performance evaluation criteria of the classifiers are in the form of confusion matrix and receiver operating curve (ROC). The confusion matrix is the matrix that relates the real samples and the ones predicted by the classifier. In the following, some of these criteria will be examined
Sensitivity or readability:
How many percent of correct samples have been correctly diagnosed (Equation 2).
(2) Recall=TP/(TP+FN)×100
Accuracy:
Total percentage of correct answers of the system (Equation 3)
(3) Accuracy = (TP+TN)/(TP+FN+FP+TN)×100
Precision:
What percentage of correctly detected outputs are actually true (Equation 5).
(4) Precision=TP/(TP+FP)×100
F measure:
Harmonic weighted average of Recall and Precision (Equation 6)
(5) F_measure=(2×Recall×Precision)/(Recall+Precision)
Where TP is the number of samples of each class that are correctly classified. TN is equal to the number of samples on the main diameter of the confusion matrix minus the number of samples that are correctly classified in the desired class. FN is defined as the sum of horizontal samples of the studied class minus the number of samples that are correctly classified in the desired class. And FP is the sum of the vertical samples of the examined class minus the number of samples that are correctly classified in the desired class.
Conclusion
The model was able to be trained with an accuracy of over 90% after the 15th epoch. After the 35th epoch, the validation and training graphs almost coincided. After the 35th epoch, the training was carried out with very high accuracy and minimum error and was validated.
The results of the recall-accuracy displayed that 100% recall-accuracy index for each class shows the success of the classifier in identifying the mentioned 4 classes from each other.
According to the receiver operating curve (ROC) for each class, it was stated that due to the fact that all the graphs are close to the line and the number one, it can be stated that all the classes were successfully identified and classified by the classifier.
At the final, the main results are as follows:
The model was able to be trained and validated with high accuracy of 99% and minimum error.
Healthy cauliflowers, cauliflowers infected with black rot and powdery mildew could be identified with 100% accuracy.
Cauliflower infected with bacterial soft rot could be identified with 99% accuracy
At the final, the main results are as follows:
The model was able to be trained and validated with high accuracy of 99% and minimum error.
Healthy cauliflowers, cauliflowers infected with black rot and powdery mildew could be identified with 100% accuracy.
Cauliflower infected with bacterial soft rot could be identified with 99% accuracy

کلیدواژه‌ها [English]

  • couliflower
  • image processing
  • disease
  • deep learning