Smart Agriculture & Deep Learning

by Sep 23, 2021Deepclever Ai Articles

Smart Agriculture, or rather, precision agriculture, represents an innovative model for managing agricultural activities.
It, through the use of sensors, translators and hardware/software systems, allows the reduction of the use of pesticides and the consumption of resources and improves the quality of crops and finished products.

Deep Learning sensory techniques, and in particular convolutional neural networks (CNN, Convolutional Neural Network), have quickly established themselves as the state of the art for image processing and computer vision tasks.
Among the numerous possible applications that can bring the concept of Smart to the agri-food world, the following are by way of example:

  • the detection of pathologies to preventively identify diseases on the plants of a crop;
  • the optimisation of automated irrigation;
  • the recognition and classification of contamination on cereals and derivatives.

The evolution of the concept of Smart Cities, a representative model of the management of the dynamics of urban life (eg. traffic, sustainability, security) using AI principles, has led to the evolution and the diffusion of intelligent applications in the agrifood sector.

The concept of Smart Agriculture has made the dedicated development of applications that were originally designed for areas more devoted to technological innovation accessible also to the agrifood world. Companies with a greater propensity for research and development can now increasingly differentiate themselves from the use of traditional methodologies.

Using wi-fi sensors in crops or mounted on drones, for example, it is possible to collect high-resolution data and images and information from cultivated land.
From soil moisture level to plant health, ripening level and much more.
All data that through cloud and innovative Deep Learning algorithms can be analysed to identify levels of health, stress and onset of diseases. This information allows you to better regulate the supply of water, pesticides or fertilizers.

According to projections made by, an increase in turnover deriving from agriculture of up to 20% is expected in the coming years, for a global market that could reach a record value of 11.23 billion dollars by 2022.

1 – Convolutional Neural Networks at the service of Smart Agriculture

Deep Learning models aimed at image recognition and their segmentation have proved to be the state of the art, outperforming even the most established families of algorithms. They have also proved to be an extremely transversal tool in the Smart Agriculture field.

This is in particular the case of CNN, a class of deep architecture inspired by the functioning of the visual cortex of the human brain and widely used in computer vision.

In general, CNNs are formed by a sequence of convolutional and pooling layers, followed by

dense levels of classification. [1]

Figure: Architectural diagram of a CNN. After the input layer, the convolutional and pooling layers alternate up until the last completely connected layer that produces the output of the network. [3]

A convolution is defined by a matrix called kernel and represents a filter that can be applied to the starting image. This matrix is ​​generally smaller than the input image and is applied by translating the kernel on it, resulting in a feature map. Each kernel requires the optimisation of a number of parameters during the training phase, which can be obtained as follows:

After a convolutional layer, then a pooling layer is usually used, this is necessary to perform a subsampling by decreasing the size of the input image and maintaining the main characteristics of the same image.

Dense layers are composed of a series of fully-connected layers that recombine the features resulting from the previous convolutional layers.

It is not uncommon for a CNN to contain other types of levels such as droput, these are used to avoid overfitting and to help the network generalise, or batch normalisation that is used to normalise the variability in data batches.

CNN’s are applicable to numerous applications thanks to their reduced need for image pre-processing and extraction of relevant features during learning, therefore reducing feature engineering costs. Furthermore, there are numerous variants of pre-trained networks that are already prepared to extract features from images that can be found in the literature.

2 – Quality and efficiency

The entire Made in Italy agri-food sector benefits from the integration of smart technologies into its processes.

The photographs acquired, enriched and crossed with data collected from weather and soil forecasts, are able to indicate and predict what problems may afflict a crop: low soil moisture and reduced mineral content of the soil, or any eventual epidemics of harmful insects.

The AI ​​is able to process data that can return forecasts and estimates using mathematical models.
AI provides an advanced environment for integrating, connecting and analysing in real-time the data that can be generated in the various stages of the company’s production cycle.
From the entire plantation, the focus could be shifted to a single plant.
Through the use of drones that are capable of moving autonomously between the rows and in the plantations, it will become possible to identify each individual fruit with the utmost precision. The collected data can then be sent in real-time to a remote server where they will be cleaned, stored and processed to become explanatory maps.

2.1 – Detection of pathologies on the plants of a crop

Image processing methods are used to solve the problem of detecting diseases and infestations on the plants of a crop.

Over the years, various methods have been proposed to address this problem. These proposals have always needed to integrate human intervention to correctly identify diseases. CNN’s have been found useful for disease detection tasks. In particular, excellent performance can be achieved and computation costs for training for this task can be reduced by relying on transfer learning models based on the DenseNet architecture. [2] DenseNet has a unique design that connects each layer to each other layer in a feed-forward fashion. If a traditional CNN has Llevels with L connections, DenseNet actually presents between each level and its next.

For each level, the feature maps of all previous levels are used as input to use with the following levels. The architecture of DenseNet has a number of advantages: it alleviates the problem of the disappearance of the gradient, strengthens the reproduction of features through the various layers, encourages the reuse of features and reduces the number of parameters to be towed.

Figure: A classic feed-forward model (a) in which the output of each layer is the input of the next layer is generally less performing than a DenseNet structure (b) in which the output of each layer is the input of all subsequent layers. [2]

In transfer learning starting from the basic structure of DenseNet requires the addition of a flatten level and fully-connected classification levels. The limits of a model like this, although it is relatively lean and performs compared to the classic deep learning models; remains that of having to exploit cloud services in order to receive the predictions produced by the algorithm in real-time.

To allow for a tool of this type available to users in real-time, one which can run on the most common mobile devices, it is necessary to intervene on the DenseNet structure by simplifying it and reducing the number of levels present in the starting architecture. [3]

Figure: Lightweight version of DenseNet that can be tasked with the Detection of pathologies on the plants of a crop. [3]

Along with the simplification of the architecture, it is important to evaluate the correct threshold between performance and size of the images to be given as input to the model in different scenarios. The quality and resolution of the images supplied as input to the network affect the number of parameters to be trained. Assuming we have an input of size W x H x M and we have set the stride and padding values ​​to 1, a normal operation in a CNN can be calculated as follows:

If we take another input with dimensions W’ x H’ where W < W’ and H < H’, we can see how the cost of a level of a CNN that connects all the layers to the input layer is quite different and that W’ x H’ requires higher computational costs than an input W x H.

A lightweight version of DenseNet, with the right input size, can be a light enough model to be accessible even on a mobile device while maintaining high performance in disease detection. In this way it is possible to directly equip the personnel in the field with tools that are capable of providing decisive help in the recognition of the onset of pathologies or infestations, returning highly granular information that allows them to act in a localised way in the care of crops. The model is immediately usable on IoT devices such as tablets and smartphones.

3 – Eco-sustainability

The search for ever-higher standards of eco-sustainability is an urgent need in our lifetime.

Experimenting and activating effective practices of precision agriculture, leveraging the IoT, today allows us to keep under control, not only the carbon footprint of a company but also its hydrological and microbiological impact on ecosystems.

Opting for green solutions contributes to the construction of a positive corporate image, through highly sought-after sustainability values ​​in a world that has an increasing interest in these issues and where consumers are orienting themselves on the market in an increasingly conscious way.

Investing to optimise the water demand of crops has cascading beneficial effects on companies that decide to work in this direction. A reasoned and modulated water distribution, in close relationship with the needs of the soil and the crop in question, is able to allow, in addition to saving 30-50% of water resources, lower energy consumption and the optimisation of fertilizers used.

3.1 – Automated precision irrigation

The ability of CNN to effectively process large data sets makes it possible to accurately estimate soil moisture levels directly from aerial images.
It is estimated that the amount of water that is absorbed by plants is only between 5% and 30% of the total used. It is possible to create a model that takes aerial images of a crop as input and outputs, a mapping of the level of moisture in the soil in order to direct irrigation towards drier areas.

To be able to complete a task of this size, two approaches based on CNN can be tried:

  • The first involves making cuts starting from the area image to identify the single plant and estimate the dissipation rate,
  • the second takes the entire aerial image as input and returns as input as many dissipation rates as the plants in the image.

Both approaches involve an architecture with two convolutional levels and three levels of pooling. The second model offers better performance as it has information on all plants and therefore also on the correlation between these and the soil conditions.

In any case, both CNN-based models outperform more classic algorithms such as SVM. The peculiarity of this type of model is that it is very robust towards the presence of noise in the input images, a fundamental characteristic for its effectiveness in real applications.

Figure 4: Schematic of the first model (a) with crop of the single input plant and of the second model (b) with images of the entire vineyard as input and convolutional input models. [6]

Once the model has been trained, it is possible to control the irrigation rate based on the output produced. If the model predicts a dissipation rate respectively greater or less than an arbitrary constant, it is possible to proportionally increase or decrease the irrigation level as follows:

By experimentally determining the values ​​for a = 1 and k = 0.25, it is possible to save up to 50% of water for irrigation by diverting water resources to areas of the land where the dissipation rate is higher. One such model is already in use in the Cowell Ranch Symphony Vineyards in Snelling, California. L’alto livello di accuratezza con cui è possibile monitorare il livello di umidità del terreno rendono applicazioni come questa uno strumento fondamentale per l’ottimizzazione delle risorse.

4 – Security

The agri-food sector has a direct relationship with people’s health.

Identification systems for contaminants and defects in industrial products can be applied directly to the production line.

Using an X-ray detector, it is possible to identify in real-time any type of non-compliance with the required standards through the training of very precise neural networks.

This would increase the guarantees provided to the final consumer on the quality and safety standards respected during processing.

4.1 – Recognition of contamination on cereals and derivatives

Cereals such as rice, wheat, corn often suffer from insect pest contamination.
Consequently, food products processed from these raw materials are also subject to similar contamination, especially if processed in unsanitary conditions.

Recognising the presence of these insects requires the analysis of hundreds of product samples and the subsequent screening necessary to identify the presence of these and other possible contaminations.

The entire process, if entrusted only to human personnel, would be extremely expensive in terms of time and cost, as well as susceptible to errors. With the advent of Deep Learning and CNN it has been possible to effectively automate this task. [4]

A deep learning model based on the VGG16 architecture is used, with weights trained on the ImageNet dataset and transfer learning to specialise the algorithm in recognising food contamination from insects. [5]

Figure 6: The basic structure of VGG16 with the weights of the frozen convolutional levels and the fully-connected levels that are specialized during training. [5]

The weights of the convolutional levels are kept frozen during training, while the fully connected levels are customised and trained on the specific task. The input images are cropped in order to eliminate as much background as possible, making a crop with an aspect ratio of 224 × 224: exactly the size of the input required by a model like VGG16.

The performance of a model like this shows an error rate of less than 10% and saves time thanks to the ability of CNNs to identify the features that best describe an input image.

This type of application can already be made available on IoT devices in order to be used by food industry operators in the screening phase, as well as by health professionals in the investigation phase.

5 – In conclusion

Offering these possibilities to the agricultural world means being able to contribute to the intelligent fight against plant diseases, to the mapping of fields, to the reduction of water and energy waste.

CNN-based applications are the most promising in this field: although they may still be heavy models to train, they remain the best class of algorithms in terms of performance and the least expensive from a human point of view to be implemented thanks to their ability to automate the construction of features relevant to the task in question.

In summary, we can identify the benefits and disadvantages of an approach aimed at Smart Agriculture in the following points:

  • Pros
    • Monitoring of crops and soil through aerial images and real-time analytics
    • Greater screening capacity for pathologies and contaminations in the agri-food chain
    • Optimised management of water resources and pesticides
    • Reduction of environmental impact
    • Improvement of the quality of crops and finished products
  • Cons
    • Necessary integration of hardware in the field such as sensors and drones
    • Dependence on the quality and quantity of data available
    • Models to be adapted for deployment on IoT devices


[1] Zheng, C., Sun, D. W., & Zheng, L. (2006). Recent developments and applications of image features for food quality evaluation and inspection–a review. Trends in Food Science & Technology, 17(12), 642-655.

[2] Huang, G., Liu, Z., Van Der Maaten, L., & Weinberger, K. Q. (2017). Densely connected convolutional networks. In Proceedings of the IEEE conference on computer vision and pattern recognition (pp. 4700-4708).

[3] Ale, L., Sheta, A., Li, L., Wang, Y., & Zhang, N. (2019, December). Deep Learning Based Plant Disease Detection for Smart Agriculture. In 2019 IEEE Globecom Workshops (GC Wkshps) (pp. 1-6). IEEE.

[4] Karunakaran, C., Jayas, D. S., & White, N. D. G. (2004). Identification of wheat kernels damaged by the red flour beetle using X-ray images. Biosystems Engineering, 87(3), 267-274.

[5] Wu, L., Liu, Z., Bera, T., Ding, H., Langley, D. A., Jenkins-Barnes, A., … & Xu, J. (2019). A deep learning model to recognize food contaminating beetle species based on elytra fragments. Computers and Electronics in Agriculture, 166, 105002.

[6] Tseng, D., Wang, D., Chen, C., Miller, L., Song, W., Viers, J., … & Goldberg, K. (2018, August). Towards automating precision irrigation: Deep learning to infer local soil moisture conditions from synthetic aerial agricultural images. In 2018 IEEE 14th International Conference on Automation Science and Engineering (CASE) (pp. 284-291). IEEE.