Can Deep Learning approaches be a valid solution in the authentication process of EVOO? Let’s find out.
In the context of the modern food industry and increasing globalization, the authenticity of food has become a major issue. This concept refers to the ability to ensure that a food product is genuine in terms of its nature, origin, identity and declared characteristics. However, the growing intertwining of global supply chains makes it increasingly difficult to trace the origin and production process of food, increasing the risk of fraud, such as adulteration or false geographical indications.
To address these challenges, technologies and analytical methods have been developed to detect specific characteristics and identify possible fraud. Techniques such as spectroscopy, chromatography and electronic sensors enable in-depth analysis of food samples. There are two strategies for analysing samples: Targeted and non-targeted analysis. The first focuses on the quantification of specific components, offering great precision but at the risk of neglecting unknown elements present in the food.
The non-targeted approach, on the other hand, provides a more complete view by detecting all possible components present in the samples. Both strategies often generate large volumes of complex data, which require advanced tools to extract useful information and overcome the limitations of manual data analysis. Among the most advanced technologies for data analysis, Deep Learning is an innovative solution to these challenges, allowing the detection of more sophisticated patterns and relationships within data, improving the accuracy of forecasts and classifications.
Deep learning, a branch of machine learning, uses multi-layered neural networks (deep neural networks) to simulate the complex decision-making abilities of the human brain. Key benefits of Deep Learning include the ability to handle large amounts of noisy or abnormal data through optimization and pre-processing techniques; the ability to analyse high-dimensional data such as spectrometric analysis results; and the ability to model complex, non-linear relationships, such as evaluating the freshness of food based on parameters such as colour and odour.
In addition, Deep Learning is particularly useful in multi-class classifications, allowing different food varieties, such as fruits, vegetables or fish, to be distinguished with high precision. An additional benefit is the ability to address overfitting, which occurs when the training dataset is limited. Data augmentation techniques, for instance, can simulate real variations and introduce greater diversity, making models more robust and generalizable. Finally, Deep Learning adapts perfectly to the analysis of unstructured data such as images, labels, descriptions or reviews, using specific models such as CNN for images and RNN for text. Extra virgin olive oil (EVOO) is one of the pillars of the Mediterranean diet, renowned for its unmistakable taste and health benefits.
However, its economic and cultural value often makes it the subject of food fraud. Adulterations with lower quality oils, such as high oleic hazelnut or sunflower oil, compromise product quality and safety. In this context, Deep Learning has emerged as a revolutionary tool for ensuring the authenticity of EVOO, offering fast and effective methods for detecting adulteration and protecting consumers. Deep Learning models are trained on data sets that include pure and adulterated samples, obtained by advanced analytical techniques such as Low Field Nuclear Magnetic Resonance (LF-NMR), Mass Spectroscopy, and GC-IMS Spectroscopy.
PRACTICAL APPLICATIONS: RECENT STUDIES ON EVOO
Feed-forward multilayer to improve the performance of olive oil classification
To date, the official classification of olive oils involves panel tests, i.e. a standardized sensory analysis conducted by experts. However, this method is slow, expensive, and subject to variability, which has led to the search for alternative approaches. This study uses deep learning techniques to classify olive oil samples into categories EVOO (extra virgin), VOO (virgin) and LOO (lampante) based on data obtained by GC-IMS (gas chromatografy-ion mobility spectrometry) spectroscopy.
The data came from two harvests (2014-2015 and 2015-2016), for a total of 701 samples. Each sample is described by 118 attributes, with 113 being the intensity of the markers, and the remaining others indicating the identifier of the sample (name, class, base value or baseline, and the position of the reference peak (RIP). The process included an initial data preprocessing phase, which is crucial for improving the accuracy of the model. In particular, two basic tasks were carried out: First, the normalization of samples was made by dividing the marker values for the maximum RIP value, in order to reduce potential instrumental variations; and second the autoscaling or standardization of markers resulted in more standardized data with zero mean and unit variance.
This approach has resulted in more consistent data that can be used for training. The model used is an artificial feed-forward multilayer neural network, consisting of an input layer, one or more hidden layers, and an output layer. The input layer receives chemical marker data, while hidden layers process nonlinear relationships between variables. The output layer provides the final classification, using the softmax function to distribute probabilities between the categories considered.
The model configuration has been optimized by varying the number of neurons in the hidden layers to suit different types of classification, both binary (e.g. LOO/non-LOO) and ternary (EVOO/VOO/LOO). The results demonstrated the effectiveness of the approach: The accuracy of ternary models reached 81.42%, while binary models, such as the distinction between LOO and non-LOO, achieved an accuracy of 95%. These results are significantly superior to traditional methods such as k-Nearest Neighbours, Support Vector Machine, and Decision Tree Classifier.
Rapid screening with LF-NMR and CNN
The second study introduces a fast and innovative method for identifying adulteration of extra virgin olive oil (EVOO) with hazelnut oil (HO) and high oleic sunflower oil (HOSO). This approach uses Low-Field Nuclear Magnetic Resonance Imaging (LF-NMR) in combination with machine learning algorithms, especially Convolutional Neural Networks (CNN). LF-NMR technology uses the relaxation properties of hydrogen protons in fats, which reflect the chemical composition of the oil, allowing pure EVOO to be distinguished from adulterated samples.
The samples analysed included adulteration rates ranging from 10% to 100%, and the signals collected were processed using five machine learning algorithms (Decision Tree, KNN, LDA, SVM, CNN). Of these, CNN has proven to be the best performing, achieving an accuracy of 89.29%, a precision of 81.25%, and a 2-minute analysis time. Compared to traditional techniques such as liquid chromatography (HPLC), gas chromatography (GC) and mass spectrometry (MS), the proposed method has several advantages. First, it is significantly faster: Whereas conventional techniques require 30 minutes to several hours to prepare and analyse samples, the new approach completes the analysis in just 2 minutes. Furthermore, LF-NMR is a non-destructive technique, which does not alter the sample and allows for further analysis or possible commercial use of the oil.
Unlike traditional methods, which often require highly skilled staff and complex equipment, the proposed system is easier to use and more automated. Sensitivity is also important: Many traditional techniques can only detect adulteration at concentrations above 20-25%, while the method based on LF-NMR and CNN detects adulteration at 10%, providing more effective quality control. CNN, with its ability to deep-learn characteristics from the data, better adapts to sample variations in samples, such as different proportions of adulterants or varying chemical compositions, demonstrating remarkable robustness.
Machine vision oil droplet analysis
The research analysed different types of extra virgin olive oil (EVOO), together with refined corn and sunflower oils, in the form of droplets, both pure and mixed. These mixtures were used as artificially adulterated samples to train deep learning algorithms to distinguish droplets by their composition. Edible oils have unique fluid dynamic properties, which affect the speed and shape of the droplets’ diffusion.
These composition-depending differences were captured by recording videos of the droplets’ expansion on a polyethylene plate at 30°C. The videos were divided into frames (JPEG format) to create a database used to train mathematical models. Convolutional Neural Networks (CNN) were implemented to classify and analyse images. CNN uses convolutional and pooling layers to select the most significant image characteristics, reducing image size but increasing the number of matrices.
Convolutional layers apply digital filters or masks to highlight distinctive image characteristics. Pooling further reduces the size by selecting maximum values (max-pooling), which is the most peculiar characteristic of a block of pixels. An additional trigger function, ReLU, has been used to induce nonlinearity to the model. The final results are processed by a multilayer perceptron (MLP), which performs the classification.
- Classification of pure oils: CNN analysed 86,187 images of pure oils, distinguishing five EVOO and two adulterant oils (sunflower and corn) with more than 98% accuracy.
- Adulteration Quantification: Specific models were used for each type of EVOO, analysing adulterated samples with concentrations from 2.5% to 10%. The results showed an overall accuracy of more than 96%.
- Global Model: A unique model has been developed to classify 302,387 images in 45 categories (5 EVOO × 4 concentrations of adulterants × 2 adulterant oils + 5 pure EVOO). The model achieved an overall hit rate of 96.7%, demonstrating a high ability to identify and quantify adulterations.
Concrete advantages of deep learning for olive oil
- High accuracy: Deep learning models go beyond traditional methods in terms of accuracy, detecting even minor adulterations that are difficult to identify with other approaches.
- Non-destructive Analysis: Techniques such as NMR spectroscopy offer non-destructive means of processing samples, preserving the product for further analysis.
- Fast and efficient: Neural networks enable real-time analysis, with much faster response times than traditional techniques.
- Versatility: Models can be adapted to analyse different oil types and updated with new data, making them dynamic and flexible tools.
Challenges and future prospects
Despite progress, Deep Learning in the authentication process of EVOO has to face several challenges. The collection of sufficiently large and representative data sets is essential to ensure reliability of the models. Standardization of protocols and large-scale validation remain essential steps. In addition, the complexity and cost of technologies can be barriers for small manufacturers. However, the integration of these systems into portable devices could democratize access to technology, enabling quality checks directly at the point of sale.
References
Deng, Z., Wang, T., Zheng, Y., Zhang, W., & Yun, Y. (2024). Deep learning in food authenticity: Recent advances and future trends. Trends in Food Science & Technology, 144, 104344. https://doi.org/10.1016/j.tifs.2024.104344
Hou, X., Wang, G., Wang, X., Ge, X., Fan, Y., Jiang, R., & Nie, S. (2020). Rapid screening for hazelnut oil and high‐oleic sunflower oil in extra virgin olive oil using low‐field nuclear magnetic resonance relaxometry and machine learning. Journal of the Science of Food and Agriculture, 101(6), 2389–2397. https://doi.org/10.1002/jsfa.10862
Pradana-Lopez, S., Perez-Calabuig, A. M., Cancilla, J. C., Garcia-Rodriguez, Y., & Torrecilla, J. S. (2021). Convolutional capture of the expansion of extra virgin olive oil droplets to quantify adulteration. Food Chemistry, 368, 130765. https://doi.org/10.1016/j.foodchem.2021.130765
Vega-Marquez, B., Nepomuceno-Chamorro, I., Jurado-Campos, N., & Rubio-Escudero, C. (2020). Deep learning techniques to improve the performance of olive oil classification. Frontiers in Chemistry, 7. https://doi.org/10.3389/fchem.2019.00929
Wang, Y., Gu, H., Yin, X., Geng, t., long, W., Fu, H., & she, Y. (2024). Deep leaning in food safety and authenticity detection: An integrative review and future prospects. Trends in Food Science & Technology, 146, 104396. https://doi.org/10.1016/j.tifs.2024.104344