1. Introduction
Grapes are one of the most consumed fruits in the word, as fresh fruit, grape juice, raisins, and wine. About 36% of grape production concerned the fresh fruit consumption (International Organization of Vine and Wine statistics). The European production of table grapes (~1.9 million tons) is mainly located in the Mediterranean area, with the domination of Italy (61%), Greece (16%), Spain (15%), and France (1.5%) [1]. The French production of table grapes is mostly in Vaucluse and Tarn-et-Garonne. About 80% of the production concern only three varieties: Alphonse Lavallée, Chasselas, and Muscat de Hambourg. French table grape production (~30,000 tons) represents approximately 40% of the national consumption, while the 60% remaining is mainly imported from Spain and Italy.
The right commercial harvest of table grapes is usually determined by different parameters like skin color, texture softening, titratable acidity, total soluble solids, and sometimes with flavonoid content, and aromatic compounds [2,3]. Visual attributes of table grapes, such as intensity and uniformity of color, large size of berries, and brightness are the main characteristics that influence consumer choice [4,5]. Color is of high importance to assess quality in the food industry [6]. Furthermore, some studies have found clear evidences that a greater consumption of fresh grapes decreases the risk of cardiovascular diseases and cancer [7,8]. This beneficial effect is mainly related to the presence of minerals, fibers, vitamins, and phytochemical compounds including flavonoids and anthocyanins [9,10]. However, the concentration of these quality attributes changes during postharvest storage and thus influence sensory perception and nutritional value of table grapes.
The quality parameters of table grapes can be assessed by a few methods [11,12,13]. Nevertheless, conventional analytical methods need a preparation of the sample, they are destructive and limit thus their use in an on-line/in-line industry for quality monitoring [14,15,16]. These methods require furthermore time and solvents and generate chemical waste. Despite being time consuming and expensive, the destructive analytical approach provides data for a limited number of samples, and, thus, their statistical relevance could be limited [17]. Several studies in the field of post-harvest are focused on non-destructive analytical techniques, which are fast, reliable, and allow to analyze a higher number of samples and repetitions of the same batch in real time.
The development of non-destructive techniques suitable to increase the number of samples analyzed is the objective of current researches, in any field. The possibility to get real-time information of quality attributes of fruits and a robust statistical data analysis is clearly aimed at these studies [18]. Infrared spectroscopy (FT-NIR and ATR-FTIR) has been applied for the prediction of procyanidin concentration [19], total polyphenol content [20], malvidin-3-O-glucoside, pigmented polymers and tannin contents [21] in cocoa, green tea, and fermenting red wine, respectively. This technology has also been employed to determine, pH, total soluble solids, glycerol, and gluconic acid in grape juice [22] and to measure condensed tannins and the dry matter in homogenized red grape berries [23].
Hyperspectral imaging spectroscopy (HIS) is a non-destructive spectroscopic technique that records hundreds of narrow-wavelength bands and spatial positions [24]. This technique is a system combining imaging and spectroscopy [25,26], providing the spatial information of spectra obtained from each pixel in the hyperspectral image [25,27]. A hyperspectral image is thus a three-dimensional (3D) cube that includes spatial information in two dimensions (of x rows and y columns) and spectral information in one dimension (of λ wavelengths) [28]. The hyperspectral image cube “hypercube” consists of a series of sub-images at small interval wavelengths ranging from 400 to 2500 nm in VIS and NIR spectral regions.
Over the last decade, HIS has been applied for fruit and vegetable quality assessment [24,27], food safety control [29,30,31], and classification tool [32,33]. Likewise, total acidity, pH, soluble solid content, technological maturity, total anthocyanin concentration, antioxidant activity, and total phenolic compounds in grapes were determined using VIS−NIR hyperspectral imaging of few fruits [34,35] but a lack of study appeared for table grapes. HIS has an advantage compared to the spectroscopic method, i.e., it acquires the spectral information on a larger area of the fruit surface analyzed and therefore considers the heterogeneity within the berries, on the contrary to a spectrophotometer [35,36]. Moreover, the conventional RGB imaging detects only surface features and could not be able to measure the chemical composition of the fruit. The HSI technique, instead, acquires information in a different region of the electromagnetic spectrum, which is strictly linked to the chemical composition of the samples [37]. The HIS has also the advantage of receiving spatially distributed spectral responses at each pixel of a fruit image. Another advantage is that once appropriate calibration models are developed, they can be re-inserted in the hypercube to create chemical mapping images. For the grape industry, the interest would be to separate berries one by one depending on their average representative spectrum, and not spatial information within each berry, in order to get several final batches for different transformations, storage conditions, or quality arrays.
The objective of this study was to determine if hyperspectral imaging would be able to predict the sugar content and the concentration of Total Flavonoids and Total Anthocyanins of white and red table grapes. Additionally, as keeping whole spectra for all samples would generate big data to manage and a higher time to analyze in a potential on-line tool, the second objective was to define if it was possible to reduce the number of wavelengths with still having good models of prediction. Thus, this study developed calibration and prediction models for three quality attributes of table grape based on hyperspectral imaging:
I.. Developing partial least square (PLS) models to validate the correlation between hyperspectral imaging spectra and Total Anthocyanins (TA) and Total Flavonoid (TF) contents and Total Soluble Solids (TSS), using the visible and short-wave near-infrared region;
II.. Selecting the lowest number of optimal wavelengths, based on regression coefficient (RC) and Variable Importance in Projection (VIPs) algorithms, which gave the highest correlation between the spectral data and the three selected quality parameters;
III.. Developing Multiple Regression Models (MLR) using spectra from only the optimal wavelengths and then checking the validation of the developed calibration models.
The novelty of this study was to estimate the potential of hyperspectral imaging as possible prediction model supplier for quality parameters (TF, TA, and TSS) usable for all table grapes. Moreover, the use of specific wavelengths and not the full spectra for these products represented a new approach.
2. Materials and Methods
2.1. Chemicals
The following chemicals were used: ethanol 96%, hydrochloric acid ≥ 37%, (FlukaTM, Muskegon, MI, USA), malvidin-3-O-glucoside 92.7% (Extrasynthese, Genay, France), and (+)-catechin 99.2% (Sigma–Aldrich, Saint-Louis, MI, USA). All the chemicals were at least of analytical grade. Ultrapure water was prepared from deionized water obtained a Milli-Q system (Millipore SAS, Molsheim, France).
2.2. Samples
Seven table grapes varieties were bought in regional markets at commercial harvest ripeness: Three white table grapes (Sugarone Superior Seedless, Thompson Seedless, and Victoria) and four red/black table grapes (Sable Seedless, Alphonse Lavallée, Lival, and Black Magic). Alphonse Lavallée and Lival were chosen because they represented French cultivars produced in the south-east of France and mostly consumed throughout the country. The other 5 cultivars were chosen because they are largely diffused around the world. Approximately 5 kg of clusters randomly selected were sampled for each cultivar. A subsample of 50 berries of each variety, with short attached pedicels, was collected from different bunch parts (shoulders, middle, and bottom). Grapes were then washed and gently dried with absorbent paper, stored at 4 °C until the HIS acquisitions.
For the 7 varieties, 50 berries of each were analyzed in triplicate by hyperspectral imaging, then they were chemically analyzed, leading to 350 mean spectra and 350 TF and TSS and 200 TA. Then, PLS and MLR were applied on pre-treated data.
2.3. Hyperspectral Imaging System (HIS)
The system is composed of the following components (Figure 1): (a) a hyperspectral imaging camera (Pika L, Resonon, Bozeman, MT, USA) coupled with an objective lenses (Xenoplan 1.4/23, Schneider-Kreuznach, Bad Kreuznach, Germany); (b) an illumination unit, which consists of four 35 W quartz tungsten halogen (QTH) MR16 35 W lamps adjusted at angle of 45° to illuminate the camera’s field of view; (c) a mounting tower; and (d) a transport stage (PS-12-20-1.0, Servo Systems Co., Rockaway, NJ, USA), with motor (DMX-J-SA-17, Arcus Technology Inc., Livermore, CA, USA). The sensor has 900 spatial channels each with 281 spectral channels covering the range from 387 to 1026 nm. The maximum spectral resolution is 2.1 nm. The camera was set up at 450 mm from the target. The spectral images were collected in a dark room where only the halogen light source was used. The HIS was controlled by a PC with the software SpectrononPRO (Resonon, Bozeman, MT, USA) for image acquisition.
2.4. Image Acquisition
The samples were kept at room temperature (20 °C) for 1 h prior to the imaging acquisition in the reflectance mode. The hyperspectral image of each sample (one berry) was recorded in three different berry positions corresponding to berry rotations of approximately 120° between positions. The berries reflectance measurement was made along the berry “equator” when considering the pedicel to be the “pole.” This is a common practice reported by several articles [38,39]. The hyperspectral images were recorded by the SpectrononPRO software (Resonon, Bozeman, MT, USA) using an exposure time of 12 ms and a stage speed of 11 mm s−1 with a gain of 10. The spectral data in the wavelength range of 411–1000 nm was used in the data analysis for removing noise and reducing data redundancy out of this range. For each sample (50 berries), three reflectance spectra were collected, corresponding to the berry rotations, and averaged over the spatial dimension.
2.5. Preprocessing of Hyperspectral Images
All the acquired images were processed and analyzed using SpectrononPro 5.1 Hyperspectral Imaging System software (Resonon, Bozeman, MT, USA). The hyperspectral images were firstly corrected with a white and a dark reference (WD). The dark reference was used to remove the effect of dark current of the CCD detectors, which are thermally sensitive.
The corrected reflectance (R) is estimated using the following Equation (1):
(1)
where S is the intensity of an image, W is the intensity of the white reference image (Teflon white board with 99% reflectance), and D is the intensity of the dark reference image (with 0% reflectance) recorded by turning off the lighting source with the lens of the camera completely covered. The corrected reflectances were the basis for the subsequent image analysis to extract the spectral response of each fruit, select effective wavelengths, and predict physicochemical parameters.2.6. Data Analysis
2.6.1. Determination of Reference Parameters: Total Soluble Solids (TSS), Total Anthocyanin (TA), and Total Flavonoid Content (TF)
Immediately after image acquisition, each berry was subjected to the determination of Total Soluble Solids (TSS), Total Flavonoids (TF), and Total Anthocyanins (TA). Each berry was weighed, manually peeled, and the juice was collected separately. Total Soluble Solids were measured by a portable refractometer (Mettler Toledo Refracto 30PX) with a 0.2°Brix incertitude. The skins were separately weighed and extracted four times with 7.5 mL of hydrochloride ethanol solution (ethanol/water/hydrochloric acid 70/30/1 v/v/v). The samples were shaken for 60′ with a horizontal shaker VXR vibrax (IKA-Werke, Staufen, Germany) at 1500 rpm and centrifuged at 5000 rpm for 5′, and the supernatant was collected in a volumetric flask. The supernatants were collected together, brought to the volume of 25 mL and, stored at −80 °C until analyses. The quantification of TA and TF was carried out spectrophotometrically by recording the UV–visible spectra in the range of 220–700 nm using a Safas UV mc2 spectrophotometer (Safas, Monaco City, Monaco) and measuring the absorption values at 280 and 520 nm, as previously reported [40]. The results were expressed as mg (+)-catechin equivalents/kg fresh grape and mg malvidin-3-O-glucoside equivalents/kg fresh grape for the flavonoids and anthocyanins, respectively.
2.6.2. Spectral Analysis for Predicting Quality Attributes
-
Collecting spectral data
Only regions of interest (ROIs) were collected as already described [41] and an average spectrum was calculated by averaging the relative reflectance spectra.
-
Spectra pre-treatments
To overcome or reduce unwanted spectral variation, baseline shifts, and various noise, a series of pre-treatment methods was applied on the mean spectral data to decrease the influence of high-frequency random noises, the nonuniformity in samples, and the surface scattering. Before building the validation model, different Equations (2) to (4) were used for spectral pre-treatments [42]:
SNV: Standard Normal Variate (SNV). The average intensity (Amean) and standard deviation (ASD) of the spectrum are calculated and inserted in Equation (2):
(2)
1st derivative: The first derivatives was calculated using the symmetric difference quotient 1st derivative (3):
(3)
2nd derivative: The second derivate was calculated using the symmetric difference quotient 2nd derivate (4):
(4)
With i = 1 to N, N being the number of samples.
2.6.3. Hyperspectral Imaging Calibration
-
Model establishment
The use of chemometrics in modeling spectral data is widely employed, being considered as a standard procedure for building predictive models in the analysis of hyperspectral images. The partial least squares (PLS) analysis between one quality attribute (TA and TF or TSS) and the spectral data (average spectra with 276 wavelengths in the range from 411 to 1000 nm) was conducted using XLSTAT software (Addinsoft, Paris, France, 2019). No outlier detection was performed in order to keep all spectra and the heterogeneity due to the vegetal material.
A total of 350 reflectance mean spectra were obtained from 350 berries. The calibration and validation sets were established by ordering the fruit samples according to their physicochemical references. Briefly, 4 samples per varieties, i.e., 28 samples in total were randomly selected for the prediction set. The two highest and two lowest values were assigned to the calibration set. Afterward, two-thirds of the samples were randomly selected as calibration data and one-third of the samples were defined as validation data in a 2:1 leave-one-out procedure. Thus, calibration set and validation set were independent.
PLS regression used to develop calibration models was carried out with two calibration sample sets: (i) N = 207 samples for TF and TSS and (ii) N = 116 samples for TA. The building of PLS models for TF and TSS took into account both the white and red table grape cultivars, while for TA was considered only the red and rosé grape cultivars since white grapes do not have anthocyanins. To reduce the probability of an over fitting of the experimental data [43], PLS models with 1–15 latent variables (LVs) were fitted, and the model with a number of PLS factors that maximized the coefficient of determination (R2cal) for the calibration and minimized the root mean square error of calibration (RMSEC) was selected. These two parameters would allow the evaluation of the models.
-
Hyperspectral imaging model validation
Two validation sets (N = 103 samples for TF and TSS; N = 56 samples for TA) were used to calculate the root mean square error of validation (RMSEV), the coefficient of determination (R2val), the Bias, and the Ratio Performance Deviation (RPD) of the PLS models as follow [42]:
(5)
(6)
(7)
where N is the number of samples, R is the number of PLS factors, is the reference value for sample i, and is the predicted value for sample i.-
Hyperspectral imaging prediction
The quality of prediction of the models was tested using 4 samples per variety. The level of prediction is discussed base on R2pr and RMSEP values. The RMSEP was calculated as follow:
(8)
-
Selection of optimal wavelengths
Spectral wavelengths in hyperspectral images are characterized by their large degree of dimensionality with collinearity and redundancy. Researchers are often interested in finding the most important wavelengths which contribute to the evaluation of quality parameters and eliminate wavelengths having no discrimination power. After proving the good performance of the PLS models on the validation set, the next step was to select only the wavelengths that showed the maximum spectral information.
The regression coefficients (RC), also called β-coefficients, and the Variable Importance in Projection (VIP) scores were applied to select the most informative wavelengths, which provided the best PLS calibration model built with the full spectrum as variables. The wavelengths that corresponded to the highest absolute values of β-coefficients were considered optimal wavelengths [44]. Based on the studies conducted by Olah et al. [45], all wavelengths at which the VIP scores were above a threshold of 1.0 (highly influential) were selected and compared with those identified using β-coefficients. In this study, only the wavelengths with highest β-coefficients (absolute values) from one side and highest VIP scores (above the threshold of 1.0) on another side were selected to establish Multiple Linear Regression (MLR) models, instead of using the whole spectral range. Moreover, all the wavelengths with VIP score above 1 (spectral windows) were also used to carry out another PLS regression model in order to improve its performance.
2.6.4. Statistical Analyses
One way ANOVA on quality attributes of table grapes was performed with XLSTAT 2019.1 software (Addinsoft, Paris, France). Mean values were separated with Tukey’s test (p < 0.05) to present the significant differences between varieties.
3. Results
3.1. Grape Composition
Berries from each grape variety were characterized by their sugar content (Total Soluble Solids (TSS)), their Total Flavonoid content (TF), and Total Anthocyanin content (TA). Table 1 shows that the selected varieties had different total flavonoid content, from 201 mg kg−1 FW for Victoria grapes to 1642 mg kg−1 FW for Lival grapes, with white grapes presenting the lowest phenolic concentration as expected. This result is in agreement with Mikulic-Petkousek et al. [46], which showed that Victoria variety has a low phenolic content. Similarly, a large amount of total anthocyanin content was observed from 217 mg kg−1 FW for Alphonse Lavallée to 590 mg kg−1 FW for Sable seedless. Their sugar concentration was between 14.0 g/100 g (Victoria) to 24.8 g/100 g (Alphonse Lavallée) corresponding to ripening level [2]. Statistics showed that TF, TA, and TSS were significantly dependent on the grape cultivar.
3.2. Spectral Profiles
The mean reflectance spectra profile of each grape variety is presented in Figure 2. These spectra obtained by HIS showed clear differences between the grape varieties, as already reported by Baiano et al. [47] on 7 other varieties. White grapes exhibited important reflectance from about 500–650 nm on the contrary of reds. Chlorophyll pigments absorb indeed around 540 nm giving the green-yellow color to these varieties as hypothesized by Costa et al. (2019) [48]. All grapes presented much higher reflectance percentage between 700 and 950 nm, with a mix of intensity between reds and whites but varieties showed similar trends depending on the variety color: whites had higher intensity around 700–720 nm, which decreased to 950 nm, and reds showed flattened bell curve with a maximum around 820 nm. Absorption band at 840 nm is mainly due to sugar [47] and more than 960 nm due to water [48,49].
3.3. Modelization of Table Grape Composition Using the Whole Spectral Range of 411–1000 nm
PLS were developed to establish the relationship between the spectral data and the corresponding TA, TF, and TSS content analyzed by conventional chemical method. First of all, the whole dataset was dedicated to select the best pre-treatment for each quality parameter. The results are reported in Table 2. Five parameters were used to select the best model: R², LVs, RMSE, and Bias. RMSE has to be minimized and RPD has to be maximized [48]. HIS data were relevant [49] for modelizing Total Flavonoid, Total Anthocyanin contents, and TSS, since all determination coefficients (R2cal and R2val) were over 0.87 (Table 2). All pretreatments showed good results. Thus, we decided for similar range of R² and RMSE, to select the pre-treatment leading to the lower number of LVs, that is to say the SNV pre-treatment for all quality parameters. In details, for the modelization of TF, the SNV pre-treatment used only 9 LVs, with R2cal = 0.94, R2val = 0.93, with RMSEV = 141 mg kg−1. For Total Anthocyanins, the SNV model was characterized by R2cal = 0.93, R2val = 0.95, and RMSEV = 47 mg/kg with only 3 LVs. Finally, for TSS, the model, thanks to 10 latent variables, generated a R2cal = 0.94 and a R2val = 0.91 with RMSEV = 1.1 g/100 g. As for residual validation deviation (RPD), selected pre-treatments (mainly WD) generated values close to 4, which suggest the capability of the models to provide a good quantification and satisfactory prediction of TF, TA, and TSS [50,51]. The relatively low number of LVs of the models generated, and in particular for TA, and the fact that the models were built using grape berries of seven different cultivar contributed to the robustness of the models. Moreover, measured data vs. validated data were plotted for the three models selected (Figure 3). These graphs validated the selected models proving the ability of hyperspectral imaging data to predict TF, TA, and TSS in table grapes.
It is interesting to note that for TF and TSS a bimodal effect could be observed. That phenomenon is due to the white varieties in the case of TF, since they had the lowest TF values (as expected) compared to the other varieties. Nevertheless, the prediction of low values of TF could be considered uncertain since the concentration of TF below 500 mg/kg seemed not to follow a linear trend even if they positively contribute to the model (data not shown). For TSS, that effect was due to the rosé grapes since they presented the highest TSS values.
The predictions of these quality parameters were good since the determination coefficients obtained ranged between 0.92 and 0.98 for the pre-treatments selected above. The RMSEP was in the same order of magnitude than those of the validation and calibration sets, and even a bit better, i.e., the RMSEP were of 33 mg/kg for TA and of 0.9 for TSS, whereas the RMSEV were of 47 mg/kg for TA and 1.1 for TSS. This result shows that the method used for the validation was good. Thus, these results showed also that all grape varieties could be gathered in a single model.
3.4. Modelization of Table Grape Composition from Optimal Wavelengths Obtained by β-Coefficients
Hyperspectral data with hundreds of contiguous wavelengths for each pixel of image are a great issue for data processing. Therefore, the selection of optimal wavelengths is very important to reduce the computation time, to simplify the potential prediction model and further to satisfy the real-time inspection [52]. In this section, regression coefficients (RC) resulting from full-spectrum PLS models, were employed to select the key wavelengths aiming to establish the Multiple Linear Regression (MLR) models. Figure 4 shows the values of β-coefficients for the quality attributes Total Flavonoids, Total Anthocyanins, and TSS from the HIS data. The optimal wavelengths are those having the highest absolute values of β-coefficients (framed in the figure). Thus 17 specific wavelengths were selected for TF: 434.3, 485.5, 501.9, 543.4, 608.2, 631.4, 648.3, 675.9, 688.7, 707.9, 779, 792, 805, 807.2, 829, 905, and 945.9 nm; 8 for TA: 434.3, 543.4, 604, 616.6, 669.5, 796.3, 943.6, and 952.5 nm; and 23 for TSS: 418, 434.3, 485, 501.9, 539.2, 543.4, 585.1, 646.2, 661, 678, 697.2, 716.5, 792, 802, 805, 807.2, 829, 833, 905.9, 910.3, 939.2, 945.9, and 952.5 nm. Table 3 presents the accuracy and robustness of RC-MLR models built using the selected wavelengths. The model for TF showed R2 = 0.94 and 0.95 for the calibration and validation set respectively and RMSEV = 128 mg/kg. For TA, the model had R2cal of 0.93, R2val of 0.95 with RMSEV = 48 mg/kg, and the model for TSS presented a value of R2cal = 0.95, R2val = 0.93, and RMSEV = 1.0 g/100 g. To visualize these models, measured data vs. validated data were plotted (Figure 5). The correlation between the spectra data and the Total Flavonoid content (R²val = 0.95, Figure 5A) that of Total Anthocyanin content (R2val = 0.95, Figure 5B) and that of TSS (R2val = 0.93, Figure 5C) was good with points concentrated on the line y = x and relatively narrow scattering of data showing the low error of the model. As for Figure 3, the bimodal effect for TF and TSS was observed in Figure 5. These results seem obvious since the reduction in wavelengths for the model should not lead to a loss of information.
Thus, our models for TF, TA, and TSS showed good quantification and good prediction potential due to their RPD values (Table 3) [49,50,51]. However, the values of Bias are rather important for TA but that could be improved.
The prediction of the data from the test set showed also good results with R² over 0.93. The RMSEP obtained were in the range of RMSEC and RMSEV, with values slightly higher for TF but lower with TA and TSS.
Although the elimination of variables was approximately 92.0%, the MLR models had good performances. Compared to the full spectra, the MLR models were better for generating the model and for the prediction for TA, lower for TSS, and similar for TF. The fact that an improvement of the model is observed in some cases using MLR could be attributed to the use the optimal wavelengths neglecting unnecessary wavelengths, mitigating the problems of collinearity and overfitting [53]. Therefore, it could be demonstrated that regression coefficient algorithm is useful and effective for the selection of key wavelengths in predicting TF, TA, and TSS content in table grape.
3.5. Modelization of Table Grape Composition from Optimal Wavelengths Obtained by VIPs Score
The VIP scores resulting from the best preprocessing PLS regression model were used to develop a robust model by selection of feature-related wavelengths for TF, TA, and TSS of table grapes. The performance of the developed model by MLR depended largely on the cut-off value of the VIP scores. Generally, the “greater-than-one” rule is used to identify optimal wavelengths [54]. Only the wavelengths with highest value of VIP scores, above the threshold of 1.0, were selected to establish MLR models, whereas the wavelengths with VIP scores above 1 (spectral windows) were selected to perform a new PLS model. As shown in Figure 6, the optimal wavebands selected from all 283 wavebands were 10 (434.3, 543.4, 610.3, 633.5, 697.2, 781.1, 785.5, 805, 905.9, and 910.3 nm), 3 (710, 785.5, and 943.6 nm), and 8 (434.3, 501.9, 543.4, 610.3, 656.8, 686.5, 802.8, and 809.4 nm) for TF, TA, and TSS, respectively. Table 4 presents the accuracy and robustness of MLR models for TF, TA, and TSS based on VIP score. The model for the quality attribute TF led to R2 of 0.90 for both calibration and validation sets and with RMSEV = 178 mg/kg. The model for TA content showed R2 of 0.93 for the calibration set and 0.95 for the validation set with RMSEV = 37 mg/kg. For the sugar content (TSS), the VIPs-MLR model had R2cal equal to 0.86 and R2val of 0.83 with RMSEV = 1.6 g/100 g.
The MLR models based on the VIPs wavelengths selection showed values of RPD close or higher to 2.5, which indicated that these models were good enough to have a high utility value model [52] and was over 4 for TA showing a good prediction potential. However, these results showed a declined validation accuracy of TF and TSS models comparing to the ability of full-spectrum PLS and RC-MLR models. On the contrary, the VIPs-MLR model for TA was much better than the other with a RMSEV only of 37 mg/kg instead of 47 mg/kg in the case of the full spectra. Once more, to check the quality of the models, the measured data vs. validated data were plotted (Figure 7). All graphs showed that validated data fitted with measured data. The model is particularly good for TA with more narrow spread of the data. Again, the bimodal effect can be observed for TF and TSS. In addition, as for Figure 3, the data show that information was not lost with the reduction in wavelengths.
The prediction was good for all quality parameters (R2 > 0.86) (Table 4). The prediction errors were, however, higher for TF and TSS compared to those obtained with the full spectra and the MLR models, whereas the prediction was improved for TA.
The last trial was to select all the wavelengths with VIP score above 1 (spectral windows). New VIPs-PLS models were then build (Table 5). The VIPs-PLS model to predict TF (spectral windows: 434.3, 539.2–543.4, 608.2–610.3, 620.8–639.8, 690.8–796.3, 829, and 835.5–943.6 nm) generated a R2cal = 0.96, R2val = 0.95, and RMSEV = 122 mg/kg, using 14 LVs. The model for predicting TA content (spectral windows: 697.2–802.8 and 842.1–957 nm) was fed by 8 LVs, and generated R2cal = 0.95, R2val = 0.96, and RMSEV = 33 mg/kg. For TSS (spectral windows: 420.1, 424.1, 428.2–432.3, 436.3, 479.3–481.4, 535.1–541.3, 545.4, 555.9, 560, 564.2, 585.1–639.8, 673.8–688.7, 716.5–720.8, 864, 881.6, 890.5–892.7, 899.3, 912.5–914.8, 921.4–934.7, 939.2, and 954.8–957 nm), the VIPs-PLR model led to R2cal = 0.94, R2val = 0.89, and RMSEV = 1.3 g/100 g, using 14 LVs. RPD values suggested that all three models were good enough to quantify and predict the corresponding TF, TA, and TSS values [55]. Figure 8 shows the curves measured data vs. validated data for these best models. Again, the models fitted well with the measured data since the data spread is rather narrow for all three parameters, suggesting good validation models from specific windows HIS data. The simplified VIPs-PLS model performed with slight increase in the validation accuracy of TF and TA compared to the ability of full-spectrum PLS models, in terms of determination coefficient, RMSE, and RPD values. However, the best validation model for TSS was built using the whole spectral data.
Concerning the prediction ability of VIPs-PLS models, Table 5 showed that again the determination coefficients were over 0.94, with errors in the same range than the calibration and the validation sets. The prediction models were even better with this method using spectral windows than with the full spectra, considering both R² and RMSEP for all three quality parameters TF, TA, and TSS.
4. Discussion
The possibility to use the full spectra from HIS to generate a relevant PLS-model to predict the sugar content was indeed reported by Baiano et al. [47] using the same device. These authors developed a calibration models able to predict TSS of white and red table grape with R2val of 0.94 and 0.93, respectively. Our method was, however, valid for all grape varieties, with all reds, rosés, and whites, which would be easier to manage from an industrial point of view. In addition, the results of the present study were comparable to those of another work carried out by Gomes et al. [56], in which the prediction of TSS in wine grape was performed using two different model development techniques, i.e., PLS regression and Neural Networks. The obtained values of R2 of prediction were 0.92 for both PLS regression and Neural Networks with RMSEP of 0.94°Brix and 0.96°Brix, respectively. Hence, a good capacity of correlation was achieved in numerous other works on prediction of TSS for table and wine grapes [24,38,57,58].
Other authors have also reported good performance of linear models to predict the total anthocyanin content, with R2CV > 0.94 using spectral data in Vis-NIR [59] and NIR ranges [60] or total phenols content, with R2CV = 0.89 using the spectral data in Vis-NIR range [57,61]. Moreover, several studies also reported very good performance of nonlinear models to predict the TA content in whole Port and Cabernet sauvignon wine grape using the hyperspectral imaging device in Vis–NIR range [38,56,62]. Thus, our results were at least as good as those of other works but for the first time showed the relevance of HIS on red and white table grapes.
Our results highlighted that not only hyperspectral imaging is a relevant method to assess TA, TF, and TSS content but also the reduction in data is possible using MLR method with β-coefficients (RC method) or variable importance in the projection VIP. RC methods were already reported to be relevant to predict sugar content in the case of lychee fruit [55] and the total polyphenols concentration in cocoa beans [60,63]. Sen and co-workers [64] have also applied VIP selection to build OPLS models for the prediction of chemical parameters of wine by combined use of visible and mid-infrared (MIR) spectroscopies. These authors have built models able to predict anthocyanin compounds, total phenol content, and TSS of red wine with R2val ranging between 0.77 and 0.96.
The use of VIP in a PLR model (specific windows) was applied by Sen and co-workers [64] to build OPLS models for the prediction of chemical parameters of wine by combined use of visible and mid-infrared (MIR) spectroscopies. These authors have built models able to predict anthocyanin compounds, total phenol content, and TSS of red wine with R2val ranging between 0.77 and 0.96. Our work is thus in adequation with the previous studies and showed for the first time that reducing data, thanks to VIP or β-coefficients from HIS, is suitable for table grapes. No similar results have been found in table grapes for the control of Total Flavonoids and the Total Anthocyanins, although they have been found in wine grapes and other matrices with errors of the same order of magnitude [24,59,60,65].
Looking now at the other quality factor of a calibration model, the measurement of TSS by refractometry led to a standard deviation ≤ 1.8 (Table 1) in which was included the incertitude due to the refractometer and to the heterogeneity of the berries. Using full spectra, the PLS model only led to a RMSEP of 0.9. The reduction in the number of wavelengths reduced it to 0.7°Brix for β-coefficients wavelength selection. For TA, the lowest RMSEP was obtained thanks to VIPs-PLS (27 mg/kg), followed by both full spectra and VIPs-MLR from optimal wavelengths (33 mg/kg). For TF, the prediction models are much better for high level of flavonoid content. RMSEP decreased from 374 mg/kg (reference method, Table 1) to 128 mg/kg with VIPs score with specific windows or 149 mg/kg with β-coefficients. However, for very low concentration of flavonoids, like the Victoria variety, the models induced higher RMSEP.
Hyperspectral imaging is a tool, which could provide relevant on-line information about Total Flavonoid, Total Anthocyanins, and Total Soluble Solids through the use of consistent validation models. The models from the full spectra generated by SNV pre-treatment and the fact that the models were built using grape berries of seven different cultivar contributed to the robustness of our models. The possibility to use the same pre-treatment for all parameters and all varieties is interesting and could limit the complexity of the method and avoid mistakes in a professional use.
The reduction in data using only the wavelength with highest β-coefficient (absolute values) from one side, and spectral windows obtained from all the wavelengths with VIPs > 1 on another side, would allow an industrial use needing less computer data memory and quicker answers. That method could be used also as quality control. Database has first to be expanded not only to strength our current models but also to test new non-linear models. Another step would be to implement hyperspectral imaging on an industrial conveyor belt to take into account not only elements such as vibration on the conveyor but also analytical speed to provide real-time information. Moreover, in an on-line perspective, the localized information could be added for separating berries from a batch in order to get two or several final batches for different transformations or different quality array, depending on berry average spectrum, thus, on their composition. Nonetheless, that tool could anyway be used for a rapid table grape characterization in producer or industry places.
Author Contributions
Conceptualization, M.G.; methodology, M.G. and P.P.; validation, P.P.; formal analysis, M.G.; investigation, V.L.-V., M.G., and C.M.; resources, V.L.-V.; data curation, M.G.; writing—original draft preparation, M.G. and C.M.; writing—review and editing, C.M., M.G., and P.P.; visualization, C.M.; supervision, C.M.; project administration, C.M.; funding acquisition, C.M. All authors have read and agreed to the published version of the manuscript.
Funding
This research was conducted in the framework of the regional program “Objectif Végétal, Research, Education and Innovation in Pays de la Loire,” supported by the French Region Pays de la Loire, Angers Loire Métropole, and the European Regional Development Fund. The chemiometry was realized thanks to Region Pays de la Loire and Interprofession des Vins du Val de Loire InterLoire in the frame of the O3VINS project.
Institutional Review Board Statement
Not applicable.
Informed Consent Statement
Not applicable.
Data Availability Statement
Not applicable.
Acknowledgments
Authors want to thank Dominique Le Meurlay for logistic support and Susana Pombo for English language check.
Conflicts of Interest
The authors declare no conflict of interest.
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Figures and Tables
Figure 1. Hyperspectral imaging system: (a) a charge-coupled device (CCS) camera, (b) a spectrograph with a standard C-mount zoom lens, (c) quartz tungsten halogen (QTH) lighting unit, (d) translation stage, and (e) a PC with image acquisition software.
Figure 2. Mean reflectance spectra profiles obtained by hyperspectral imaging spectroscopy. Sable Seedless (dark blue), Seedless Sugarone (red), Alphonse Lavallée (green), Thompson Seedless (purple), Lival (turquoise), Black Magic (orange), and Vittoria (clear blue) table grapes samples.
Figure 3. Performance of partial least square (PLS) model using the spectral data pre-treated by SNV. Calibration and external validation for total flavonoids using 9 factors (A), for Total Anthocyanins using 3 factors (B), and for Total Soluble Solids using 10 factors (C). Blue dots indicate calibration set and red dots represent validation set. R2: determination coefficient, RMSE: root mean squared error.
Figure 4. Values of β-coefficients for all wavelengths for predicting quality attributes in table grape for the quality attributes, Total Flavonoids (A), Total Anthocyanins (B), and Total Soluble Solids (TSS) (C).
Figure 5. Performance of Multiple Linear Regression (MLR) models using only the optimal wavelengths extracted from β-coefficients of PLS analysis. Calibration and external validation using SNV pre-treated data: (A) for Total Flavonoids, (B) for Total Anthocyanins, and (C) for TSS. Blue dots indicate calibration set and black dots represent validation set.
Figure 6. Values of Variable Importance in Projections (VIP) scores for all wavelengths for predicting quality attributes in table grape for the quality attributes Total Flavonoids (A), Total Anthocyanins (B), and Total Soluble Solids TSS (C).
Figure 7. Performance of MLR model using the optimal wavelengths extracted from VIPs score of the best full-spectrum PLS analysis. Calibration and external validation from SNV pre-treated data for (A) Total Flavonoids, (B) Total Anthocyanins, and (C) Total Soluble Solids. Blue dots indicate calibration set and red dots represent validation set.
Figure 8. Performance of PLS model using the optimal wavelength windows extracted from VIPs score (>1) of PLS analysis. Calibration and external validation for (A) Total Flavonoids, (B) Total Anthocyanins, and (C) Total Soluble Solids. Blue dots indicate calibration set and red dots represent validation set.
Grape composition. Total Anthocyanins (TA), Total Flavonoids (TF) Content, and Total Soluble Solids (TSS) of table grapes. a b c, and d letters within the same column indicate significant differences among table grape cultivars according to Tukey-b test (p < 0.05). FW: fresh weight.
Grape Cultivars | Origin | TF |
TA |
TSS |
---|---|---|---|---|
Sable Seedless | South Africa | 1131 ± 267 c | 590 ± 163 a | 19.0 ± 1.8 b |
Alphonse Lavallée | South Africa | 829 ± 153 d | 217 ± 61 c | 24.8 ± 1.1 a |
Lival | France | 1642 ± 374 a | 588 ± 222 a | 15.0 ± 1.7 cd |
Black Magic | Italy | 1279 ± 259 b | 399 ± 132 b | 15.4 ± 0.9 c |
Sugarone Superior Seedless | South Africa | 162 ± 43 e | 0 | 14.7 ± 1.0 d |
Thompson Seedless | Egypt | 826 ± 136 d | 0 | 15.5 ± 1.9 c |
Victoria | Italy | 201 ± 28 e | 0 | 14.0 ± 1.5 e |
p < 0.001 | p < 0.001 | p < 0.001 |
Performance of partial least square (PLS) models depending on data pre-treatments for predicting TF, TA, and SSC, using full spectra (400–1000 nm). TF: Total Flavonoids, TA: Total Anthocyanins and TSS: Total Soluble Solids, WD: white and black correction, der: derivative, SNV: Standard Normal Variate. LVs: number of latent variables.
Variable | Pre-Treatment | LVs | Calibration Set | Validation Set | Prediction Set | |||||
---|---|---|---|---|---|---|---|---|---|---|
R2c | RMSEC | R2val | RMSEV | BIAS | RPD | R2pr | RMSEP | |||
TF | SNV | 9 | 0.94 | 146 | 0.93 | 141 | −9.45 | 3.90 | 0.92 | 159 |
1st DER | 9 | 0.95 | 128 | 0.93 | 148 | 5.2 | 3.70 | 0.96 | 120 | |
WD | 12 | 0.94 | 134 | 0.94 | 132 | −0.13 | 4.16 | 0.95 | 130 | |
2nd DER | 5 | 0.93 | 149 | 0.89 | 183 | 13.0 | 3.01 | 0.89 | 196 | |
TA | SNV | 3 | 0.93 | 59 | 0.95 | 47 | 6.7 | 4.61 | 0.98 | 33 |
1st DER | 4 | 0.93 | 61 | 0.92 | 56 | 6.8 | 3.87 | 0.97 | 39 | |
WD | 6 | 0.91 | 65 | 0.93 | 56 | 5.0 | 3.90 | 0.97 | 41 | |
2nd DER | 4 | 0.90 | 70 | 0.91 | 65 | 14.0 | 3.32 | 0.96 | 50 | |
TSS | SNV | 10 | 0.94 | 1.0 | 0.91 | 1.1 | −0.05 | 3.45 | 0.95 | 0.9 |
1st DER | 6 | 0.93 | 1.0 | 0.91 | 1.2 | −0.07 | 3.33 | 0.93 | 1.1 | |
WD | 15 | 0.96 | 0.8 | 0.94 | 0.9 | 0.01 | 4.17 | 0.96 | 0.8 | |
2nd DER | 5 | 0.92 | 1.1 | 0.88 | 1.4 | −0.01 | 2.90 | 0.92 | 1.1 |
Multiple Linear Regression (MLR) model performance for Total Flavonoids (TF), Total Anthocyanins (TA) and Total Soluble Solids (TSS) from optimal wavelengths selection based on β-coefficient of the best PLS full spectra analysis.
Variable | Optimal Wavelengths (nm) | Calibration Set | Validation Set | Prediction Set | |||||
---|---|---|---|---|---|---|---|---|---|
R2c | RMSEC | R2val | RMSEV | Bias | RPD | R2pr | RMSEP | ||
TF | 434.3, 485.5, 501.9, 543.4, 608.2, 631.4, 648.3, 675.9, 688.7, 707.9, 779, 792, 805, 807.2, 829, 905, 945.9 | 0.94 | 136 | 0.95 | 128 | 0.9 | 4.27 | 0.93 | 149 |
TA | 434.3, 543.4, 604, 616.6, 669.5, 796.3, 943.6, 952.5 | 0.93 | 55 | 0.95 | 48 | 4.5 | 4.51 | 0.97 | 39 |
TSS | 418, 434.3, 485, 501.9, 539.2, 543.4, 585.1, 646.2, 661, 678, 697.2, 716.5, 792, 802, 805, 807.2, 829, 833, 905.9, 910.3, 939.2, 945.9, 952.5 | 0.95 | 0.9 | 0.93 | 1.0 | −0.06 | 3.82 | 0.97 | 0.7 |
Performance of MLR models for predicting Total Flavonoids (TF), Total Anthocyanins (TA), and the Total Soluble Solids (TSS) using the optimal wavelengths extracted from VIPs of the best PLS full spectra analysis.
Variable | Optimal Wavelengths (nm) | Calibration Set | Validation Set | Prediction Set | |||||
---|---|---|---|---|---|---|---|---|---|
R2c | RMSEC | R2val | RMSEV | Bias | RPD | R2pr | RMSEP | ||
TF | 434.3, 543.4, 610.3, 633.5, 697.2, 781.1, 785.5, 805, 905.9, 910.3 | 0.90 | 178 | 0.90 | 178 | −11.3 | 3.09 | 0.93 | 155 |
TA | 710, 785.5, 943.6 | 0.93 | 44 | 0.95 | 37 | 5.6 | 5.90 | 0.98 | 33 |
TSS | 434.3, 501.9, 543.4, 610.3, 656.8, 686.5, 802.8, 809.4 | 0.86 | 1.5 | 0.83 | 1.6 | −0.06 | 2.46 | 0.86 | 1.4 |
Performance of Variable Importance in Projections (VIPs)-PLS models for predicting Total Flavonoids (TF), Total Anthocyanins (TA), and the Total Soluble Solids (TSS) using only the optimal wavelengths windows extracted from VIPs of PLS full spectra analysis.
Variable | Spectral Windows (nm) | LVs | Calibration Set | Validation Set | Prediction Set | |||||
---|---|---|---|---|---|---|---|---|---|---|
R2cal | RMSEC | R2val | RMSEV | Bias | RPD | R2pr | RMSEP | |||
TF | 434.3, 539.2–543.4, 608.2–610.3, 620.8–639.8, 690.8–796.3, 829, 835.5–943.6 | 14 | 0.96 | 120 | 0.95 | 122 | 12.0 | 4.50 | 0.95 | 128 |
TA | 697.2–802.8 and 842.1–957 | 8 | 0.95 | 38 | 0.96 | 33 | 1.5 | 6.50 | 0.99 | 27 |
TSS | 420.1, 424.1, 428.2–432.3, 436.3, 479.3–481.4, 535.1–541.3, 545.4, 555.9, 560, 564.2, 585.1–639.8, 673.8–688.7, 716.5–720.8, 864, 881.6, 890.5–892.7, 899.3, 912.5–914.8, 921.4–934.7, 939.2, 954.8–957 | 14 | 0.94 | 1.0 | 0.89 | 1.3 | −0.01 | 3.00 | 0.94 | 1.0 |
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2021 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (https://creativecommons.org/licenses/by/4.0/). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Table grape quality is of importance for consumers and thus for producers. Its objective quality is usually determined by destructive methods mainly based on sugar content. This study proposed to evaluate the possibility of hyperspectral imaging to characterize table grapes quality through its sugar (TSS), total flavonoid (TF), and total anthocyanin (TA) contents. Different data pre-treatments (WD, SNV, and 1st and 2nd derivative) and different methods were tested to get the best prediction models: PLS with full spectra and then Multiple Linear Regression (MLR) were realized after selecting the optimal wavelengths thanks to the regression coefficients (β-coefficients) and the Variable Importance in Projection (VIP) scores. All models were good at showing that hyperspectral imaging is a relevant method to predict sugar, total flavonoid, and total anthocyanin contents. The best predictions were obtained from optimal wavelength selection based on β-coefficients for TSS and from VIPs optimal wavelength windows using SNV pre-treatment for total flavonoid and total anthocyanin content. Thus, good prediction models were proposed in order to characterize grapes while reducing the data sets and limit the data storage to enable an industrial use.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details



1 USC 1422 GRAPPE, INRAE, Ecole Supérieure d’Agricultures, SFR 4207 QUASAV, 55 Rue Rabelais, BP 30748, 49007 Angers CEDEX 01, France;
2 USC 1422 GRAPPE, INRAE, Ecole Supérieure d’Agricultures, SFR 4207 QUASAV, 55 Rue Rabelais, BP 30748, 49007 Angers CEDEX 01, France;