Find Paper, Faster
Example:10.1021/acsami.1c06204 or Chem. Rev., 2007, 107, 2411-2502
Learning main drivers of crop progress and failure in Europe with interpretable machine learning
International Journal of Applied Earth Observation and Geoinformation  (IF5.933),  Pub Date : 2021-10-19, DOI: 10.1016/j.jag.2021.102574
Anna Mateo-Sanchis, Maria Piles, Julia Amorós-López, Jordi Muñoz-Marí, Jose E. Adsuara, Álvaro Moreno-Martínez, Gustau Camps-Valls

A wide variety of methods exist nowadays to address the important problem of estimating crop yields from available remote sensing and climate data. Among the different approaches, machine learning (ML) techniques are being increasingly adopted, since they allow exploiting all the information on crop progress and environmental conditions and their relations with crop yield, achieving reliable and accurate estimations. However, interpreting the relationships learned by the ML models, and hence getting insights about the problem, remains a complex and usually unexplored task. Without accountability, confidence and trust in the ML models can be compromised. Here, we develop interpretable ML approaches for crop yield estimation that allow us investigating the most informative agro-ecological drivers and influencial regions learned by the models. We conduct a set of experiments to learn the selection of agro-ecological drivers leading to best estimations of main crops grown in Europe: corn, barley and wheat. As input data, we consider a variety of multi-scale Earth observation products sensitive to canopy greenness (e.g. EVI and LAI), its water-uptake dynamics (e.g. VOD) and available water (soil moisture), as well as climatic variables from the ERA5-Land reanalysis (e.g. temperature and radiation). Our results show that the best performances (R2>0.55 for corn and R2>0.8 for both barley and wheat) are obtained when descriptors of soil, vegetation, and atmosphere status are used as input in the ML models. This combination of variables outperforms the results obtained using single variables as inputs or all variables altogether. We then further analyze the relations of input features with crop yield in the developed models by means of Gaussian Process Regression (GPR). We show how the information learned by the GPR model allows us to identify atypical or anomalous crop seasons across the study region, and investigate the underlying factors behind crop progress and failure in Europe.