1 Introduction
A key challenge in hydrology is estimating past, present, and future hydrological conditions in rivers around the world. This is largely due to severe temporal and spatial gaps in the global river discharge observing network. In many parts of the world, there is simply not enough long-term river discharge observations at high enough spatial density, and in the vast majority of countries hydrometric data are not available in real time (Lavers et al., 2019). The lack of observations is therefore a major barrier in our ability to provide monitoring and early warning of hydrological extremes such as floods and droughts, which have, for example, implications for progressing international disaster risk reduction (UNDRR, 2015). A way forward pioneered in the field of meteorology and climate has been to optimally combine in situ and satellite earth system observations, together with advanced numerical weather prediction (NWP) models, to generate a reanalysis of land, ocean, and atmospheric variables of interest, thus providing consistent spatio-temporal “maps without gaps” (Hersbach et al., 2020). Several global hydrological products have been developed that provide estimates of runoff or river discharge with a wide range of forcing and methodological approaches (e.g. Fekete et al., 2002; Döll et al., 2003; Qian et al., 2006; Sperna Weiland et al., 2010; Reichle et al., 2011; Yamazaki et al., 2011; Beck et al., 2017; Ghiggi et al., 2019; Lin et al., 2019). While these datasets can be used to understand past variability and change in the terrestrial hydrological cycle, they are currently not produced in an operational environment in near real time and so cannot be used for monitoring current global river conditions or providing initial conditions to hydrometeorological forecasting systems.
A long-term and near-real-time river discharge reanalysis is produced operationally as part of the Global Flood Awareness System (GloFAS;
In GloFAS, ensemble river discharge forecasts are produced each day at a daily time step and provide probabilities of flood thresholds being exceeding for a given river section with a lead up to 30 d ahead (GloFAS 30 d; Alfieri et al., 2013). There is also a seasonal component, GloFAS-Seasonal (Emerton et al., 2018), that provides forecasts once per month at a weekly time step with a lead time up to 4 months ahead. The river discharge reanalysis is used for two core tasks within GloFAS. First, flood thresholds at 2-, 5-, and 20-year return periods for each river cell are derived from the long-term reanalysis series. This allows for the magnitude of the real-time ensemble river discharge forecasts to be directly compared to the magnitude of the long-term flood thresholds and thus awareness of a flood signal if the threshold is exceeded. Second, it provides the basis to derive initial hydrometeorological conditions for both GloFAS 30 d and GloFAS-Seasonal real-time forecasts. Estimating initial conditions is a key step to determine the current status of soil moisture, groundwater, snow cover, and initial state of water within rivers and other waterbodies, and it has been identified as one of the major challenges in continental- and global-scale flood forecasting given the limited availability of observational data at these scales (Emerton et al., 2016).
The aim of this data paper is to describe the newly produced operational river discharge reanalysis dataset as part of the launch of GloFAS v2.1 on 5 November 2019 (see GloFAS technical documentation for details on upgrades:
Section 2 outlines the production of the dataset and Sect. 3 describes its main attributes including available variables and file format. An evaluation of the dataset against a global network of observations is conducted in Sect. 4. The dissemination of the data through the CDS is shown in Sect. 5 before key conclusions and future work are offered in Sect. 6.
2 Data production
Pappenberger et al. (2010) first demonstrated that it was possible to achieve useful river discharge predictions by coupling a river routing scheme with the land surface model of the ECMWF global numerical weather prediction (NWP) system. The GloFAS-ERA5 river discharge reanalysis uses this concept and is produced by coupling the land surface model runoff component of the ECMWF ERA5 global reanalysis (Hersbach et al., 2020) with the LISFLOOD hydrological and channel routing model (van der Knijff et al., 2010). In ERA5, the runoff (m d) from one cell is not connected to neighbouring cells; hence, it is not possible to estimate river discharge (m s) at the catchment scale. Coupling ERA5 runoff with LISFLOOD allows for the lateral connectivity of grid cells with runoff routed through the river channel to produce river discharge. A schematic of the key components in the production of the GloFAS-ERA5 reanalysis is provided in Fig. 1. The open-access scientific publications and model documentation that describe the full methodological detail for each key component is provided in Table 1 and summarized below.
Figure 1
A schematic of the key components in the production of GloFAS-ERA5 v2.1 river discharge reanalysis dataset.
[Figure omitted. See PDF]
Table 1Scientific papers and model documentation for the key components in the production of GloFAS-ERA5 v2.1 river discharge reanalysis dataset.
GloFAS-ERA5 component | Description | Reference |
---|---|---|
ERA5 | Global reanalysis dataset using ECMWF Integrated Forecast System (IFS) model cycle 41r2 from 1979 to present | Hersbach et al. (2020) |
ERA5 runoff | Surface and sub-surface runoff within ERA5 generated using the HTESSEL land surface model | Balsamo et al. (2009) |
LISFLOOD river discharge | River discharge generated using LISFLOOD hydrological and channel routing model to route runoff into and through the river network and provide groundwater storage, including lake, reservoir, and human water use routines | Burek et al. (2013) |
Lakes and reservoirs used in GloFAS | Incorporated 463 lakes and 667 reservoirs into the GloFAS river network | Zajac et al. (2017) |
Calibration of LISFLOOD used in GloFAS | LISFLOOD was calibrated against daily river discharge from 1287 observation stations worldwide | Hirpa et al. (2018) |
ERA5 runoff is produced from the HTESSEL land surface model (Hydrology Tiled ECMWF Scheme for Surface Exchanges over Land; Balsamo et al., 2009) as used within the ECMWF Integrated Forecasting System (IFS). HTESSEL computes the surface water and energy fluxes and the temporal evolution of soil temperature, soil moisture, and snowpack. Excess precipitation and snowmelt are partitioned as surface runoff or infiltrated into a four-layer soil column (7 cm depth for top layer and then 21, 72, and 189 cm) at each ERA5 grid cell before draining from the bottom of the soil column as sub-surface runoff (Balsamo et al., 2009). ERA5 uses an advanced land data assimilation system to assimilate conventional in situ and satellite observations for land surface variables such as soil moisture, soil temperature, snow water equivalent, snow density, and snow temperature, as outlined in de Rosnay et al. (2014).
ERA5 benefits from a decade worth of numerical weather prediction developments in model physics, numerics, and data assimilation by using ECMWF IFS model cycle 41r2 (2016) compared to model cycle 31r2 (2006) as used in its predecessor, ERA-Interim (Dee et al., 2011). ERA5 has a horizontal resolution of approximately 31 km at the Equator (native octahedral grid) and since January 2019 has been openly available from 1979 to present. A key novelty of ERA5 is its operational production that makes available an intermediate timely product, ERA5T, in near real time, allowing the production the GloFAS-ERA5 river discharge reanalysis operationally with a latency of between 2 and 5 d behind real time.
2.2 LISFLOOD river discharge
River discharge is currently not calculated by HTESSEL. Instead, surface and sub-surface runoff from the HTESSEL land surface model coupled with a simplified global version of LISFLOOD, a spatially distributed grid-based hydrological and channel routing model. The details of the global version of LISFLOOD used within GloFAS v2.1 and its calibration can be found in Hirpa et al. (2018) but are briefly summarized here for context. The sub-surface runoff from HTESSEL is used as input for the LISFLOOD groundwater module, which consists of two parallel linear reservoirs that store and subsequently transport water to the river channel with a time delay. The upper zone represents quick groundwater and sub-surface flow, while the lower zone represents slow groundwater flow that generates base flow. In Hirpa et al. (2018), the upper zone time constant was given a default value of 10 d with a lower (upper) bound of 3 d (40 d) during calibration, and the lower zone time constant given a default value of 200 d with a lower (upper) bound of 40 d (500 d). The surface runoff from HTESSEL is used as input for the LISFLOOD river channel routing module. This is a two-stage process whereby the surface runoff for each cell is first routed to the nearest downstream river channel cell, then the water in the channel is routed through the river network using the kinematic wave approach. Groundwater and river routing parameters in GloFAS were calibrated against daily river discharge observations for 1287 catchments globally by Hirpa et al. (2018). A key feature of LISFLOOD is the ability to represent features that can severely alter the timing and magnitude of river discharge, such as lakes, reservoirs, and human water use (Burek et al., 2013). A total of 463 of the largest lakes (surface area km) and 667 of the largest reservoirs were incorporated into the GloFAS river network by Zajac et al. (2017).
To generate the GloFAS-ERA5 river discharge reanalysis, the LISFLOOD model is forced with daily HTESSEL surface and sub-surface runoff from ERA5 starting from 1 January 1979 (Fig. 1). In order to be consistent with the operational GloFAS procedure, the runoff fields from ERA5 were downscaled using the simple nearest neighbour method from the native ERA5 to the 0.1 GloFAS grid. To avoid the need for very long spin-up periods, LISFLOOD calculates a steady-state storage amount for the lower groundwater zone during a long-term “pre-run” and thus reduces the lower zone's spin-up time (Burek et al., 2013). LISFLOOD was therefore given a 1-year model spin-up using preliminary ERA5 output for 1978. To produce GloFAS-ERA5 reanalysis in near real time operationally, the latest available ERA5T data are used.
3 Data description
The key attributes of the current operational version (v2.1) of the GloFAS-ERA5 river discharge reanalysis dataset are shown in Table 2. The daily reanalysis is global in coverage, except for Antarctica, with a horizontal grid resolution of 0.1 (approximately 11 km at the Equator). The dataset is over 40 years long starting on 1 January 1979. An innovative aspect of the dataset is its operational production allowing it to be available 2 to 5 d behind real time, shortly after ERA5T becomes available. The intermediate ERA5T data are not quality assured due to its timely nature. Consequently, there will be two reanalysis streams available: GloFAS (consolidated) is the final product based on the consolidated ERA5 from 1 January 1979 until 2 to 3 months behind real time, updated on the CDS on a monthly basis; and GloFAST (intermediate) is the timely product based on the intermediate ERA5T from 1 August 2019 until 2 to 5 d behind real time, updated on the CDS on a daily basis whenever ERA5T becomes available.
Table 2
Summary of GloFAS-ERA5 dataset attributes in the C3S Climate Data Store.
Dataset attribute | Details |
---|---|
Horizontal coverage | Global except for Antarctica (90 N–60 S, 180 W–180 E) |
Horizontal resolution | |
Spatial reference system | Latitude/longitude (WGS 84; EPSG:4326) |
Vertical resolution | Surface level for river discharge |
Temporal resolution | Daily data |
Temporal coverage | 1 January 1979 to near real time |
Availability behind real time | (i) GloFAS (consolidated): 2 to 3 months, updated on CDS monthly (final product following availability of officially released quality assured ERA5 data) (ii) GloFAST (intermediate): 2 to 5 d, updated on CDS daily (timely product following availability of non-quality assured ERA5T data) |
Update frequency | A new river discharge reanalysis will be published with every major update of the GloFAS system. The latest version will always be the version used in operations. |
File format | NetCDF |
Data type | Grid |
Data size on disk | Approximately 21.7 MB uncompressed per global NetCDF file for 1 d (full dataset currently GB uncompressed) |
Version | GloFAS-ERA5 v2.1 |
File naming convention | “CEMS_ECMWF_dis24_YYYYMMDD_glofasT_v2.1.nc” where YYYY is year, MM is month, DD is day, and T is for timely (i.e. GloFAST). The date stamp, YYYYMMDD, represents the end of the 24 h averaging period. |
Variables available within GloFAS-ERA5 dataset in the C3S Climate Data Store.
Variable type | Name | Units | Description |
---|---|---|---|
Primary variable | River discharge | m s | Volume rate of water flow, including sediments and chemical and biological material, in the river channel averaged over a time step through a cross section. The value is an average over a 24 h period. |
Related variable | Upstream area | m | Static file (“upArea.nc”), upstream area for the point in the river network |
The GloFAS-ERA5 reanalysis dataset includes the variables river discharge and the upstream area for each GloFAS grid cell (Table 3). Data are stored in NetCDF format with one file per day containing the 24 h mean river discharge (00:00 UTC to 00:00 UTC). Each daily filename follows the convention “CEMS_ECMWF_dis24_YYYYMMDD_glofasT _v2.1.nc” whereby the date stamp represents the end of the 24 h averaging period. So, for example, the file “CEMS_ECMWF_dis24_20190101_glofas_v2.1.nc” contains the daily mean flow for the 24 h period 00:00 UTC 31 December 2018 to 00:00 UTC 1 January 2019. Appendix A shows the header metadata information contained within the example NetCDF file. Each daily NetCDF file for the whole globe has an uncompressed size of MB; therefore, the estimated size of the dataset from January 1979 to October 2019 is GB.
Figure 2
Mean GloFAS-ERA5 daily river discharge from 1979 to 2018 for each GloFAS river grid cell with an upstream area greater than 1000 km. Darker blue river sections have larger river discharge.
[Figure omitted. See PDF]
Figure 3
Hydrograph for GloFAS-ERA5 river discharge reanalysis (blue line) from 1 January 1979 to 12 November 2019 and observations (red line), when available, for the Santa Rosa gauging station on the Teles Pires River, a sub-catchment of the Amazon, Brazil (GloFAS ID 1250; GRDC ID 3629770). Summary statistics from the evaluation of the reanalysis against observations in top right box as used in Sect. 4.
[Figure omitted. See PDF]
Figure 2 maps the mean daily river discharge from 1979 to 2018 for each GloFAS river with an upstream area greater than 1000 km, revealing the main river arteries of the world. An example hydrograph of the long-term near-real-time reanalysis against available river discharge observations is shown in Fig. 3 for the Teles Pires River in the Amazon basin, Brazil.
4 Evaluation and limitationsGloFAS-ERA5 v2.1 river discharge reanalysis was evaluated against a global network of daily river discharge observations. As part of GloFAS, a database of global hydrological observations for 2042 stations is held, consisting predominantly (i.e. %) of data from the Global Runoff Data Centre (GRDC) and supplemented by data collected through collaboration with GloFAS partners worldwide to improve spatial coverage. A number of criteria were used to select stations for the evaluation:
-
at least 4 years of daily data available between 1979 and 2018 (not necessarily contiguous) (78 stations removed);
-
minimum upstream area of 500 km (4 stations removed);
-
error in catchment area supplied by data provider and upstream area for corresponding cell on the GloFAS river network within 20 % (93 stations removed);
-
first order visual quality check on observed river discharge time series to remove stations with erroneous data; for example, time series truncated above a threshold, severe inhomogeneities, or series monitoring an artificial canal instead of a river (39 stations removed);
-
station with the longest record retained when multiple observation stations were matched to the same GloFAS river cell (27 stations removed).
Performance at the daily scale was assessed using the modified Kling–Gupta efficiency metric (; Gupta et al., 2009; Kling et al., 2012). The is gaining popularity as the standard performance metric in hydrology (e.g. Beck et al., 2017; Harrigan et al., 2018; Lin et al., 2019) and can be decomposed into three components important for assessing hydrological dynamics: temporal errors through correlation, bias errors, and variability errors: where is the Pearson correlation coefficient between reanalysis simulations (s) and observations (o), is the bias ratio, is the variability ratio, is the mean discharge, and is the discharge standard deviation. The and its three decomposed components (correlation, bias ratio, and variability ratio) are all dimensionless with an optimum value of 1. In order to evaluate the hydrological simulation skill of GloFAS-ERA5 reanalysis, its performance is compared against a simpler benchmark. Here the observed mean flow is used as a benchmark as proposed by Knoben et al. (2019). This is not a difficult benchmark to beat but should arguably be the minimum reference for any hydrological system to be compared against. Here we represent as a skill score, KGESS, to evaluate the performance of GloFAS-ERA5 river discharge reanalysis against the mean flow benchmark simulation, given as
4 where is the value for the GloFAS-ERA5 reanalysis against observations, is the value for the observed mean flow benchmark against observations (i.e. from Knoben et al., 2019), and is the value of for a perfect simulation which is 1. A KGESS means the GloFAS-ERA5 reanalysis is no better than the mean flow benchmark and so has no skill, KGESS means the reanalysis is considered skilful, and KGESS means the performance is worse than the benchmark and so has negative skill. Performance metrics for all 1801 stations are included in Table S1.
4.1 Overall performanceResults for overall performance show that the GloFAS-ERA5 river discharge reanalysis is skilful in 86 % of catchments (Fig. 4a). The global median KGESS () is 0.51 (0.31) with an interquartile range (IQR) of 0.30 (0.00) to 0.66 (0.52). Performance is best in Brazil (particularly the Amazon basin), central Europe, and the eastern and western regions of the United States (Fig. 5). GloFAS-ERA5 reanalysis performance is poor (i.e. KGESS ) in many catchments in Africa and the North American Great Plains extending into Mexico with notable patches in eastern Brazil, Thailand, and southern Spain. Results will be biased towards regions with a larger number of stations, especially when well performing large basins contain many sub-catchments (e.g. Amazon and Rhine basins).
Figure 4
Cumulative distribution function (CDF) of performance metrics across all 1801 stations. Modified Kling–Gupta efficiency () and skill score (KGESS) (a) with decomposition of into Pearson correlation (b), bias ratio (c), and variability ratio (d). The red dot marks the optimum value for each metric.
[Figure omitted. See PDF]
Figure 5
Modified Kling–Gupta efficiency skill score (KGESS) for GloFAS-ERA5 river discharge reanalysis against 1801 observation stations. Optimum value of KGESS is 1. Blue (red) dots show catchments with positive (negative) skill.
[Figure omitted. See PDF]
4.2 Decomposition into correlation, bias, and variabilityAn advantage with the is that it can be decomposed into three constituent components so that greater insights can be gained into which aspects of the GloFAS-ERA5 reanalysis are driving poor and good skills. Almost all (99 %) catchments show a positive correlation (Figs. 4b and 6a) with a global median Pearson correlation coefficient of 0.61 (IQR , 0.74). Figure 4c shows that river discharge reanalysis is negatively biased in 64 % of catchments (i.e. bias ratio ) with a global median bias ratio of 0.84 (IQR , 1.21). In the evaluation of their global river simulation, Lin et al. (2019) consider a percentage bias within % (equivalent to a bias ratio within 0.8 to 1.2) to be very good. Whilst only 28 % of stations meet this criterion for the GloFAS-ERA5 reanalysis, results are in line with simulations in Lin et al. (2019). The worst performing catchments (dark red KGESS dots in Fig. 5) are predominantly driven by very large positive biases (dark blue dots in Fig. 6b) in dryer rivers of the central United States, Africa, and eastern Brazil, as well as the western coast of South America; in total 12 % of catchments have a bias ratio (equivalent to a percent bias %). Figure 4d (shown spatially in Fig. 6c) shows lower variability in GloFAS-ERA5 reanalysis than observations in 61 % of catchments (i.e. variability ratio ), but errors in variability are less severe than bias errors with a global median variability ratio of 0.91 (IQR , 1.15).
Figure 6
Decomposition of the Modified Kling–Gupta efficiency () into its three components: Pearson correlation (a), bias ratio (b), and variability ratio (c) for GloFAS-ERA5 river discharge reanalysis against 1801 observation stations. The optimum value for each of the three components is 1. Blue (red) dots represent positive (negative) values.
[Figure omitted. See PDF]
Figure 7
Mean absolute error (MAE) for GloFAS-ERA5 reanalysis against 1801 observation stations. Units for both reanalysis and observations have been converted from cubic metres per second (m s) to runoff depth across the catchment area (mm d) to allow direct comparison of the magnitude of errors. Optimum value of MAE is 0; catchments with larger magnitude of errors are darker shades of blue dots.
[Figure omitted. See PDF]
It is important to also look at the average magnitude of errors as a small over/under estimation in dry rivers can produce large percentage biases (and hence bias ratios). This was done by converting the units of both the reanalysis and observation time series from cubic metres per second (m s) to runoff depth across the catchment area in millimetres per day (mm d) to allow direct comparison between catchments of different sizes and then computing the mean absolute error (MAE) metric (Fig. 7). The global median MAE is 0.41 mm d (IQR mm d, 0.72 mm d). Most areas with a bias ratio (in Fig. 6b), namely much of Africa, the central United States, and eastern Brazil, have in fact a low absolute magnitude of errors given their dry locations. Other notable areas with a low absolute magnitude of errors include large parts of India, South East Asia, and Australia. There are, however, catchments on the western coast of South America, Sudan, and Ethiopia and tributaries of the River Ganges with a large MAE.
4.3 Performance by monthFigure 8 shows the global performance of GloFAS-ERA5 reanalysis for each month across all 1801 stations. Hydrological simulation skill is relatively consistent across each month with median KGESS ranging between 0.32 to 0.41 (Fig. 8a). The April to October months have the highest skill, with November to March having a higher proportion of catchments with negative skill. When the is decomposed into correlation, bias, and variability components at the monthly scale (Fig. 8b–d, respectively), it shows that the months with higher incidence of negative KGESS are driven by a higher proportion of catchments with large positive biases in those months. Correlation and variability error metrics do not vary much from one month to the next in comparison to bias errors.
Figure 8
Performance metrics for each month for all 1801 stations. Modified Kling–Gupta efficiency skill score (KGESS) (a) with decomposition of into Pearson correlation (b), bias ratio (c), and variability ratio (d). Boxes represent the IQR and horizontal grey line the median. Whiskers extend to the most extreme data point unless the data point is more than 1.5 times the IQR from the box and is instead represented as an outlier (grey diamond).
[Figure omitted. See PDF]
Figure 9
As in Fig. 8 but by hemisphere: Northern Hemisphere ( stations) as brown boxes and Southern Hemisphere ( stations) as green boxes.
[Figure omitted. See PDF]
Results are grouped into Northern Hemisphere ( stations) and Southern Hemisphere ( stations) in Fig. 9. The overall GloFAS-ERA5 monthly performance in each hemisphere does not change substantially from the global analysis (Fig. 8). Nevertheless, there are some differences. The KGESS and bias ratio from the Northern Hemisphere (Fig. 9a and c, respectively) tend to follow the global analysis most strongly (i.e. Fig. 8a and c, respectively), which is not surprising given 70 % of all stations are located in the Northern Hemisphere. However, a higher proportion of Southern Hemisphere stations show large positive biases from April to June compared to November to March in the Northern Hemisphere. The largest proportion of stations with negative KGESS in the Southern Hemisphere is found from August to October (Fig. 9a). These months correspond with a lower Southern Hemisphere correlation (Fig. 9b) and a higher proportion of stations with large positive variability ratios (i.e. GloFAS-ERA5 has higher variability than observed river discharge).
4.4 Performance by catchment areaThe skill of GloFAS-ERA5 river discharge reanalysis grouped into seven catchment area categories is shown in Fig. 10. In general, skill is lowest for catchments in the three categories km with median KGESS (), 0.4 (), and 0.42 (), respectively. Performance improves as catchment size increases with median KGESS for catchments km. It must be noted that results are affected by uneven samples of catchment sizes available within the GloFAS observation database, with catchments between 10 000 and 50 000 km being dominant () and smaller catchments being under-represented.
Figure 10
Modified Kling–Gupta efficiency skill score (KGESS) grouped into seven catchment area categories. Box and whisker descriptions are as in Fig. 8.
[Figure omitted. See PDF]
Figure 11
The GloFAS-ERA5 river discharge reanalysis landing page in the C3S Climate Data Store (CDS;
[Figure omitted. See PDF]
4.5 LimitationsThis first evaluation has found the dataset to be hydrologically skilful in the vast majority of catchments tested, although the strength of skill can vary considerably depending on location. The degradation in skill, as defined using KGESS, is the combination of (lower) correlation, (larger) bias errors, and (larger) variability errors. The evaluation provides users with an overview of the global-scale quality of the dataset, although users are advised to undertake a more in-depth evaluation of the dataset for their region of interest. A key limitation of the dataset is the large biases identified in several regions (see above). The attribution of such biases in the GloFAS-ERA5 reanalysis is outside the scope of this data paper, but ongoing investigations such as Zsoter et al. (2019) have shown that biases can be introduced by the real-time land data assimilation within the HTESSEL land surface model. Another expected cause of differences between river discharge reanalysis and observations is due to human modification within catchments and river channels (e.g. Harrigan et al., 2014). It is estimated that just 37 % of rivers remain free-flowing globally with the construction of reservoirs and dams the main contributor to loss of connectivity (Grill et al., 2019). While GloFAS-ERA5 reanalysis does represent major dams and reservoirs on the modelled river network, simplified reservoir operating parameters were used based on expert opinion (outlined in Zajac et al., 2017) due to lack of availability of global operational release records. Given the fundamental dependence of the dataset on ERA5, it would be pertinent for users to be aware of the known ERA5 issues, which can be found in the ERA5 documentation:
5 Data availability
The GloFAS-ERA5 river discharge reanalysis is provided through the European Commission Copernicus Emergency Management Service (CEMS) and follows the Copernicus open data policy that users shall have free, full, and open access to Copernicus service information. With the drive for open data comes challenges. In the era of big data, it is clear that traditional ways of hosting and disseminating large earth system datasets is no longer fit for purpose. An exciting development in the way large climate datasets are discovered, accessed, and used is the Copernicus Climate Change Service (C3S) Climate Data Store (CDS;
The GloFAS-ERA5 river discharge reanalysis product is available on the CDS:
6 Conclusions
This paper outlines the production, description, evaluation, and access to the new GloFAS-ERA5 operational global river discharge reanalysis dataset available from 1979 and updated in near real time. This dataset is central to two key steps within GloFAS: (i) the calculation of flood thresholds against which real-time ensemble forecasts are compared to determine the probability of a flood signal and (ii) more consistent hydrometeorological initial conditions for the real-time flood and seasonal forecasts. The evaluation against observations showed that the product is skilful in 86 % of catchments according to the modified Kling–Gupta efficiency skill score against a mean flow benchmark. However, skill varies considerably with location, with several regions such as the central United States, Africa, eastern Brazil, and the western coast of South America having large systematic positive biases. The results from the evaluation are comparable with other long-term global river discharge products (e.g. Lin et al., 2019). The attribution of such biases in the GloFAS-ERA5 reanalysis is outside the scope of this data paper, but ongoing investigations such as Zsoter et al. (2019) on the biases introduced by the real-time land data assimilation within the HTESSEL land surface model will help us to better understand existing limitations. GloFAS is an operational system which undergoes constant developments with intensive research on future versions of the model. It is foreseen that a new model version will be made operational in 2021 based on the full LISFLOOD hydrological model and an improved model calibration (Alfieri et al., 2020).
The long-term and operational nature of the GloFAS-ERA5 reanalysis dataset opens avenues for further applications. Forecast evaluation activities within GloFAS now include skill assessment over longer time periods and has allowed a new operational forecast verification suite to be developed whereby the performance of the forecasts can be tracked in near real time for every river in the world. Other applications are envisaged for monitoring the global status of flood and drought conditions, the identification of hydroclimatic variability and change, and as raw input for post-processing and machine learning methods that can add further value.
Appendix A Appendix BThe supplement related to this article is available online at:
Author contributions
SH drafted the paper and performed the evaluation. EZ wrote the suite to produce the dataset. CB adapted the suite to produce the dataset operationally. FW and CB were responsible for the integration of the dataset into the Climate Data Store. LA, CP, PS, HC, and FP helped frame the paper. All co-authors contributed to the editing of the paper and to the discussion and interpretation of results.
Competing interests
The authors declare that they have no conflict of interest.
Acknowledgements
We thank colleagues from the Copernicus Climate Change Service (C3S) for helping with ingesting the dataset into the Climate Data Store (CDS) and Cinzia Mazzetti (ECMWF) and Domenico Nappo (JRC) for helpful discussions during the revision regarding LISFLOOD. The providers of observed river discharge observations are greatly thanked, as well as both GloFAS partners and the Global Runoff Data Centre (GRDC), 56068 Koblenz, Germany.
Financial support
This research has been supported by the European Commission Copernicus Emergency Management Service (CEMS) (grant no. 198702).
Review statement
This paper was edited by David Carlson and reviewed by two anonymous referees.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2020. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Estimating how much water is flowing through rivers at the global scale is challenging due to a lack of observations in space and time. A way forward is to optimally combine the global network of earth system observations with advanced numerical weather prediction (NWP) models to generate consistent spatio-temporal maps of land, ocean, and atmospheric variables of interest, which is known as a reanalysis. While the current generation of NWP models output runoff at each grid cell, they currently do not produce river discharge at catchment scales directly and thus have limited utility in hydrological applications such as flood and drought monitoring and forecasting. This is overcome in the Global Flood Awareness System (GloFAS;
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details








1 Forecast Department, European Centre for Medium-Range Weather Forecasts (ECMWF), Reading, UK
2 Forecast Department, European Centre for Medium-Range Weather Forecasts (ECMWF), Reading, UK; Department of Geography and Environmental Science, University of Reading, Reading, UK
3 Disaster Risk Management Unit, European Commission Joint Research Centre (JRC), Ispra, Italy; CIMA Research Foundation, Savona, Italy
4 Forecast Department, European Centre for Medium-Range Weather Forecasts (ECMWF), Reading, UK; Centre for Ecology and Hydrology (CEH), Wallingford, UK; Department of Geography and Environment, University of Loughborough, Loughborough, UK
5 Disaster Risk Management Unit, European Commission Joint Research Centre (JRC), Ispra, Italy
6 Department of Geography and Environmental Science, University of Reading, Reading, UK; Department of Meteorology, University of Reading, Reading, UK; Department of Earth Sciences, Uppsala University, Uppsala, Sweden; Centre of Natural Hazards and Disaster Science, CNDS, Uppsala, Sweden