1 Introduction
The quantitative assessment of global water resources and their use helps to increase our understanding of the freshwater cycle and supports decision-making. Global hydrological modeling approaches have been developed since the 1990s, and one of the pioneers in this field is the global water resources and water use model WaterGAP (Water – Global Assessment and Prognosis) . To continue to answer relevant scientific and societal questions, such a modeling system needs to be at the cutting edge in terms of process representation and the databases used. Moreover, informative descriptions of specific model versions are required and are increasingly supplied in global hydrological modeling , especially when the models are part of model intercomparison exercises. This paper describes the changes to WaterGAP 2 (from now referred to as WaterGAP) from version 2.2d (v2.2d) to the most recent model version 2.2e (v2.2e) to present the modifications and extensions rather than a thorough description of the whole WaterGAP model. Furthermore, it provides a model evaluation against independent data for different model variants and explains its application in the Inter-Sectoral Impact Model Intercomparison Project phase 3 (ISIMIP3) framework (
WaterGAP was developed to quantify global-scale water resources, as well as water stress, with a focus on direct human impacts on the natural water cycle through human water use and artificial reservoirs. The model framework (Fig. ) consists of sectoral water use models that are linked in a submodel (GSWSUSE) to calculate potential net water abstractions from surface waterbodies and from groundwater. The computed net abstractions are an input for the WaterGAP Global Hydrology Model that calculates the water storages and fluxes and routes the streamflow to the basin outlet (Fig. ). WaterGAP, as described here, operates with a spatial resolution of 0.5° 0.5° and at daily time steps.
Figure 1
Schematics of the WaterGAP framework and the WaterGAP Global Hydrology Model (both taken from ) and a summary of data updates, process updates, and new algorithms.
[Figure omitted. See PDF]
A model like WaterGAP is used to answer questions with numerical experiments, where the model is driven by alternative inputs, for example, climate data to quantify the impact of climate change on water resources or is run with different setups or algorithms. One extensively performed experiment is to switch off human water use and artificial reservoirs to evaluate these direct human impacts on the water cycle
The capability of WaterGAP to assess the impact of climate change on the freshwater system is limited, as is the case for most hydrological models, by not being able to simulate the response of vegetation to climate change and an increased atmospheric CO2 concentration. The simulation of vegetation responses (instead of assuming no changes in vegetation that affect evapotranspiration) may result in substantial differences in estimated climate change impacts, for example, on groundwater recharge . However, the simulation of vegetation responses is complex and uncertain, and a simplified approach is required. Applying the results of , who analyzed future evapotranspiration changes in an ensemble of global climate models, we developed an alternative method for calculating potential evapotranspiration (PET) under climate change applicable to the Priestley–Taylor PET method. This model variant can be used in an ensemble, together with the standard model, to approximate the range of uncertainty in future evapotranspiration and runoff changes.
Glaciers play a crucial role in the global water cycle but are represented in very few global hydrological models . Neglecting the dynamics of water storage in glaciers results in a missing component of the terrestrial water storage and hinders quantifying the impact of glacier mass loss on water resources and sea level rise. We had developed a glacier component (HYOGA) for a previous version of WaterGAP , which, however, is no longer state-of-the-art. Hence, to enable an optimal consideration of glacier water dynamics, it is preferable to include the output of a dedicated glacier model in a global hydrological model . This approach has been implemented in WaterGAP v2.2e but not in its standard version due to the limited temporal extent of the glacier model output.
An important indicator of water quality is water temperature, especially in a changing climate . Therefore, the Inter-Sectoral Impact Model Intercomparison Project (ISIMIP) has included river water temperature as a requested variable in its recent phase 3. Moreover, the new ISIMIP sector water quality has been formed that has identified water temperature as one of the essential elements (
An important rationale for developing a new model version is to update the input data basis to reflect the current state of the art. To optimally take into account reservoirs in WaterGAP and to be consistent with other global hydrological models participating in the model intercomparison project ISIMIP, it has been necessary to update the reservoir and regulated lake data to GRanD version 1.3 and include some additional reservoirs from other sources. In terms of non-irrigation water use data, two errors (one error in downscaling the national level to the grid cell level and one copy–paste error) appeared in WaterGAP v2.2d when creating the domestic water use time series, which was subject to be corrected in v2.2e. Furthermore, input data to temporally extend the time series for thermal electricity (from 2010 to 2017) and manufacturing water use (from 2010 to 2016) were available.
Models and their inputs are imperfect, and calibration can help to reduce the uncertainty in model output
The improvement of already implemented algorithms is another motivation for developing a new model version. Focused groundwater recharge below the surface waterbodies in (semi-)arid grid cells was a feature introduced in WaterGAP v2.2a . A modification in WaterGAP v2.2d regarding the handling on grid cells without outflow of liquid water, i.e., internal sinks, has led to unrealistically high values of groundwater recharge in these cells that are difficult to interpret in a water balance approach, especially when assessing the impact of climate change on groundwater resources . A good example is the Okavango Delta in Botswana, which is an endorheic basin with a surface waterbody. Here, approx. 95 % of the inflowing water is evaporated rather than recharging the groundwater , while the v2.2d model version computes very large and focused groundwater recharge under the delta. In addition, the modification to handle inland sinks in v2.2d just like any other grid cell has led to outputting a value for streamflow out of the inland sink, which does not reflect reality. Both issues motivate a modification of the handling of inland sinks in the model.
Data assimilation, which requires regular updating of the model states (water storages), was not possible with the standard version v2.2d, as the simulation could not be stopped at a certain point in time (e.g., 31 March 2004) and restarted to continue the computation (for 1 April 2004) with prescribed initial conditions that had been written out at the end of the previous model run. Therefore, the WaterGAP Global Hydrology Model was modified to enable a monthly restart and successfully applied in data assimilation . In addition, the restart capability is a prerequisite to applying WaterGAP in water resource monitoring and ensemble forecasts of water resources. Also, it reduces model runtimes, in particular in climate change assessments. The participation of the model in the ISIMIP3b simulation round requires model runs for different time periods (e.g., the pre-industrial period starting in the year 1601, the historical time period in the year 1850, and the future in the year 2015). With v2.2d, each run for the future time period would require a transient run with a start in 1601 to reach full consistency, especially between the time periods, leading to a high demand for computing resources and runtime. To perform the multiple-scenario evaluation for the 86 years from 2015–2100, starting in 1601 would lead to a runtime of 25 h, while the runtime would be only 4 h if the model could start with prescribed initial conditions in 2015.
To address these scientific demands, WaterGAP was updated to version v2.2e. The objective of this paper is to clearly describe the modifications and new options implemented in WaterGAP v2.2e and to evaluate the impact of the modifications on model results. The paper describes
-
the removal of small reservoirs from the local lake storage compartment to achieve an improved simulation of naturalized conditions (Sect. );
-
the updated database for reservoirs and regulated lakes (Sect. );
-
the updated and bug-fixed non-irrigation water use data (Sect. );
-
the updated streamflow observation data set used for model calibration (Sect. );
-
the new handling of inland sinks (Sect. );
-
the integration of an alternative approach for PET to improve climate change impact assessments (Sect. );
-
the integration of outputs from a global glacier model (Sect. );
-
the implementation of water temperature calculation (Sect. );
-
the model restart capability (Sect. ).
2.1 Naturalized runs: small reservoirs are no longer considered in naturalized runs
In WaterGAP v2.2d, small reservoirs ( km3 storage capacity) are simulated as local lakes, whether or not WaterGAP is run in nat mode. In WaterGAP v2.2e, the small reservoirs are removed from local lakes in nat runs, decreasing the grid-cell-specific area share covered by surface waterbodies that are simulated with the local lakes algorithm. In standard (ant) runs, small reservoirs continue to be treated like natural lakes. After integration of updates and new reservoirs from the Global Reservoir and Dam Database (GRanD) 1.3 (Sect. ), there are 5722 small reservoirs with a maximum storage capacity of less than 0.5 in WaterGAP v2.2e. They cover a total maximum area of 31 630 .
2.2 Reservoir and regulated lake data: GRanD 1.3 integration
In WaterGAP, reservoirs with a storage capacity of at least 0.5 are simulated as so-called global reservoirs that receive inflow from the upstream grid cell. Their dynamics are simulated with a filling and operational scheme, depending on their main use (irrigation or non-irrigation) . Changes to reservoirs and new reservoirs from GRanD version 1.3, together with four additional reservoirs from a preliminary version of the GeoDAR data set , were implemented in WaterGAP v2.2e. Reservoirs with a commissioning year until 2020 were selected and mapped to the river network of WaterGAP DDM30 . The location of the new reservoirs was manually co-registered in the drainage network with the help of web-based map information in order to match the given hydrological situation, particularly whether a reservoir is located on the main stream or its tributary. The total number of implemented reservoirs with a storage capacity of at least 0.5 increased from 1082 in WaterGAP v2.2d to 1255 in WaterGAP v2.2e, and the number of regulated lakes increased from 85 to 88. The total maximum storage capacity of the global reservoirs sums up to 5672 .
Furthermore, parameters (i.e., commissioning year and assigned outflow cell) from 12 reservoirs were changed either due to changes from GRanD 1.1 to 1.3 or for correcting flawed parameterization. Multiple reservoirs and regulated lakes may have their outflow cell in the same grid cell. In such cases, they are simulated as one big reservoir or regulated lake by adding up their maximum area and storage capacity and assigning to this new waterbody the type (reservoir or regulated lake) and the commissioning year of the actual reservoir or regulated lake with the largest water storage capacity. Thus, for example, a regulated lake and a reservoir can become one reservoir in WaterGAP. Therefore, WaterGAP v2.2e explicitly simulates only a maximum of 1181 reservoirs and 86 regulated lakes (corresponding data available from ). In addition to these global reservoirs, local reservoirs with a storage capacity smaller than 0.5 were updated to GRanD version 1.3 (Sect. ).
2.3 Water use data: updated non-irrigation water use data
In WaterGAP, domestic water use is calculated on a national level and then downscaled to the grid cells according to the population number per grid cell. Additional information, such as the ratio of rural to urban population per grid cell and the share of the population with access to safe water supply, is considered . In the 2.2d version, an error occurred for a few countries in the downscaling procedure because non-numerical values (i.e., not a number, NaN) were written in the input time series of the percentage of the population having access to a safe water supply. This bug was detected after the calibration of the model variants and fixed in the runs.
The sectoral water use estimates end in different years. For the years thereafter, the value of the last data year was copied. The thermal electricity estimates end in 2017 and manufacturing estimates end in 2016, whereas livestock estimates end already in 2011 (no change as compared to WaterGAP v2.2d, except that the year 2011 was correctly used for prolonging the time series instead of the year 2010, as done by accident in v2.2d) and domestic water use ends in 2010 (no temporal extension, but the bug fix is applied as described above).
2.3.1 Thermal electricity water use
WaterGAP estimates the amount of cooling water for thermal electricity production, namely water abstractions and consumptive use, for each power plant individually. The input data for the location and capacity of thermal power plants are obtained from the World Electric Power Plants Data (
A thermoelectric power plant is defined as a power-generating facility that uses heat to generate energy, which may be produced by burning fossil fuels, biomass, or nuclear energy. Additionally, geothermal power plants and concentrated solar power (CSP) plants, as well as other solar-related power plants that require water for cooling and cleaning of solar panels, have been incorporated into the database . Power plants that employ seawater or brackish water for cooling purposes are excluded. The time series of data on annual electricity production for different fuel types (
2.3.2 Manufacturing water use
The WaterGAP manufacturing water use model calculates the amount of water abstracted and consumed for production and cooling purposes in the manufacturing sector. A detailed model description can be found in and . The water use time series was prolonged to 2016, based on the key driving force manufacturing value added from
2.4 New calibration data set
The data set of streamflow calibration stations was updated for WaterGAP v2.2e, now comprising a total of 1509 stations compared to 1319 stations for WaterGAP v2.2d . An update was warranted as databases of streamflow observations had been updated or newly established since the last station update roughly a decade ago, and climate forcings now cover more recent years, e.g., until 2019 . As recent high-quality climate forcings are available only from 1979 onwards and require a concatenation to other less reliable climate forcings with potential offsets , the update of the calibration stations also aimed at increasing the number of streamflow observations after 1978. A detailed description of the updating process can be found in .
2.4.1 Databases
As in the case of previous WaterGAP versions, the Global Runoff Data Center (GRDC) is the main resource for streamflow gauging station data. The GRDC database includes mostly daily streamflow time series of national data providers, but not all nationally available streamflow data are included. During the last few years, additional databases of streamflow indices have been made available.
The Global Streamflow Indices and Metadata Archive (GSIM) provides indices such as monthly streamflow for 30 000 stations from national daily streamflow data that have been collected, homogenized, and enriched by metadata information. The start year for GSIM data is 1958.
The African Database of Hydrometric Indices (ADHI) provides indices including monthly streamflow for 1466 stations over the African continent, together with metadata. The start (end) year for ADHI data is 1950 (2018). While the GRDC database is continuously updated, this is not the case for GSIM and ADHI.
2.4.2 Station selection methodology
The criteria for considering a streamflow station to be suitable for the calibration of WaterGAP remain unchanged from WaterGAP v2.2d and include the following :
-
an upstream area of at least 9000 ,
-
a time series of at least 4 complete but not necessarily consecutive calendar years (with a maximum of 2 missing days per month), and
-
an inter-station catchment area of at least 30 000 .
The selected stations of all three data sources were plotted on the WaterGAP drainage network in order to (1) find and eliminate duplicates, which are not necessarily identified from the station metadata; (2) identify the stations that meet the inter-station catchment area criteria; and (3) re-map the station to a grid cell that fits with the drainage network. Re-mapping of the position focused on accurately relating the station either to the mainstream of the river or the tributary. A correcting factor for mismatches of drainage areas between the values provided by the station data producers and those calculated from the drainage direction map was not implemented, but both areas can be found in the shapefiles of . As only GRDC is regularly updated, this data source was preferred in the case of multiple stations with similar time series lengths in close-by grid cells. The time series of multiple stations in one grid cell were compared to further eliminate duplicates or to select the best-suited station. Where it was meaningful, time series were merged (e.g., for those cases where GSIM provides more recent years but GRDC years before 1958). Furthermore, each time series was visually inspected in order to check the plausibility of data and to delete data points in case of obvious errors.
2.4.3 Resulting calibration data set of streamflow observation
The final WaterGAP calibration data set with streamflow observations consists of 1509 JSON files with monthly streamflow observations (only for years with values for all calendar months). Data for 1252 gauging stations originated from GRDC, with 80 from ADHI and 177 from GSIM databases.
In the WaterGAP calibration, 30 complete years of streamflow data are ideally used for model calibration. Of the 1509 stations, 949 have more than 30 years of data, which requires the selection of a suitable start year for calibration. The later the global calibration start year is, the fewer stations and number of years are available for calibration (Fig. ). In the case of 1979 as the start year for calibration, which would allow us to use only the most reliable climate forcing, only 1375 out of 1509 gauging stations are available for calibration. In addition, the number of years that would be available for calibration is reduced drastically in several parts globally (Fig. ). Therefore, we decided to not constrain the calibration to periods starting in 1979 or later.
Figure 2
The number of gauging stations and years for calibration as a function of the year where the calibration starts. Both numbers decrease with a later start year of calibration, indicating that the year 1916 is the most recent year to start the calibration without losing data points according to the station/data selection criteria. Note that the axes do not start at zero.
[Figure omitted. See PDF]
Figure 3
Number of complete years usable for the calibration of model parameters in the calibration basins shown for 1916 and 1979 as calibration start years. The term “not used” refers to the case where fewer than 4 years of streamflow data are available for the case of starting the calibration in 1979, such that these basins would not be included in model calibration.
[Figure omitted. See PDF]
The preferred period for calibration was set to 1981–2010. If observation data are incomplete for this period for any gauging station, the following is done iteratively until 30 years of data are reached (not necessarily consecutive years) or until no further years are available for the station:
-
go back to using 1979 as the start year;
-
extend the years after 2010;
-
go back, year by year, starting from 1978, until reaching 1901 as the start year.
In total, 38 543 full calendar years could be used for calibrating WaterGAP v2.2e, but due to the error described above, only 37 785 full calendar years were considered. For a total of 993 (597 due to the error) out of 1509 stations, a 30-year period was available. For 336 of these stations, the 30-year period matches the time span 1981–2010. For 854 (825 due to the error) stations, the calibration years (not necessarily 30 years) start before 1979, and out of these, 82 stations have all their calibration years before 1979. In contrast, the 1319 WaterGAP v2.2d calibration stations sum up to 31 184 years; hence, the update of the calibration data set increased the number of years by around 24 % (21 % due to the error). In terms of the calibration area, the overall process increased the calibration area by km2, whereas km2 are no longer included in the calibration area, e.g., due to suspicious data (Fig. ). This results in an increase in calibrated drainage area from 53.8 % in WaterGAP v2.2d to 55.1 % in WaterGAP v2.2e of the global land area outside Antarctica and Greenland. The average basin size (excluding any additional upstream basin area) decreased from 54 000 in v2.2d to 48 300 in v2.2e. The calibration basins and streamflow time series are provided in .
Figure 4
Areas considered for calibration in WaterGAP versions v2.2d and v2.2e. Blue colors indicate grid cells that are newly present as the calibration area in v2.2e due to the update of the data basis, whereas red colors show grid cells that are no longer calibrated in v2.2e in comparison to v2.2d.
[Figure omitted. See PDF]
2.5 New handling of inland sinksCells that represent inland sinks, i.e., cells without the outflow of liquid water, are handled like any other cell in WaterGAP v2.2d. Since WaterGAP v2.2a , focused groundwater recharge below the surface waterbodies (i.e., lakes and wetlands) is calculated in (semi-)arid grid cells. In the case of (semi-)arid inland sinks, the focused recharge can reach very high values, which limits assessment of this variable, e.g., in climate impact studies. Furthermore, it is unrealistic to provide a streamflow value for an inland sink as there is – other than an ocean outflow cell – no grid cell that could receive the streamflow generated in inland sinks.
Hence, inland sinks are handled in v2.2e as follows:
-
no focused groundwater recharge below the surface waterbodies;
-
surface runoff and groundwater outflow are routed to the surface waterbodies (no fractional routing; )
-
simulated streamflow of inland sinks is added to actual evapotranspiration in the model output, and streamflow is set to zero.
3 New options for special model applications
3.1 Alternative PET calculation method to approximate the effect of vegetation response when estimating the impact of climate change on evapotranspiration
Potential evapotranspiration on land surfaces (PET) is determined by a combination of plant transpiration and evaporation from the canopy and the soil. As such, PET is influenced by vegetation characteristics and processes that are affected by human-induced climate change, in particular rising atmospheric CO2 concentrations. The physiological effect (with closing stomata decreasing transpiration), the structural effect (also known as the fertilization effect, which may increase canopy evaporation and transpiration), and biome shifts are three types of vegetation responses to rising atmospheric CO2 . These effects influence PET and, if not accounted for, lead to wrong estimates of the impact of climate change on evapotranspiration and water resources.
Typical hydrological models, such as WaterGAP, do not simulate the plant phenology processes leading to these effects or the interaction with the atmosphere. This significantly constrains the capacity of standard hydrological models to assess how water resources change under climate change. Given the intricacy and considerable uncertainty associated with simulating vegetation responses, recommended running hydrological models in two variants, namely one with the PET algorithm used for conditions where PET is not impacted by vegetation response to climate change (i.e., the standard PET), and the other in which this impact is approximated. Accordingly, in WaterGAP v2.2e, the Priestley–Taylor (PT) method is used in the standard model runs to calculate PET , and the Priestley–Taylor modified approach (PT-MA) is applied as the alternative PET computation method, where PT-MA considers the vegetation effect when computing the PET in a very simple and approximate way.
The PT method computes PET as a function of net radiation and temperature, where PET increases with temperature. However, analyzing evaporation changes in an ensemble of global climate models; found that under future climate change, PET change as computed with the PT method overestimates the increase in future PET, and the PET change is a function of net radiation change only. The impact of increasing temperature on PET is approximately canceled by the impact of changes in other processes that are taken into account by global climate models (GCMs) but not by typical hydrological models .
The new PET method, PT-MA, which was developed based on the results of , can be applied for estimating hydrological changes due to climate change between a reference period and a future period. A temperature reduction factor is calculated in pre-processing for each land grid cell and year in the future time period and stands for the difference between the annual mean temperature of a 20-year period centered around the year of interest and the mean annual temperature of the reference period. The model then applies this temperature reduction factor to adjust the daily temperature values in future scenarios, thus removing the long-term temperature trends. As a result, the model computes future PET by taking into account changes in net radiation only, while still varying temperatures at daily to inter-annual scales.
The PT-MA method leads to a roughly similar effect of future anthropogenic climate change on PET, as computed by the ensemble of GCMs. Therefore, the PT-MA method is applicable as an alternative for estimating the change in hydrological variables between the reference period and a period in the future. Different from the standard WaterGAP, it does not neglect the impact of vegetation dynamics on actual evapotranspiration and thus runoff. With decreased evaporation as compared to climate change runs with the standard WaterGAP with PT, the PT-MA runs lead to less drying or more wetting than PT runs. Given the very simplified manner of considering the vegetation response to climate change, we recommend using both the PT and PT-MA model variants in an ensemble approach for estimating hydrological hazards of climate change. provide further details and a verification of this approach.
3.2 Integration of glaciers
WaterGAP v2.2d neither simulates water storage in glaciers nor water flows related to glacier dynamics. To take into account the water storage and flow dynamics of glaciers in WaterGAP, we implemented a glacier algorithm in WaterGAP v2.2e. This algorithm reads input data sets of glacier area and glacier mass change computed with the global glacier model of and of total precipitation (rainfall and snowfall) on glacier area from the atmospheric data set used to force the glacier model. These input data sets are used (1) to integrate a glacier area fraction in the grid cells where glaciers are located; (2) to calculate glacier runoff, i.e., the runoff generated from precipitation on glacier area and glacier mass change; and (3) to include a glacier water storage compartment in the hydrology model. The glacier runoff is added to the cell’s fast runoff, which partly flows directly into the river, while the rest flows into the other surface waterbodies. In the standard version of WaterGAP v2.2e, the glacier algorithm is switched off; i.e., glaciers are not included. This is because the algorithm relies on glacier-related input data sets that are currently only available from January 1948 to December 2016, whereas standard model runs require input data from 1901 onwards and up-to-date climate forcing data sets prolongs after the year 2016. WaterGAP v2.2e with glaciers was validated by comparing simulated global monthly terrestrial water storage anomalies to observations from an ensemble of four GRACE spherical harmonic solutions for the period January 2003 to August 2016. For more details regarding the glacier algorithm implementation and validation, we refer the reader to .
3.3 Calculation of river water temperature
The estimation of water temperature of rivers is relevant, e.g., for the solubility of gases, the metabolic rate of aquatic flora and fauna, and the formation of ice. Furthermore, changes in water temperature have not only local but also downstream effects . Also, the return flows from thermal power plants increase river water temperature. Due to the importance of water temperature as a physical water quality indicator, the Inter-Sectoral Impact Model Intercomparison Project (ISIMIP) included river water temperature as a requested variable in its recent project phase 3. In WaterGAP v2.2e, and inspired by the approaches of and , the calculation of river water temperature is implemented. Implementation details, as well as a validation against observed river water temperature, can be found in . When comparing simulated river temperatures of WaterGAP with a regression approach of air temperature , the results are rather similar. initially compared the results of WaterGAP and the regression approach with observation data and concluded that the regression approach from air temperature often obtains higher-performance indicator values. They also showed that, e.g., the inclusion of warming due to return flows from thermal power plants improved model simulations. For assessing if the implemented approach is useful for impact assessments, further evaluation is required and will be conducted, e.g., in the newly formed water quality sector of ISIMIP.
3.4 Ability to start from prescribed initial conditions
A typical model run of WaterGAP starts with several years of initialization (e.g., 5 years) to enable storage compartments to swing in from their initial conditions to more realistic ones. The stop and restart of the model in a specific month was a functionality that was not required in earlier versions of WaterGAP. WaterGAP v2.2e is now able to store all states (storage compartments), parameters (such as area reduction factors), and additional information (such as days of the vegetation growing period) for a pre-defined month of a specific year. A model run can then be started from this prescribed stored initial state.
The ability to start the model from a prescribed initial condition is required, for example, for model runs for near-real-time monitoring and ensemble forecasts. This feature was used within the framework of the ISIMIP3b simulations as different scenarios for the future time period could be started from a given state of the historical time period, which reduced runtimes drastically when compared to a transient run.
Furthermore, this functionality enables the model to run a certain month; modify, e.g., storage compartments externally (assimilation of, e.g., GRACE data); and start the next month in WaterGAP. This offline coupling allows data assimilation studies, and in addition, WaterGAP is prepared for online coupling in the PDAF system . For this reason, WaterGAP compiles not only as an executable to run on a Linux system but also as a library that can be embedded in PDAF. As the writing and reading of physical data are omitted, this online coupling strongly reduces the runtime of monthly data assimilation.
4 Climate forcings and model setup
4.1 Climate forcings
WaterGAP was calibrated and run with a total of four climate forcings, which are mainly from the ISIMIP phase 3a . All the climate forcings are a concatenation of two data sets – one for the period prior to 1979 and one for the period starting in 1979 (Table ). The year 1979 is the first year of the current ERA5 reanalysis, which is either directly used or is the basis for a specific bias adjustment to observation data.
Table 1
Overview of the climate forcings used to drive WaterGAP v2.2e (and v2.2d).
No. | Name | Before 1979 | After 1979 | Temporal coverage | Source and further info |
---|---|---|---|---|---|
1 | gswp3-w5e5 | GSWP3 v1.09 | W5E5 v2.0 | 1901–2019 | |
2 | gswp3-era5 | GSWP3 v1.09 | ERA5 | 1901–2022 | Provided by Stefan Lange∗ |
3 | 20crv3-w5e5 | 20CRv3 | W5E5 v2.0 | 1901–2019 | |
4 | 20crv3-era5 | 20CRv3 | ERA5 | 1901–2021 |
∗ Until 2021 and extended to 2022 by the authors of this paper, based on the methodology provided by Stefan Lange.
GSWP3 in its version 1.09 is a bias-adjusted and downscaled version of the Twentieth Century Reanalysis version 2 (20CRv2) . The ensemble member 1 of the Twentieth Century Reanalysis version 3 (20CRv3) was interpolated to 0.5° spatial resolution but not bias-adjusted . ERA5 is the latest version of the ECMWF Reanalysis. The year 2022 for ERA5 is added based on the scripts that have been provided by Stefan Lange, with an ERA5 download date of 25 January 2023. W5E5 v2.0 is a bias-adjusted version of the current version of the European Reanalysis ERA5 .
The climate forcings are concatenated by applying a bias adjustment of the data set before 1979 to the data set thereafter using ISIMIP3BASD v2.5.1 . This reduces discontinuities at the 1978/1979 transition. For details, see .
4.2 WaterGAP model variantsThe standard model variant, ant, includes human interference with the hydrological cycle, namely human water use and reservoir operation ( “histsoc” in ISIMIP3 nomenclature). In contrast, the model is also run in a nat mode without water use, and reservoirs reflect a hydrological system without those direct human impacts (“nowatermgt” in ISIMIP3 nomenclature). All model variants are calibrated with the corresponding climate forcing. The standard climate forcing of WaterGAP v2.2e is gwsp3-w5e5. To compare the effect of model development, we calibrated and ran WaterGAP v2.2d with the gswp3-w5e5 climate forcing and the calibration data basis of v2.2e. In total, the outputs of eight WaterGAP v2.2e variants are available (four climate forcings with ant and nat setups), as well as the output of two WaterGAP v2.2d variants (one climate forcing with ant and nat setups calibrated to the new WaterGAP v2.2e streamflow observations data).
5 Results of standard model modifications
5.1 Effect of removing local reservoirs from naturalized runs
The impact on the global water balance of no longer assuming that local reservoirs exist in naturalized runs is small (Table ). As fewer waterbodies are considered in v2.2e, actual evaporation decreases, and streamflow increases by the same amount. Global streamflow into oceans thus increases by less than 0.03 %. The change in water storage components is only minor (not shown).
Table 2
Global water balance components with a model variant of WaterGAP v2.2e, including local reservoirs in local lakes under a naturalized variant (as in v2.2d; labeled v2.2e_nat with local reservoirs) and in WaterGAP v2.2e, where local reservoirs are removed from local lakes in a naturalized variant (labeled v2.2e_nat). Water balance components for the time period 1991–2019. All units are in .
v2.2_nat with local reservoirs | v2.2e_nat | v2.2e – v2.2e with local reservoirs | |
---|---|---|---|
Precipitation | 111 578.0 | 111 578.0 | 0.0 |
Actual evapotranspiration | 70 863.7 | 70 852.5 | |
Streamflow into oceans | 40 709.4 | 40 720.7 | 11.3 |
Change in total water storage | 4.8 | 4.8 | 0.0 |
Long-term average volume balance error | 0.0 | 0.0 | 0.0 |
The calibration as implemented in the standard version of WaterGAP focuses on adjusting biases in a rather simple method. More comprehensive approaches are currently in development and might be used in future model versions. While the calibration approach for WaterGAP v2.2e is the same as for WaterGAP v2.2d, the data set of observed streamflow differs, as described in Sect. . Calibration of WaterGAP v2.2e was done for all four climate forcings. To explore the impact of the model version, WaterGAP v2.2d, driven by gswp3-w5e5, was calibrated using the v2.2e streamflow observation data set, too. As described in
-
CS1 – adjust the basin-wide uniform parameter
their Eq. 18 in the range of [0.1–5.0] to match mean annual observed streamflow within %. -
CS2 – adjust as for CS1 but within 10 % uncertainty range (90 %–110 % of observations).
-
CS3 – as for CS2 but apply the areal correction factor, CFA (adjusts runoff and, to conserve the mass balance, actual evapotranspiration as the counterpart of each grid cell within the range of [0.5–1.5]), to match mean annual observed streamflow with 10 % uncertainty.
-
CS4 – as for CS3 but apply the station correction factor, CFS (multiplies streamflow in the cell where the gauging station is located by an unconstrained factor), to match mean annual observed streamflow with 10 % uncertainty to avoid error propagation to the downstream basin.
The calibration of WaterGAP v2.2e (v2.2d) (driven by the standard climate forcing gwsp3-w5e5) results in 519 (524) basins with calibration status CS1, 216 (212) basins with calibration status CS2, 262 (323) basins with calibration status CS3, and 512 (449) basins with calibration status CS4. While, with 49 %, the percentage of river basins that can be calibrated without applying correction factors is nearly the same for both model versions, the modification/update of reservoir or water use data in v2.2e led to substantially more stations where not only the areal correction factor CFA but also the station correction factor CFS is required to match the simulated long-term annual streamflow with observations. The 69 stations that moved from CS3 in WaterGAP v2.2d to CS4 in WaterGAP v2.2e are located all around the globe in different climate zones, but a lot of them are located in snow-dominated regions. Of these stations, 64 have a CFS value of larger than 1, indicating streamflow is underestimated by WaterGAP v2.2e unless CFS is applied. This difference is due to a slightly different handling of the calibration routines in v2.2d and v2.2e. Whereas in v2.2d, the calibration period uses a spin-up of a 5-year time period prior to the calibration start year, in v2.2e, the calibration start year is repeated five times. Hence, different calibration results can occur especially in the first calibration year, which can finally result in a different CS.
The spatial distribution of calibration parameters and the calibration status is shown for WaterGAP v2.2e and the standard forcing gwsp3-w5e5 in Fig. and for v2.2d in Fig. S2 in the Supplement. For the calibration results for WaterGAP v2.2e driven by the other three climate forcings, the reader is referred to Figs. S3–S5.
Figure 5
Results of the calibration of WaterGAP v2.2e driven by the gswp3-w5e5 climate forcing, with (a) the calibration status of each of the 1509 calibration basins, (b) calibration parameter , (c) areal correction factor CFA, and (d) station correction factor (CFS). Grey areas in panel (d) indicate regions with regionalized calibration parameter , and for panels (a)–(d), dark green outlines indicate the boundaries of the calibration basins. For details of the calibration procedure, the reader is referred to .
[Figure omitted. See PDF]
5.3 Improved handling of inland sinksThe improved handling of inland sinks leads to a reduction in global streamflow, an increase in actual evapotranspiration, and a slight decrease in the total water storage change in the period 2001–2010 (Table ). This is expected as streamflow is now assumed to become actual evapotranspiration in inland sinks. Hence, between WaterGAP v2.2d and WaterGAP v2.2e, the assessment of streamflow into oceans in the water balance component has a different meaning. The improved handling of inland sinks increases global actual evapotranspiration by 1.1 % and decreases global streamflow into oceans and inland sinks by 2.0 %. Focused recharge is neglected in inland sinks which leads to less groundwater storage. The water balance error is not affected.
Table 3
Global water balance components with a model version including the improved handling of inland sinks in WaterGAP v2.2e as compared to previous handling (as in WaterGAP v2.2d). Water balance components for the time period 2001–2010. Please note that the model version used for this assessment is a pre-v2.2e version and is run with a different climate (a combination of WFD-WFDEI). The purpose here is only to show the effect of new handling of inland sink cells. The unit of all variables is .
v2.2e old inland | v2.2e standard | v2.2e st – v2.2e old | |
---|---|---|---|
Precipitation | 112 438.5 | 112 438.5 | 0.0 |
Actual evapotranspirationa | 72 086.8 | 72 903.8 | 817.0 |
Streamflow into oceansb | 40 332.4 | 39 518.6 | |
Change in total water storage | 19.3 | 16.0 | |
Long-term average volume balance error | 0.1 | 0.1 | 0.0 |
a Including (excluding) streamflow in inland sinks for v2.2e (v2.2d); b including (excluding) streamflow in inland sinks for v2.2d (v2.2e).
5.4 Global water balance components5.4.1 Major water balance components
The calculation of globally aggregated water balance components for WaterGAP v2.2e driven by gswp3-w5e5 is shown in Table . The corresponding tables for the other model variants are provided in Tables S1–S4. Due to bias adjustment of precipitation, precipitation is larger for the climate forcings that include W5E5 compared to those that include ERA5. For all model variants, climate forcings, and time periods, the streamflow to the oceans (in Table S1 it is streamflow to the oceans and inland sinks) is between 39 000 and 40 500 km3 yr−1. As global streamflow does not vary much as a consequence of calibration, even though the precipitation varies, actual evapotranspiration differs strongly between the model variants that are driven by either W5E5 or ERA5 from 70 000 to 80 000 km3 yr−1. Please note that as a consequence of the new handling of inland sinks (Sect. ), inland sinks do not contribute to globally aggregated streamflow in WaterGAP v2.2e, and thus the amount is lower than in previous model versions. However, we indicated the inflow into inland sinks in the tables for model version v2.2e, which is the amount of water that would have been included in row 3 for model version v2.2d but is now included in row 2. For Table S1 (WaterGAP v2.2d), row 4 is included in row 3. This different handling of inland sinks explains the differences between streamflow and actual evapotranspiration between versions v2.2d and v2.2e. For assessments of renewable water resources, it is recommended to sum up rows 3 and 4 for WaterGAP v2.2e results.
Table 4
Global-scale (excluding Antarctica and Greenland) water balance components for different time spans as simulated with WaterGAP v2.2e with gswp3-w5e5. The unit of all variables is km3 yr−1. Long-term average volume balance error is calculated as the difference in component 1 and the sum of components 2, 3, and 8.
No. | Component | 1961–1990 | 1971–2000 | 1981–2010 | 1991–2019 | 2001–2019 |
---|---|---|---|---|---|---|
1 | Precipitation | 110 637 | 111 279 | 111 350 | 111 574 | 111 655 |
2 | Actual evapotranspirationa | 71 325 | 71 755 | 71 816 | 71 998 | 72 063 |
3 | Streamflow into oceans | 39 295 | 39 530 | 39 584 | 39 666 | 39 697 |
4 | Inflow into inland sinksb | 776 | 794 | 795 | 841 | 846 |
5 | Actual consumptive water usec | 904 | 1049 | 1195 | 1307 | 1369 |
6 | Actual net abstraction from surface water | 1036 | 1186 | 1338 | 1448 | 1501 |
7 | Actual net abstraction from groundwater | |||||
8 | Change in total water storage | 17 | ||||
9 | Long-term average volume balance error |
a Including actual consumptive water use. b Streamflow that flows into inland sinks; the simulated streamflow of inland sinks is added to actual evapotranspiration. c Sum of rows 6 and 7.
5.4.2 Water storage componentsThe globally aggregated water storage component changes are shown in Table for WaterGAP v2.2e driven by gswp3-w5e5. While the increase in water storage in reservoirs and regulated lakes during the period 1961–1990, due to dam construction, more than balances the decrease in groundwater storage due to human water use, the latter dominated in all later evaluation periods. While the annual rate of groundwater loss has steadily increased from the period 1961–2000 to the period 2001–2019, the annual total water storage loss rate has steadily increased from the period 1971–2000 onward. This is also true for the other model variants (Tables S6–S9). For all three climate forcings, WaterGAP v2.2e computes a decline in snow water storage since the period 1981–2010. For other storage compartments, different climate inputs result in different signs of change without a specific component that is dominantly sensitive. When comparing the water storage changes in WaterGAP v2.2e (Table ) and WaterGAP v2.2d (Table S5), most components are similar, but in WaterGAP v2.2d, the reservoirs and global lakes gain less water than in WaterGAP v2.2e in the more recent time periods.
Table 5
Globally aggregated (excluding Antarctica and Greenland) water storage component changes during different periods, as simulated by WaterGAP v2.2e with gswp3-w5e5. All units are in km3 yr−1.
No. | Component | 1961–1990 | 1971–2000 | 1981–2010 | 1991–2019 | 2001–2019 |
---|---|---|---|---|---|---|
1 | Canopy | 0 | 0 | 0.1 | 0 | 0 |
2 | Snow | 11.4 | ||||
3 | Soil | 4.9 | 7.6 | 9.5 | ||
4 | Groundwater | |||||
5 | Local lakes | 0.3 | 1.1 | 0.9 | 0.2 | |
6 | Local wetlands | 0.7 | 4.6 | 4.4 | 9.2 | |
7 | Global lakes | 4.3 | 9.8 | |||
8 | Global wetlands | 5.0 | 0.8 | 0.0 | ||
9 | Reservoirs and regulated lakes | 70.8 | 50.8 | 36.0 | 24.9 | 25.1 |
10 | River | 0.4 | 5.4 | 3.8 | 4.1 | |
11 | Total water storage | 20.3 |
Globally aggregated sectoral potential withdrawal and consumptive water uses, as well as use fractions from groundwater are shown in Table for WaterGAP v2.2e and gswp3-w5e5; the corresponding values for the other model variants are given in Tables S10–S13. Irrigation accounts for two-thirds of potential water abstractions (WU) and 88 % of potential consumptive use. Groundwater withdrawals are estimated to cover about 22 % of all withdrawals, with the highest fraction for the domestic sector, while 35 % of total potential consumptive use is supplied by groundwater, due to the assumed higher water use efficiency in the case of irrigation with groundwater. The values in Table represent the human demand for water that cannot be completely satisfied in WaterGAP v2.2e due to a lack of surface water resources. Only 1307 km3 yr−1 of the 1342 km3 yr−1 of potential consumptive use can be fulfilled in the period 1991–2019 (row 5 in Table ). The climate forcings including ERA5 have 150 km3 yr−1 less potential withdrawal water use for irrigation than the forcings with W5E5, which is a result of more precipitation and thus less irrigation demand. Still, the potential consumptive use of 1268 km3 yr−1 cannot be fulfilled, and only 1237 km3 yr−1 is actually consumed (compare Tables S13 and S5). Global sectoral water demand differences between WaterGAP v2.2d (Table S9) and v2.2e are visible only for two updated water use sectors (cooling of thermal power plants and manufacturing).
Table 6
Globally aggregated (excluding Antarctica and Greenland) sectoral potential withdrawal water use, WU, and consumptive water use, CU (km3 yr−1), as well as use fractions from groundwater (%) as simulated by GSWSWUSE of WaterGAP v2.2e for the time period 1991–2019.
Water use sector | WU | Percent of WU | CU | Percent of CU |
---|---|---|---|---|
from groundwater | from groundwater | |||
Irrigation | 2541 | 25 | 1179 | 37 |
Thermal power plants | 592 | 0 | 18 | 0 |
Domestic | 352 | 35 | 57 | 36 |
Manufacturing | 298 | 27 | 60 | 25 |
Livestock | 29 | 0 | 29 | 0 |
Total | 3813 | 22 | 1342 | 35 |
6.1 Effect of PET calculation with PT-MA on the global water balance under climate change
The effect of the modified Priestley–Taylor PET approach (PT-MA) is tested by running WaterGAP, as driven by two ISIMIP3b GCMs (GFDL-ESM4 and CanESM5), for the future under the emissions scenario RCP8.5 with standard PT and the newly developed PT-MA approach. Analyzing the global water balance components for the period of 2071–2100, actual evapotranspiration is, as expected, lower with the PT-MA method, and global streamflow is increased by around the same amount (Table ). In the case of GFDL-ESM4 and CanESM5, the PT-MA method leads to an increase in the streamflow into oceans by 2.7 % and 4.0 %, respectively. If hydrological models neglect the effect of the active vegetation response to the increasing atmospheric CO2 concentrations, it can thus be expected that they may underestimate future water resources . Other water balance components are affected only marginally, also because the PT-MA method is not applied in WaterGAP v2.2e when computing irrigation water use.
Table 7
Globally aggregated (excluding Antarctica and Greenland) water balance components for the period 2071–2100 computed with standard PET model variant (PT) and the alternative PET model variant (PT-MA) that takes into account – in a very simple manner – the impact of climate change on vegetation when computing PET. The WaterGAP variants are driven by the bias-adjusted output of the GFDL-ESM4 and CanESM5 provided by ISIMIP. The columns labeled Diff correspond to PT-MA PT for the respective GCM. All units are in .
GFDLPT | GFDLPT-MA | Diff | CanESM5PT | CanESM5PT-MA | Diff | |
---|---|---|---|---|---|---|
Precipitation | 108 633 | 108 633 | 0 | 130 617 | 130 617 | 0 |
Actual evapotranspirationa | 70 924 | 69 907 | 82 838 | 80 894 | ||
Streamflow into oceansb | 37 850 | 38 859 | 1009 | 47 764 | 49 689 | 1925 |
Change in total water storage | 8 | 15 | 34 | 18 | ||
Long-term average volume balance error | 0 | 0 | 0 | 0 | 0 | 0 |
a Including actual consumptive water use; b inland sinks are not considered.
6.2 Effect of glaciers on the global water balanceThe inclusion of glaciers in a WaterGAP run influences all global water balance components (Table ). Precipitation is higher due to a different precipitation product used in the original glacier model
Table 8
Global-scale (excluding Antarctica and Greenland) water balance components for two time spans, as simulated with the standard model version WaterGAP v2.2e and the version with enabled glacier option. All units are in . Long-term average volume balance error is calculated as the difference between component 1 and the sum of components 2, 3, and 7.
1971–2000 | 2001–2016 | ||||||
---|---|---|---|---|---|---|---|
No. | Component | Standard | Glacier | Glacier – standard | Standard | Glacier | Glacier – standard |
1 | Precipitation | 111 279 | 111 955 | 676 | 111 601 | 112 254 | 653 |
2 | Actual evapotranspirationa | 71 756 | 71 642 | 72 043 | 71 930 | ||
3 | Streamflow into oceans and inland sinks | 39 529 | 40 438 | 909 | 39 696 | 40 735 | 1039 |
4 | Actual consumptive water useb | 1049 | 1057 | 8 | 1364 | 1371 | 7 |
5 | Actual net abstraction from surface water | 1186 | 1206 | 20 | 1492 | 1510 | 18 |
6 | Actual net abstraction from groundwater | ||||||
7 | Change in total water storage | ||||||
8 | Long-term average volume balance error | 0.00 | 0.00 |
a Including actual consumptive water use; b sum of rows 5 and 6.
7 Evaluation of WaterGAP v2.2e7.1 Model variants used for the evaluation
The evaluation was done using the output of the WaterGAP runs in the anthropogenic mode, considering human water use and reservoir operation. The difference between the model version v2.2d and v2.2e is investigated by running both variants with the climate forcing gswp3-w5e5. The effect of the different climate forcings is assessed by comparing WaterGAP v2.2e driven by the gswp3-w5e5 climate forcing to WaterGAP driven by the gswp3-era5 climate forcing. For the sake of consistency, the evaluation closely follows .
7.2 Independent data sets used for model evaluation
7.2.1 Water abstractions
AQUASTAT is the UN Food and Agriculture Organization's global information system on water and agriculture (
7.2.2 Streamflow
The streamflow data set described in Sect. and can be classified as follows:
-
all months available for the station, including months in incomplete years (ALL);
-
months in complete years that went into the calibration of the model (CAL);
-
months that remain from ALL when months for CAL are removed (VAL).
Figure 6
Number of available months of streamflow observation data (ALL) (a), number of complete years for calibration (CAL) (b), and number of months for validation (VAL) (c).
[Figure omitted. See PDF]
7.2.3 Terrestrial water storage anomaliesThe Gravity Recovery And Climate Experiment (GRACE) satellite mission was in orbit between 2002 to 2017 to observe the temporal changes in the Earth's gravity field and obtain monthly time series of terrestrial water storage anomalies (TWSAs). Its follow-on mission, GRACE-FO, started in 2018 to continue the measurements. Thus, a data gap of several months exists. In addition, due to the aging batteries of the GRACE mission, no data were collected in specific periods, leading to further data gaps in the GRACE time series. published a strategy based on independent component analyses (ICAs) to combine data from the Swarm explorer mission and GRACE(-FO) to reconstruct a gap-free time series. The AAU Geodesy product was recently extended to include GRACE-FO TWSA data until July 2021. For the reconstruction, the release of the monthly GRACE L2 product RL06 between April 2002 and September 2016 and the release RL05 between November 2016 and January 2017 in terms of spherical harmonic coefficients up to degree and order 96 were downloaded from the Center for Space Research (CSR;
In this study, monthly GRACE(-FO) TWSA values are estimated on a regular global 0.5° grid. The grid values are spatially averaged over 148 river basins (TWSA validation basins). The TWSA validation basins were derived by combining a few of the 1509 streamflow calibration basins such that the area of each TWSA validation basin is larger than 200 000 km2. A two-step approach was applied to filter the observations and to compute and reduce leakage errors in the basin-averaged time series following the approach of . In the first step, a 2D-destriping filter was designed for the spectral domain that acknowledges the north–south striping pattern of the GRACE(-FO) error structure and aims to retain the high-frequency spatial changes while removing the noise. In the second step, an efficient averaging kernel was designed to spatially average the observations for the 148 selected river basins and simultaneously estimate the leakage in and leakage out of the signal. These estimates are used to correct the smoothed signal of step 1. The magnitude of the leakage error is used to represent the TWSA uncertainties because this error is dominant in the TWSA processing steps. We consider the time span between January 2003–December 2019 that is limited by the common period of GRACE(-FO) data and by the model output from the different WaterGAP versions.
Note that we refer to the term “terrestrial water storage” specifically in a context concerning GRACE(-FO). In contrast, the term “total water storage” remains in those cases where the context concerns WaterGAP (e.g., the water balance assessments).
7.3 Evaluation metrics
The Nash–Sutcliffe efficiency metric NSE (–) and the Kling–Gupta efficiency metric KGE (–) with its components correlation KGEr (–), bias KGEb (–), and the deviation of variability KGEg (–) , as well as TWSA-related metrics, are applied here and were described in
7.4 Evaluation results
7.4.1 Water abstractions
The evaluation of simulated potential abstractions against reported abstraction values in the AQUASTAT database (
The performance of the livestock sector with an NSE of 0.4 is relatively low, and overestimations and underestimations are visible (Fig. h). However, the total volumes are mostly below 1 km3 yr−1, and the number of data points from AQUASTAT is lowest among the other variables. The difference between the irrigation sector, and the corresponding total, groundwater, and surface water withdrawal water uses due to the different climate forcings is rather low in comparison to AQUASTAT, as are the differences to WaterGAP v2.2d (Figs. S6–S9). A slightly lower fit of WaterGAP forced by ERA5 to AQUASTAT irrigation abstractions is observed (compare Figs. and S9).
Figure 7
Comparison of potential withdrawal water uses from WaterGAP v2.2e and gswp3-w5e5 with AQUASTAT (
[Figure omitted. See PDF]
7.4.2 StreamflowThe evaluation of streamflow indicates the overall best results with WaterGAP v2.2e driven by gwsp3-w5e5 (Fig. and Table ). There are only very small differences between the model versions v2.2d and v2.2e under the same climate forcing. The gswp3-era5 climate forcing leads to a slightly lower performance with regard to mean bias (KGEb) and variability (KGEg). The simulations as driven by climate forcings that use 20crv3 prior to 1979 have much lower performance metrics than those that use gwsp3 (Figs. , ). This is also visible in the cumulative distribution functions of KGE, NSE, and the KGE components (Figs. , , , , and ).
With WaterGAP v2.2e, as driven by gswp3-w5e5, large areas of North America and Africa result in NSE values below 0.5, which is a similar pattern to that of
Performances according to the Köppen–Geiger climate zones are shown in Tables , , , , and . Please note that the assignment of a basin to the climate zone is based on the climate forcing used and can thus differ slightly among the model variants. When assessing the KGE and NSE performance indicators for Köppen–Geiger climate zones, a similar pattern is visible despite the fact that the distribution in the classes is differing due to the obviously different meaning of the performance values (Table ). Highest KGEr values are generally reached for A and C climates, and especially here, the difference between the gswp3 and 20crv3 climate forcing combinations is visible (Table ). For KGEb, a tendency to simulate higher mean streamflow compared to the observation is visible for A and C climates, whereas for the other climate zones, the number of basins is distributed rather equally around the 10 % deviation that is introduced by the calibration routine (Table ). The variability indicator KGEg differs largely from the optimum value, especially for A, B, and D climate zones. For A (D) climates, all models underestimate variability around half (two-thirds) of the basins. The model variants as driven by ERA5 climate combinations have a tendency to underestimate variability, especially in C climates (Table ).
The assessments above have been done using all monthly observation data available for the stations, including those monthly values that have not been used in model calibration. This data set is referred to as “all data” (ALL). The monthly data that were used (in yearly aggregation) for calibration are referred to as “calibration data” (CAL). Finally, the difference in all data and calibration data, i.e., the months that are not used for calibration, is referred to as “validation data” (VAL). A slight performance decrease occurs when evaluating the fit to the simulated streamflow for a validation data set, mainly due to a reduced KGEb (see the corresponding Figs. S11–S49 in the Supplement).
Figure 8
Efficiency metrics for monthly streamflow of the WaterGAP variants at the 1509 observation stations (all data) with NSE, KGE, and its components. Outliers (outside inter-quartile range) are excluded, but the number of stations that are defined as outliers are indicated at the axis.
[Figure omitted. See PDF]
Figure 9
Cumulative distribution of the KGE efficiency metric for all monthly streamflow values at the 1509 gauging stations for all model variants.
[Figure omitted. See PDF]
Figure 10
NSE efficiency metric for all monthly data of the 1509 river basins in WaterGAP v2.2e as forced by gswp3-w5e5.
[Figure omitted. See PDF]
Figure 11
KGE efficiency metric and its components for all monthly streamflow values at the 1509 gauging stations for WaterGAP v2.2e as forced by gswp3-w5e5.
[Figure omitted. See PDF]
Table 9Number of calibration basins in each Köppen–Geiger region for which the KGE of the monthly streamflow time series is within three performance classes for five WaterGAP variants. Note that the assignment of a basin to a climate region can differ among the climate forcings.
Model variant | KGE | A | B | C | D | E | Sum |
---|---|---|---|---|---|---|---|
127 | 17 | 163 | 167 | 15 | 489 | ||
v2.2d gswp3-w5e5 | 0.5–0.7 | 124 | 37 | 77 | 173 | 12 | 423 |
109 | 72 | 68 | 329 | 19 | 597 | ||
127 | 17 | 163 | 168 | 15 | 490 | ||
v2.2e gswp3-w5e5 | 0.5–0.7 | 125 | 38 | 77 | 175 | 13 | 428 |
108 | 71 | 68 | 326 | 18 | 591 | ||
78 | 6 | 105 | 170 | 11 | 370 | ||
v2.2e 20crv3-era5 | 0.5–0.7 | 137 | 35 | 102 | 186 | 9 | 469 |
133 | 76 | 114 | 339 | 8 | 670 | ||
96 | 8 | 111 | 159 | 15 | 389 | ||
v2.2e 20crv3-w5e5 | 0.5–0.7 | 129 | 37 | 93 | 190 | 5 | 454 |
132 | 83 | 106 | 326 | 19 | 666 | ||
96 | 7 | 152 | 173 | 13 | 441 | ||
v2.2e gswp3-era5 | 0.5–0.7 | 142 | 38 | 102 | 207 | 8 | 497 |
112 | 70 | 70 | 310 | 9 | 571 |
The comparison of basin-averaged TWSA of WaterGAP v2.2e forced by gswp3-w5e5 and the reconstructed gap-free time series of GRACE(-FO) for 148 basins is shown in Fig. . The annual amplitude is underestimated in most of the African basins and in some Asian basins but is overestimated in major parts of North America. The correlation between WaterGAP v2.2e and GRACE(-FO) is overall reasonable, with the majority of basins experiencing correlations between 0.5–1. However, basins where the amplitude is considerably under- or overestimated show low correlations. The comparison of TWSA trends shows that WaterGAP v2.2e generally computes considerably smaller trends in comparison to GRACE(-FO). This characteristic was also observed in the previous model evaluation .
The comparison between WaterGAP v2.2d and v2.2e shows that only a few basins differ; mainly stronger trends in (north-)east Asia can be observed for version v2.2e. The WaterGAP v2.2e versions forced by 20crv3-era5 and gswp3-era5, respectively, show only marginal differences. This is expected since both versions are forced by ERA5 during the evaluation period for TWSAs (January 2003–December 2019). When forcing the model with ERA5, stronger trends are observed in North America than with W5E5. The correlations differ in (north-)east Asia and match better in South America. The annual amplitude fits better in North America, but the annual amplitude in South America is better represented using the W5E5 forcing.
Figure 12
Comparison of basin-averaged monthly TWSA time series of WaterGAP v2.2e as forced by gswp3-w5e5 (a, c, e) and gswp3-era5 (b, d, f) for 148 basins larger than 200 000 km2, with (a, b) the ratio of amplitude (reddish colors indicate amplitude underestimation by WaterGAP), (c, d) the correlation coefficient, (e, f) the trend of WaterGAP v2.2e, and (g) the trend of GRACE. All values are based on the time series from January 2003 to December 2019.
[Figure omitted. See PDF]
7.5 Performance changes due to the updated calibration data basisThe calibration data basis with observed mean annual streamflow values of WaterGAP v2.2e has 190 stations more than WaterGAP v2.2d. In particular, 77 river basins are newly included in the calibration routine (ID 1). In 6 cases, a new gauging station has been added downstream (ID 2) and, in 126 cases, upstream (ID 3) of an already existing station. For 21 basins, a station was moved compared to the previous calibration data basis (ID 4). These sum up to 230 gauging stations that differ between the calibration data basis of v2.2d and v2.2e.
To determine the impact of the updated streamflow data basis, the performance of the simulated streamflow obtained by calibrating WaterGAP v2.2d against the two different streamflow data sets (1319 vs. 1509) was compared for the 230 stations. Due to the similar performance between the two model versions, we do not expect that analysis results with v2.2e would be similar. The gswp3-w5e5 climate forcing was applied in both variants.
For all 230 stations, the calibration with the updated observational data basis, which is used to calibrate the standard version of WaterGAP v2.2e, led to substantially improved performance indicators, in particular NSE, KGE, and KGEb, whereas KGEr and KGEg do not differ notably (Fig. ). This improvement is a result of the calibration's objective to adjust the bias in mean simulated streamflow to a range of 10 % around the observed value.
Figure 13
Efficiency metrics for monthly streamflow of the 230 gauging stations that differ between the streamflow data basis used for calibrating WaterGAP v2.2d and the new data basis used for v2.2e., with NSE, KGE, and its components. All monthly observations available have been used to compute the metrics. Outliers (outside inter-quartile range) are excluded, but the number of stations that are defined as outliers is indicated on the axis.
[Figure omitted. See PDF]
Strong performance improvements are observed for the 77 grid cells with newly added calibration data that are outside (and also not downstream) of previously calibrated basins (ID 1), considering the median and the spread (indicated by the range of the 25th and 75th percentile) (Table ). Those grid cells that are already calibrated by a more downstream station in the case of the old calibration data basis (ID 3) show less performance gain. In particular, KGEb for the ID 3 station is already close to the optimum value due to being calibrated to a downstream observation. Here, the bias adjustment of the downstream station is effective for upstream grid cells. In contrast, the improvement is large if stations are included further downstream of an already existing station (ID 2), but the small number of stations implies a careful interpretation (Table ).
Table 10Model performance for the two calibration variants (1509 vs. 1319 stations) and the ID∗ with the reason for change between the two variants and the corresponding number of affected stations in parentheses. The performance indicator is provided as median with its 25th and 75th percentile in parentheses.
ID∗ | Variant | NSE | KGE | KGEr | KGEb | KGEg |
---|---|---|---|---|---|---|
1 (77) | 1509 | 0.37 ( 0.68) | 0.58 (0.19 0.73) | 0.75 (0.55 0.87) | 1.00 (0.93 1.09) | 1.01 (0.78 1.19) |
1319 | ( 0.40) | 0.00 ( 0.49) | 0.78 (0.57 0.87) | 1.39 (0.89 2.61) | 1.00 (0.75 1.32) | |
2 (6) | 1509 | 0.55 (0.19 0.83) | 0.54 (0.43 0.81) | 0.75 (0.51 0.92) | 1.01 (0.94 1.05) | 0.93 (0.67 1.07) |
1319 | ( 0.61) | 0.08 ( 0.69) | 0.76 (0.50 0.91) | 1.69 (1.08 2.39) | 0.91 (0.81 1.03) | |
3 (126) | 1509 | 0.15 ( 0.61) | 0.44 (0.03 0.69) | 0.73 (0.34 0.85) | 1.02 (0.97 1.09) | 0.85 (0.59 1.31) |
1319 | ( 0.44) | 0.19 ( 0.58) | 0.71 (0.35 0.85) | 1.04 (0.82 1.39) | 0.86 (0.62 1.29) | |
4 (21) | 1509 | 0.55 (0.15 0.69) | 0.62 (0.49 0.78) | 0.77 (0.62 0.88) | 1.00 (0.94 1.09) | 0.89 (0.81 1.15) |
1319 | 0.18 ( 0.60) | 0.45 (0.31 0.68) | 0.80 (0.57 0.87) | 1.18 (0.98 1.45) | 0.93 (0.84 1.23) |
∗ 1 are the new river basins, 2 are the added stations downstream of the already existing stations, 3 are the added stations upstream of the already existing stations, and 4 are the stations that were removed.
7.6 Performance comparison between different model variants7.6.1 WaterGAP v2.2e vs. WaterGAP v2.2d
The performance of simulated water abstractions is nearly identical, except for the thermoelectric sector, where WaterGAP v2.2e, with the updated water use, results in a slightly worse fit to AQUASTAT data (logarithmic NSE is 0.40 for v2.2e and 0.52 for v2.2d) (Figs. and S8). With regard to the streamflow performance, WaterGAP 2.2e performs nearly identically to WaterGAP v2.2d with the same climate forcing and calibration data. This is also visible in the spatial pattern for streamflow, where differences are rare. The performance ratio of indicators (for calculation, see the Appendix ) often shows basins with a slightly different sign next to each other (Fig. ) but without a clear spatial pattern of general performance gain or loss. When aggregated to climatic characteristics, such as Köppen–Geiger regions, it can be seen that WaterGAP v2.2e has slightly more basins in a better KGE class for cold D and E climate compared to WaterGAP v2.2d with the same climate forcing (Table ).
Figure 14
Resulting performance ratio of indicators of streamflow for the model version v2.2d and v2.2e as driven by gswp3-w5e5 for overall KGE (a), KGE b (b), KGE (c), and KGE g (d). Bluish colors indicate that v2.2e is closer to the optimal parameter indicator value than v2.2d (see also the description in Appendix ). Note that the calibration procedure forces KGE beta values to be close to the optimum value; hence, the drastic colors here are a result of only small differences to the optimum value.
[Figure omitted. See PDF]
For TWSA, WaterGAP v2.2e performs better than v2.2d, specifically as the trends (in both directions) of TWSA are stronger for v2.2e and fit better to the observations but also correlation coefficients, and the amplitude ratios are improved for v2.2e. The performance ratio of indicators for TWSA shows a consistent direction of change for the trend and correlation for most basins (with more bluish colors, indicating more regions with a performance gain with v2.2e), while the amplitude sometimes shows the opposite signal, especially for those regions with an improved trend ratio (Fig. ). The seasonality of streamflow and TWSA is rather similar within the 12 selected river basins (Fig. S54).
Figure 15
Resulting performance ratio of indicators of TWSAs for the model version v2.2d and v2.2e as driven by gswp3-w5e5 for the amplitude ratio (a), correlation ratio (b), and trend ratio (c). Bluish colors indicate that v2.2e is closer to the optimal parameter indicator value than v2.2d (see also the description in Appendix ).
[Figure omitted. See PDF]
7.6.2 GSWP3-W5E5 vs. GSWP3-ERA5The impact of the selected climate forcing starting in 1979 is substantial, except for the water use (where the performance of gswp3-era5 regarding irrigation water abstractions is slightly lower).
The median streamflow performance with gswp3-w5e5 is slightly higher than with gswp3-era5 (value in parentheses) with 0.499 (0.490) for NSE, 0.582 (0.578) for KGE, 0.775 (0.774) for KGEr, 1.007 (1.018) for KGEb, and 0.858 (0.813) for KGEg. In particular, the Köppen climate zone A (equatorial climate) shows higher performance with gswp3-w5e5 (Table ). Model simulations driven by ERA5 combinations have higher NSE values in northwestern North America but lower values in China (compare Figs. and S33). Moreover, ERA5 combinations tend to have a lower KGEr in some parts of North America and large parts of South America and a generally higher variability compared to the W5E5 combinations (compare Figs. and S41).
The TWSA trend in gswp3-era5 is closer to the observations in North America and South America, and the amplitude ratio is also improved for North America. For parts of Europe and Asia, the correlation but also the trend, as driven by gswp3-w5e5, are closer to GRACE, showing an overall diverse impact of climate forcing to the TWSA (Fig. ). This is also visible in the seasonality, where large differences occur both for streamflow and for TWSA (Fig. S55). For example, the TWSA, as driven by gswp3-era5, matches perfectly to observations for the Amazon, but for streamflow, gswp3-w5e5 fits better.
7.6.3 GSWP3-W5E5 vs. 20CRv3-W5E5
Performance metrics for water abstractions are identical for both variants (Figs. and S10). The median streamflow performance with gswp3-w5e5 is generally higher than with 20crv3-w5e5 (value in parentheses) with 0.499 (0.378) for NSE, 0.582 (0.539) for KGE, 0.775 (0.718) for KGEr, and 1.007 (1.015) for KGEb, except for KGEg with 0.857 (0.864). The higher performance of gswp3-w5e5 is obvious for all Köppen climate regions, with smaller differences for D and E climates (Table . Differences in seasonality are relatively small as the time series for TWSA and streamflow starts several years after 1979 and thus use W5E5. The visible differences are related to the specific calibration parameters that depend also on the years before 1979.
8 Benefits and limitations of the calibration approach
The calibration of WaterGAP is a simple but effective approach to adjust biases in simulated streamflow, runoff, and renewable water resources. As shown for the 230 grid cells with new streamflow observations used for calibrating WaterGAP v2.2e, calibration leads to an overall reduction in water resources to be closer to the observations (Table ). Previous assessments of WaterGAP determined that the decision to calibrate or not has the largest effect on water resources on global-scale fluxes and at the spatial runoff pattern . The improved representation of long-term average water resources is required for evaluating water stress. In addition, this bias adjustment, which also balances out uncertainties in precipitation, is beneficial for improving the simulation of, e.g., the dynamics of downstream wetlands or reservoirs.
However, the simple approach to modify only one parameter () and up to two additional correction factors by calibration against mean annual streamflow has limitations. Reaching the calibration objective by modifying alone is possible only in 519 (524) basins of WaterGAP v2.2e (v2.2d), which indicates that the uncertainties in the input data model structure and the many other model parameters might not be covered well by adjusting only this parameter. In most of the other basins, runoff is still overestimated with the optimum , and the correction factors need to lower the runoff. Another model parameter, the maximum soil water storage , has been found to strongly affect runoff generation and the seasonality and trends of terrestrial water storage anomalies , with higher values decreasing runoff and increasing seasonality and trends. Multi-variable calibration of WaterGAP in individual basins and comparison of model output to spaceborne terrestrial water storage anomalies indicates that the cell-specific values used in WaterGAP might be too low. Thus, increased values are expected to help achieve the calibration objective by adjusting alone.
More complex multi-variable calibration approaches, which use not only observed streamflow but also observations of other model output variables such as TWSA or snow cover, allow us to go beyond bias adjustment and adjust more model parameters. While such ensemble-based calibration approaches have been successfully applied to WaterGAP for individual basins such as the Mississippi sub-basins , they are not yet applicable as a standard approach for global-scale calibration. Such ensemble-based calibration approaches are computationally expensive and also suffer from methodological problems related, for example, to the large footprint of spaceborne terrestrial water storage anomalies ( km2) or trade-offs between the optimal simulation of the different observed variables .
9 Standard model output
Similar to , we provide standard output data for WaterGAP v2.2e driven by the four climate forcings listed in Table and, for comparison, also WaterGAP v2.2d driven by gswp3-w5e5. In addition to the standard ant runs that include direct human impacts (water use and human-made reservoirs, labeled histsoc), we provide, for all five variants, the model output of nat model runs, where it is assumed that there is no human water use and no human-made reservoirs (labeled “nosoc”). The data are stored using the Network Common Data Form (netCDF) format developed by UCAR/Unidata and are available from the Goethe University Data Repository (GUDe) . For two forcings and the ant runs, daily temporal resolution for the storage compartments are provided . The netCDF files contain metadata with detailed information regarding characteristics of the data, e.g., whether a storage type contains anomaly values or absolute values, and a legend where applicable.
The available water storages, flows, and water use variables are listed in Tables , , and , respectively. Table includes additional data, such as the cell-specific continental area as used in WaterGAP v2.2e to convert between equivalent water heights (e.w.h.) and volumetric units (assuming a water density of 1 ). A spatial view for a range of model output is available in a web app (
10 Caveats of WaterGAP v2.2e
This section is a compilation of known issues with the model output and should give guidance to data users.
-
Due to the architecture of WaterGAP, where the output of individual water use models is combined to net abstractions from groundwater and net abstractions from surface water in the linking model GWSWUSE
their Sect. 3.3 , it is not possible to compute sectoral actual consumptive water use values (and the corresponding withdrawal water uses) but only the total actual consumptive water use (and corresponding withdrawal water use). -
In WaterGAP, the actual total consumptive water use (variable atotuse) is included in the actual evapotranspiration (evap). In cases where surface water abstractions are satisfied from the neighboring cell due to shortages in the original water-demanding cell, the return flows to groundwater are assigned to the original water-demanding cell. This can lead to (1) a negative value for atotuse and (2) even evap.
-
In dry areas around large rivers, water is often abstracted from neighboring cells with big rivers (e.g., the Nile) to satisfy the water demand in the original demand cell. The return flows are increasing the groundwater in the demanding cell, which results in a relative increase in groundwater storage and thus an increase in groundwater outflow, which is then visible in the total runoff, qtot, and could add up to more than the precipitation (precip) in the grid cell. Furthermore, the calibration factor, CFA, can lead to more runoff than precipitation.
-
When comparing globally aggregated streamflow from previous versions with WaterGAP v2.2e, it has to be considered that due to the new handling of inland sinks in WaterGAP v2.2e (Sect. ), the endorheic basins contribute to actual evaporation, and the sink cells have zero streamflow. When quantifying the renewable water resources on the global scale, inflow to all inland sinks has to be added to the water resources of the other cells (or the streamflow into oceans).
11 WaterGAP v2.2e in ISIMIP3
WaterGAP contributes to the Inter-Sectoral Impact Model Intercomparison Project (ISIMIP) in its current project phase 3 and follows the simulation protocol of
-
The drainage direction map used in WaterGAP does not completely follow the ISIMIP land–sea mask definition, which was modified slightly and unintentionally. In particular, the lat/long 178.75, (an island southeast of Aotearoa / New Zealand) is defined as land, but the drainage direction map used in WaterGAP locates this island in a neighboring cell. Thus, this island is not present, and any model output for the grid cell with lat/long 178.75, is set to a missing value in all files prepared for ISIMIP.
-
The WaterGAP drainage direction map differs in four grid cells at Lake Ladoga in the Neva river basin in Russia from the ISIMIP definition (lat/long coordinates of 61.25, 31.25; 60.75, 31.25; 60.75, 31.75; and 60.75, 32.25). Those grid cells are not included in WaterGAP, and the drainage direction flows around this lake, resulting in a total number of 67 420 grid cells considered in WaterGAP v2.2e.
-
WaterGAP does not use the land use data as provided by ISIMIP but a static, satellite-based map of land cover classes
their Appendix C . WaterGAP considers temporally varying irrigation areastheir Sect. 3.1 but not those from ISIMIP. -
During the update of the reservoir data (Sect. ), we found better-suited grid cell locations for several dams compared to the input data provided by ISIMIP. The data used within WaterGAP v2.2e are available via .
-
According to the modeling protocol, the variable qtot consists of the sum of the surface, qs, and sub-surface, qsb, runoff and is defined as total runoff. However, and specifically for WaterGAP, this implies that for qtot (but not for the net cell runoff ncrun provided in the standard model output), the horizontal water balance (i.e., the water balance of the surface waterbodies) is not considered. For users who want to assess the differences, we provide qtot and ncrun as standard model output.
12 Conclusions and outlook
Since the development of the WaterGAP model started in 1996, numerous model versions have been created and applied in many studies. This paper describes the most recent model version v2.2e, as well as the model output, with a focus on the changes from the previous model version v2.2d described in . With version v2.2e, the applicability of WaterGAP for answering scientific questions has been enhanced compared to previous versions. The performance of v2.2e regarding water use, streamflow, and TWSA does not differ much from v2.2d when using the same climate forcing and the same streamflow observations for model calibration (thus, the only difference is to the model structure). The climate forcing gswp3-w5e5 leads to the highest performance for streamflow, whereas there are distinct regions for which gswp3-era5 is superior to gswp3-w5e5, in particular for TWSA trends.
While version v2.2e has been finalized, the scientific and societal demand for future model development remains. For example, to improve the still poor simulation of the outflow and storage dynamics of artificial reservoirs, the reservoir algorithm should be modified and calibrated, benefiting from the recent availability of remote-sensing-based estimates of reservoir water storage dynamics. The achieved glacier integration into WaterGAP (Sect. ), which has led to an improved representation of TWSA , is unsustainable in the sense that it depends on updates from the glacier modeling community. Therefore, model adjustments and arrangement with the glacier modeling community are required to achieve a continuing integration of glacier model output into WaterGAP, which would particularly improve climate change impact assessments . Then, a future model version of WaterGAP could include a glacier component in its standard variant.
The WaterGAP v2.2e software, written in C/C, started to be developed nearly 30 years ago. Generations of researchers modified, tested, and documented the code, resulting in a very complex software that is difficult to understand, maintain, and enhance. Currently, the WaterGAP Global Hydrology Model and GWSWUSE are re-programmed in Python with a modern software architecture; this research software will be available as an open-source community software, alongside documentation, a user guide, and examples (
Appendix A Technical changes
-
Output of monthly groundwater recharge below surface waterbodies is now possible.
-
Data arrays are now stored and processed in std::vector objects.
-
Several options to run WaterGAP were removed because they were not used anymore.
-
Bug in the initialization of reservoir water demand in the respective commissioning years was fixed (routing routine).
-
Bug in the reintroduction of return flows into groundwater due to delayed satisfaction of was fixed.
-
Bug in the reallocation of unsatisfied at global lakes and reservoirs was fixed.
Appendix B Evaluation metrics
The following section is to a great extent identical to
B1 Nash–Sutcliffe efficiency
The Nash–Sutcliffe efficiency metric NSE (–) is a traditional metric in hydrological modeling. It provides an integrated measure of the model performance with respect to mean values and variability and is calculated as
B1 where is the observed value (e.g., monthly streamflow), is the simulated value, and is the mean observed value. The optimal value of NSE is one. Values below zero indicate that the mean value of the observations is better than the simulation . For assessing the performance of low values of water abstraction (Sect. ), a logarithmic NSE was also calculated by applying a logarithmic transformation before the calculation of the performance indicator.
B2 Kling–Gupta efficiencyThe Kling–Gupta efficiency metric, KGE , transparently combines the evaluation of bias, variability, and timing and is calculated (in its 2012 version) as
B2 where is the correlation coefficient between the simulated and observed values (–) and an indicator for the timing, KGEb is the ratio of mean values (Eq. ) (–) and an indicator of biases regarding mean values, and KGEg is the ratio of variability (Eq. ) (–) and an indicator for the variability in simulated (S) and observed (O) values. where is the mean value, is the standard deviation, and CV is the coefficient of variation. The optimal value of KGE is one.
B3 TWSA-related metricsFor the evaluation of TWSA performance, the following metrics were used: (coefficient of determination) as the strength of linear relationship between simulated and observed variables, the amplitude ratio as an indicator for variability, and the trend of GRACE and WaterGAP data. Amplitude and trends were determined by a linear regression for estimating the most dominant temporal components of the GRACE time series. The time series of monthly TWSA was approximated by a constant , a linear trend , and an annual and a semi-annual sinusoidal curve as follows:
B5 where denotes the residuals. The parameters to were estimated via least squares adjustment. The annual amplitude can be computed by , and thus, the annual ratio was calculated by .
Appendix C Performance ratio of indicatorsIn order to find out where the difference as to the optimal value of a model performance indicator is reduced or increased between the two versions (v2.2e vs. v2.2d) of WaterGAP, the indicator performance ratio (Eq. ) was used and defined as
C1 where is the performance ratio of the given indicator IND [–]. IND is the indicator value (KGE and its components for streamflow, with the amplitude ratio for TWSA and the ratio of the model divided by GRACE for the TWSA trend) for the particular model version [–]. The smaller the resulting , the better v2.2e will be compared to v2.2d. For values , v2.2e performs better than v2.2d, and vice versa. The closer is to zero, the better v2.2e will perform against v2.2d.
Appendix D Additional figures and tablesFigure D1
Cumulative distribution of the NSE efficiency metric for all streamflow values at the 1509 gauging stations for all model variants.
[Figure omitted. See PDF]
Figure D2
Cumulative distribution of the KGE r for all streamflow values at the 1509 gauging stations for all model variants.
[Figure omitted. See PDF]
Figure D3
Cumulative distribution of the KGE b for all streamflow values at the 1509 gauging stations for all model variants.
[Figure omitted. See PDF]
Figure D4
Cumulative distribution of the KGE g for all streamflow values at the 1509 gauging stations for all model variants.
[Figure omitted. See PDF]
Table D1Model performance and the NSE efficiency indicator and number of basins per Köppen–Geiger region in the particular performance class for the different WaterGAP variants.
Model variant | NSE | A | B | C | D | E | Sum |
---|---|---|---|---|---|---|---|
87 | 13 | 112 | 109 | 14 | 335 | ||
v2.2d gswp3-w5e5 | 0.5–0.7 | 114 | 25 | 83 | 183 | 6 | 411 |
159 | 88 | 113 | 377 | 26 | 763 | ||
88 | 13 | 112 | 111 | 13 | 337 | ||
v2.2e gswp3-w5e5 | 0.5–0.7 | 113 | 25 | 81 | 191 | 7 | 417 |
159 | 88 | 115 | 367 | 26 | 755 | ||
51 | 3 | 47 | 146 | 6 | 253 | ||
v2.2e 20crv3-era5 | 0.5–0.7 | 91 | 19 | 79 | 151 | 4 | 344 |
206 | 95 | 195 | 398 | 18 | 912 | ||
56 | 3 | 50 | 123 | 16 | 248 | ||
v2.2e 20crv3-w5e5 | 0.5–0.7 | 92 | 18 | 66 | 159 | 3 | 338 |
209 | 107 | 194 | 393 | 20 | 923 | ||
77 | 6 | 103 | 127 | 7 | 320 | ||
v2.2e gswp3-era5 | 0.5–0.7 | 113 | 21 | 99 | 176 | 6 | 415 |
160 | 88 | 122 | 387 | 17 | 774 |
Model performance and the KGEr efficiency indicator and number of basins per Köppen–Geiger region in the particular performance class for the different WaterGAP variants.
Model variant | KGEr | A | B | C | D | E | Sum |
---|---|---|---|---|---|---|---|
210 | 31 | 186 | 231 | 18 | 676 | ||
v2.2d gswp3-w5e5 | 0.5–0.8 | 120 | 53 | 99 | 258 | 17 | 547 |
30 | 42 | 23 | 180 | 11 | 286 | ||
210 | 31 | 185 | 233 | 19 | 678 | ||
v2.2e gswp3-w5e5 | 0.5–0.8 | 121 | 54 | 101 | 256 | 15 | 547 |
29 | 41 | 22 | 180 | 12 | 284 | ||
123 | 11 | 111 | 262 | 11 | 518 | ||
v2.2e 20crv3-era5 | 0.5–0.8 | 182 | 57 | 156 | 246 | 13 | 654 |
43 | 49 | 54 | 187 | 4 | 337 | ||
141 | 12 | 116 | 228 | 20 | 517 | ||
v2.2e 20crv3-w5e5 | 0.5–0.8 | 171 | 56 | 148 | 246 | 8 | 629 |
45 | 60 | 46 | 201 | 11 | 363 | ||
181 | 18 | 180 | 257 | 14 | 650 | ||
v2.2e gswp3-era5 | 0.5–0.8 | 137 | 58 | 121 | 268 | 11 | 595 |
32 | 39 | 23 | 165 | 5 | 264 |
Model performance and the KGEb efficiency indicator and number of basins per Köppen–Geiger region in the particular performance class for the different WaterGAP variants.
Model variant | KGEb | A | B | C | D | E | Sum |
---|---|---|---|---|---|---|---|
0 | 4 | 0 | 1 | 0 | 5 | ||
1.1–1.5 | 104 | 32 | 59 | 80 | 1 | 276 | |
v2.2d gswp3-w5e5 | 0.9–1.1 | 241 | 60 | 218 | 484 | 29 | 1032 |
0.5–0.9 | 14 | 29 | 28 | 104 | 16 | 191 | |
1 | 1 | 3 | 0 | 0 | 5 | ||
1 | 4 | 0 | 1 | 0 | 6 | ||
1.1–1.5 | 96 | 33 | 56 | 89 | 2 | 276 | |
v2.2e gswp3-w5e5 | 0.9–1.1 | 249 | 58 | 222 | 484 | 28 | 1041 |
0.5–0.9 | 13 | 30 | 27 | 95 | 16 | 181 | |
1 | 1 | 3 | 0 | 0 | 5 | ||
0 | 4 | 4 | 5 | 0 | 13 | ||
1.1–1.5 | 76 | 25 | 97 | 99 | 8 | 305 | |
v2.2e 20crv3-era5 | 0.9–1.1 | 246 | 53 | 190 | 540 | 20 | 1049 |
0.5–0.9 | 26 | 30 | 28 | 50 | 0 | 134 | |
0 | 5 | 2 | 1 | 0 | 8 | ||
0 | 4 | 5 | 4 | 0 | 13 | ||
1.1–1.5 | 86 | 35 | 88 | 96 | 3 | 308 | |
v2.2e 20crv3-w5e5 | 0.9–1.1 | 251 | 63 | 184 | 481 | 24 | 1003 |
0.5–0.9 | 20 | 25 | 30 | 94 | 12 | 181 | |
0 | 1 | 3 | 0 | 0 | 4 | ||
0 | 4 | 0 | 0 | 0 | 4 | ||
1.1–1.5 | 94 | 19 | 68 | 93 | 10 | 284 | |
v2.2e gswp3-era5 | 0.9–1.1 | 232 | 61 | 224 | 540 | 18 | 1075 |
0.5–0.9 | 23 | 26 | 30 | 56 | 2 | 137 | |
1 | 5 | 2 | 1 | 0 | 9 |
Model performance and the KGEg efficiency indicator and number of basins per Köppen–Geiger region in the particular performance class for the different WaterGAP variants.
Model variant | KGEg | A | B | C | D | E | Sum |
---|---|---|---|---|---|---|---|
56 | 19 | 32 | 30 | 3 | 140 | ||
1.1–1.5 | 68 | 21 | 69 | 57 | 8 | 223 | |
v2.2d gswp3-w5e5 | 0.9–1.1 | 68 | 18 | 110 | 109 | 9 | 314 |
0.5–0.9 | 150 | 54 | 89 | 317 | 14 | 624 | |
18 | 14 | 8 | 156 | 12 | 208 | ||
54 | 19 | 32 | 30 | 3 | 138 | ||
1.1–1.5 | 70 | 23 | 70 | 57 | 8 | 228 | |
v2.2e gswp3-w5e5 | 0.9–1.1 | 67 | 17 | 110 | 107 | 8 | 309 |
0.5–0.9 | 152 | 52 | 87 | 316 | 15 | 622 | |
17 | 15 | 9 | 159 | 12 | 212 | ||
63 | 23 | 22 | 29 | 3 | 141 | ||
1.1–1.5 | 40 | 12 | 67 | 79 | 9 | 207 | |
v2.2e 20crv3-era5 | 0.9–1.1 | 61 | 15 | 91 | 111 | 11 | 289 |
0.5–0.9 | 165 | 57 | 127 | 294 | 3 | 646 | |
19 | 10 | 13 | 182 | 2 | 226 | ||
65 | 23 | 33 | 32 | 2 | 155 | ||
1.1–1.5 | 70 | 23 | 75 | 54 | 8 | 230 | |
v2.2e 20crv3-w5e5 | 0.9–1.1 | 61 | 24 | 106 | 100 | 10 | 301 |
0.5–0.9 | 147 | 47 | 88 | 328 | 8 | 618 | |
14 | 11 | 8 | 161 | 11 | 205 | ||
50 | 18 | 28 | 26 | 3 | 125 | ||
1.1–1.5 | 42 | 10 | 70 | 77 | 7 | 206 | |
v2.2e gswp3-era5 | 0.9–1.1 | 50 | 10 | 89 | 121 | 12 | 282 |
0.5–0.9 | 182 | 61 | 123 | 288 | 5 | 659 | |
26 | 16 | 14 | 178 | 3 | 237 |
Standard WaterGAP output variables. (1) Water storage. Units are in (). Each water storage, except for reservoirstor, is also available in a naturalized variant, as indicated by the suffix, nat, in the file name. The temporal resolution is monthly, except for two climate forcings that are additionally available in a daily resolution.
Storage type | GUDe variable file | Symbol in |
---|---|---|
Total water storagea,b | tws | |
Canopy water storage | canopystor | |
Snow water storage | swe | |
Soil water storage | soilmoist | |
Groundwater storageb | groundwstor | |
Local lake storageb | loclakestor | |
Global lake storageb | glolakestor | |
Local wetland storage | locwetlandstor | |
Global wetland storage | glowetlandstor | |
Reservoir storage | reservoirstor | |
River storage | riverstor |
a Sum of all compartments below. b Relative water storage; only anomalies with respect to a reference period can be evaluated.
Table E2Standard WaterGAP output variables. (2) Flows. Units are in (), except for for dis and for triver. The temporal resolution is monthly.
Flow type | GUDe variable file | Symbol in |
---|---|---|
Monthly precipitation | precmon | |
Fast surface and fast subsurface runoffa | qs | ; in corrigendum |
Diffuse groundwater recharge | qrdif | |
Groundwater recharge from surface waterbodies | qrswb | |
Total groundwater rechargeb | qr | |
Runoff from landc | ql | in corrigendum |
Groundwater discharged | qg | |
Total runoff from lande | qtot | sum of and |
Actual evapotranspiration f | evap | |
Potential evapotranspiration | potevap | |
Net cell runoff | ncrun | |
Streamflowg | dis | |
River water temperature | triver | NA |
NA: not available. a Fraction of total runoff from land that does not recharge the groundwater; b sum of qrdif and qrswb; c sum of qs and qrdif; d groundwater runoff; e sum of ql and qg; f sum of soil evapotranspiration, sublimation, evaporation from canopy, evaporation from waterbodies, and actual consumptive water use; g river discharge.
Table E3Standard WaterGAP output variables. (3) Water use. Units are in (). The temporal resolution is monthly.
Flow type | GUDe variable | Symbol in |
---|---|---|
file | ||
Potential consumptive water use for domestic sector | pdomuse | |
Potential withdrawal water use for domestic sector | pdomww | |
Potential consumptive water use for thermoelectric sector | pelecuse | |
Potential withdrawal water use for thermoelectric sector | pelecww | |
Potential consumptive water use for irrigation sector | pirruse | |
Potential withdrawal water use for irrigation sector | pirrww | |
Potential withdrawal water use for irrigation sector from groundwater resources | pirrwwgw | |
Potential consumptive water use for livestock sectora | plivuse | |
Potential consumptive water use for manufacturing sector | pmanuse | |
Potential consumptive water use for manufacturing sector from groundwater resources | pmanusegw | |
Potential withdrawal water use for manufacturing sector | pmanww | |
Potential withdrawal water use for manufacturing sector from groundwater resources | pmanwwgw | |
Potential net abstraction from surface water | pnas | |
Potential net abstraction from groundwater | pnag | |
Potential consumptive water use from groundwater | pgwuse | |
Potential withdrawal water use from groundwater | pgwww | |
Potential consumptive water useb | ptotuse | |
Potential withdrawal water usec | ptotww | |
Actual net abstraction from surface water | anas | |
Actual net abstraction from groundwater | anag | |
Actual consumptive water used | atotuse |
a Equals withdrawal water use; b sum of pnas and pnag; c sum of pdomww, pelecww, pirrww, plivuse, and pmanww; d sum of anas and anag.
Table E4Standard WaterGAP output variables. (4) Additional files provided for a better understanding of the model outputs.
Storage type | GUDe variable file | Symbol in |
---|---|---|
Calibration status of the basin | calstatus | CS |
Area correction factor from calibration | cfa | CFA |
Station correction factor from calibration | cfs | CFS |
Gamma factor from calibration | gamma | |
Continental area of the grid cell | continentalarea | |
Flow direction in D8 schema | flowdirection | |
Outflow cells to oceans and inland sinks | outflowcells | |
Rooting depth of the grid cell | rootdepth | |
Maximum soil water capacity of the soil compartment | smax | |
Commissioning year of the reservoirs | startyear |
Code and data availability
The code of WaterGAP v2.2e is open-source under the GNU Lesser General Public License version 3 at (10.5281/zenodo.10026943). The model output data availability is described in Sect. . The streamflow data for the evaluation are available at (10.5281/ZENODO.7255968), and the GRACE(-FO) data are available at . For latest papers published based on WaterGAP 2, we refer the reader to
The supplement related to this article is available online at:
Author contributions
HMS and PD led the development of WaterGAP v2.2e. HMS led the software development, supported by TT, SA, DC, TAP, and PD. The paper was conceptualized by HMS and PD. HMS did the calibrations, simulations, and data analysis; prepared the model output for the GUDe data repository; did the visualization and model validation; and was supported by MS regarding the validation against GRACE TWSA. EK provided the updated non-irrigation water use data. The original draft was written by HMS, with specific parts drafted by TT, SA, DC, MF, HG, TAP, LS, MS, and PD. All authors contributed to the final version of the paper.
Competing interests
The contact author has declared that none of the authors has any competing interests.
Disclaimer
Publisher's note: Copernicus Publications remains neutral with regard to jurisdictional claims made in the text, published maps, institutional affiliations, or any other geographical representation in this paper. While Copernicus Publications makes every effort to include appropriate place names, the final responsibility lies with the authors.
Acknowledgements
We acknowledge the ISIMIP team for producing and making available the ISIMIP input data. We thank Georg Seitfudem for support in finding and solving the bug in the domestic water use data. We furthermore thank Lukas Grittner for polishing the reference list and for technical support during the preparation of this work. We thank Seyed-Mohammad Hosseini-Moghari for reviewing the draft. We are grateful to Guillaume Attard for creating the WaterGAP Explorer. We are thankful for valuable comments and suggestions from two anonymous referees, which helped to streamline and improve the consistency of the paper.
Financial support
Maike Schumacher has been supported by a research grant from VILLUM FONDEN (grant no. VIL60779). This open-access publication was funded by Goethe University Frankfurt.
Review statement
This paper was edited by Nathaniel Chaney and reviewed by two anonymous referees.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2024. This work is published under https://creativecommons.org/licenses/by/4.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Water – Global Assessment and Prognosis (WaterGAP) is a modeling approach for quantifying water resources and water use for all land areas of the Earth that has served science and society since 1996. In this paper, the refinements, new algorithms, and new data of the most recent model version v2.2e are described, together with a thorough evaluation of the simulated water use, streamflow, and terrestrial water storage anomaly against observation data. WaterGAP v2.2e improves the handling of inland sinks and now excludes not only large but also small human-made reservoirs when simulating naturalized conditions. The reservoir and non-irrigation water use data were updated. In addition, the model was calibrated against an updated and extended data set of streamflow observations at 1509 gauging stations. The modifications resulted in a small decrease in the estimated global renewable water resources. The model can now be started using prescribed water storages and other conditions, facilitating data assimilation and near-real-time monitoring and forecast simulations. For specific applications, the model can consider the output of a glacier model, approximate the effect of rising CO2 concentrations on evapotranspiration, or calculate the water temperature in rivers. In the paper, the publicly available standard model output is described, and caveats of the model version are provided alongside the description of the model setup in the ISIMIP3 framework.
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details







1 Institute of Physical Geography, Goethe University Frankfurt, Frankfurt am Main, Germany; Senckenberg Leibniz Biodiversity and Climate Research Centre (SBiK-F), Frankfurt am Main, Germany
2 Institute of Physical Geography, Goethe University Frankfurt, Frankfurt am Main, Germany
3 Chair of Engineering Hydrology and Water Resources Management, Ruhr University Bochum, Bochum, Germany
4 Institute of Geodesy and Geoinformation, University of Bonn, Bonn, Germany
5 Center for Environmental Systems Research, University of Kassel, Kassel, Germany
6 Geodesy Group, Department of Sustainability and Planning, Aalborg University, Aalborg, Denmark