Introduction
The ice thickness distribution of a glacier, ice cap, or ice sheet is a fundamental parameter for many glaciological applications. It determines the total volume of the ice body, which is crucial to quantify water availability or sea-level change, and provides the link between surface and subglacial topography, which is a prerequisite for ice flow modelling studies. Despite this importance, knowledge about the ice thickness of glaciers and ice caps around the globe is limited – a fact linked mainly to the difficulties in measuring ice thickness directly. To overcome this problem, a number of methods have been developed to infer the total volume and/or the ice thickness distribution of ice masses from characteristics of the surface.
Amongst the simplest methods, so-called “scaling approaches” are the
most popular
Methods that yield distributed information about the ice
thickness generally rely on theoretical
considerations. , for example, noted that for the case
of an idealized glacier of infinite width, ice thickness can be
calculated from the surface slope using estimates of basal shear stress
and assuming perfect plastic behaviour. successively
extended the considerations to valley glaciers of idealized shapes,
whilst additionally accounted for the effect of
side drag from the glacier margins. Common to these three approaches
is the assumption of a constant and known basal shear
stress. were the first suggesting that the latter
can be estimated from the glacier elevation range, and the corresponding
parametrization has been used in a series of recent studies
Early approaches that take into account mass conservation and ice flow
dynamics go back to and , whose
ideas were further developed by and
. The latter approach was successively
extended by , who presented the first globally
complete estimate for the ice thickness distribution of individual
glaciers. Alternative methods based on more rigorous
inverse modelling, in contrast, have often focused on
additionally inferring basal slipperiness together with bedrock topography
Overview of the test cases considered in ITMIX. Glacier type follows the GLIMS classification guidance . “Calv” indicates whether the glacier or ice cap is affected by calving (x) or not (–). The following abbreviations are used: glacier area (A); simple basin (SB); compound basin (CB); mountain (mtn.); glacier outline (OL); digital elevation model of the glacier surface (DEM); surface mass balance (SMB); surface ice flow velocity (Vel.); rate of ice thickness change (h/t); ice thickness measurements (H); unpublished data (Unpub.). References to the data are given.
Test case | Type | Calv | A (km) | Available data and source |
---|---|---|---|---|
Academy | Ice cap | x | 5587.2 | OL, DEM, H: |
Aqqutikitsoq | SB valley gl. | – | 2.8 | OL, DEM, H: |
Austfonna | Ice cap | x | 7804.8 | OL, DEM: ; , SMB: Unpub. G. Moholdt; |
Vel.: ; H: | ||||
Brewster | SB mountain gl. | – | 2.5 | OL: ; DEM: ; SMB: ; |
Vel.: Unpub. B. Anderson; H: | ||||
Columbia | CB valley gl. | x | 937.1 | OL, DEM, H: |
Devon | Ice cap | x | 14015.0 | OL, DEM, H: ; Vel.: Unpub. GAMMA |
Elbrus | Crater mnt. gl. | – | 120.8 | OL, , H: Unpub. RAS; DEM: ; |
SMB: | ||||
Freya | SB valley gl. | – | 5.3 | OL, DEM, H: Unpub. ZAMG; SMB: |
Hellstugubreen | CB valley gl. | – | 2.8 | OL: ; DEM, SMB, : ; |
Vel.: Unpub. NVE; H: | ||||
Kesselwandferner | SB mountain gl. | – | 4.1 | OL, DEM: ; SMB: ; |
H: | ||||
Mocho | Crater mnt. gl. | – | 15.2 | OL, H: ; DEM: ASTER GDEM v2; |
SMB: Unpub. M. Schaefer | ||||
North Glacier | SB valley gl. | – | 7.0 | OL, DEM, H: ; Vel.: Unpub. G. Flowers |
South Glacier | SB valley gl. | – | 5.3 | OL, DEM, H: ; SMB: ; |
Vel.: | ||||
Starbuck | CB outlet gl. | x | 259.7 | OL, H: ; DEM: |
Tasman | CB valley gl. | – | 100.3 | OL: ; DEM: ; |
SMB, Vel.: Unpub. B. Anderson; H: | ||||
Unteraar | CB valley gl. | – | 22.7 | OL, DEM, , SMB: Unpub. VAW-ETHZ; Vel.: ; |
H: | ||||
Urumqi | SB mountain gl. | – | 1.6 | OL, DEM, SMB, H: |
Washmawapta | Cirque mnt. gl. | – | 0.9 | OL, DEM, H: |
Synthetic1 | CB valley gl. | – | 10.3 | OL, DEM, SMB, Vel., , H: Unpub. C. Martin and D. Farinotti |
Synthetic2 | CB mountain gl. | – | 35.3 | OL, DEM, SMB, Vel., , H: Unpub. C. Martin and D. Farinotti |
Synthetic3 | Ice cap | – | 89.9 | OL, DEM, SMB, Vel., , H: Unpub. C. Martin and D. Farinotti |
GAMMA Remote Sensing Research and Consulting AG, Gümligen, Switzerland; contact person: T. Strozzi. Russian Academy of Sciences, Institute of Geography, Moscow, Russia; contact person: S. Kutuzov. Zentralanstalt für Meteorologie and Geodynamik (ZAMG), Vienna, Austria; contact person: D. Binder. Norwegian Water Resources and Energy Directorate (NVE), Oslo, Norway; contact person: L. M. Andreassen. ASTER GDEM is a product of NASA and METI. Laboratory of Hydraulics, Hydrology and Glaciology (VAW), ETH Zurich, Zurich, Switzerland; contact person: A. Bauder.
In the recent past, the number of methods aiming at estimating the ice
thickness distribution from characteristics of the surface has
increased at a rapid pace. Methods have been presented that include
additional data such as surface velocities and mass balance
Against this background, the working group on glacier ice thickness
estimation, hosted by the International Association of Cryospheric
Sciences (
This article presents both the experimental set-up of ITMIX and the results of the intercomparison. The accuracy of individual approaches is assessed in a unified manner, and the strengths and shortcomings of individual models are highlighted. By doing so, ITMIX not only provides quantitative constraints on the accuracies that can be expected from individual models but also aims at setting the basis for developing a new generation of improved ice thickness estimation approaches.
Experimental set-up
ITMIX was conducted as an open experiment, with a call for
participation posted on the email distribution list “Cryolist”
(
The input data referred to the surface characteristics of a predefined set of 21 test cases (see next section, Table , and Fig. ) and participants were asked to use these data for generating an estimate of the corresponding ice thickness distribution. Results were collected and compared to direct ice thickness measurements.
No prior information about ice thickness was provided, and the participants were asked not to make use of published ice thickness measurements referring to the considered test cases for model calibration. This was to mimic the general case in which the ice thickness distribution for unmeasured glaciers has to be estimated. The compliance to the above rule relied on honesty.
Participants were asked to treat as many test cases as possible and to consider data availability (see next section and Table ) as the only factor limiting the number of addressed cases. Details on the considered test cases and the participating models are given in Sects. and respectively. An overview of the solutions submitted to the experiment is given in Table .
Overview of provided and used data, as well as test cases considered by individual models. Names of ice caps are flagged with an asterisk (*). Models are named after the modeller submitting the results; alternative model identifiers that have been used in the literature are given in parentheses. The category refers to the classification provided in Sect. and includes (1) minimization approaches, (2) mass conserving approaches, (3) shear-stress-based approaches, (4) velocity-based approaches, and (5) other approaches. Abbreviations: glacier outline and digital elevation model of the surface (OL+DEM); surface mass balance (SMB); surface ice flow velocity (Vel.); rate of ice thickness change (). For “Vel.”, a distributed field of flow speeds (s) and flow directions (d) or individual point measurements (p) were provided. “x” (“.”) indicates that the given information was (not) provided/used. In the columns “Data used”, “(x)” indicates that the information was used when available, but that it is not strictly necessary for model application. References for the data source are given in Table .
Provided data | Academy* | Aqqutikitsoq | Austfonna* | Brewster | Columbia | Devon* | Elbrus | Freya | Hellstugubreen | Kesselwandferner | Mocho | North Glacier | South Glacier | Starbuck | Tasman | Unteraar | Urumqi | Washmawapta | Synthetic1 | Synthetic2 | Synthetic3 | TOTAL cases | Data used | |||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Category | OL+DEM | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | OL+DEM | SMB | Vel. | |||
SMB | . | . | x | x | . | . | x | x | x | x | x | . | x | . | x | x | x | . | x | x | x | |||||||
Vel. | . | . | sd | p | . | sd | . | . | p | . | . | p | p | . | s | sd | . | . | sd | sd | sd | |||||||
Model | . | . | x | . | . | . | x | . | x | . | . | . | . | . | . | x | . | . | x | x | x | |||||||
1 | Brinkerhoff-v2 | . | . | . | x | . | . | . | x | x | x | . | . | x | . | . | x | x | . | x | x | x | 10 | x | x | (x) | . | |
1 | Fuerst | . | . | x | . | . | . | . | . | . | . | . | . | . | . | . | x | . | . | x | x | x | 5 | x | x | x | x | |
1 | VanPeltLeclercq | . | . | . | x | . | . | x | x | x | x | x | . | . | . | . | x | . | . | x | x | x | 10 | x | x | (x) | . | |
2 | Farinotti (ITEM) | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | 21 | x | . | . | . | |
2 | GCbedstress | x | x | . | x | . | . | . | x | x | x | x | x | x | . | . | x | x | x | x | x | x | 15 | x | (x) | . | (x) | |
2 | Huss (HF-model) | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | 21 | x | . | . | . | |
2 | Maussion (OGGM) | x | x | x | x | x | x | x | x | x | x | x | x | x | . | x | x | x | x | x | x | . | 19 | x | . | . | . | |
2 | Morlighem | . | . | . | . | . | . | . | x | x | x | . | . | x | . | x | x | x | . | x | x | x | 10 | x | x | (x) | . | |
3 | Linsbauer (GlabTop) | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | 21 | x | . | . | . | |
3 | Machguth (GlabTop2) | x | x | . | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | x | 20 | x | . | . | . | |
3 | RAAJglabtop2 | . | . | . | . | . | . | . | . | . | . | . | x | . | . | x | x | . | . | . | x | . | 4 | x | . | . | . | |
4 | Gantayat | . | . | x | . | . | x | . | . | . | . | . | . | . | . | x | x | . | . | x | x | x | 7 | x | . | x | . | |
4 | Gantayat-v2 | . | . | x | . | . | x | . | . | . | . | . | . | . | . | x | x | . | . | x | x | x | 7 | x | . | x | . | |
4 | RAAJgantayat | . | . | . | x | . | . | . | . | . | . | . | x | . | . | x | x | . | . | x | . | . | 5 | x | . | x | . | |
4 | Rabatel | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | x | . | . | 1 | x | x | x | . | |
5 | Brinkerhoff | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | . | x | x | x | 3 | x | x | x | x | |
5 | GCneuralnet | . | x | . | . | . | . | . | . | . | x | x | x | x | . | x | x | . | . | x | x | x | 10 | x | . | . | . | |
TOTAL models | 6 | 7 | 7 | 9 | 5 | 7 | 6 | 9 | 9 | 10 | 8 | 9 | 9 | 4 | 11 | 15 | 8 | 6 | 16 | 15 | 13 | 189 |
Considered test cases and data
The considered test cases included 15 glaciers and 3 ice caps for which direct ice thickness measurements are available and 3 synthetically generated glaciers virtually “grown” over known bedrock topographies (more detailed information below). The real-world test cases (see Fig. for geographical distribution) were chosen to reflect different glacier morphologies (see Table ) and different climatic regions, whilst the synthetic test cases were included to have a set of experiments for which all necessary information is perfectly known. Since most published approaches for estimating ice thickness were developed for applications on mountain glaciers and smaller ice caps, ice sheets were not included in the experiment.
For each test case, the input data provided to the ITMIX participants included at a minimum (a) an outline of the glacier or ice cap and (b) a gridded digital elevation model (DEM) of the ice surface. Further information was provided on a case-by-case basis depending on data availability, including the spatial distribution of the (i) surface mass balance (SMB), (ii) rate of ice thickness change (), and (iii) surface flow velocity. An overview of the data available for individual test cases and the corresponding data sources is given in Tables and respectively.
For the real-world test cases, and whenever possible, temporal consistency was ensured between individual data sets. Glacier outlines and DEMs were snapshots for a given point in time, whereas SMB, , and velocity fields generally referred to multi-year averages for an epoch as close as possible to the corresponding DEM. Glacier-wide estimates of surface velocities were not available for any of the considered cases. For obtaining a possibly complete coverage, velocities from separate sources were therefore merged, which often led to discontinuities along the tile margins.
Ice thickness measurements were only used for quantifying model performance but were not distributed to the ITMIX participants. Bedrock elevations were obtained by subtracting observed ice thicknesses from surface elevations, and the bedrock was assumed to remain unchanged over time. The time periods the individual data sets are referring to are given in Supplement Table S1. Note that no specific information about the uncertainties associated to individual measurements were available. Reported uncertainties for ice thickness measurements, however, are typically below 5 % .
The synthetic test cases were generated by “growing” ice masses over known bedrock topographies with the Elmer/Ice ice flow model . To do so, selected deglacierized areas located in the European Alps were extracted from local high-resolution DEMs (product DHM25 by the Swiss Federal Office of Topography) and the flow model forced with a prescribed SMB field. The SMB field was either generated by prescribing an equilibrium-line altitude and two separate SMB elevation gradients for the accumulation and ablation zone (test cases “Synthetic1” and “Synthetic2”) or by constructing the field through a multiple linear regression between SMB and terrain elevation, slope, aspect, curvature, and local position (test case “Synthetic3”). In the latter case, the individual regression parameters were defined arbitrarily but such to ensure a plausible range for the resulting SMB field. The Elmer/Ice simulations were stopped after the formation of a glacier judged to be of suitable size and shape, and the corresponding and surface velocity fields were extracted. No sliding at the glacier base was assumed, and all three resulting geometries were close to steady state. Note that, to avoid numerical instabilities, the DEM used for prescribing the bedrock topography had to be smoothed significantly. For anonymizing the individual locations, the original coordinates were removed, and the individual tiles arbitrarily rotated and shifted in elevation.
All data provided as input to the ITMIX participants, as well as the results submitted by individual models, will be provided as an electronic Supplement to this article. The direct ice thickness measurements were additionally included in the Glacier Thickness Database (GlaThiDa) version 2 .
Overview of the considered real-world test cases. Note that some names are shortened for convenience (Academy is Academy of Sciences Ice Cap; Devon is Devon Ice Cap; Mocho is Glaciar Mocho-Choshuenco; Unteraar is Unteraargletscher; Urumqi is Urumqi Glacier no. 1).
[Figure omitted. See PDF]
Participating models
The ITMIX call for participation was answered by 13 research groups having access to 15 different models in total. Two modelling approaches were used twice, with two independent implementations stemming from two different groups, nine models were published prior to the call, one model consisted of a modification of an existing approach, and five models were previously unpublished. In total, thus, 17 different models submitted individual solutions (Table ).
The 17 approaches providing individual solutions can be classified into five different categories: (1) approaches casting ice thickness inversion as a minimization problem (minimization approaches), (2) approaches based on mass conservation (mass conserving approaches), (3) approaches based on a parametrization of basal shear stress (shear-stress-based approaches), (4) approaches based on observed surface velocities (velocity-based approaches), and (5) other approaches. The principle of each of the five categories is briefly described hereafter. A more detailed description, including information about parameter choices, is found in the Supplement (Supplement Sect. S1). The supplementary description is exhaustive for unpublished approaches and is held at a minimum for published ones.
Minimization approaches
Methods within this category formulate the problem of ice thickness
inversion as a minimization problem. They do so by defining a cost
function that penalizes the difference between a modelled and an
observable quantity. Typically, the observable quantity includes the
elevation of the glacier surface
“Brinkerhoff-v2” (Brinkerhoff, unpublished; see Supplement Sect. S1.2 for details) includes three terms in the cost function. The first term quantifies the difference between modelled and observed surface elevations; the second penalizes strong spatial variations in bedrock elevations; and the third is used to impose zero ice thickness outside the glacier boundaries. If available, surface flow velocities are used to additionally invert for the basal traction field. The forward model is based on the Blatter–Pattyn approximation to the Stokes equations .
“VanPeltLeclercq” (adapted from ; see Supplement Sect. S1.17) has a cost function based on the difference of modelled and observed surface elevation as well. In contrast to Brinkerhoff-v2, which evaluates the cost function for steady-state surfaces, this approach allows for transient surface geometries to be taken into account. If available, the mismatch between calculated and observed surface velocities is used for both stopping the iteration procedure and for optimizing the model parameters affecting basal sliding and deformational flow. The implemented forward model “SIADYN” is part of the ICEDYN package
Sect. 3.3 in , relies on the vertically integrated shallow ice approximatione.g. , and includes Weertman-type sliding .“Fuerst” (Fürst et al., unpublished; Supplement Sect. S1.4) differs from the two above approaches in that the cost function is not linked to surface elevations. Instead, the function penalizes (i) negative thickness values, (ii) the mismatch between modelled and observed surface velocities, (iii) the mismatch between modelled and observed SMB, and (iv) strong spatial variations in ice thickness or surface velocities. The forward model is based on Elmer/Ice and the mass conservation approach of .
Mass conserving approaches
Methods appertaining to this category are based on the principle of mass conservation. If ice is treated as an incompressible medium, the corresponding continuum equation states that the ice flux divergence has to be compensated by the rate of ice thickness change and the climatic mass balance : The methods of this category estimate the distribution of both and and use that estimate to quantify the glacier's mass turnover along the glacier. The mass flux is then converted into ice thickness by prescribing some constitutive relation. Most often, an integrated form of Glen's flow law is used. The corresponding equation, solved for ice thickness, is then generally formulated as where is glacier ice thickness, the mean specific ice volume flux, the flow rate factor, Glen's flow law exponent, the ice density, the gravitational acceleration, the surface slope, and a factor accounting for valley shape, basal sliding, and parameter uncertainty. To avoid infinite for tending to zero, a minimal surface slope is often imposed, or is averaged over a given distance. Based on theoretical considerations , this distance should correspond to 10–20 times the ice thickness. In most cases, the ice thickness is first inferred along prescribed ice flow lines and then distributed across the glacier or ice cap by choosing a suitable interpolation scheme. Five of the models participating in ITMIX belong to this category.
“Farinotti”
also referred to as ITEM in the literature; see Supplement Sect. S1.3 evaluates Eq. () for manually digitized “ice flow catchments” and along manually predefined ice flow lines. The ice volume flux across individual cross sections is estimated by integrating the SMB field of the corresponding upstream area. The method was the first suggesting that the necessity of a steady-state assumption can be circumvented when directly estimating the difference rather than imposing constraints on the two terms separately. Many of the approaches within this and other categories have adopted this idea.“Maussion” (Maussion et al., unpublished; see Supplement Sect. S1.12) is based on the same approach as . By relying on the Open Global Glacier Model version 0.1.1
OGGM v0.1.1; , however, it fully automatizes the method, thus making it applicable at larger scales. Automatization is achieved by generating multiple flow lines according to the methods presented in . The major difference between Maussion/OGGM and the approaches Farinotti or Huss is that SMB is not prescribed as a linear function of elevation but with a temperature-index model driven by gridded climate data .“Huss”
HF-model; see Supplement Sect. S1.9 extends Eq. () to account for additional factors such as basal sliding, longitudinal variations in the valley shape factor, frontal ablation, and the influence of ice temperature and the climatic regime. The latter is achieved by imposing site-specific parameters. A major difference compared to other models in this category is that all calculations are performed on elevation bands. Mean elevation-band thickness is then extrapolated to a spatially distributed field by considering local surface slope and the distance from the glacier margin. The approach was the first ice thickness model that was applied to the global scale.“GCbedstress” (; see Supplement Sec. S1.7) shares many conceptual features with as well, but it differs in its implementation. Manually delineated flow sheds are transversely dissected by ladder-like “rungs” representing flux gates. Ice flow discharges – derived from integration of the mass contribution from the upstream area – are then applied to intervening cells by interpolation. “Raw” ice thickness is derived from Eq. () and the final ice thickness is smoothed by minimizing a cost function that negotiates a tradeoff between accepting the raw estimates or maximizing the smoothness of the solution.
“Morlighem” (; Supplement Sect. S1.13) was originally designed to fill gaps between ground-penetrating radar measurements over ice sheets. As such, it was cast as an optimization problem minimizing the misfit between observed and modelled thicknesses. Since no such measurements were provided within ITMIX, the method was applied without the minimization scheme. The method is thus purely based on mass conservation. Ice thickness is computed by requiring the ice flux divergence to be balanced by the rate of thickness change and the net surface and basal mass balances (see Eq. ). For the test cases for which no ice velocities were provided, the shallow ice approximation (see below) was used together with an assumption of no-sliding to convert the computed ice mass flux into ice thickness.
Shear-stress-based approaches
Methods of this category rely on the shallow ice approximation
“Linsbauer”
GlabTop; see Sect, S1.10 was the first proposing to use the empirical relation by to solve Eq. (). This is done by considering manually digitized branch lines, and determining within 50 m elevation bins. An ice thickness distribution is then obtained by interpolating the so-obtained ice thickness along several branch lines.“Machguth”
GlabTop2; see Supplement Sect. S1.11 is based on the same concept but overcomes the need of manually drawing branch lines by applying the relation at randomly selected grid cells. During this process, is determined from the average slope of all grid cells within a predefined elevation buffer. The final ice thickness distribution is derived from interpolation of the randomly selected points and the condition of zero ice thickness at the glacier margin. The procedure by which the random points are selected has an influence on the shape of the obtained bedrock topography.“RAAJglabtop2”
re-implemented from see Supplement Sect. S1.15 is an independent re-implementation of the Machguth model. Individual differences in terms of coding solutions may exist but were not assessed during the experiment.
Velocity-based approaches
As for models in Sect. , models in this category are based on an integrated form of Glen's flow law . Differently as in Eq. (), however, the flow law is either expressed as where and are the surface and basal ice flow velocities respectively, or such to replace in Eq. () with the depth-averaged profile velocity (since ). An assumption relating to or is then made, which usually implies postulating the existence of some coefficient or for which or holds true everywhere. Four models participating in ITMIX follow this strategy:
“Gantayat” (; see Supplement Sect. S1.5) solves Eq. () in elevation bands, and by substituting according to Eq. (). The central assumption is that . A final, gridded ice thickness distribution is then obtained by smoothing the elevation-band thickness with a kernel.
“RAAJgantayat” (re-implemented from ; see Supplement Sect. S1.14) follows exactly the same procedure. In fact, the method is an independent re-implementation of the Gantayat approach.
“Gantayat-v2” (adapted from ; Supplement Sect. S1.6) closely follows the original approach by . Instead of solving Eq. () for elevation bands, however, the equation is solved for discrete points along manually digitized branch lines. Interpolation between various branch lines is then used to obtain an ice thickness distribution. Note that none of the approaches based on the ideas by account for mass conservation.
“Rabatel” (Rabatel et al., unpublished; see Supplement Sect. S1.16) is based on the knowledge of surface velocities as well but includes some elements of the mass conserving approaches. Basically, the ice thickness along individual glacier cross sections is calculated by assuming that and by determining the ice volume flux for a given cross section from an estimate of the mass flux from the upstream area. Combining this information allows for the area of a given cross section to be computed, and the spatial distribution of along the cross section is used to determine the local ice thickness. The final ice thickness distribution is obtained by interpolation of various cross sections.
Other approaches
This last category includes two additional approaches that cannot be classified in any of the categories above:
“GCneuralnet” (; see Supplement Sect. S1.8) is based on artificial neural networks (ANNs) and thus neglects any kind of glacier physics. The basic assumption is that the bedrock topography underneath glacierized areas closely resembles nearby ice-free landscapes. In principle, the method uses an elevation-dependent azimuthal stencil to “paste” ice-free landscape sections into glacierized parts of a given region.
“Brinkerhoff” (; see Supplement Sect. S1.1) poses the problem of finding bedrock elevations in the context of Bayesian inference. The main hypothesis is that both bed elevations and ice flux divergence can be modelled as Gaussian random fields with assumed covariance but unknown mean. Depth-averaged velocities are found by solving the continuity equation (Eq. ) and by prescribing normally distributed likelihoods with known covariance around the available velocity, SMB, and data. A Metropolis–Hastings algorithm is then used to generate samples from the posterior distribution of bed elevations.
Results and discussion
In total, 189 different solutions were submitted to ITMIX (Table ). Three models (Farinotti, Huss, Linsbauer) were able to handle all 21 test cases, one model handled 20 cases (Machguth), and one model handled 19 cases (Maussion). Data availability was the main factor hindering the consideration of additional test cases. This is particularly true for the approaches (a) Brinkerhoff, Brinkerhoff-v2, Morlighem, and VanPeltLeclercq, requiring SMB at least; (b) Gantayat, Gantayat-v2, and RAAJgantayat, requiring surface velocity fields; (c) Fuerst, requiring SMB, , and velocity fields simultaneously; and (d) GCneuralnet, requiring surrounding ice-free terrain for algorithm training. For the approaches GCbedstress, RAAJglabtop2, and Rabatel, the time required for model set-up was a deterrent for considering additional test cases.
Between-model comparison
Overview of the range of solutions provided by the ensemble of models. The example refers to the test case “Unteraar”. The first four panels show composites for the (a) average, (b) spread, (c) minimal, and (d) maximal ice thickness distribution of the 15 submitted solutions. The model providing the minimal and maximal ice thickness for a given location is depicted in panels (e) and (f). Models that did not consider the specific test case are greyed out on the bottom right legend.
[Figure omitted. See PDF]
Share of “extreme results” provided by individual models. An extreme result is defined as either the minimum (MIN) or maximum (MAX) ice thickness occurring in the ensemble of solutions provided for a given test case. The share is based on test case area and assigns equal weights to all cases (a 10 % “fraction of MAX solutions provided” indicates, for example, that on average, the model generated the maximal ice thickness for 10 % of the area of any considered case). The number of test cases considered by individual models is given. Models are sorted according to the categories introduced in Sect. .
[Figure omitted. See PDF]
Locally, the solutions provided by the different models can differ considerably. As an example, Fig. provides an overview of the solutions generated for the test case “Unteraar” (the real-world case considered by the largest number of models). The large differences between the solutions are particularly evident when comparing the average composite ice thickness (i.e. the distribution obtained when averaging all solutions grid cell by grid cell; Fig. a) with the local ensemble spread (i.e. the spread between all solutions at a given grid cell; Fig. b). Often, the local spread is larger than the local average. This observation holds true for most of the other test cases as well (not shown).
Figure c and d provide insights into the composition of the ensemble spread by presenting the composites of the minimum and maximum provided thicknesses respectively. The models providing the most extreme solutions are depicted in Fig. e and f. In the Unteraar example, the approaches GCneuralnet and Fuerst tend to provide the smallest and largest local ice thickness of the ensemble respectively. For the specific case, closer inspection shows that the very low ice thicknesses estimated by GCneuralnet are associated with the debris-covered parts of the glacier and with the steep slopes delimiting these parts in particular. This is an artefact introduced by the specific set-up of the stencil used within the ANN. In fact, found that including steep ice in the definition of valley walls can be advantageous for ANN training. An unforeseen consequence is that steep ice walls close to debris-covered glacier ice are interpreted as valley walls as well, thus causing the surrounding ice thickness to be too thin. Flagging debris-covered glacier parts and treating them as a special case could be an option for alleviating this issue. For Fuerst, large ice thicknesses (locally exceeding 900 m) mostly occur in the accumulation area. This is the area for which no measured ice flow velocities were available, thus precluding precise model constraint. Uncertainties in this area are propagated downstream thus perturbing the inferred ice thickness even in areas with velocity information. For the particular test case, the approach also provides the minimal ice thickness for large areas, indicating that important oscillations are present in the estimated ice thickness field.
The overall tendency for individual models to provide “extreme” solutions is shown in Fig. . Two models (Rabatel and GCbedstress) seem to be particularly prone to predict large ice thicknesses, providing the largest ice thickness of the ensemble for 33 and 25 % of the area they considered. Although for Rabatel the basis of the statement is weak (only one test case considered), possible explanations lie in (a) the possible overestimation of the area contributing to the ice volume flux of individual profiles and (b) the assumed relation between depth-averaged and surface flow velocity (see Supplement Sect. S1.16). For GCbedstress the possible reasons are less clear. The no-sliding assumption included by the model (see Supplement Sect. S1.7) – which causes systematically higher thickness estimates than if sliding is assumed – could be a reason. The model, however, seems not to be particularly sensitive to it: assuming that half of the surface velocity is due to sliding decreases the mean estimated thickness by 13 % only (not shown).
Very small ice thicknesses are often predicted by the models Maussion and GCneuralnet. The two models provided the smallest ice thickness of the ensemble in 30 and 23 % of the considered area respectively. For Maussion, the result is mainly driven by the ice thickness predicted for ice caps (Academy, Austfonna, Devon) and large glaciers (Columbia, Elbrus). This is likely related to the applied calibration procedure (see Supplement Sect. S1.12), which is based on data included in GlaThiDa v1. The observations in that data set, in fact, mostly refer to smaller glaciers . For GCneuralnet, it can be noted that the smallest ice thicknesses are often predicted along the glacier centre lines (not shown). Besides the previously discussed issue related to steep ice in proximity of e.g. medial moraines, the ad hoc solution adopted to allow the ANN stencil to be trained (see Supplement Sect. S1.8) might be an additional cause.
Although the above observations provide insights into the general behaviour of individual models, it should be noted that a tendency of providing extreme results is not necessarily an indicator of poor model performance. Actual model performance, in fact, can only be assessed through comparison against direct observations (see next section).
Comparison to ice thickness measurements
The solutions submitted by individual models are compared to ice thickness measurements in Figs. and . For every glacier, the figures show one selected profile along and one across the main ice flow direction. The previously noted large spread between individual solutions re-emerges, as well as the tendency of individual models to produce rather large oscillations. The spread is particularly pronounced for ice caps (Academy, Austfonna, Devon) and for across-flow profiles (Fig. ).
It is interesting to note that the spread between models is not reduced when individual model categories are considered separately (see also Fig. S3). We interpret this as an indication that even models based on the same conceptual principles can be regarded as independent. Whilst this is not surprising for the minimization approaches since they are based on very different forward models (see Sect. ), or for the mass conserving approaches since they differ significantly in terms of implementation (Sect. ), the observation is rather unexpected for the shear-stress-based and velocity-based approaches (Sects. and respectively). The latter two categories, in fact, both rely on very similar concepts. Figure reveals that for shear-stress-based approaches the differences are particularly prominent for ice caps, in the vicinity of ice divides in particular. This seems to be related to the way individual models (a) subdivide individual ice caps, (b) treat the resulting boundaries, and (c) handle very small surface slopes. Also for the participating velocity-based approaches, apart from Rabatel all rely on the ideas of , and it seems that the implementation differences of conceptually similar approaches (see Gantayat and Gantayat-v2) are sufficient for considering the models as independent.
The above consideration is relevant when interpreting the average
solution of the model ensemble (thick green line in
Figs. and ): this average solution
matches the direct measurements relatively well for most glaciers, with
an average deviation below 10 % in 17 out of 21 cases. This
increase in prediction accuracy is expected for an unbiased model
ensemble. For a set of independent random realizations of the same
variable, in fact, Poisson's law of large numbers predicts the
average result to converge to the expected value (the “true bedrock”
in this case) with increasing number of realizations. The so-inferred
unbiasedness of the ensemble has an important consequence, as it
suggests that future estimates could be significantly improved when
relying on such model ensembles. Model weighting – such as used in
numerical weather prediction for example
The positive effect of averaging the results of individual models is best seen in Fig. . On average over the individual model solutions, the difference between modelled and measured ice thickness is % (1 estimate) of the mean glacier thickness (first box plot in the ALL group). This value reduces to % when the average composite solution is considered and is close to the value obtained when selecting the best single solution for every test case individually (third and second box plots of the group respectively).
Two notable exceptions in the above considerations are given by the test cases Unteraar and Tasman, for which the ensembles of solutions (15 and 11 solutions provided respectively) converge to a significantly smaller ice thickness than observed (median deviations of and respectively). Two common features that might partially explain the observation are (a) the significant debris cover of the two glaciers, which might bury ice thicker than what would be expected from the present-day SMB fields, and (b) the branched nature of the glaciers, which might be insufficiently captured by the models. Both hypotheses, however, are difficult to test further, as the remaining cases show very different morphological characteristics. An erroneous interpretation of the actual ice thickness measurements, in contrast, seems unlikely. This is particularly true for Unteraar, for which the reported quality of original radio-echo soundings is high and independent verifications through borehole measurements were performed .
“Urumqi” and “Washmawapta”, for which eight and six individual solutions
were provided, respectively, are the other two cases for which the
average ice thickness composite differs largely from the observations
(median deviations of 71 and 125 % respectively;
Figs. , , and ; recall
that because the “true” ice thickness is not known everywhere,
deviations are expressed in terms of mean thickness of the
average composite). For Washmawapta – a cirque glacier mostly fed
by steep ice-free headwalls – it is interesting
to note that the Farinotti approach is the only one predicting ice
thickness in the observed range. This suggests that the concept of
“ice flow catchments”, which is used in the approach for
accommodating areas outside the glacier margin that contribute to snow
accumulation
The comparison between Figs. and also suggests that, in general, the ice thickness distribution along flow is better captured than the distribution across flow. This is likely due to the combination of the fact that most participating approaches include considerations about mass conservation and that virtually all models include surface slope as a predictor for the local ice thickness. Indeed, these two factors have a stronger control on the along-flow ice thickness distribution than they have across flow.
The results also indicate that, compared to real-world cases, the ice thickness distribution of the three synthetic cases is better reproduced. On average over individual solutions, the difference to the correct ice thickness is % (Fig. ). This difference reduces to % for the average composites, i.e. to a 1 spread reduced by a factor of 2. Again, two factors provide the most likely explanation. On the one hand, the model used for generating the synthetic cases is built upon the same theoretical knowledge as the models used for generating the ice thickness estimates. On the other hand, and more importantly, the input data from which the ice thickness distribution is inferred are known without any uncertainty in the synthetic cases. The latter is in contrast to the data available for the real-world cases: whilst the provided DEMs, fields, and outlines can be considered of good quality, SMB fields are often the product of the inter- and extrapolation of sparse in situ measurements. The inconsistencies that may arise between and SMB, together with the previously mentioned discontinuities in the available velocity fields (see Sect. ), are obviously problematic for methods that use this information. Two additional observations that might be related to the better model performance in the synthetic cases are (1) that the no-sliding assumption adopted in most models was adequate for the considered synthetic cases, but does not hold true in the real-world ones, and (2) that synthetic glacier geometries were close to steady state. Testing the importance of the second consideration is not possible with the data at hand and would require the generation of transient synthetic geometries.
In relative terms, the average composite solutions seem to better predict (smaller interquartile range, IQR) the ice thickness distribution of ice caps than that of glaciers. In fact, the 1 deviations from the measurements for ice caps and glaciers are of % and % respectively (Fig. ). This might be surprising at first, but Fig. illustrates that for all three considered ice caps, the average composites are the results of a relatively small set (six or seven) of largely differing solutions. This issue is particularly evident for the ice cap interiors, for which two model clusters emerge, predicting extremely high and extremely low ice thicknesses respectively. The relatively small IQR of the ensemble mean, thus, appears to be rather fortuitous and calls for additional work in this domain. Note, moreover, that the relative accuracy is expressed in relation to the mean ice thickness. In absolute terms, the abovementioned values translate into average deviations on the order of m for ice caps and m for glaciers. Obviously, these values are strongly affected by the particular test cases included in the intercomparison and should not be expected to hold true in general.
To put the average model performance into context, the results are compared to a benchmark model based on volume-area scaling (last box plot in Fig. ). The “model” neglects spatial variations in thickness altogether and simply assigns the mean ice thickness predicted by a scaling relation to the whole glacier. For the scaling relation, we use the form , where (m) and (km) are the mean ice thickness and the area of the glacier respectively. The parameters and are set to and for glaciers and to and for ice caps . The values of parameter have a strong theoretical foundation , whilst is a free parameter. Since the relation between and is linear, it must be noted that as long as the distribution of is symmetric and as long as the value chosen for corresponds to the mean of that distribution, the results of the above relation correspond to the maximum likelihood estimator for the mean of the distribution of . In other words, randomly sampling different values for would increase the spread of our estimates but not its mean.
This simple model deviates from the measured ice thickness by %, which is a spread (bias) more than twice (4 times) as large as estimated for the average composites of the model ensemble ( %). This result is reassuring as it suggests that the individual models have actual skill in estimating both the relative ice thickness distribution and the total glacier volume of individual glaciers. The negative sign of the bias – which is consistent with results obtained from a comprehensive data set in Norway – should not be overinterpreted, since a different choice for could be used to alter it. It has again to be noted, however, that this would not reduce the spread in the results and that for real-world applications the value of is unknown. In general, a site-specific calibration of would be required.
Comparison between estimated and measured bedrock topographies. For every test case, a longitudinal profile showing the glacier surface (thick black line), the bedrock solution of individual models (coloured lines), the average composite solution (thick green line), and the available GPR measurements (black-encircled red dots) are given. The coloured squares on the upper left of the panels indicate which models provided solutions for the considered test case (see legend on the right margin for colour key). The location of the profiles are shown on the small map on the bottom left of the panels (red), and the beginning of the profile (blue dot) is to the left. Available ice thickness measurements are shown in grey.
[Figure omitted. See PDF]
Same as Fig. , for a series of cross-sectional profiles.
[Figure omitted. See PDF]
Effect of merging individual model solutions. For every test case, the distribution of the deviations between modelled and measured ice thicknesses is shown for the case in which (i) the individual point-to-point comparisons of all available solutions are pooled (grey box plots), (ii) only the provided single best solution is considered (blue box plots), and (iii) the deviations are computed from the average composite thickness of all model solutions (green box plots). Deviations are expressed relative to the mean ice thickness. The best single solution is computed by summing the ranks for the (a) average deviation, (b) median deviation, (c) interquartile range, and (d) 95 % confidence interval. The distributions of the deviations when grouping glaciers, ice caps, and synthetic glaciers separately are additionally shown, as are the results when grouping all test cases together (ALL). When forming the groups, point-to-point deviations for every test case are resampled so that every test case has the same weight. The last box plot to the right refers to the case in which the mean ice thickness is predicted by volume–area scaling (see Sect. ). The upper part of the panel provides the number of considered model solutions and the model providing the single best solution (see Fig. for colour key; the number refers to the model category according to Sect. ). Box plots show minimum and maximum values (crosses), the 95 % confidence interval (whiskers), the interquartile range (box), and the median (lines within box).
[Figure omitted. See PDF]
Individual model performance
The considerations in the previous section refer mainly to the average composite ice thickness provided by the ensemble of models. Running a model ensemble, however, can be very impractical. This opens the question on whether individual models can be recommended for particular settings, or whether a single best model can be identified.
To address this question, we propose two separate rankings. Both are based on the (I) average, (II) median, (III) interquartile range, and (IV) 95 % confidence interval (95 % CI) of the distribution of the deviations between modelled and measured ice thicknesses (Fig. ).
The first ranking considers the individual test cases separately. All models considering a particular test case are first ranked separately for the four indicators (I–IV). When a model does not include a particular test case, no ranks are assigned. For every model, the four indicators are then averaged individually over all test cases. The final rank is computed by computing the mean of these average ranks (Table ). The ranking rewards models with a consistently high performance over a large number of test cases.
The second ranking is only based on the average model performance. In this case, ranks for the above indicators (I–IV) are assigned to the ensemble of point-to-point deviations of the various models (last row of box plots in Fig. ; same weight between test cases ensured). The ranks for the four individual indicators are then averaged to obtain the overall rank (Table ). In contrast to the first option, this ranking does not consider the test cases individually and does not account for the number of considered test cases. A model considering only one test case but performing perfectly on it, for example, would score highest.
The ranking result of every model on a case-by-case basis is given in Table S3. The distributions of the deviations between modelled and measured ice thicknesses for every model and considered test case are given in Figs. and S2. Similarly as noted during the discussion of the last section (Sect. ), the rankings do not suggest a performance advantage in any of the five model categories introduced in Sect. .
Combined over the two rankings, the model Brinkerhoff-v2 scores highest (third and first rank respectively). The good score is mainly driven by the comparatively small model spread (IQR and 95 % CI) and bias (Table ). The small model bias ( % average deviation), however, arises from a partial compensation between positive bias for glaciers ( %) and negative bias for the synthetic cases ( %) (Table ). Unfortunately, the model did not consider any ice cap, thus hampering any statement on model performance in this particular setting. Ice caps were not considered mainly because of the absence of the necessary data.
Apart from the model Brinkerhoff-v2, the first positions in the first ranking are occupied by models that consider a large number of test cases (Table ). The model by Maussion is rated highest. Similar to Brinkerhoff-v2, the good result is driven by the small IQRs and 95 % CIs, in particular for glaciers and ice caps. In the second ranking, the model is severely penalized (11th rank) for its large bias ( % on average; Table ). The bias is particularly prominent in the case of ice caps and the synthetic cases ( and % average deviation respectively) and may be related to the fact that the Maussion model was developed and calibrated by using data from valley glaciers only. For the synthetic cases in particular, the calibration with real-world glaciers (i.e. cases that include sliding) seems to be a likely explanation for a systematic underestimation of the ice thickness. This, however, appears to be only a partial explanation, as such a negative bias is apparent for most approaches, i.e. also for approaches that explicitly assumed no sliding (e.g. GCbedstress, Morlighem; see Supplement Sect. S1).
In general, the model bias can be interpreted as an indicator for the performance of the models in reproducing the total glacier ice volume. The latter is not discussed explicitly as the computation of a “measured volume” would need the available measurements to be interpolated over large distances. Seven of the considered models show a bias of less than 8 % (Table ). An interesting case in this respect is given by the model by , which yields small biases ( and %) for both considered implementations (Gantayat and RAAJgantayat respectively). The relatively low overall ranks assigned to these models (ranks 10 and 14 in the first ranking, ranks 3 and 12 in the second respectively) are an expression of the relatively small number of considered test cases (first ranking) and the relatively large model spread (second ranking). Of interest is also the observation that the version of the model considering multiple flow lines (Gantayat-v2) yields a significantly higher bias ( % on average) than the approach based on elevation bands, despite a moderate decrease in model spread. The increase is particularly visible for real-world glaciers, for which the bias changes from to %. This might hint at the difficulty in correctly subdividing a given glaciers into individual flow lines and could be an indication that the rather mechanistic procedure used in this case (see Supplement Sect. S1.6) is insufficient for achieving a sensible subdivision.
The difficulty in correctly interpreting the overall model bias is well illustrated in the case of the Linsbauer model: the model yields the smallest bias over the entire set of considered test cases ( % on average) but is the result of a compensation between (a) a moderate negative bias for glaciers and the synthetic test cases (both %) and (b) a large positive bias for ice caps ( %).
Together with Brinkerhoff-v2, the model Farinotti is the second one included in the first five places of both rankings (ranks 4 and 5 respectively; Tables and ). The relatively high ranking is due to a combination of comparatively high model performance (small bias and spread) and large number of considered test cases. The consideration of all test cases, however, should not be interpreted as capability of handling large samples of glaciers in this case. The application of the model, in fact, requires a significant amount of manual input (see Sect. ). This is in contrast to the fully automated methods of Maussion, Huss, and Machguth. In this respect it is interesting to note that the model by Huss slips from the second rank in the first ranking to the eighth in the second one. The relatively low score in the second ranking is mainly an expression of the comparatively large confidence intervals (Table ). Combined over the two rankings, however, the model can be considered as the best amongst the fully automated approaches.
The model GCbedstress (fifth and sixth in the two rankings) ranks highest when only ice caps are considered. The average deviation of % indeed suggests a very high model performance. However, it has to be noted that the result is based on one test case only (Table ). For the models considering more ice caps, the results are heterogeneous and difficult to interpret, as models showing small IQRs show large bias, and vice versa (Table ).
The model GCneuralnet is found at the other end of the ranking (penultimate and last ranks respectively). The average deviation of % highlights both the large bias and large spread of the estimates. Obviously, the performance of approaches based on ANNs are highly dependent on the data set used for algorithm training. The large deviations might therefore be an expression of the issues encountered with the provided DEMs (see Supplement Sect. S1.8) rather than an indication of generally low model performance. As already noted, however, the general absence of ice-free analogues for ice caps or crater glaciers makes the approach unsuitable for this kind of morphologies.
An interesting result emerges when considering the IQRs and 95 % CIs in the synthetic test cases (see Table ): approaches that include SMB, or velocity information in addition to the glacier outline and the DEM of the surface (e.g. approaches by Brinkerhoff, Fuerst, Morlighem, or VanPeltLeclerq) yield the smallest spreads around the average deviation (IQR 22 % for all mentioned models). This is in marked contrast to the real-world cases, in which the IQRs are about 5 times larger (average IQR for the same models 105 %), and similar considerations apply when analysing the 95 % CIs. As already noted, this is most likely linked to the differences in data quality. Whilst the input data for the synthetic cases are perfectly known, the data available for the real-world cases were necessarily retrieved from various independent data sources (see Table ). This often led to problems in the mutual consistency of the surface fields and caused particular difficulties to those models requiring all of the information. The capability of accounting for observational uncertainties, as in the Brinkerhoff approach for example (see Supplement Sect. S1.1), hence seems to be an important prerequisite when handling real-world cases. Similarly, having access to reliable uncertainty estimates for any particular data set would be important. Emphasis has to be put in this domain if significant advances are to be achieved.
Ranking of individual models based on case-by-case performance. Numbers in front of model names refer to the categories introduced in Sect. . Displayed values are mean ranks for the average (avg), median (med), interquartile range (IQR), and 95 % confidence interval (95 %) of the deviations from ice thickness measurements. For every group (glaciers, ice caps, and synthetic cases), the average of the above three ranks is given (AVG). Values for ALL correspond to the average of the three groups. The three best average results for every group are given in bold. is the number of considered test cases per model.
Glaciers only | Ice caps only | Synthetic only | ALL | ||||||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Model – version | avg | med | IQR | 95 % | AVG | avg | med | IQR | 95 % | AVG | avg | med | IQR | 95 % | AVG | avg | med | IQR | 95 % | AVG | |
2 Maussion | 19 | 4.6 | 4.3 | 2.9 | 3.0 | 3.7 | 5.3 | 5.3 | 1.3 | 1.0 | 3.2 | 13.0 | 11.0 | 12.5 | 10.5 | 11.8 | 5.6 | 5.2 | 3.7 | 3.5 | 4.5 |
2 Huss | 21 | 4.5 | 4.3 | 4.3 | 3.5 | 4.1 | 2.0 | 2.0 | 4.3 | 5.0 | 3.3 | 4.7 | 3.3 | 12.0 | 12.7 | 8.2 | 4.1 | 3.8 | 5.4 | 5.0 | 4.6 |
1 Brinkerhoff-v2 | 10 | 3.6 | 3.6 | 4.6 | 5.3 | 4.2 | – | – | – | – | – | 7.7 | 6.7 | 3.3 | 4.3 | 5.5 | 4.8 | 4.5 | 4.2 | 5.0 | 4.6 |
2 Farinotti | 21 | 3.7 | 3.9 | 5.5 | 5.6 | 4.7 | 2.7 | 3.3 | 2.7 | 3.0 | 2.9 | 5.3 | 6.0 | 9.0 | 10.7 | 7.8 | 3.8 | 4.1 | 5.6 | 6.0 | 4.9 |
2 GCbedstress | 15 | 4.6 | 4.8 | 5.7 | 5.5 | 5.2 | 2.0 | 1.0 | 1.0 | 5.0 | 2.2 | 4.0 | 5.3 | 8.3 | 7.3 | 6.2 | 4.3 | 4.7 | 5.9 | 5.9 | 5.2 |
1 VanPeltLeclercq | 10 | 4.1 | 4.3 | 7.1 | 7.9 | 5.9 | – | – | – | – | – | 4.3 | 4.0 | 5.3 | 4.7 | 4.6 | 4.2 | 4.2 | 6.6 | 6.9 | 5.5 |
3 Linsbauer | 21 | 5.4 | 5.3 | 3.5 | 3.9 | 4.5 | 6.7 | 6.7 | 6.7 | 6.7 | 6.7 | 11.0 | 8.3 | 9.0 | 8.0 | 9.1 | 6.4 | 6.0 | 4.8 | 4.9 | 5.5 |
3 Machguth | 20 | 5.9 | 5.6 | 4.1 | 3.7 | 4.8 | 4.5 | 4.5 | 5.0 | 4.5 | 4.6 | 8.7 | 10.7 | 10.0 | 10.3 | 9.9 | 6.2 | 6.2 | 5.0 | 4.8 | 5.5 |
5 Brinkerhoff | 3 | – | – | – | – | – | – | – | – | – | – | 9.7 | 9.3 | 2.3 | 2.0 | 5.8 | 9.7 | 9.3 | 2.3 | 2.0 | 5.8 |
4 RAAJgantayat | 5 | 4.8 | 4.8 | 7.8 | 9.8 | 6.8 | – | – | – | – | – | 4.0 | 3.0 | 6.0 | 5.0 | 4.5 | 4.6 | 4.4 | 7.4 | 8.8 | 6.3 |
3 RAAJglabtop2 | 4 | 7.0 | 6.7 | 6.7 | 6.3 | 6.7 | – | – | – | – | – | 1.0 | 3.0 | 9.0 | 8.0 | 5.2 | 5.5 | 5.8 | 7.2 | 6.8 | 6.3 |
1 Fuerst | 5 | 13.0 | 14.0 | 15.0 | 15.0 | 14.2 | 4.0 | 6.0 | 2.0 | 2.0 | 3.5 | 7.7 | 8.0 | 1.3 | 3.3 | 5.1 | 8.0 | 8.8 | 4.2 | 5.4 | 6.6 |
2 Morlighem | 10 | 6.7 | 6.7 | 6.0 | 4.9 | 6.1 | – | – | – | – | – | 12.0 | 11.3 | 4.3 | 3.7 | 7.8 | 8.3 | 8.1 | 5.5 | 4.5 | 6.6 |
4 Gantayat | 7 | 4.0 | 4.0 | 11.0 | 10.0 | 7.2 | 3.5 | 2.0 | 5.5 | 4.0 | 3.8 | 8.0 | 7.7 | 8.7 | 9.0 | 8.3 | 5.6 | 5.0 | 8.4 | 7.9 | 6.7 |
4 Gantayat-v2 | 7 | 7.0 | 8.0 | 3.0 | 6.5 | 6.1 | 2.5 | 2.5 | 4.0 | 3.0 | 3.0 | 11.0 | 10.0 | 9.3 | 8.7 | 9.8 | 7.4 | 7.3 | 6.0 | 6.4 | 6.8 |
5 GCneuralnet | 10 | 7.1 | 7.9 | 7.3 | 6.9 | 7.3 | – | – | – | – | – | 9.7 | 14.0 | 14.0 | 14.7 | 13.1 | 7.9 | 9.7 | 9.3 | 9.2 | 9.0 |
4 Rabatel | 1 | – | – | – | – | – | – | – | – | – | – | 5.0 | 5.0 | 16.0 | 15.0 | 10.2 | 5.0 | 5.0 | 16.0 | 15.0 | 10.2 |
Ranking of individual models based on average model performance. Numbers in front of model names refer to the categories introduced in Sect. . Displayed values are the average (avg), median (med), interquartile range (IQR), and 95 % confidence interval (95 %) of the percental deviations from ice thickness measurements. The three best results for every column is given in bold. is the number of considered test cases per model. AVG is the average of the ranks assigned to the values in the three ALL columns.
Glaciers only | Ice caps only | Synthetic only | ALL | |||||||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|---|
Model – version | avg | med | IQR | 95 % | avg | med | IQR | 95 % | avg | med | IQR | 95 % | avg | med | IQR | 95 % | AVG | |
1 Brinkerhoff-v2 | 10 | – | – | – | – | 3.8 | ||||||||||||
5 Brinkerhoff | 3 | – | – | – | – | – | – | – | – | 5.8 | ||||||||
4 Gantayat | 7 | 6.0 | ||||||||||||||||
1 VanPeltLeclercq | 10 | – | – | – | – | 6.0 | ||||||||||||
2 Farinotti | 21 | 6.5 | ||||||||||||||||
2 GCbedstress | 15 | 6.5 | ||||||||||||||||
3 Linsbauer | 21 | 8.0 | ||||||||||||||||
2 Huss | 21 | 8.5 | ||||||||||||||||
4 Gantayat-v2 | 7 | 8.8 | ||||||||||||||||
1 Fuerst | 5 | 9.0 | ||||||||||||||||
2 Maussion | 19 | 10.5 | ||||||||||||||||
4 RAAJgantayat | 5 | – | – | – | – | 10.8 | ||||||||||||
3 Machguth | 20 | 11.0 | ||||||||||||||||
4 Rabatel | 1 | – | – | – | – | – | – | – | – | 11.5 | ||||||||
3 RAAJglabtop2 | 4 | – | – | – | – | 12.2 | ||||||||||||
2 Morlighem | 10 | – | – | – | – | 12.5 | ||||||||||||
5 GCneuralnet | 10 | – | – | – | – | 15.8 |
Difference between estimated and measured ice thicknesses. For every test case (rows) and every model (columns; ordered according to the categories defined in Sect. ), the distribution of the point-by-point deviations between estimated and measured ice thicknesses is shown. Differences are expressed relative to the mean ice thickness (e.g. a 100 % deviation means that the modelled ice thickness deviates from the measured one by one mean ice thickness). Circles indicate that no solution was submitted. Box plots show the 95 % confidence interval (whiskers), the interquartile range (box), and the median (lines within box). The box plots are squared to facilitate the comparison within models and within test cases. Note the scale bars in the bottom right corner.
[Figure omitted. See PDF]
Conclusions
ITMIX was the first coordinated intercomparison of approaches that estimate the ice thickness of glacier and ice caps from surface characteristics. The goal was to assess model performance for cases in which no a priori information about ice thickness is available. The experiment included 15 glaciers and 3 ice caps spread across a range of different climatic regions, as well as 3 synthetically generated test cases.
ITMIX attracted 13 research groups with 17 different models that can be classified into (1) minimization approaches, (2) mass conserving approaches, (3) shear-stress-based approaches, (4) velocity-based approaches, and (5) other approaches outside of the previous categories. The 189 solutions submitted in total provided insights into the performance of the various models and the accuracies that can be expected from their application.
The submitted results highlighted the large deviations between individual solutions and even between solutions of the same model category. The local spread often exceeded the local ice thickness. Caution is thus required when interpreting the results of individual models, especially if they are applied to individual sites. Substantial improvements in terms of accuracy, however, could be achieved when combining the results of different models. Locally, the mean deviation between an average composite solution and the measured ice thickness was on the order of % of the mean ice thickness (1 estimate). This hints at the random nature of individual model errors, and suggests that ensembles of models could help in improving the estimates. For applications at the large scale – such as the estimation of the ice thickness distribution of an entire mountain range and beyond – reducing the uncertainties through such a strategy will be challenging, as only few models are currently capable of operating at the regional or global scale.
Although no clear pattern emerged for the performance of individual model categories, the intercomparison allowed statements about the performance of individual models. The model Brinkerhoff-v2 was detected as the best single model, with average deviations for real-world glaciers on the order of %. Some caution has to be expressed, however, since the model considered only about half of the provided test cases and was not applied to any ice cap. The model Huss scored highest amongst the automated methods capable of handling large sets of glaciers. With average deviations of %, the approach ranged mid-way when considering point-to-point deviations from measurements. For ice caps, the model GCbedstress showed very promising results (average deviations of %), although generalizing this observation would be speculative, as the approach considered only one test case. For ice caps, particularly large differences between individual models were detected in the proximity of ice divides. This calls for improvements in how models treat these regions.
Somewhat surprisingly, models that include SMB, , or surface flow velocity fields in addition to the glacier outline and DEM did not perform better when compared to approaches requiring less data, in particular for real-world cases. Inconsistencies between available data sets – which are often acquired with very different techniques, spatial footprints, and temporal resolutions – appeared to be the most likely cause. Although it must be noted that the set of considered synthetic cases was generated upon the same theoretical knowledge as the approaches used for ice thickness inversion, the generally better model performance for these cases supports the previous hypothesis. In the synthetic cases, in fact, input data were known precisely, i.e. without observational errors. This highlights the importance for mutually consistent data sets and suggests that improved observational capabilities could help to improve the performance of the next generation of ice thickness estimation methods. Similarly, improving the model's capability of taking into account uncertainties in the input data should be considered a priority.
Besides improved data concerning glacier surface characteristics, a key for developing a new generation of ice thickness estimation models will be the data base against which the models can be calibrated and validated. The data utilized within ITMIX are available as a supplement to this paper (see link at the end of this section), but a much larger effort is ongoing in collaboration with the World Glacier Monitoring Service. With the initiation of the Glacier Thickness Database , the first steps towards a freely accessible, global database of ice thickness measurements have been undertaken. We anticipate that this effort, together with a second phase of ITMIX targeting at how to best integrate sparse thickness measurements to improve model performance, will foster the development of improved ice thickness estimation approaches.
To summarize, in order to improve available thickness estimates for glacier and ice caps, we make the following recommendations:
Ensemble methods comprising a variety of independent, physically based approaches should be considered. This is likely to be a more effective strategy than focusing on one individual approach.
Models should be extended to take observational uncertainty into account. The Bayesian framework used by , for example, showed promising results in this respect.
The increasing availability of surface ice flow velocity data
e.g. should be exploited. In this context, the previously mentioned necessity of accounting for observational uncertainty is crucial.The way individual models treat ice divides has to be improved. This is important when addressing ice caps and glacier complexes and to ensure consistency between subsurface topographies of adjacent ice masses.
Efforts for centralizing available ice thickness measurements should be strengthened. Initiatives such as the GlaThiDa database launched by the World Glacier Monitoring Service are essential for new generations of ice thickness models to be developed and validated.
All data used within ITMIX, as well as the individual solutions
submitted by the participating models, can be found at
The Supplement related to this article is available online at
D. Farinotti and H. Li designed ITMIX. D. Farinotti coordinated the experiment, performed the model evaluations, prepared the figures, and wrote the paper with contributions by D. J. Brinkerhoff, G. K. C. Clarke, J. F. Fürst, P. Gantayat, F. Gillet-Chaulet, M. Huss, P. W. Leclercq, A. Linsbauer, H. Machguth, F. Maussion, M. Morlighem, A. Rabatel, R. Ramsankaran, O. Sanchez, W. J. J. van Pelt. D. J. Brinkerhoff, G. K. C. Clarke, J. J. Fürst, H. Frey, P. Gantayat, F. Gillet-Chaulet, C. Girard, M. Huss, P. W. Leclercq, A. Linsbauer, H. Machguth, C. Martin, F. Maussion, M. Morlighem, C. Mosbeux, A. Pandit, A. Portmann, A. Rabatel, R. Ramsankaran, O. Sanchez, S. S. Kumari, and W. J. J. van Pelt participated in the experiment and provided modelling results. B. Anderson, L. M. Andreassen, T. Benham, D. Binder, J. A. Dowdeswell, D. Farinotti, A. Fischer, K. Helfricht, S. Kutuzov, I. Lavrentiev, H. Li, R. McNabb, and P. A. Stentoft provided the data necessary for the real-world test cases. C. Martin and D. Farinotti generated the synthetic test cases. M. Huss, L. M. Andreassen, and G. H. Gugmundsson provided advice during experimental set-up and evaluation.
The authors declare that they have no conflict of interest.
Acknowledgements
ITMIX was made possible by the International Association of Cryospheric Sciences (IACS). We are indebted to the European Space Agency's project Glaciers_cci as well as to Elisa Bjerre, Emiliano Cimoli, Gwenn Flowers, Marco Marcer, Geir Moholdt, Johnny Sanders, Marius Schäfer, Konrad Schindler, Tazio Strozzi, and Christoph Vogel for providing data necessary for the experiment. D. Farinotti acknowledges support from the Swiss National Science Foundation (SNSF). M. Morlighem and C. Girard were funded by the National Aeronautics and Space Administration, Cryospheric Sciences Program, grant NNX15AD55G. P. W. Leclercq was funded by the European Research Council under the European Union's Seventh Framework Programme (FP/2007–2013)/ERC grant agreement no. 320816. J. A. Dowdeswell acknowledges the UK Natural Environment Research Council for supporting the airborne radio-echo sounding surveys of the three Arctic ice caps included in the experiment (grants GR3/4663, GR3/9958, GR3/12469, and NE/K004999/1). A. Rabatel and O. Sanchez acknowledge the contribution of Labex OSUG@2020 (Investissements d'avenir – ANR10 LABX56) and the VIP_Mont-Blanc project (ANR-14-CE03-0006-03). F. Gillet-Chaulet and C. Mosbeux acknowledge support from French National Research Agency (ANR) under the SUMER (Blanc SIMI 6) 2012 project ANR-12BS06-0018. The constructive comments by Shin Sugiyama and an anonymous referee helped to improve the manuscript.Edited by: A. Vieli Reviewed by: S. Sugiyama and one anonymous referee
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
© 2017. This work is published under http://creativecommons.org/licenses/by/3.0/ (the “License”). Notwithstanding the ProQuest Terms and Conditions, you may use this content in accordance with the terms of the License.
Abstract
Knowledge of the ice thickness distribution of glaciers and ice caps is an important prerequisite for many glaciological and hydrological investigations. A wealth of approaches has recently been presented for inferring ice thickness from characteristics of the surface. With the Ice Thickness Models Intercomparison eXperiment (ITMIX) we performed the first coordinated assessment quantifying individual model performance. A set of 17 different models showed that individual ice thickness estimates can differ considerably – locally by a spread comparable to the observed thickness. Averaging the results of multiple models, however, significantly improved the results: on average over the 21 considered test cases, comparison against direct ice thickness measurements revealed deviations on the order of
You have requested "on-the-fly" machine translation of selected content from our databases. This functionality is provided solely for your convenience and is in no way intended to replace human translation. Show full disclaimer
Neither ProQuest nor its licensors make any representations or warranties with respect to the translations. The translations are automatically generated "AS IS" and "AS AVAILABLE" and are not retained in our systems. PROQUEST AND ITS LICENSORS SPECIFICALLY DISCLAIM ANY AND ALL EXPRESS OR IMPLIED WARRANTIES, INCLUDING WITHOUT LIMITATION, ANY WARRANTIES FOR AVAILABILITY, ACCURACY, TIMELINESS, COMPLETENESS, NON-INFRINGMENT, MERCHANTABILITY OR FITNESS FOR A PARTICULAR PURPOSE. Your use of the translations is subject to all use restrictions contained in your Electronic Products License Agreement and by using the translation functionality you agree to forgo any and all claims against ProQuest or its licensors for your use of the translation functionality and any output derived there from. Hide full disclaimer
Details













1 Laboratory of Hydraulics, Hydrology and Glaciology (VAW), ETH Zurich, Zurich, Switzerland; Swiss Federal Institute for Forest, Snow and Landscape Research (WSL), Birmensdorf, Switzerland
2 Geophysical Institute, University of Alaska Fairbanks, Fairbanks, AK, USA
3 Department of Earth, Ocean and Atmospheric Sciences, University of British Columbia, Vancouver, BC, Canada
4 Institute of Geography, Friedrich Alexander University of Erlangen-Nuremberg (FAU), Erlangen, Germany
5 Department of Geography, University of Zurich, Zurich, Switzerland
6 Divecha Centre for Climate Change, Indian Institute of Science, Bangalore, India
7 Institut des Géosciences de l'Environnement (IGE), Université Grenoble Alpes, CNRS, IRD, Grenoble, France
8 Department of Earth System Science, University of California Irvine, Irvine, CA, USA
9 Laboratory of Hydraulics, Hydrology and Glaciology (VAW), ETH Zurich, Zurich, Switzerland; Department of Geosciences, University of Fribourg, Fribourg, Switzerland
10 Department of Geosciences, University of Oslo, Oslo, Norway
11 Department of Geography, University of Zurich, Zurich, Switzerland; Department of Geosciences, University of Fribourg, Fribourg, Switzerland
12 British Antarctic Survey, Natural Environment Research Council, Cambridge, UK
13 Institute of Atmospheric and Cryospheric Sciences, University of Innsbruck, Innsbruck, Austria
14 Department of Civil Engineering, Indian Institute of Technology, Bombay, India
15 Swiss Federal Institute for Forest, Snow and Landscape Research (WSL), Birmensdorf, Switzerland
16 Institute for Marine and Atmospheric Research (IMAU), Utrecht University, Utrecht, the Netherlands
17 Arctic Technology Centre ARTEK, Technical University of Denmark, Kongens Lyngby, Denmark
18 Department of Earth Sciences, Uppsala University, Uppsala, Sweden
19 Antarctic Research Centre, Victoria University of Wellington, Wellington, New Zealand
20 Scott Polar Research Institute, University of Cambridge, Cambridge, UK
21 Central Institute for Meteorology and Geodynamics (ZAMG), Vienna, Austria
22 Institute for Interdisciplinary Mountain Research, Austrian Academy of Sciences, Innsbruck, Austria
23 Laboratory of Glaciology, Institute of Geography, Russian Academy of Science, Moscow, Russia
24 Geophysical Institute, University of Alaska Fairbanks, Fairbanks, AK, USA; Department of Geosciences, University of Oslo, Oslo, Norway
25 State Key Laboratory of Cryospheric Sciences, Tian Shan Glaciological Station, CAREERI, CAS, Lanzhou, China
26 Norwegian Water Resources and Energy Directorate (NVE), Oslo, Norway