Land subsidence caused by groundwater overpumping threatens the sustainable development in Beijing. Hazard assessments of land subsidence can provide early warning information to improve prevention measures. However, uncertainty and fuzziness are the major issues during hazard assessments of land subsidence. We propose a method that integrates fuzzy set theory and weighted Bayesian model (FWBM) to evaluate the hazard
probability of land subsidence measured by Interferometric Synthetic Aperture Radar (InSAR) technology. The model is structured as a directed
acyclic graph. The hazard probability distribution of each factor triggering land subsidence is determined using Bayes' theorem. Fuzzification of the factor significance reduces the ambiguity of the relationship between the factors and subsidence. The probability of land subsidence hazard under multiple factors is then calculated with the FWBM. The subsidence time series obtained by InSAR is used to infer the updated posterior probability. The upper and middle parts of the Chaobai River alluvial fan are taken as a case-study site, which locates the first large-scale emergency groundwater resource region in the Beijing plain. The results show that rates of groundwater level decrease more than 1 m yr

The continuous overpumping of groundwater results in dramatic piezometric drawdown and induces regional land subsidence. Many countries such as China, Mexico, Italy, USA, Spain, and Iran (Teatini et al., 2005; Tomás et al., 2010; Galloway and Burbey, 2011; Chaussard et al., 2014; Zhu et al., 2015; Motagh et al., 2017) have reported land subsidence due to groundwater pumping. Land subsidence is a complex process influenced by the anthropogenic activities and geological environment. The anthropogenic extraction of groundwater from aquifer is the principal triggering factor because the rapid decline in the groundwater level leads to the compaction of the aquitard, and consequently the land surface subsides (Xue et al., 2005; Zhu et al., 2015; Gao et al., 2018). Although the drops of groundwater level in aquifers lead to land subsidence, this process is also controlled by the geological environment, which includes hydrologic and geomechanical conditions (Zhu et al., 2015, 2017; Gambolati and Teatini, 2015). Terzaghi's effective stress principle shows that a decrease in the pore pressure leads to an increase in the effective stress, which consequently induces land subsidence. The value of land subsidence is related to the soil mechanical properties (Bonì et al., 2020). Land subsidence threatens the environment and cause economic losses, such as municipal infrastructure damage, building fracture, and increasing flood risk (Wu et al., 2017; Peduto et al., 2017; Wang et al., 2018). Assessments of the subsidence hazard are necessary for risk prevention.

Recent studies have analyzed the hazards of land subsidence to buildings using field investigation and Interferometric Synthetic Aperture Radar (InSAR) (Julio-Miranda et al., 2012; Tomás et al., 2012; Bhattarai and Kondoh, 2017; Peduto et al., 2017). Some studies assessed the regional subsidence hazard and identified the areas with high risk using spatial modeling methods based on geographic information system (GIS; Huang et al., 2012; Bhattarai and Kondoh, 2017) and multi-objective decision-making (Jiang et al., 2012; Yang et al., 2013) or advanced methods along with fuzzy set theory (Mohebbi Tafreshi et al., 2019). The latter require expert score, which is subjective, and the produced risk level map is also qualitative. Mohebbi Tafreshi et al. (2019) adopted fuzzy functions to standardize parameters with different dimensions, which did not address the fuzziness of parameter importance. Land subsidence is a geological problem depending on various uncertain natural variables. Hazard assessments are associated with an inherent degree of uncertainty, which includes aleatoric aspects due to randomness and epistemic aspects related to insufficient information (Kiureghiana and Ditlevsen, 2009). Aleatoric uncertainty may come from the randomness of natural variables and data quality (Matthies, 2007). Epistemic uncertainty may be generated by inadequate expert knowledge, the selection of evaluation factors, and their quantitative effects on a hazard (Vilares and Kording, 2011). The methods mentioned above do not fully consider these uncertainties.

To avoid these disadvantages, some researchers have adopted more objective methods, such as evidence reasoning methods (Chen et al., 2014; Pradhan et al., 2014), physically based numerical models (Xu et al., 2015; Dai et al., 2016; Jia et al., 2018; Sundell et al., 2019), and machine learning (Park et al., 2012; Yi et al., 2017). However, numerical models require detailed geo-hydrological and geological parameters, which are generally difficult to collect (Smith and Knight, 2019). Evidence reasoning has strict combination rules and becomes exponentially intensive from the computational point of view as the number of elements increases, although it can handle both certain and uncertain information regardless of whether the information is complete or incomplete and precise or imprecise (Dai et al., 1999). Furthermore, current studies mainly focus on the identification and classification of hazard levels without any quantitative analysis of the subsidence hazard.

The main challenges in the field are to reduce the uncertainty of hazard assessments and to find an objective and effective method to assess hazard areas and risks. The mentioned uncertainty can be represented with probabilities. Bayesian models (BMs) are powerful probability approaches to deal with uncertainty (Vilares and Kording, 2011). BMs have been widely applied in disaster hazard assessments, such as flooding hazard and pipeline damage assessments (Liu et al., 2017; Zhang et al., 2016).

This paper proposes a fuzzy weighted Bayesian model (FWBM) that combines a weighted Bayesian model (WBM) and fuzzy set theory to evaluate the subsidence hazard probability and analyze the hazard probability for different rates of groundwater level change. The posterior probability is calculated using InSAR-derived land subsidence as model input to reduce the epistemic uncertainty. This new approach is applied in the Chaobai River alluvial fan in Beijing, China, where the first large-scale emergency groundwater resource region (EGRR) supplying water to Beijing is located. The hazard probability inferenced with the proposed relatively objective method can offer scientific support for the prediction of land subsidence hazard.

InSAR is a microwave remote sensing technique that records the phase and amplitude of the electromagnetic waves of ground objects. The phase information is used to inversely determine the subsidence. Persistent scatterer InSAR (PS-InSAR) is the most popular method for detecting time
series of land movements by calculating the differential interferometric
phase on PS points with a millimetric accuracy (Sun et al., 2017). The
density of PS points can reach 450 per km

PS-InSAR processing includes four steps (Zhu et al., 2015):

master image selection,

construction of a series of interferograms,

PS point selection, and

unwrapping phase.

BMs consider the probability distribution of random variables and can infer
the posterior probability based on weakly informative prior probability to
address uncertainty (Weise and Woger, 1993). A BM consists of a set of random variables with complex causalities that can be plotted using a directed acyclic graph (DAG), where random variables are represented as eigenvector nodes (Ren et al., 2009). In DAG (Fig. 1a), the hazard factors related to land subsidence are parent nodes (

Bayes' theorem can be used to infer posterior probability distributions from
weakly informative prior probability distributions through observed results
(Verdin et al., 2019). The approach is formulated as follows:

For multiple factors in DAG, the jointly probability of multiple conditions
can be expressed as

The conditional independence assumption must be met for BMs. This assumption generally can be strictly met in geological studies (Webb and Pazzan, 1998). WBMs use weighted assessment variables to relax the independence assumption and address the different contributions of parent nodes to child node (Webb and Pazzan, 1998). It has been widely used in hazard-related analyses (Tang et al., 2018). However, the weight of each factor, such as the piezometric decline, soil compressibility, or high static loads on the land surface, is determined by its importance to land subsidence and is usually qualitative and fuzzy (Chen et al., 2016; Li et al., 2017). The fuzziness of the factor contribution to land subsidence may cause ambiguity in weighting when their importance must be determined. These deviations can be modeled with fuzzy set theory, which expresses fuzziness through a membership function to objectively describe the relationship between land subsidence and the factors (Mentes and Helvacioglu, 2011). Therefore, the fuzzification of factor importance is applied to eliminate ambiguity.

We developed the FWBM by extending Eq. (2) with the introduction of a fuzzy-based weight. The probability of random variable

The structure of the FWBM is shown in Fig. 1b, which is an improvement of Fig. 1a. The eigenvector nodes are fuzzy weighted.

Referring to spatial variables, a BM is used to calculate the hazard probability of each factor through its spatial features. The spatial features of

The hazard probability of the spatial feature

In the FWBM framework (Fig. 2), a BM is used to infer the hazard probability with the fuzzification of factor importance to reduce the ambiguity of the relationship between the hazard factors and land subsidence measured by InSAR.

Flowchart of subsidence hazard assessment using the FWBM.

The first part of the procedure consists of data (land subsidence measurements and hazard factors) gridding to obtain homogeneous datasets. The assessment hazard factors are derived first, which are parent nodes
(

The second part is the model implementation. Three modules corresponding to
the three variables that should be inferred in FWBM, i.e., probability of

The first module infers the subsidence hazard probability of

The second module calculates the fuzzy-based weight

The third module infers the probability of

Flowchart to infer

The study area belongs to the upper-middle part of the Chaobai River alluvial fan in the northern Beijing plain and covers approximately 1350 km

Location of the study area with the digital elevation model from the Shuttle Radar Topography Mission (SRTM). The EGRR location and the land subsidence monitoring station used to calibrate the PS-InSAR outcome are shown.

In this study, four hazard factors are considered: the rates of groundwater
level change in confined (

The contour lines of compressible soil and Quaternary unit thickness were
collected. The change in groundwater levels varied from

Assessment factors:

The distribution of the factors related to groundwater level change is split
into three periods:

from January 2003 to December 2010, a period of massive groundwater exploitation after the EGRR establishment;

from January 2011 to December 2014, when the decline rate of the groundwater level slowed due to the long-term loss of groundwater which reduces the capacity of water supply and increased rainfall (Zhang et al., 2015); and

from January 2015 to December 2017, when the SNWP-CR operation partially relieved the groundwater exploitation.

All factors and subsidence datasets were gridded into 5664 cells, with a cell size of 500 m

The fuzzification of factor importance is expressed as a triangular fuzzy
number considering the ambiguity between factors and subsidence. To compare
the model performance when ambiguity was eliminated, we implemented the WBM
with the non-fuzzy-based weight (

The hazard probability of feature

The same procedure was applied to land subsidence data from 2011 to 2014 and from 2015 to 2017, and the posterior hazard probability at a previous time interval was set as the prior probability at the current one.

With the probability of the single factor and factor weights, the hazard probability of

The proposed FWBM was applied to assess the subsidence hazard probability in the upper and middle part of the Chaobai River alluvial fans from 2003 to 2017. The hazard assessment distribution was reclassified into seven grades (Fig. 6a). A hazard probability less than 0.07 indicates a low-hazard area, and a hazard probability greater than 0.15 indicates a high-hazard area (Fig. 6b).

The changes in the land subsidence rate (Sr) detected by InSAR (Fig. 6c)
between 2010–2014 and 2015–2017 were used to validate the assessment results. A positive value means the subsidence rate decreased (SrD) and a
negative value means the subsidence rate increased (SrI). The total match
ratio is 85 % (Table 2). Notably, Table 1 showed that some points with SrD
are located in the high-hazard area. In fact, there are portions of the
study plain where, because the piezometric level did not recover significantly, land subsidence rates remained larger than 50 mm yr

Comparison of the match ratio (calculated by the ratio between the sum of the amount of SrI in the high-hazard area and SrD in the low-hazard area, and the total number of SrI and SrD) obtained with FWBM and WBM.

Fuzzy (

The FWBM results were compared with the results from a WBM that ignored the
ambiguity in the hazard assessment framework. The fuzzy-based weights (

The WBM results were also divided into seven levels. The degree of difference between WBM and FWBM outcomes is shown in Fig. 7a. Negative and positive values mean that WBM provides a higher or a lower hazard level than FWBM, respectively. In terms of the subsidence rate change (Fig. 6c), the WBM overestimates the subsidence hazard level for area 1 (Fig. 7b) and partially the level for area 2 (Fig. 7c), where the subsidence rate decreased in recent years. In addition, the WBM underestimates the hazard level for area 3 (Fig. 7d), where the subsidence rate increased. The FWBM performs better in regions with SrD and is characterized by a higher total match ratio than the WBM, as shown in Table 1.

The hazard probability of factors addressed in the analysis is shown in Fig. 8. Using the period from 2015 to 2017 as an example, a rate reduction of the groundwater level in the confined aquifer greater than 1 m yr

Hazard probability of the selected four factors for the three time periods considered in the analysis.

Because land subsidence is negligible in the northern part of the study area (Zhu et al., 2015), the temporal change in the hazard probability of land subsidence is investigated only in the southern region, where the confined aquifer system is located.

As shown in Fig. 6a, the subsidence hazard probability in Niulanshan decreased from 2003 to 2017. Conversely, the southwestern portion of the study area, especially in Tianzhu and Nanfaxin where the groundwater level decrease exceeded 1 m yr

Probability distribution of subsidence hazard for the three time periods from 2003 to 2017.

Figure 9 shows that the value of the subsidence hazard probability is between 0.01 % and 51.30 % from 2003 to 2010, between 0.01 % and 45.54 % from 2011 to 2014, and between 0.01 % and 28.33 % from 2015 to 2017.

Overall, the subsidence hazard decreased. This can be credited to the significant exploitation of groundwater resources and the following utilization policies. From 2003 to 2010, the operation of the EGRR led to a rapid drawdown of the groundwater levels. Between 2015 and 2017, the operation of the SNWP-CR conveyed a large amount of water to Beijing, reducing the pressure on the aquifer system and, consequently, slowing the rate of change in the groundwater level (Zhu et al., 2020b). Similarly, the California State Water Project, which is also a large water-transfer system, has been fundamental to controlling land subsidence in the Central Valley, California (Sneed et al., 2018).

Four subsidence hazard levels are classified from the probability map (Fig. 6b), consistent with previous studies (Yang et al., 2013; Zhu et al.,
2015). The high hazard covers 10.7 % of the total area, and the medium
hazard accounts for 17.5 % of the total area. The low and very low hazards
represent 29.7 % and 42.1 % of the total area, respectively. As the thickness of the compressible sediments increases from north to south, the
subsidence hazard probability increases accordingly. Tianzhu, Nanfaxin, Gaoliying, and Houshayu in the southwestern region experience medium–high
hazards because the compressible strata in these areas are thick and the
groundwater level dropped significantly. The InSAR results also revealed
that the maximum subsidence rate in these regions increased to 84.9 mm yr

Considering the ambiguity of the importance of various factors controlling land subsidence due to groundwater pumping and the uncertainty in the assessment process, a FWBM model was developed to assess the probability of land subsidence hazard at a regional scale. FWBM is based on a combination of BM and fuzzy set theory. The InSAR method was used to obtain land subsidence time series to adjust the posterior probability of the FWBM, thus reducing the model uncertainty.

The implementation of the FWBM in the Beijing area demonstrated the potentiality of this modeling approach and showed that it is superior when
the ambiguity of the relationship between the factors and land subsidence is
considered. The study is a first analysis of the hazard probability of land
subsidence and the related hazard factors. From the case study, we found
that subsidence probability decreased over time due to a change in water utilization, such as the operation of the SNWP-CR. A rate of groundwater
level decline greater than 1 m yr

The results of this study suggest that the proposed subsidence hazard assessment method significantly represents the uncertainty and ambiguity compared to traditional qualitative methods (Huang et al., 2012; Park et al., 2012; Yang et al., 2013; Chen et al., 2014; Tafreshi et al., 2019; Sundell et al., 2019). The hazard probability map of different time periods with different groundwater level conditions can offer scientific support for land subsidence prediction and help stakeholders and decision makers to develop more reliable water utilization strategies accounting for land subsidence hazard.

Improvements to the proposed methodology will be investigated in the future. The prior probability in this model is determined by the factor grid number ratio, which may have deviations. This ratio can be further improved by expert knowledge. Additionally, the impact of the selected assessment factors on the results will be investigated. Moreover, the cumulative land subsidence and not only the subsidence rate will be considered to assess the subsidence hazard with a more comprehensive perspective.

The calculations involved in this study are done using ArcGIS software.

The data used in this study are available from the corresponding author on reasonable request.

HL completed the experiment and paper, LZ and ZD provided rigorous guidance on the thesis and revised the manuscript, GG provided part of data and discussed the contents, YZ and LC revised the manuscript, and XL and PT provided guidance on the research conclusions and revised the manuscript.

The authors declare that they have no conflict of interest.

The authors are thankful to Tingbao Xu at the Australian National University and Yun Chen at CSIRO for their valuable comments and suggestions.

This research has been supported by the National Natural Science Foundation of China (grant nos. 41130744 and 41201420), the Natural Science Foundation of Beijing Municipality (grant no. 8202008), and the Capacity Building for Sci-Tech Innovation-Fundamental Scientific Research Funds (grant no. 025195305000/191).

This paper was edited by Olga Petrucci and reviewed by two anonymous referees.