Environmental prediction and risk analysis using fuzzy numbers and data-driven models




Khan, Usman Taqdees

Journal Title

Journal ISSN

Volume Title



Dissolved oxygen (DO) is an important water quality parameter that is used to assess the health of aquatic ecosystems. Typically physically-based numerical models are used to predict DO, however, these models do not capture the complexity and uncertainty seen in highly urbanised riverine environments. To overcome these limitations, an alternative approach is proposed in this dissertation, that uses a combination of data-driven methods and fuzzy numbers to improve DO prediction in urban riverine environments. A major issue of implementing fuzzy numbers is that there is no consistent, transparent and objective method to construct fuzzy numbers from observations. A new method to construct fuzzy numbers is proposed which uses the relationship between probability and possibility theory. Numerical experiments are used to demonstrate that the typical linear membership functions used are inappropriate for environmental data. A new algorithm to estimate the membership function is developed, where a bin-size optimisation algorithm is paired with a numerical technique using the fuzzy extension principle. The developed method requires no assumptions of the underlying distribution, the selection of an arbitrary bin-size, and has the flexibility to create different shapes of fuzzy numbers. The impact of input data resolution and error value on membership function are analysed. Two new fuzzy data-driven methods: fuzzy linear regression and fuzzy neural network, are proposed to predict DO using real-time data. These methods use fuzzy inputs, fuzzy outputs and fuzzy model coefficients to characterise the total uncertainty. Existing methods cannot accommodate fuzzy numbers for each of these variables. The new method for fuzzy regression was compared against two existing fuzzy regression methods, Bayesian linear regression, and error-in-variables regression. The new method was better able to predict DO due to its ability to incorporate different sources of uncertainty in each component. A number of model assessment metrics were proposed to quantify fuzzy model performance. Fuzzy linear regression methods outperformed probability-based methods. Similar results were seen when the method was used for peak flow rate prediction. An existing fuzzy neural network model was refined by the use of possibility theory based calibration of network parameters, and the use of fuzzy rather than crisp inputs. A method to find the optimum network architecture was proposed to select the number of hidden neurons and the amount of data used for training, validation and testing. The performance of the updated fuzzy neural network was compared to the crisp results. The method demonstrated an improved ability to predict low DO compared to non-fuzzy techniques. The fuzzy data-driven methods using non-linear membership functions correctly identified the occurrence of extreme events. These predictions were used to quantify the risk using a new possibility-probability transformation. All combination of inputs that lead to a risk of low DO were identified to create a risk tool for water resource managers. Results from this research provide new tools to predict environmental factors in a highly complex and uncertain environment using fuzzy numbers.



uncertainty analysis, fuzzy numbers, data-driven models, water quality, floods, risk assessment, artificial neural network, linear regression