SimCenter Forum

Research in Natural Hazards Engineering => Uncertainty Quantification (quoFEM) => Topic started by: atish on October 25, 2023, 04:55:13 AM

Title: Using quoFEM for GP based Surrogate Modelling and Sensitivity Analysis,
Post by: atish on October 25, 2023, 04:55:13 AM
Dear SimCenter Team,

I am a PhD student at the University of New South Wales, Australia. Currently, I am working on a global sensitivity analysis using a high-fidelity atmospheric model. I have considered 50 reaction rates (input parameter) as my input quantity of interest, whose effects I will be quantifying. Using 512 sets of perturbed reaction rates, I tried to run my model and perform sobol analysis. However, my sobol indices are not converging, as I think my sample size is small. Now I have decided to create a surrogate model using the Gaussian Process (GP) Regression method and then perform sobol analysis. I am new to machine learning.

My question:
1. Since my input parameter is 50 and I have 512 output data, can I use quoFEM to perform sensitivity analysis.
2. If not, then can I use quoFEM to create GP based surrogate model.
Title: Re: Using quoFEM for GP based Surrogate Modelling and Sensitivity Analysis,
Post by: Sang-ri on October 25, 2023, 08:45:04 PM
Hi,

Thanks for the post! I have some quick questions before going into details.

1. Can you let me know the dimensions of the model input (=number of the parameters whose contribution to the responses will be inspected) and output (=number of the responses from a simulation run) of interest? I am assuming that the output dimension is 50, but I just wanted to clarify. I believe the number of simulations you ran is 512.

2. Regarding the sensitivity analysis that did not converge, was it performed using quoFEM?

Thanks,
Sang-ri
Title: Re: Using quoFEM for GP based Surrogate Modelling and Sensitivity Analysis,
Post by: atish on October 30, 2023, 02:37:26 AM
Thanks, Sang-ri.

My apologies for the late reply. I have been feeling unwell for the past few days.

1. The dimension of my model input is 50. I have chosen 50 chemical reaction rates whose contribution I would like to inspect. The atmospheric model that I am working on is the Globol Ionosphere Thermosphere Model (GITM), which outputs many parameters, but I will try to focus on baseline parameters such as neutral temperature, nitric oxide density, neutral density, and electron density.  Let's say, for now, my output dimension is 4.

2. No, I have not used quoFEM before. Previously, I tried to use the SALib - Sensitivity Analysis Library in Python to perform Sobol Analysis. The result was not good.

I have 512 simulation data which I obtained when I ran my model using 512 different sets of perturbed reaction rates. Each set of perturbed reaction rates contains 50 reaction rates. I generated these 512 sets of perturbed reaction rates using Monte Carlo simulation.

Regards,
Atish
Title: Re: Using quoFEM for GP based Surrogate Modelling and Sensitivity Analysis,
Post by: Sang-ri on October 30, 2023, 07:36:22 PM
Hi Atish,

Thanks much for clarifying! I hope you get fully recovered soon.

Dealing with high-dimensional input is typically more tricky than high-dimensional output, partially because of the algorithmic challenges (e.g. the number of parameters to be optimized increases), but more importantly, because this means it is likely that the sensitivity index values are very low. In an extreme case, imagine the case where 50 variables equally contribute to the response - the sensitivity of each variable can be less than 0.02, and getting the estimation accuracy of this level will require an enormous number of samples. With 512 samples, the estimation can be significantly perturbed by the sampling variability. However, on the other hand, if only a few variables actually dominate the response of your model, some algorithms can work. The best way to figure it out is to test it out  :)

To run quoFEM analysis using existing Monte Carlo results, please follow the below:
Some caveats:

For the surrogate model, I assume GP in quoFEM would not work - With 512 samples for 50 input dimensions, it will very likely result in overfitting.

Please let me know if something is unclear or have difficulty running the analysis.

Best,
Sang-ri

Title: Re: Using quoFEM for GP based Surrogate Modelling and Sensitivity Analysis,
Post by: atish on October 31, 2023, 12:05:12 AM
Many thanks, Sang-ri.

1. I will try to perform the sensitivity analysis on quoFEM as per your instructions. Fingers crossed. Let's see how the sensitivity index values are.
2. For the GP-based surrogate model, could you please suggest which tools will be easier for my case, provided I have no machine learning experience? I am comfortable with Python and MATLAB.
Title: Re: Using quoFEM for GP based Surrogate Modelling and Sensitivity Analysis,
Post by: Sang-ri on October 31, 2023, 11:31:04 PM
Hi Atish,

For the second question, I believe the overfitting is a general limitation of GP for high-dimensional inputs (rather than a limitation in specific toolboxes/packages), and its effect is highly problem-specific.

You could always try running quoFEM, because once the data files are prepared to run the sensitivity analysis, the same files can be easily used for surrogate model training. The cross-validation results are provided as an output to help you understand how well the surrogate model is trained.

quoFEM provides easy access to UQ beginners as we put some recommended setups by default, but if you would like to have more control of the surrogate training algorithm by directly working on python/matlab toolboxes, "GPy" is the Python package that quoFEM utilizes for GP training. Additionally, "UQpy" (python) and "UQlab" (matlab) are some of the well-established and maintained UQ packages that have surrogate training modules.

Best,
Sang-ri