Hi Atish,
Thanks much for clarifying! I hope you get fully recovered soon.
Dealing with high-dimensional input is typically more tricky than high-dimensional output, partially because of the algorithmic challenges (e.g. the number of parameters to be optimized increases), but more importantly, because this means it is likely that the sensitivity index values are very low. In an extreme case, imagine the case where 50 variables equally contribute to the response - the sensitivity of each variable can be less than 0.02, and getting the estimation accuracy of this level will require an enormous number of samples. With 512 samples, the estimation can be significantly perturbed by the sampling variability. However, on the other hand, if only a few variables actually dominate the response of your model, some algorithms can work. The best way to figure it out is to test it out
To run quoFEM analysis using existing Monte Carlo results, please follow the below:
- In the UQ tab, select "Sensitivity Analysis"-"SimCenterUQ"-"Import Data Files". Set # samples to 512 and import the data files that are prepared following the instructions.
- In the FEM tab, select "none".
- Then, if you click the RV tab, quoFEM should already have auto-populated 50 variables (nothing to change), and finally, in the QoI tab, you can set any name for the output variable and set the length to 4.
Some caveats:
- Please note that the total sensitivity index coming from the algorithm in SimCenterUQ is likely not credible for such high-dimensional inputs (the challenge is in fitting a Gaussian mixture distribution in 50-dim space; see here for the reference). So, only the main index should be useful.
- If you want to get the reliable total index, you may want to run the algorithm in the Dakota engine, but this typically requires a much larger number of simulations and cannot be estimated using pre-simulated samples (need to import the model in FEM tab). But this algorithm is guaranteed to converge to the exact solution if the number of samples is very large
- One more caveat is warranted for the case where the input variables are correlated. If this is the case, please note that the contribution can be "double counted" for the correlated variables, and be careful with the interpretations of Sobol indices.
For the surrogate model, I assume GP in quoFEM would not work - With 512 samples for 50 input dimensions, it will very likely result in overfitting.
Please let me know if something is unclear or have difficulty running the analysis.
Best,
Sang-ri