Statistical Modelling and Analysis of the Computer-Simulated Datasets

January 31, 2019

My first academic publication: a peer-reviewed book chapter on statistical modelling using Gaussian processes. We reviewed several GP models and correlation structures, and methods to handle numerical instabilities due to near-singular matrices. Finally, we reviewed several algorithms developed specifically for analysing big data obtained from computer simulators.


Over the last two decades, the science has come a long way from relying on only physical experiments and observations to experimentation using computer simulators. This chapter focuses on the modelling and analysis of data arising from computer simulators. It turns out that traditional statistical metamodels are often not very useful for analyzing such datasets. For deterministic computer simulators, the realizations of Gaussian Process (GP) models are commonly used for fitting a surrogate statistical metamodel of the simulator output. The chapter starts with a quick review of the standard GP based statistical surrogate model. The chapter also emphasizes on the numerical instability due to near-singularity of the spatial correlation structure in the GP model fitting process. The authors also present a few generalizations of the GP model, reviews methods and algorithms specifically developed for analyzing big data obtained from computer model runs, and reviews the popular analysis goals of such computer experiments. A few real-life computer simulators are also briefly outlined here.


Harshvardhan, M., & Ranjan, P. (2019). “Statistical Modelling and Analysis of the Computer-Simulated Datasets”. In B. Gupta, & D. Agrawal (Eds.), Handbook of Research on Cloud Computing and Big Data Applications in IoT (pp. 202-228). Hershey, PA: IGI Global. doi:10.4018/978-1-5225-8407-0.ch011 [ arXiv:2012.11122]

Posted on:
January 31, 2019
2 minute read, 261 words
publications R statistics
See Also: