I Web, Therefore I Exist

Slides, recorded lecture and additional resources around my talk on how to create and control your digital identity.

IndiaPIN: R Data Package

R Package for All India PIN Codes Directory with Latitude and Longitude Details (Updated: December 2021)

Today I Learnt

This is my digital kitchen sink. Basically a dump of cool things.

Is COVID-19 Data tampered?

Is there any evidence of tampering or manipulation in COVID-19 daily cases reported by countries? Using Benford analysis in R, I try to reach at some conclusion.

Dynamic GP: Application to Malaria Vaccine Coverage Prediction

Gaussian process (GP) based statistical surrogates are popular, inexpensive substitutes for emulating the outputs of expensive computer models that simulate real-world phenomena or complex systems. Here, we discuss the evolution of dynamic GP model — a computationally efficient statistical surrogate for a computer simulator with time series outputs. The main idea is to use a convolution of standard GP models, where the weights are guided by a singular value decomposition (SVD) of the response matrix over the time component. The dynamic GP model also adopts a localized modeling approach for building a statistical model for large datasets. In this chapter, we use several popular test function based computer simulators to illustrate the evolution of dynamic GP models. We also use this model for predicting the coverage of Malaria vaccine worldwide. Malaria is still affecting more than eighty countries concentrated in the tropical belt. In 2019 alone, it was the cause of more than 435,000 deaths worldwide. The malice is easy to cure if diagnosed in time, but the common symptoms make it difficult. We focus on a recently discovered reliable vaccine called Mos-Quirix (RTS,S) which is currently going under human trials. With the help of publicly available data on dosages, efficacy, disease incidence and communicability of other vaccines obtained from the World Health Organisation, we predict vaccine coverage for 78 Malaria-prone countries.

Statistical Modelling and Analysis of the Computer-Simulated Datasets

Over the last two decades, the science has come a long way from relying on only physical experiments and observations to experimentation using computer simulators. This chapter focuses on the modelling and analysis of data arising from computer simulators. It turns out that traditional statistical metamodels are often not very useful for analyzing such datasets. For deterministic computer simulators, the realizations of Gaussian Process (GP) models are commonly used for fitting a surrogate statistical metamodel of the simulator output. This peer-reviewed book chapter reviews GP models, their numerical stability due to near-singularity of spatial correlation structures. We also presented generalisations of GP model and reviewed algorithms for big data.

Chai Kaapi

Analytics live project for a fast food chain in India