Forecasting Malaria Vaccine Demand

This is a summary of my research work on forecasting demand of Malaria vaccines. For preprint of the actual paper, click here. I have ignored all discussions on the "mathematical" part of modelling, focussing only on my approach and the results.

Malaria is a mosquito-borne disease caused by a Plasmodium, a malarial parasite. Although Malaria is not life-threatening by its nature, it can cause severe illness and prove to be fatal if left untreated.

In February 2019, a new Malaria vaccine RTS,S — known by the trade name Mos-Quirix — was approved for human trials in three countries Ghana, Malawi and Kenya, coordinated by WHO. The study is expected to get over by December 2022. However, several pharmaceutical majors have begun showing interest in the vaccine's mass production in the last few months.

The companies want to estimate the coverage ratio — defined as the vaccinated population count divided by the total population. In this research, my aim was to forecast the same for all 78 affected countries using Dynamic Gaussian Process Model.


A vaccine's coverage in a country or geography depends on several factors: how effective is the vaccine? How many people are scared of not taking the vaccines? How many doses it has? Is the disease contagious?

Each country will have its own specific characteristics which are difficult to quantify. Therefore, the best approach is to select countries of "similar" characteristics instead of having one model only. In this work, we decided to work with the Human Development Index for grouping countries.

In strict modelling terms,

  • Y = time-series of coverage ratios for the next T years

  • X1 = Dosage number. The value is k, if k doses of the vaccine have already been given. Multiple dosages result in lower coverage.

  • X2 = Dosage time. The number of months after birth when the first dosage is taken; 0 represents 'at birth'. Typically, vaccines given at birth have higher coverage as there's no need to return to the hospital.

  • X3 = Efficacy. The ability of the vaccine to actually prevent the disease. Higher efficacy creates stronger motivation for vaccination.

  • X4 = Incidence per lakh. It is more likely that the parents will give their children the vaccine if the disease's occurrence is high. When incidences are high, the population is more careful about prevention.

  • X5 = Communicability. Fear of contagion drives vaccination.

  • X6 = Years active. As time passes, coverage naturally grows.


The training data size was too big to fit a full svdGP model on a standard laptop. We implemented the localized model (i.e., lasvdGP model) developed by Zhang et al. (2018) for the model fitting. The localized model considers only observation that "closely resemble" itself for modelling instead of considering all points.

Of the 10 countries in each group, not all are used for modelling. Instead, selected few for each variable and observation are used. This "closeness" is decided based on clustering within the country-group.

All the methods were executed with DynamicGP package in R.


From what was known, we considered that the first dose of vaccine (X1 = 0) was given to a six-month infant (X2 = 6). Malaria is known to be non-communicable (X5 = 0). We found the average incidence value as 60% (X4 = 60) and the vaccine's efficacy at 70% (X3 = 70).

Once the model was developed, we forecasted the coverage for each year for all 78 countries.

Vaccine coverage on the first year and after 38 years on the world map.

Vaccine coverage through the years for each country group.

Other Interesting Remarks

As evident from the figure, some countries start with higher coverage ratios and lead thirty years down the line. Some groups, like group 8, remain low for the entire duration. Group 9 and 10 catch up quickly with the group 1 and 2.

Countries that are score low on HDI get all the attention, NGOs and ilk. They receive higher external funding and support.

Countries that score high on HDI have established infrastructure and facilities to quickly roll out the programs. The countries in groups three to eight are less better off, with eight being the worst.

There are also trends in seasonality. Much like the original data, the coverage ratios show spikes at the end of every decade from the launch year. This could be due to 'anniversary' coverage news and attention. Also, the agencies responsible for vaccinating might be pushing themselves to complete their 10-year targets.

However, a more in-depth study is necessary to make any definitive conclusions.