aldvmm

This post is written by Mark Pletscher, Institute of Health Economics and Health Policy, Bern University of Applied Sciences.

Introduction

Health-related quality of life is a key outcome in health technology assessments because it is patient-relevant and it is needed to calculate quality-adjusted life years. As 100% quality of life represents perfect health, health state utilities are limited at 1. The lowest possible utility in a local value set further defines a lower limit of health state utilities in a local population, and local value sets often show gaps between 1 and the next smaller utility value. Thus, health state utilities are limited dependent variables. In addition, they can be the consequence of multiple latent classes, or they can exhibit multi-modal marginal densities (Hernández Alava et al. 2014).

The goal of the aldvmm package is to fit adjusted limited dependent variable mixture models of health state utilities using the likelihood and expected value functions proposed by Hernández Alava and Wailoo (2015). Adjusted limited dependent variable mixture models have been frequently used for mapping studies (Gray, Wailoo, and Hernández Alava 2018; Gray, Hernández Alava, and Wailoo 2018; Dixon, Hollingworth, and Sparrow 2020; Yang et al. 2019; Xu et al. 2020; Fuller et al. 2017; Pennington et al. 2020), but they can also improve assessments of incremental and average marginal effects of medical interventions or health problems.

Methods

Adjusted limited dependent variable mixture models are finite mixtures of normal distributions in \(K\) components \(c\) with conditional expectations \(\mathrm{E}[y\mid X, c] = X\beta^{c}\) and standard deviations \(\sigma^{c}\). The probabilities of component membership are modeled as a multinomial logit function \(\Pr[c\mid X]=\exp(X\delta^{c})/\sum_{k=1}^{K}\exp(X\delta^{k})\). The model accumulates the density mass of the finite mixture below a minimum value \(\Psi_1\) at the value \(\Psi_1\), and the density mass above a maximum value \(\Psi_{2}\) at 1. If the maximum value \(\Psi_2\) is smaller than 1, the model emulates a value set with a gap between 1 and the next smaller value.

\[\begin{equation} \label{eq:limits} \begin{array}{ll} y_{i}|c =& \begin{cases} \begin{array}{ll} 1 & \text{if } y_{i}\mid c > \Psi_{2}\\ \Psi_{1} & \text{if } y_{i}\mid c \leq \Psi_{1}\\ y_{i}\mid c & \text{if } \Psi_{1} < y_{i}\mid c \leq \Psi_{2}\\ \end{array} \end{cases} \end{array} \end{equation}\]

Usage

The aldvmm() function fits an adjusted limited dependent variable mixture model using analytical gradients. By default, the aldvmm() function estimates mixtures of two components, but the number of components can be set by the user using the argument ncmp. If ncmp is set to 1, the model fits a tobit-like single-component model with a gap between 1 and the next smaller utility value specified in psi. We fit a simple two-component model with gender as the only explanatory variable for component means and an intercept-only model for the probability of component membership.

library("aldvmm")
data("utility")

fit <- aldvmm(eq5d ~ female | 1,
              data = utility,
              ncmp = 2,
              init = "zero",
              psi = c(-0.594, 0.883),
              optim.method = "BFGS")

summary(fit)

The model formula in aldvmm() is an object of class “formula” with two parts on the right-hand side of ~. The first part on the left of the | delimiter represents the model of expected values of normal distributions. The second part on the right of the | delimiter represents the model of probabilities of component membership.

The argument optim.method accepts all optimization methods available in the optimr package except for “nlm”, which requires a different implementation of the likelihood function.

The argument init accepts four options for the generation of starting values of the optimization algorithm.

  1. “zero”: A vector of zeroes (default).

  2. “random”: A vector of standard normal random values.

  3. “constant”: Parameter estimates of a constant-only model as starting values for intercepts and standard deviations, and zeroes for all other parameters.

  4. “sann”: Parameter estimates of a simulated annealing algorithm.

We obtain a summary table of regression results (table 1) using the generic function summary(). The coefficients of the model of expected values of normal distributions \(E[y|c, X]\) can be interpreted as marginal effects on component means. ‘lnsigma’ denotes the natural logarithm of the estimated standard deviation \(\sigma^{c}\). The coefficients of covariates in the multinomial logit model of probabilities of component membership are log-transformed relative probabilities. Our model only includes two components, and the multinomial logit model collapses to a binomial logit model. The intercept of 1.083 means that the average probability of an observation in the data to belong to component 1 is exp(1.083) or 2.954 times the probability to belong to component 2.

Table 1: Regression results with the “Nelder-Mead” optimization method and zero-only initial values
EstimateStd. Errort valuePr(>|t|)2.5 %97.5 %
E[y|X, c]
Comp1(Intercept)0.46750.02617.957300.41610.5188
female0.14980.03214.661700.08640.2131
lnsigma-1.76060.0701-25.10470-1.8989-1.6223
Comp2(Intercept)0.03120.02321.34730.1795-0.01450.077
female-0.14170.0291-4.86240-0.1991-0.0842
lnsigma-2.37160.1081-21.94570-2.5848-2.1585
P[c|X]
Comp1(Intercept)1.08310.17726.112500.73361.4326

We cannot interpret coefficients in terms of expected quality of life, but we can use predictions to calculate incremental effects. Standard errors of incremental effects can be calculated using the delta method (See vignette for example code). Objects of class “aldvmm” can also be supplied to various functions from the sandwich package to calculate robust, clustered or bootstrapped standard errors.

Discussion

The aldvmm package makes adjusted limited dependent variable mixture models available to R users and offers a broad set of optimization algorithms and methods for generating initial values.

The comparison of different optimization methods with EQ-5D-3L utility data from English patients after hip replacement in 2011 and 2012 (NHS Digital 2013) showed that the likelihood function can be challenging to maximize and can converge at extreme solutions (see vignette). Parameter estimates varied considerably across optimization methods and even across optima with the same log-likelihood. However, fitted values were very similar across optimization approaches which suggests that the model is more robust for prediction tasks than for parameter identification.

Although coefficients of models of normal means can be interpreted as marginal effects within each component, they cannot be interpreted in terms of overall expected values. Thus, average marginal effects and average treatment effects need to be calculated from predictions using the generic function predict(). Standard errors of marginal effects or average treatment effects can be calculated using the delta method (see example code in the vignette).

In situations with repeated measures, the aldvmm package only allows fixed effects estimations with group- and time-specific fixed effects which can be an important limitation in the analysis of clinical data.

References

Dixon, Padraig, William Hollingworth, and John Sparrow. 2020. “Mapping to Quality of Life and Capability Measures in Cataract Surgery Patients: From Cat-PROM5 to EQ-5D-3L, EQ-5D-5L, and ICECAP-o Using Mixture Modelling.” MDM Policy & Practice 5 (1): 2381468320915447.
Fuller, Gordon Ward, Monica Hernandez, David Pallot, Fiona Lecky, Mathew Stevenson, and Belinda Gabbe. 2017. “Health State Preference Weights for the Glasgow Outcome Scale Following Traumatic Brain Injury: A Systematic Review and Mapping Study.” Value in Health 20 (1): 141–51.
Gray, Laura A, Mónica Hernández Alava, and Allan J Wailoo. 2018. “Development of Methods for the Mapping of Utilities Using Mixture Models: Mapping the AQLQ-s to the EQ-5D-5L and the HUI3 in Patients with Asthma.” Value in Health 21 (6): 748–57.
Gray, Laura A, Allan J Wailoo, and Mónica Hernández Alava. 2018. “Mapping the FACT-b Instrument to EQ-5D-3L in Patients with Breast Cancer Using Adjusted Limited Dependent Variable Mixture Models Versus Response Mapping.” Value in Health 21 (12): 1399–1405.
Hernández Alava, Mónica, and Allan Wailoo. 2015. “Fitting Adjusted Limited Dependent Variable Mixture Models to EQ-5D.” The Stata Journal 15 (3): 737–50.
Hernández Alava, Mónica, Allan Wailoo, Fred Wolfe, and Kaleb Michaud. 2014. “A Comparison of Direct and Indirect Methods for the Estimation of Health Utilities from Clinical Outcomes.” Medical Decision Making 34 (7): 919–30.
NHS Digital. 2013. “Finalised Patient Reported Outcome Measures (PROMs) in England - April 2011 to March 2012. Patient Reported Outcome Measures (PROMs).” Https://Digital.nhs.uk/Data-and-Information/Publications/Statistical/Patient-Reported-Outcome-Measures-Proms/Finalised-Patient-Reported-Outcome-Measures-Proms-in-England-April-2011-to-March-2012 October 15, 2013.
Pennington, Becky M, Mónica Hernández Alava, Philip Hykin, Sobha Sivaprasad, Laura Flight, Abualbishr Alshreef, and John Brazier. 2020. “Mapping from Visual Acuity to EQ-5D, EQ-5D with Vision Bolt-on, and VFQ-UI in Patients with Macular Edema in the LEAVO Trial.” Value in Health 23 (7): 928–35.
Xu, Richard Huan, Eliza Lai Yi Wong, Jun Jin, Ying Dou, and Dong Dong. 2020. “Mapping of the EORTC QLQ-C30 to EQ-5D-5L Index in Patients with Lymphomas.” The European Journal of Health Economics 21 (9): 1363–73.
Yang, Fan, Carlos KH Wong, Nan Luo, James Piercy, Rebecca Moon, and James Jackson. 2019. “Mapping the Kidney Disease Quality of Life 36-Item Short Form Survey (KDQOL-36) to the EQ-5D-3L and the EQ-5D-5L in Patients Undergoing Dialysis.” The European Journal of Health Economics 20 (8): 1195–1206.
Gianluca Baio
Gianluca Baio
Professor of Statistics and Health Economics