Characterizing the Distribution of Heterogeneity : Distributed Markov Chain Monte Carlo for Bayesian Hierarchical Models
Abstract This article proposes a distributed Markov chain Monte Carlo (MCMC) algorithm for estimating Bayesian hierarchical models when the panel size is extremely large (in the millions of consumers) and the objects of interest are the distribution of heterogeneity and the parameters that characterize it. Extant distributed MCMC methods are inherently inefficient, statistically and computationally, because they require the estimation of both the consumer-level parameters and the distribution of heterogeneity. The approach we present bypasses the estimation of the consumer-level parameters. The two-stage algorithm is asymptotically exact, has excellent variance properties, retains the flexibility of a standard MCMC algorithm, and is easy to implement. The details of the algorithm depend on the form of the prior imposed on the hierarchical model. All three possibilities for the prior are considered: i) nonparametric, ii) exponential family, and iii) nonexponential family, such as a finite mixture. The first stage constructs an estimator of the posterior predictive distribution of the consumer-level parameters, which is also the distribution of heterogeneity. For a nonparametric prior, a second stage is not needed since, by definition, the common parameters that characterize the distribution of heterogeneity are already known. For the two parametric priors (exponential and nonexponential families) for which the common parameters that characterize the distribution of heterogeneity are desired, the second stage draws auxiliary variables from the posterior predictive distribution before directly drawing the common parameters. The proposed algorithm takes particular advantage of exponential family priors by first reducing the auxiliary variables to the sufficient statistics that parameterize the posterior distribution of heterogeneity before drawing the common parameters. Although both stages are embarrassingly parallel, the second stage is sufficiently fast that a serial implementation may be computationally tractable. By avoiding the extensive computational, memory and network resources related to drawing, storing and communicating consumer-level parameters, the algorithm dominates the single-machine benchmark algorithm in computational and statistical efficiency by several orders of magnitude
Year of publication: |
[2021]
|
---|---|
Authors: | Bumbaca, Federico (Rico) |
Publisher: |
[S.l.] : SSRN |
Subject: | Monte-Carlo-Simulation | Monte Carlo simulation | Theorie | Theory | Statistische Verteilung | Statistical distribution | Bayes-Statistik | Bayesian inference | Markov-Kette | Markov chain |
Saved in:
freely available
Extent: | 1 Online-Ressource (63 p) |
---|---|
Type of publication: | Book / Working Paper |
Language: | English |
Notes: | Nach Informationen von SSRN wurde die ursprüngliche Fassung des Dokuments June 13, 2021 erstellt |
Other identifiers: | 10.2139/ssrn.3866257 [DOI] |
Classification: | C11 - Bayesian Analysis ; C23 - Models with Panel Data |
Source: | ECONIS - Online Catalogue of the ZBW |
Persistent link: https://www.econbiz.de/10013223426
Saved in favorites
Similar items by subject
-
Distributed Markov Chain Monte Carlo for Bayesian Hierarchical Models
Bumbaca, Federico, (2017)
-
Ardia, David, (2009)
-
Ardia, David, (2015)
- More ...
Similar items by person
-
Scalable target marketing : distributed Markov chain Monte Carlo for Bayesian hierarchical models
Bumbaca, Federico, (2020)
- More ...