Homework

  1. Obtain the mean and the variance for the beta-binomial distribution. Show that it tackles   the overdispersion problem. Hint: use the formulas for conditional expectations and variances.
  2. Obtain the Laplace approximation for the posterior expection of  logit(mu) and log(tau) in the cancer mortality rate example (data available from the package LearnBayes).
  3. The failure time of a pump follows a two-parameter exponential distribution, f(y|b,m) = 1/b exp(-(y-m)/b, when y>=m.
    1. Obtain the likelihood for b and m based on an i.i.d. sample of size n
    2. Consider a suitable transformation that maps the parameters b and m to the plane
    3. Consider a sample of 8 pumps, where all pumps failed at some time. The smallest failure time was 23721 minutes and the total testing time for all pumps was 15962989 minutes. Assume a sensible prior for the transformed parameters and explore the contours of the posterior distribution.
    4. Find a normal approximation to the posterior distribution of the transformed parameters
    5. Use rejection sampling and SIR to approximate the posterior distribution. Compare.
    6. Use importance sampling as well as a Laplace approximation to estimate the posterior mean and variance of the transformed parameters.
    7. Define the reliability at time t_0 as R(t_0) = exp(-(t_0 -m)/b). Describe the posterior moments and the posterior dsitribution of R(10^6).
  4. Problems 3.10.1,3.10.2, 3.10.5; 4.7.1, 4.7.2
  5. Problems 5.9.7,5.9.8,5.9.10, 5.9.12
  6. (Turn in 4/26/19) Repeat the SAT example with:
    1. Direct sampling from the posterior
    2. Gibbs sampling from the posterior
    3. Abrams and Sansó '98 approximations for the posterior moments
  7. Problems 5.9.14, 5.9.15
  8. (Turn in 5/3/19) Write the Bayes factor, BIC, DIC and Gelfand and Ghost criterion to compare a model where n observations are assumed to be sampled with a poisson distribution with a gamma prior, to a model where the observations are sampled from a binomial distribution, with a fixed, large, number of trials and beta prior for the probability of success.
    1. Consider the data on red tailed hawks. Fit the data using the two different models. Notice that there is no hierarchical structure in this case, as opposed to what was assumed in the take home part of Test 1. Ignore the route counts.
    2. Perform a prior sensitivity analysis.
    3. Present a model comparison analysis using the criteria mentioned above.
  9. Consider the SAT example, use the DIC to compare the models with no  pooling, total pooling and partial pooling based on a hierarchical model with unknown variance.
  10. Problems 6.7.2,6.7.6; 7.8.4, 7.8.5 
  11. Problems 8.10.11, 8.10.14
  12. For each of the examples considered in class regarding the censored and truncated weights data develop an approach based on MCMC with auxiliary variables and write the full conditionals.
  13. Consider the following data regarding the heights in inches of male students at a college. First interval: less than 66, counts 14; second interval 66 to 68, counts 30; third interval 68 to 70, counts 49; fourth interval 70 to 72, counts 70; fifth interval 72 to 74, counts 33; sixth interval greater than 74, counts 15. Assume that the height of students is normally distributed, and assume a non-informative distribution for the parameters of the normal.
    1. Use Metroplis Hastings to estimate the parameters of the normal using a multinomial likelihood.
    2. Introduce latent variables and use Gibbs sampling to do the estimation. Compare.
  14. Find the marginal distribution of the regression coefficients in a linear normal model.
  15. Show that the posterior predictive distribution of a new observation corresponding to a regression model is a student distribution as indicated in the slides.
  16. Consider the data available as "birthweight" form the package "LearnBayes". Fit a linear regression that considers age and gender as explanatory variables for birth weight. Describe the posterior distribution of the regression parameters using a sample-based approach. Explore the predictive posterior distribution for the birth weight of children in the following four cases: (a) 36 week female/male; (b) 40 week female/male. Compare.
  17. Consider a conditional linear model with a design matrix and and an error covariance matrix specified by a set of unknown parameters. Find an explicit expression for the posterior distribution of those parameters.
  18. Show that the posterior distribution of the regression parameters using an informative prior is the same when the quadratic forms are completed directly and when the prior is considered as additional data.
  19. Consider a normal linear model where the errors have a covariance matrix that is a multiple of the identity. Is it possible to obtain a conjugate prior (informative) for the regression coefficients and the variance? If so, find the posterior distribution.
  20. Obtain the expressions for the marginal distributions of the data and the Bayes factors for linear models using g-priors.
  21. Problems 14.10.1 and 14.10.10
  22. (Turn in 5/29/19) Obtain the full conditionals for a Bayesian Lasso regression model assuming that \lamba, the scale of the penalization term, is randomly distributed according to a gamma distribution (see section 20.2).
  23. (Turn in 6/5/19) Perform a re-analysis of the cancer mortality data using Gibbs sampling introducing appropriate latent variables.
  24. Show that a Negative-Binomial likelihood tackles the overdispersion problem in models for Poisson count data.
  25. Write the full conditionals for a robust regression where the errors are assumed to correspond to a student distribution with known degrees of freedom.
  26. Consider data corresponding to a mixture of M exponential densities. Consider appropriate conjugate priors for the parameters of the model. Find the full conditionals needed to explore the posterior distribution using a Gibbs sampler.
  27. Problems 13.5, 13.6, 13.9
  28. Use the EM algorithm to estimate the mode of the posterior distribution of the location and scale parameters in a model where the observations follow a student distribution with fixed degrees of freedom and the prior is non-informative.