Choosing the prior

So, in this case, what I was saying was that I don't want to - I don't need to really say anything about the error covariance matrices so I'm going to have a diffuse prior. This is going to be non-informative. Then on top of that I don't really want to say anything about the demographics, so again, I'm going to have a diffuse prior of that. Now, what is going to be important, is to say something about this random variation across the stores. So you remember that I've got 192 parameters and each of these 192 parameters is going to dropped in this xxx hierarchical distribution. Now the problem I'm running into is that I've only got 83 stores, so if I don't have an informative prior, then I'm not going to have a well defined Wishart distribution. So, for the Wishart to be defined, I've got to have an informative prior at this stage. Now, this puts some problems on it, because I don't exactly know what this prior should be, so what I'm going to do is, there's going to be all these scaling differences, you know, is the price elasticity for Minute Maid, does that vary more than the price elasticity for Tropicana. So what I'm going to do there, is I'm going to go through and just go back and do kind of empirical technique to try to set the prior on this . So, what I'm going to do, I'm going to say essentially postulate that there is some type of independent relationships on each of the - for this prior on each of these parameters and I'm going to go back and compute the least squares estimates and then I'm going to take the variance of those and then I'm going to scale those by k and then I also want to include the in there, and it's easier just to go back and think about this in terms of what's the expectations of my prior. Well, the expectation of my prior is down here, so the expectation is essentially going to be this matrix here. This . So if I go back and I've got the k's and the k's are sort of telling me what's the expectation. k is 1 and that my expectation would be is essentially going back - it's putting me at the same place that I would start from an empirical - an empirical prior.

For more detail on specification of the prior click here. Also, see here.

Implications of this Prior

For the outline click here.

The reason that I chose this parameterization is that I'd like to go back and have something that puts me somewhere close to the least-squared estimates, somewhere that's close to the pooled estimates, and somewhere between the pooled and least squares estimates. So, if my k=0 and my is 0, then my prior is going to say that there's no demographic effects, it's essentially going to mimic a pool model. So, it's not - it's going to sit - there's no cross-store variation and there's no demographic effects and all the stores are going to have the same, so we want to think about this is, I've got my individual parameter of and that's going to equal to this and this plus the demographic effects plus this independent variable or this random variable. Now the next case is going to say that we've got some type of direct demographic independent interactions. So here is k =0, if is not equal to zero, what I'm saying is there is no demographic xxx , there is no random effects, but these graphic effects are new. Now since I haven't put an informative prior on I'm not saying xx I'm going to try xx the data from what the results are here. The next notion is, where should I set the k? Should k be small, should k be large. Well, if I set k large, then the estimates are going to converge to the individual store models, So, it's going to be essentially like going back and saying that each of the stores is unrelated to all the others. The other option is to go back and just set k=1, so it's essentially some type of empirical Bayes prior. Or the other option is to set k to a small value, so suppose I said of the empirical Bayes prior, so how do I determine what would be a good k, well, if I was - a pure approach I would just say k is this number, or if I'm not sure about it, I would pick some kind of prior on the k. Now in this case, I'm interested in what this k should be. So, what I'm going to do is I'm going to go back and allow the k to vary.

How we choose our final hyperparameter is given here.

Back