should've asked about ICC as a measure of rater agreement since it is
easy to calculate from the lmer output.  [although it is not so
obvious for models with multiple random effects]

some notes on icc are in a separate folder here.