R Package: rube

Overview Manual Vista/7 notes/Macintosh Examples Version history Notes for the latest version

Overview

The rube package is a Really Useful WinBUGS and JAGS Enhancer, which greatly facilitates the efficient and organized writing and running of Bayesian analyses using WinBUGS or JAGS from R.

The current rube package requires R version 3.4.1 or above (the older versions work with 2.15.0 or above), so go to www.r-project.org to load it if you are using an old version. The current version of rube is rube_0.3-11.

The package in Windows zip format can be installed from R using the menus: Packages => Install package(s) from local zip files....
The package in Mac tar.gz format can be installed from R. First install package "stringr". Then install rube_0.3-11.tar.gz using the Installation Manager menu by changing "CRAN(binaries)" to "Local Binary Package", and then selecting the tar.gz file. If you get the message Error in install.packages: type == "both" cannot be used with 'repos=NULL', then run the install.packages() shown in your R console, but add , type="source" after repos=NULL.
The package in Unix tar.gz format can be installed from R using the command: install.packages("c:/rube_0.3-11.tar.gz",type="source",repos=NULL).

These procedures will install other needed packages, including stringr and R2WinBUGS. In either case, you must also use library(rube) to load it in each session.

In case you are interested, the full code is at code directory.

Manual

The rube manual is in rubeManual.pdf. Please let me know of bugs, confusing input or output, deficiencies in documentation, and/or whatever you have on your wish-list.

Windows Vista/7/Macintosh notes

As mentioned in the WinBUGS documentation, Windows Vista and Windows 7 do not allow writing to the "Program Files" folder, but WinBUGS tries to do this. Therefore on these operating systems, you should install WinBUGS into a folder such as "c:\\WinBUGS14". If you do this you will need to explicitly tell rube() and getBugsExamples() where you installed WinBUGS. Alternatively, you may be able to get everything to work if you start R with "Run as Administrator". See the pdf documentation for details.

Rube runs well on a Macintosh, but you must use jags, not WinBUGS. As of the beginning of July 2014, the creators of jags have not released a Mavericks version, so you must run the Snow Leopard version of R on your Mavericks machine.

Note for Macintosh Yosemite:
If you get an error message like:
jags failed: call: dyn.load(file, DLLpath = DLLpath, ...) error: unable to load shared object '/Library/Frameworks/R.framework/Versions/3.1/Resources/library/tcltk/libs/tcltk.so': dlopen(/Library/Frameworks/R.framework/Versions/3.1/Resources/library/tcltk/libs/tcltk.so, 10): Library not loaded: /opt/X11/lib/libX11.6.dylib Referenced from: /Library/Frameworks/R.framework/Versions/3.1/Resources/library/tcltk/libs/tcltk.so Reason: image not found
then you have run in to the fact the Yosemite changes where X11 and Quartz are installed relative to earlier OS10 versions. The quick fix is to add

, progress.bar="none"

to all of your rube() calls.

Instructions for running rube on a Macintosh using a Windows emulator or CMU's Virtual Andrew are included in the main rube documentation file.

Rube runs fine from within R-Studio on either a PC or a Mac. The R-Studio limitation is that interactive graphics lik p3() cannot be "zoomed".

Examples

A good way to learn/play is to use the rube function getBugsExample() to run the built-in WinBUGS examples. This function reads an example (by default from the standard location) from its .odc file, and returns a list object with the model, data, and initializations all in a form ready for rube to use. If multiples are present they are all loaded. If you want to put the model in a text file you can use code like this:

write(ali$model1, file="aliModel.txt")

Critically, the getBugsExample() function automatically corrects for the differences in array ordering between the examples and how R works.

Here are examples of how to use rube based on the "Aligators" example that comes with WinBUGS:

### Find any example as a ".odc" file in the WinBUGS example directory.
### Load the model, data, and initializations from that file:
library(rube)
ali <- getBugsExample("Aligators")
names(ali)  #  this example has only one model and one data set and one initialization.

### Show a summary of the model:
with(ali, summary(rube(model=model1, data=data1, inits=inits1)))
# No problems were found, and a nice summary of the model is produced:
## Rube Summary:
## For variables: i, j, k 
## 
## Assignments (logicals):
## alpha[], beta[,], gamma[,], mu[,,], b[,], g[,] 
## 
## Constants:
##   Size Min Max Mean SD NAs
## I    1            4      0
## J    1            2      0
## K    1            5      0
## 
## Data:
##   Distr  Size NAs Parameters (mean,sd) Initial Value(s) [Range] Flags
## X dpois 4x2x5   0            mu[i,j,k]  5.475 +/- 5.588 [0, 23]      
## 
## Stochastics:
##        Distr Size NAs  Parameters (mean,sd) Initial Value(s) [Range] Flags
## alpha  dnorm    5   1 0, 1e-05 (0, 316.228)           0 +/- 0 [0, 0]      
## beta   dnorm  4x5   8 0, 1e-05 (0, 316.228)           0 +/- 0 [0, 0]      
## gamma  dnorm  2x5   6 0, 1e-05 (0, 316.228)           0 +/- 0 [0, 0]      
## lambda dnorm  4x2   0 0, 1e-05 (0, 316.228)           0 +/- 0 [0, 0]      
## 
## No problems detected!


### Run the model through bugs by specifying parameters-to-save.
result = with(ali, rube(model1, data1, inits1, 
                        parameters.to.save=c("alpha","beta"), n.chains=1))
summary(result)
## Rube Summary:
## Run at 2010-08-10 08:04 and taking 2.98 secs 
## Based on 1000 kept iterations (n.thin=1) after burn-in of 1000 iterations
##             mean    sd MCMCerr   2.5%   25%   50%    75%    97.5%
## alpha[2]  -1.847 0.509  0.0690 -2.885 -2.14 -1.82 -1.496 -0.94746
## alpha[3]  -2.650 0.811  0.1017 -4.779 -3.01 -2.56 -2.135 -1.36475
## alpha[4]  -2.104 0.554  0.0341 -3.281 -2.47 -2.09 -1.708 -1.05980
## alpha[5]  -0.772 0.377  0.0327 -1.507 -1.01 -0.78 -0.528 -0.00548
## beta[2,2]  2.754 0.605  0.0776  1.688  2.33  2.68  3.139  3.98512
## beta[3,2]  3.012 0.618  0.0795  1.972  2.54  2.94  3.427  4.27110
## beta[4,2]  1.759 0.596  0.0732  0.679  1.34  1.72  2.126  3.02617
## 
## DIC = 191.816

p3(result) # produces graphical summaries


### bugs() wants a function for the initializations if you are 
### running multiple chains:
ali$inits
# $alpha
# [1] NA  0  0  0  0
# ...
simpleInit = function() {
  init = ali$inits
  init$alpha[2:5] = rnorm(4, 0, 1)
  return(init)
}
result3 = with(ali, rube(model1, data1, simpleInit, 
                         c("alpha","beta","gamma"), 
                         n.burn=1000, n.iter=9000))
p3(result3)

### Note that rube helps you document your models:
result3$startTime
result3$runTime
result3$model

### Demonstrate problem detection:
write(ali$model1, "aliModel.txt")
### Now, screw something(s) up in aliModel.txt, e.g.
### change one alpha[k] to alpha[k,m] and change a ")" to "("
### and misspell a function or density.
summary(rube("aliModel.txt", ali$data1, ali$inits1))
## You'll get something like this:
## Rube Summary:
## 
## No assignments.
## 
## No constants.
## 
## No data.
## 
## Stochastics:
## None
## 
## Problems:
## At line 12 in $model: Distribution must be ~dfun(...) or ~dfun(...)I(...). 
## beta[i, k] ~ dnorm(0, 0.00001( 
## At line 28 in $model: rpois is not on the list of valid distributions 
## X[i, j, k] ~ rpois(mu[i, j, k])

### Fixing errors may take multiple passes.  For my version of the above
### example, fixing the two errors then re-running rube() gives:
## Problems:
## Variable(s) with inconsistent dimensions: alpha

Two more examples are in basicExample.R and Rasch.R.

A more complex example on a 4-parameter-logistic curve fit with random effects and covariates is at http://www.stat.cmu.edu/~hseltman/TrajBUGS/

Version history

Note that the R command library(help=rube) can be used to show what version of rube you are running.

Version 0.2-4, 17 June 2010

p3(), the interactive posterior viewer function, now shows the range of elements for parameters with multiple parameters. This appears on the button for each parameter.
Whenever possible, p3() shows the distribution of a parameter in the title.
p3() now shows a summary of all elements of a parameter (unless it is large) first, then you can use Next/Previous to view the individual elements in detail. The parameter nSamples, which defaults to 30, determines what "large" means. For parameters with greater than nSamples elements a random sample of nSamples elements is shown, and clicking the button for that parameter shows a new sample.
The overview includes the 95% posterior interval on the left, the lag interval for non-trivial autocorrelations in the middle, and the Gelman Rhat statistic (for all parameter elements, not just a sample) on the right. Rhat, which is missing if only a single chain is run, uses color codes to show values that suggest non-convergence of the chains in orange (1-1.2) or red (>2).

Version 0.2-5, 19 June 2010

Fixed problem in syntax parser that gave "dim(X) must have a positive length" error for some model code.
Added no-action area in p3() menu, so that clicking very close to the edge between two buttons results in no action.
Changed p3() menu to handle larger numbers of parameters better.

Version 0.2-6, 19 June 2010

Added the function priorExplore() to interactively explore prior distributions. This is a working initial version with nine common distributions. To use the function just type priorExplore().

Version 0.2-7, 1 July 2010

bugsCheck() now splits
bugsCheck() now counts NAs (including non-finite values)
print.bugsCheck() and summary.bugsCheck() now wrap long lists of variables for better readability
If parameters.to.save has duplicates (which confuses p3()), duplicates are removed and a warning is given (because this is probably a user error).
Fixed rube()'s call to bugs() to pass bugs.directory.
bugsCheck() now uses the first initialization if several are supplied
Button text now is smaller for more buttons.
p3() now correctly handles the absence of Rhat when its input is a component of a sims.array component of a bugs object.

Version 0.2-8, 9 July 2010

rube() now always returns a "rube" object. This can be examined using the print() or summary() methods (which are currently equivalent). When the bugs() run is successful, these methods show the MCMC results. This improves on the default view of a bugs() object in several ways. There is a column for the MCMC error, the number of digits does not default to a very low number (as does print.bugs), and there is a "limit" parameter (with a default of 10) which limits the number of parameters shown for vector parameters. When bugs() fails (or if you specify check=TRUE), examining the rube object gives the "bugsCheck" results summarizing the model, data, and initializations.
When rube() sees that bugs() has failed, it shows you the bugs() error message, so you should rarely need to use debug=TRUE.
The summary() of a rube object when bugs() was not run (or failed) now shows the model lines corresponding to any errors on specific lines.
The bugsCheck() reporting is more robust to various dumb errors including no stochastic nodes, and handles NAs in a clearer way.
Use of "^" now explicitly reminds you to use pow() instead.
A problem where rube() allowed WinBUGS to see scientific notation forms that it does not understand was fixed.
In p3(), the params= option now allows numeric selection as well as a character vector.
Rube is now fully flexible in understanding the form of your inits() function. You can have any or none of "data", "extras", and "cases" as parameters (in any order). The passing of the "cases" string vector simplified calling rube while allowing your inits() function to work differently for different model cases.
The showDefaults() function now allows a cases= argument. The default of cases=NULL shows all defaults. You can specify specific cases or use cases="" to see the defaults for the default model.
IFCASE() now works even if there are spaces between IFCASE and ().
Rube gives an error message rather than trying to run bugs() when you supply an "inits" list of a length that doesn't match n.chains.

Version 0.2-9, 12 July 2010

The lists of automatically culled data, initialization, and saved parameters is now wrapped for easier reading.
Culling of initializations is now stronger (logical nodes are culled) to allow the bugs()$last.value element to be used as an initialization.
print.rube() and summary.rube() now default to digits=3
A problem where multiple initializations with a function caused an error is now fixed.

Version 0.2-10, 16 July 2010

Added a compare() function that takes a list of two or more rube() results as its first argument. The posteriors are compared graphically.
Numeric expressions, e.g, y[i]~rnorm(log(2),pow(30,-2)), are now allowed as parameters to distributions. (The numeric value will be passed to bugs(), which does not allow this construction.)
rube() will cull variables from parameters.to.save if they are pure data (which would otherwise cause bugs() to fail with a misleading message).
rube() now checks the environmental variable "BUGSDIR" to set the bugs.directory. This is particularly useful for Vista where you will need to install WinBUGS in a directory other than "program files". You can do this two ways:
1. Use the Windows "System" Control Panel and select "Advanced", then "Environmental Variables". Under "User variables" click "New" then enter "BUGSDIR" as the name, and the directory location as the value. Click OK twice to finish.
2. In R, use something like Sys.setenv(BUGSDIR="C:\\WINBUGS14"). Even better, use .First=function()Sys.setenv(BUGSDIR="C:\\WINBUGS14") to make this run automatically each time R starts (in this workspace).
  1. The default for checkModel in rube() is now "always". In a large simulation study you may want to change this to save a little time.
  2. print.rube and summary.rube() now allow application of the 'limit=' option to multidimensional arrays. The default is c(10,2) which shows var[1] through var[10] and var[1,1] through var[10,2], but suppresses additional elements. (The last element in the 'limit=' option is repeated if needed.)

Version 0.2-11, 30 July 2010

Reporting is improved in many ways, including separating constants, data, and other stochastics.
Parameters that are symbols rather than numbers show up in the reports.
The new compare() function takes a list of rube() objects and graphically compares them.
The p3() function has a new "drop" parameter for use when the burn-in was insufficient. It indicates the number of (kept, not dropped) iterations to drop from the beginning of the plots.
The p3() function now has "intelligent parameter matching". If you are looking at a multi-valued parameter with more than (a default of) 20 elements, you see a random sample of 20 of the elements, and clicking on the parameter button gives another random sample of size 20. The intelligent feature is that if you click on another parameter with the same number of elements, the same subset is re-used so that you can see corresponding elements across multiple parameters.
Added warnings to rube(), so that when you use a form like
```
  result = rube(model, data, inits, parameters)
  
```
you don't need to examine the result to know that it failed.
Missing censor limits are now explicitly filled in.
Now rube() gives more detailed information about failures, including showing you the WinBUGS error message when possible.
The getBugsExample() function is now much more robust and will handle the weird examples that don't use semicolons between multiple statements on the same line. (Although undocumented, WinBUGS actually allows this.)
Now rube() allows a data.frame as its 'data' argument by making vectors from each column. An 'N' data value is automatically added.
Now rube() doesn't bother to try to run WinBUGS if n.thinby<1.
Fixed rube() problem with use of predefined variables such as "c" in bugs models.
Fixed getBugsExample() to correctly handle 3-way arrays.
Fixed rube() problem with some forms of line continuation syntax.
Changed the behavior of summary.bugsCheck() and summary.rube() to not print anything, as is standard practice, and added corresponding print functions.
Fixed rube() problems with funny spacing in code.
For consistency sake, changed the name of the "bugs.path" argument in getBugsExample() to "bugs.directory" to match rube().
Fixed a bug with multiple valued parameters and a single chain in p3() (on 8/11).

Version 0.2-12, 13 August 2010

An improved version of p3() now greys out unavailable menu buttons and uses "-10/-1/+1/+10" buttons exclusively for moving into and within multiple valued parameters and "Previous/Next" for moving between parameters.
In bugs() a high 'bin' parameter can increase the total number of iterations, possible resulting in runs much longer than intended. Rube now prevents this.
A merge() function (technically a merge.rube() method) allow merging of rube results. The combined object behaves as if chains from different runs were in the same run.
The rube() now returns a rube object nearly always. For early failures, there is an R warning and the rube problem list indicates "model checking as unsuccessful".

Version 0.2-13, 20 August 2010

The rube() function now sets a different random number seed for WinBUGS each time you run it. The seed is stored in the $bugs.seed component of the result, so you can use bugs.seed=myPriorModel$bugs.seed if you really want to re-generate identical results. This reverses the bugs() default of bugs.seed=NULL which uses the same random number sequences for every WinBUGS run.
The compare() function is improved to better handle comparing results collected under differing conditions.

Version 0.2-14, 8 September 2010

p3() now uses varLists to show related parameters in a group rather than individually. A viewGroups= argument allows similar grouping to be specified manually.
rube() now gives an error for an LC() formula with a intercept but no prefix or with neither prefix nor suffix.
rube() allows you to skip specification of a varList, and defaults to an intercept only.
priPost() now has a "sdFromGammaPrecision" distribution, which shows the distribution of a posterior standard deviation compared to its prior when specified as a gamma distribution on the precision scale.
The error message for a missing ENDCASE now shows the complete corresponding IFCASE().
Miscellaneous improvements and fixes: priPost() handles HyperGamma better, commented FOR() statements are no longer expanded, unbalanced parenthesis are always an explicit error, the compare() function handles a wider variety of params= values, and the mechanism to handle expressions for substitution variables is more robust.

Version 0.2-15, 9 November 2010

Fixed a bug in p3() relating to showing multiple chains.
Added only= to p3() to display a single trace of a multi-element parameter.
Fixed a bug in compare().

Version 0.2-16, 7 December 2010
- Added distributions and functions from http://www.winbugs-development.org.uk/ to the lists of allowed distributions and functions.

Version 0.2-17, 27 November 2012

Added a generic function model() which can be applied to a model text, or a rube result to view the model. The 'indent' parameter can be set to tab("\t"), or any other fixed set of characters. The code lines are numbered unless 'lineNumbers' is set to FALSE.
Added code to allow rube()'s 'parameters.to.save' argument to take the value "*" which causes all parameters to be saved. If a vector starting with "*" is used, the rest of the vector is elements to be dropped from the parameters.to.save.
Added parameter 'add' to p3(). This is a named list of character strings or expressions that define new parameters based on old ones. The posterior distributions of the new parameters are displayed without having to slow down WinBUGS to compute and store them. Currently vector parameters are not allowed, and there is not yet a way to view the summary() of the new parameters.
When the 'wd' parameter is unspecified, rube() now first checks the environmental variable "BUGSWD". This solves a problem where getwd() is in a form that bugs() does not understand causing WinBUGS to hang. Just specify a "normal" working directory in this environmental variable, e.g., using Sys.setenv() to make a setting for your whole session.
The compare() function now takes its input directly as rube objects, rather than as a list.
Added augment() which allows adding in new parameters based on formulas using standard functions plus old parameters. E.g. augment(myRubeRun, add=list(sdRI="sqrt(varRI)", pHat="1/(1+exp(-LO))")) creates a new variable called 'sdRI' (assuming that 'varRI' is a scalar parameter that has been saved in the rube output called 'myRubeRun'). And it creates a set of 'pHat' values assuming that 'LO' is a vector variable that has been saved in the output.
The sims.array, sims.matrix, sims.list, summary, mean, sd, median, and MCMCerr components of the object are all updated, but the time consuming construction of all except the first can be suppressed by using 'minimal=TRUE'. The MCMCerr and n.eff components of the output are computed by the 'coda' package, and disagree with the R2WinBUGS results (and sometimes are NA).
The shrink(), merge(), and stitch() functions have all been extensively revised, mainly to provide the $summary component of the new object which was not previously available. The merge() and stitch() functions no longer take a list of rube objects as their input, instead two separate rube objects should be used.
All three functions now use two separate functions (makeStats and makeMatList) to recompute results after constructing the new $sims.array. All three have a 'minimal=FALSE' argument which can be set to TRUE to suppress the lengthier calculations if you only want to plot the new results. For all three the MCMCerr and n.eff components of the output are computed by the 'coda' package, and disagree with the R2WinBUGS results (and sometimes are NA).
The shrink() function has a new parameter, 'dropChains', which can be used to discard some chains of a multi-chain run.
The stitch() functions gives appropriate errors and warnings if it can detect an inappropriate stich, e.g., different numbers of chains, different thinning, or burn-in for the second run. The print.summary.rube() function now shows both start and run times for stitched objects.
Changed rube() to set the bugs() 'save.history' parameter to FALSE when 'DIC' and 'debug' are FALSE. This saves time by not telling WinBUGS to make the plots that flash by when the user can't really study them. (Due to a bug in R2WinBUGS:::bugs.log(), this trick does not work when DIC=TRUE.)
Change rube() to use the new makeData() function which allows a wider variety of data input forms, including mixing data.frames with other list elements (the data.frame columns become separate list elements). A new feature is that if 'data' is a data.frame, in addition to automatically adding "N" to the data list, any numeric attributes (e.g., attr(myDtf, NS=55) to specify the number of subjects in a hierarchical model) are also automatically added.
Added reporting of number of chains to print.summary.rube().
Fixed an unneeded warning message during some model checking.

Version 0.2-17, 28 November 2012

Added support for dflat() and spatial distributions

Version 0.2-18, 13 February 2013

Added support for a generic model() function
Added support for derived posteriors via formulas entered into p3()

Version 0.3-1, 5 September, 2013

Conversion from R version 2 to R version 3
Fixed error in reporting MCMC errors (under some conditions the values were mismatched to the wrong parameters names)
Changed "only=" parameter in p3(). Default is now NULL instead of NA, and vectors are allowed when "param=" is a single vector parameter (so that a specfic set of elements is shown in the "multipanel".
In p3(), moved "Rhat=" to its proper place above the trace plots.

Version 0.3-2, 20 September, 2013

Added support for jags
- The "program=" parameter for rube() now defaults to "auto", which determines the MCMC engine, with "jags" as the other new option. With "auto", if packlage "R2jags" is loaded, jags will be run instead of WinBUGS. An "engine" element has been added to the rube object.
- print() and summary() now output the engine (on objects created >=0.3-2).
- The rube() call now includes "progress.bar=" (defaults to "gui" to specify the jags (non-parallel) method reporting MCMC progress.
- The rube() call now includes "RNGname=" (defaults to "Wichmann-Hill") to specify the jags random number generator.
- The rube() call now allows "parallel=TRUE" to invoke the parallel MCMC routine in jags. This requires that "inits=" be a function with no arguments (which cannot access non-local variables). It also requires that rube attach the data, so a warning is given if rube() needs to temporarily move any global variable(s) with the same name(s) as the data variables.
- With parallel jags, rube cannot protect you from specifying non-stochastic variables in your initialization function, so it stops with an annotated error, if you try that.

Version 0.3-3, 8 October, 2013

Fixed a bug in selectText() so that now a missing (or misspelled) variable list (varList) name produces prefix0suffix for FOR() or LC() model constructs.

Version 0.3-4, 14 October, 2013

Added dinterval(), dmstate(), and dnormmix() as distribution for jags. Remember to use the R command load.module("msm") before using a model with dmstate(), and load.module("mix") before using a model with dnormmix().

Version 0.3-5, 26 March, 2014

Added support for jags data block. Only a single data block, which must precede the model block, is allowed. Data blocks are not allowed in winBugs. Use of assignment and distribution for the same variable is flagged as an error in jags, with a message that the assignment must be moved to a data block.
In jags, the power operator "^" is now allowed.

Version 0.3-6, 24 April, 2014

With jags, all of the quantile and probability functions are now allowed.
With jags, the truncation operator "T()" is now allowed.
I(value,), I(,value), T(value,), and T(,value) now behave correctly.
The compare() function now correctly displays RHat.
The augment() function now allows a parameter to be mentioned more than once in a formula.
The priPost() function now respects domain when estimating density.

Version 0.3-7, 19 June, 2014

Fixed showModel() to work even with curly brace mismatch.
Fixed bugsCheck() to use the correct formula for the sd of a dunif().

Version 0.3-8, 7 November, 2014

Added length() and dim() to valid function list for jags. Note: to avoid an "inconsisent dimensions" error, you must you code such as length(y[]) instead of length(y).
Add the density functions (as opposed to distributions) for jags.

Version 0.3-9, 14 July, 2015 + 28 September 2017

Changed some regular expression code regarding right parentheses and right curly braces to conform to new requirements in the latest version of R. Fixes the "Incorrectly nested parentheses in regexp pattern. (U_REGEX_MISMATCHED_PAREN)" error message.

Version 0.3-10, 29 November, 2017

Changed regular expression code in extractSyntax to remove "\\" inside of stringr's fixed(). Fixes the inappropriate "Distribution must be ~dfun(...) or ~dfun(...)I(...)" error message.

Version 0.3-11, 23 March, 2018

Fixed code in rube.R to accomodate a change in functionality of exists() that caused failure to run any model with jags.

Last updated 3/26/2014.
Please send comments to