I am a PhD Candidate in the Department of Statistics & Data Science at Carnegie Mellon University. My research is supervised by Jared Murray of the University of Texas at Austin and Rebecca Nugent of Carnegie Mellon University. I received my BA in mathematics from Swarthmore college. Prior to beginning my studies at Carnegie Mellon I worked as a research analyst for an economic consulting firm.
The primary focus of my research is in developing computationally tractable methods for Bayesian Record Linkage, which is also know as entity resolution. Existing record linkage methods than can be applied to large problems typically produce only a point estimate, failing to characterize the uncertainty associated with the point estimate. In contrast Bayesian approaches typical characterize the uncertainty well but are too computationally costly to apply to large problems. My research aims to develop methods that can be applied to large problems, but which still characterize the uncertainty associated with the estimated link structure. I have also worked on applying Approximate Bayesian Computation algorithms to problems in astronomy.
I am interested in statistical sampling and simulation methods, non-parametric testing, and clustering methods.
A current version of my CV is available here.
“Scaling Bayesian Probabilistic Record Linkage with Post-Hoc Blocking: An Application to the California Great Registers”. McVeigh, B.; Spahn B; Murray, J. 2019. (arXiv)
“Practical Bayesian Inference for Record Linkage”. McVeigh, B.; Murray, J. 2017. (arXiv)
“Practical Bayesian Record Linkage”. McVeigh, B.; Spahn, B.; Murray, J. Joint Statistical Meeting 2018, Vancouver, Canada. August 2018.
“A Sequential Algorithm for Bayesian Inference of Large-Scale Record Linkage Structure”. McVeigh, B.; Murray, J. Joint Statistical Meeting 2017, Baltimore, Maryland. July 2017.
Julia implementation of unsupervised methods for one-to-one record linkage. Both an EM algorithm based method and a MCMC algorithm employing a standard Fellegi-Sunter based approach are included. The module also includes methods for post-hoc blocking and a penalized likelihood based point estimate that I have developed (here).
Julia implementation of algorithms for solving linear sum assignment problems. This includes a version of the well known Hungarian algorithm as well as Auction algorithms (Bertsekas 1992).
Julia implementation and the standard ABC rejection sampler as well as a Population Monte Carlo (PMC) sampler. Methods are also provided for more easily defining prior sampling distributions and adjusting the iterative process in the PMC sampler.