My papers on Google Scholar and arXiv.

Working papers and preprints

  1. Chen, Y. and Lei, J. (2024)
    Minimax Optimal Probability Matrix Estimation For Graphon With Spectral Decay
  2. Zhang, T., Lee, H. and Lei, J. (2024) Winners with Confidence: Discrete Argmin Inference with an Application to Model Selection
  3. Lei, J., Chen, K., and Moon, H. (2024)
    Least Squares Inference for Data with Network Dependency
  4. Moon, H., Du, J., Lei, J., and Roeder, K. (2024)
    Augmented Doubly Robust Post-Imputation Inference for Proteomic Data
  5. Zhang, T., Lei, J., and Roeder, K. (2024)
    Debiased Projected Two-Sample Comparisonscfor Single-Cell Expression Data
  6. Lin, K. Z., and Lei, J. (2024)
    Dynamic clustering for heterophilic stochastic block models with time-varying node memberships
  7. Zhang, T. and Lei, J. (2023)
    Online estimation with rolling validation: adaptive nonparametric estimation with stream data
  8. Kissel, N. and Lei, J. (2022)
    On high-dimensional gaussian comparisons for cross-validation
  9. Wang, Y., Lei, J., and Fienberg, S. (2016)
    A minimax theory for adaptive data analysis

Selected Peer-Reviewed Articles: Theory and Methodology

  1. Oliverira, N. L., Lei, J., and Tibshirani, R. (2023)
    Unbiased Test Error Estimation in the Poisson Means Problem via Coupled Bootstrap Techniques
    Electronic Journal of Statistics, to appear.
  2. Chen, Y., and Lei, J. (2024+)
    De-Biased Two-Sample U-Statistics With Application To Conditional Distribution Testing
    Machine Learning, to apper.
  3. Lei, J., Oliveira, N. L. and Tibshirani, R. J. (2024+)
    Discussion of `Data Fission: Splitting a Single Data Point' by Leiner et al.
    Journal of the American Statistical Association, to appear.
  4. Lei, J., Anru R. Zhang, and Zhu, Z. (2024+)
    Computational and statistical thresholds in multi-layer stochastic block models
    Annals of Statistics, to appear.
  5. Oliverira, N. L., Lei, J., and Tibshirani, R. (2025)
    Unbiased Risk Estimation in the Normal Means Problem via Coupled Bootstrap Techniques
    Electronic Journal of Statistics, 19(1), 361-396.
  6. Tian, J., Lei, J., and Roeder, K. (2024)
    From local to global gene co-expression estimation using single-cell RNA-seq data
    Biometrics, 80(1), ujae001.
  7. Cai, Z., Lei, J., and Roeder, K. (2024)
    Asymptotic distribution-free independence test for high dimension data.
    Journal of the American Statistical Association, 119(547), 1794-1804.
  8. Hu, X. and Lei, J. (2024)
    A Two-Sample Conditional Distribution Test Using Conformal Prediction and Weighted Rank Sum [code]
    Journal of the American Statistical Association, 119(546), 1136-1154.
  9. Chakravarti, P., Kuusela, M., Lei, J., and Wasserman L. (2023)
    Model-independent detection of new physics signals using interpretable semi-supervised classifier tests.
    Annals of Applied Statistics, 17(4), 2759-2795.
  10. Qiu, Y., Lei, J., and Roeder, K. (2023)
    Gradient-based sparse principal component analysis with extensions to online learning
    Biometrika, 110(2), 339-360.
  11. Lei, J. and Lin, K. Z., (2023)
    Bias-adjusted spectral clustering in multi-layer stochastic block models
    Journal of the American Statistical Association, 118(544), 2433-2445.
  12. Cai, Z., Lei, J., and Roeder, K. (2022)
    Model-free prediction test with application to genomics data
    Proceedings of the National Academy of Sciences, 119(34), e2205518119.
  13. Lin, K. Z., Lei, J., and Roeder, K. (2021)
    Exponential-family embedding with application to cell developmental trajectories for single-cell RNA-seq data
    Journal of the American Statistical Association, 116(534), 457-470 (with discussion and rejoinder).
  14. Qiu, Y., Wang, J., Lei, J., and Roeder, K. (2021)
    Identification of cell-type-specific marker genes from co-expression patterns in tissue samples
    Bioinformatics, 37(19), 3228-3234.
  15. Lei, J. (2021)
    Network representation using graph root distributions
    Annals of Statistics, 49(2), 745-768.
  16. Lei, J. (2020)
    Cross-validation with confidence [code]
    Journal of the American Statistical Association, 115(532), 1978-1997.
  17. Lei, J. and Kadane, J. B. (2020)
    On the probability that two random integers are coprime
    Statistical Science, 35(2) 272-279.
  18. Lei, J. and Lin, K. (2020)
    Discussion of ‘Network cross-validation by edge sampling’ [code]
    Biometrika, 107(2), 285-287.
  19. Lei, J. (2020)
    Convergence and concentration of empirical measures under Wasserstein distance in unbounded functional spaces
    Bernoulli, 26(1), 767-798.
  20. Lei, J., Chen, K., and Lynch, B. (2020)
    Consistent community detection in multi-layer network data [code]
    Biometrika, 107(1), 61-73.
  21. Kim, I., Lee, A. B., and Lei, J. (2019)
    Global and local two-sample tests via regression
    Electronic Journal of Statistics, 13(2), 5253-5305.
  22. Lei, J. (2019)
    Fast exact conformalization of Lasso using piecewise linear homotopy [code]
    Biometrika, 106(4), 749–764.
  23. Zhu, L., Lei, J., Klei, L., Devlin, B., and Roeder, K. (2019)
    Semi-soft clustering of single cell data
    Proceedings of the National Academy of Sciences, 116(2), 466-471.
  24. Vu, V. Q. and Lei, J. (2019)
    Squared-norm empirical processes
    Statistics & Probability Letters, 150, 108-113.
  25. Sadinle, M., Lei, J., and Wasserman, L. (2019)
    Least ambiguous set-valued classifiers with bounded error levels
    Journal of the American Statistical Association, 114(525), 223-234.
  26. Zhu, L., Lei, J., and Roeder, K. (2018)
    A unified statistical framework for single cell and bulk RNA sequencing data
    Annals of Applied Statistics, 12(1), 609-632.
  27. Lei, J., G'Sell, M., Rinaldo, A., Tibshirani, R. J. and Wasserman, L. (2018)
    Distribution-free predictive inference for regression [R Package]
    Journal of the American Statistical Association, 113(523), 1094-1111.
  28. Lei, J., Charest, A.-S., Slavkovic, A., Smith, A., and Fienberg, S. (2018)
    Differentially private model selection with penalized and constrained likelihood
    Journal of the Royal Statistical Society, Series A, 181, 609-633.
  29. Chen, K. and Lei, J. (2018)
    Network cross-validation for determining the number of communities in network data [code]
    Journal of the American Statistical Association, 113(521), 241-251.
  30. Lei, J. and Zhu, L. (2017)
    Generic sample splitting for refined community recovery in degree corrected stochastic block models [code]
    Statistica Sinica, 27, 1639-1659.
  31. Zhu, L., Lei, J., Devlin, B., and Roeder, K. (2017)
    Testing high dimensional differential matrices, with application to detecting schizophrenia risk genes
    Annals of Applied Statistics, 11(3), 1810−1831.
  32. Wang, Y., Lei, J., and Fienberg, S. (2016)
    On-average KL-privacy and its equivalence to generalization for max-entropy mechanisms
    Privacy in Statistical Databases. PSD'2016, Dubrovnik.
  33. Wang, Y., Lei, J., and Fienberg, S. (2016)
    Learning with differential privacy: stability, learnability and the sufficiency and necessity of ERM principle
    Journal of Machine Learning Research, 17(183), 1−40.
  34. Lei, J. (2016)
    A goodness-of-fit test for stochastic block models [code]
    Annals of Statistics, 44(1), 401-424.
  35. Chen, K. and Lei, J. (2015)
    Localized functional principal component analysis [code]
    Journal of the American Statistical Association, 110, 1266-1275.
  36. Liu, L., Lei, J., and Roeder, K. (2015)
    Network assisted analysis to reveal the genetic basis of autism
    Annals of Applied Statistics, 9(3), 1571-1600.
  37. Lei, J. and Vu, V. Q. (2015)
    Sparsistency and agnostic inference in sparse PCA
    Annals of Statistics, 43(1), 299-322.
  38. Lei, J. and Rinaldo, A. (2015)
    Consistency of spectral clustering in stochastic block models
    Annals of Statistics, 43(1), 215-237.
  39. Lei, J., Rinaldo, A., and Wasserman, L. (2015)
    A conformal prediction approach to explore functional data
    Annals of Mathematics and Artificial Intelligence, 74(1-2), 29-43.
  40. Lei, J. (2014)
    Classification with confidence [code]
    Biometrika, 101(4), 755-769.
  41. Lei, J. (2014)
    Adaptive global testing for functional linear models [code]
    Journal of the American Statistical Association, 109, 624-634.
  42. Lei, J. and Wasserman, L. (2014)
    Distribution free prediction bands for nonparametric regression [code]
    Journal of the Royal Statistical Society, Series B, 76, 71-96.
  43. Vu, V. Q. and Lei, J. (2013)
    Minimax sparse principal subspace estimation in high dimensions
    Annals of Statistics, 41, 2905-2947.
  44. Vu, V. Q., Cho, J., Lei, J., and Rohe, K. (2013)
    Fantope projection and selection: A near-optimal convex relaxation of sparse PCA [R package]
    Annual Conference on Neural Information Proceeding Systems, 26 (NIPS'13).
  45. Lei, J., Robins, J., and Wasserman, L. (2013)
    Distribution free prediction sets [code]
    Journal of the American Statistical Association, 108, 278-287.
  46. Lei, J. and Bickel, P. (2013)
    On convergence of recursive Monte Carlo filters in non-compact state spaces
    Statistica Sinica, 23, 429-450.
  47. Vu, V. Q. and Lei, J. (2012)
    Minimax rates of estimation for sparse PCA in high dimensions
    Fifteenth International Conference on Artificial Intelligence and Statistics (AISTATS'12, Best Paper Award).
  48. Lei, J. and Bickel, P. (2011)
    A moment-matching approach to nonlinear non-Gaussian ensemble filtering
    Monthly Weather Review, 139, 3964-3973.
  49. Lei, J. (2011)
    Differentially private M-estimators [supplementary] [code]
    Annual Conference on Neural Information Proceeding Systems, 24 (NIPS'11).
  50. Lei, J., Bickel, P., and Snyder, C. (2010)
    Comparison of ensemble Kalman filters under non-Gaussianity
    Monthly Weather Review, 138, 1293-1306.
  51. Dwork, C. and Lei, J. (2009)
    Differential privacy and robust statistics [Full Version]
    Proceedings of the 41st Annual ACM Symposium on Theory of Computing (STOC'09).

Peer-Reviewed Articles: Application

  1. Cotney, J. et al. (2015)
    The autism-associated chromatin modifier ​CHD8 regulates other autism risk genes during human neurodevelopment
    Nature Communications, 6:6404.
  2. De Rubeis, S. et al. (2014)
    Synaptic, transcriptional and chromatin genes disrupted in autism
    Nature, 515, 209-215
  3. Liu, L., Lei, J., et al. (2014)
    DAWN: A framework to identify autism genes and subnetworks using gene expression and genetics
    Molecular Autism, 5:22.
  4. Willsey, J. et al. (2013)
    Coexpression networks implicate human midfetal deep cortical projection neurons in the pathogenesis of autism
    Cell, 155, 997-1007.

Notes

  1. Bernstein's inequality using Orlicz psi_1 norm