Detect Sparse Heterogeneous Mixtures with Higher Criticism Statistic
To detect sparse and weak signals in mixtures, Donoho and Jin (2004) proposed that Higher Criticism statistic can be used, and has optimal behavior. The function HCdetection is to realize this procedure. It is assumed that the size alpha_n slowly converges to 0 for this test.
Contents
An Example with Simulated Data
Generate the data with large sample size n, and the same distribution
n = 1e+6; data = randn(n, 1);
Calculate two-sided p-value for the data
p = 2*(1 - normcdf(abs(data)));
Use the p-value to do test, the first one is classicial Higher Criticism test, and the second one is a refinement
[H01, stat01] = HCdetection(p, 1, 0); [H02, stat02] = HCdetection(p, 1/2, 1/n); H01 H02
H01 = 0 H02 = 0
The result shows that the null hypothesis is accepted.
In the alternative case, Generate the data such that with 1/1000 probability, the sample comes from normal distribution with a small mean.
eps = 1/1000; mu = sqrt(2*0.15*log(n)); r = binornd(1, eps, n, 1); data(r == 1) = data(r == 1) + mu;
Calculate two-sided p-value for the data
p = 2*(1 - normcdf(abs(data)));
Use the p-value to do test, the first one is classicial Higher Criticism test, and the second one is a refinement
[H11, stat11] = HCdetection(p, 1, 0); [H12, stat12] = HCdetection(p, 1/2, 1/n); H11 H12
H11 = 1 H12 = 1
For both statistics, the null hypothesis is rejected.