Estimate distribution of true positives given sampling resuts

truepos_given_sample(samplepos, n, N, replicates = 1000)

# S3 method for truepos
summary(object, alpha = 0.1, ...)

Arguments

samplepos	Number of positives observed in sample
n	Sample size
N	Population size
replicates	Number of replicates per tested true pos number
object	Sample counts to summarise
alpha	The confidence interval is (1-alpha)*100% (i.e. alpha=0.1 => 90% CI)
...	Additional arguments (currently ignored)

Value

a vector containing population true positive counts that could have generated the observed number of sample positives. It has class truepos.

Details

The idea is to generate random realisations for all possible numbers of true positives, choose only those cases that resulted in the observed number of sample positives, and then use that empirical distribution of simulated true positives to estimate the most likely value of the (unknown) number of true positives.

NB what we are doing here effectively is to estimate the unknown parameter, m, of the Hypergeometric distribution, i.e. the number of white balls in the urn.

Examples

# Imagine we have sampled 10 profiles from a tract of 48 and found 2 LHNs
tps=truepos_given_sample(samplepos = 2, n=10, N=48)

hist(tps, breaks=0:49-.5, col='red')
plot(ecdf(tps))

# the mode should be the Maximum Likelihood Estimate
# (if enough replicates were used)
summary(tps)
#>     5% Median   Mode    95% 
#>      5     12     11     22 
# 95% confidence interval
summary(tps, alpha=.05)
#>   2.5% Median   Mode  97.5% 
#>      4     12     11     24

Estimate distribution of true positives given sampling resuts

Arguments

Value

Details

See also

Examples

Contents