birthday {stats}R Documentation

Probability of coincidences

Description

Computes approximate answers to a generalised “birthday paradox” problem. pbirthday computes the probability of a coincidence and qbirthday computes the number of observations needed to have a specified probability of coincidence.

Usage

qbirthday(prob = 0.5, classes = 365, coincident = 2)
pbirthday(n, classes = 365, coincident = 2)

Arguments

classes How many distinct categories the people could fall into
prob The desired probability of coincidence
n The number of people
coincident The number of people to fall in the same category

Details

The birthday paradox is that a very small number of people, 23, suffices to have a 50-50 chance that two of them have the same birthday. This function generalises the calculation to probabilities other than 0.5, numbers of coincident events other than 2, and numbers of classes other than 365.

This formula is approximate, as the example below shows. For coincident=2 the exact computation is straightforward and may be preferable.

Value

qbirthday Number of people needed for a probability prob that k of them have the same one out of classes equiprobable labels.
pbirthday Probability of the specified coincidence

References

Diaconis P, Mosteller F., “Methods for studying coincidences”. JASA 84:853-861

Examples

 ## the standard version
qbirthday() 
 ## same 4-digit PIN number 
qbirthday(classes=10^4)
 ## 0.9 probability of three coincident birthdays
qbirthday(coincident=3,prob=0.9)
## Chance of 4 coincident birthdays in 150 people
pbirthday(150,coincident=4)
## Accuracy compared to exact calculation 
x1<- sapply(10:100, pbirthday)
x2<-1-sapply(10:100, function(n)prod((365:(365-n+1))/rep(365,n)))
par(mfrow=c(2,2))
plot(x1,x2,xlab="approximate",ylab="exact")
abline(0,1)
plot(x1,x1-x2,xlab="approximate",ylab="error")
abline(h=0)
plot(x1,x2,log="xy",xlab="approximate",ylab="exact")
abline(0,1)
plot(1-x1,1-x2,log="xy",xlab="approximate",ylab="exact")
abline(0,1)

[Package Contents]