How many people do you need in a room before you have a greater than 505 chance that two of them will have the same birthday?
Assume 356 birthdays(our containers)
% chance that two people in the room have the same birthday
100% requires 366 people (the pigeonhole principle)
Compute probability of different birthdays
Random sample of n people(birthdays) taken from k (365)days
k^n samples with replacement
(k)n=k(k-1)…(k-n+1) sample without replacement
Probability of repetition:
p = 1 – (k)n/k^n = n(n-1)/2k = 0.5 if n = √k
1-(k)n/k^n = the probability that a pair share the same birthday
If k = 365, n = 19
If there are 19 people in a room, there is a good chance that two of them share the same birthday!
There are many more ‘pigeons’ than ‘pigeonholes’
Many inputs will be mapped to the same output. That is, many input messages will have the same hash.
Conclusion: The longer the length of the hash, the fewer collisions.