A Poisson distribution is the probability distribution that results from a Poisson experiment. A probability distribution assigns a probability to possible outcomes of a random experiment. A Poisson experiment has the following properties:
- The outcomes of the experiment can be classified as either successes or failures.
- The average number of successes that occurs in a specified region is known.
- The probability that a success will occur is proportional to the size of the region.
- The probability that a success will occur in an extremely small region is virtually zero.
A Poisson random variable is the number of successes that result from a Poisson experiment. Given the mean number of successes that occur in a specified region, we can compute the Poisson probability based on the following formula:
which is also written as:
Examples
The average number of homes sold is 2 homes per day. What is the probability that exactly 3 homes will be sold tomorrow?
Calculating this manually in R:
e <- exp(1)
((e^-2)*(2^3))/factorial(3)
[1] 0.180447
Using dpois():
dpois(x = 3, lambda = 2)
[1] 0.180447
The Poisson distribution can be used to estimate the technical variance in high-throughput sequencing experiments.
My basic understanding is that the variance between technical replicates can be modelled using the Poisson distribution. Check out Why Does Rna-Seq Read Count Fit Poisson Distribution? on Biostars.
Calculating confidence intervals
Calculate the confidence intervals using R. Create data with 1,000,000 values that follow a Poisson distribution with lambda = 20.
set.seed(1984)
n <- 1000000
data <- rpois(n, 20)
Functions for calculating the lower and upper tails.
poisson_lower_tail <- function(n) {
qchisq(0.025, 2*n)/2
}
poisson_upper_tail <- function(n) {
qchisq(0.975, 2*(n+1))/2
}
Lower limit for lambda = 20.
poisson_lower_tail(20)
[1] 12.21652
Upper limit for lambda = 20.
poisson_upper_tail(20)
[1] 30.88838
How many values in data are lower than the lower limit?
table(data<poisson_lower_tail(20))
FALSE TRUE
961213 38787
How many values in data are higher than the upper limit?
table(data>poisson_upper_tail(20))
FALSE TRUE
986239 13761
What percentage of values were outside of the 95% CI?
(sum(data<poisson_lower_tail(20)) + sum(data>poisson_upper_tail(20))) * 100 / n
[1] 5.2548
Plot.
hist(data)
abline(v=poisson_lower_tail(20))
abline(v=poisson_upper_tail(20))
Webtool
Using the Poisson Confidence Interval Calculator and lambda = 20 returns:
- 99% confidence interval: 10.35327 - 34.66800
- 95% confidence interval: 12.21652 - 30.88838
- 90% confidence interval: 13.25465 - 29.06202
which matches our 95% CI values.

This work is licensed under a Creative Commons
Attribution 4.0 International License.
