Updated 2014 December 19th
The normal or Gaussian distribution is commonly occurring continuous probability distribution. The skewness, which is a measure of symmetry (or there lackof), of a normal distribution is zero since the distribution is symmetrical, i.e. it looks the same to the left and right of the centre. Kurtosis can be used to measure the shape of a normal distribution; a high kurtosis indicates that the normal distribution has a distinct peak near the mean and a low kurtosis indicates a flat distribution.
We can generate an univariate data set that follows a normal distribution using the rnorm() function in R; the function takes three parameters, the number of data points, the mean, and the standard deviation:
#seed for reproducibility set.seed(31) x.norm <- rnorm(n=200, m=10, sd=2)
The histogram can be used to show both the skewness and kurtosis of a data set.
hist(x.norm, main="Histogram x.norm")
We can use the skewness() and kurtosis() functions from the e1071 package to measure the skewness and kurtosis, respectively.
#install if necessary install.packages('e1071') library(e1071) #seed for reproducibility set.seed(31) x.norm <- rnorm(n=200, m=10, sd=2) skewness(x.norm)  0.005622505 #the default is a measure of excess kurtosis #see http://www.itl.nist.gov/div898/handbook/eda/section3/eda35b.htm kurtosis(x.norm)  -0.2955075 #the kurtosis of the standard normal distribution is near 0 kurtosis(rnorm(10000, 0, 1))  -0.0153943
Checking whether a dataset is normal
I’ve written about testing for normality previously. Briefly, we can use the qqnorm() function, to test the goodness of fit of a normal distribution and use the Shapiro-Wilk test of normality.
set.seed(31) x.norm <- rnorm(n=200, m=10, sd=2) #The Shapiro–Wilk test checks whether a sample is normally distributed #the null hypothesis is that the data was independently drawn from a normal distribution #the p-value indicates that we cannot reject the null shapiro.test(x.norm) Shapiro-Wilk normality test data: x.norm W = 0.9956, p-value = 0.8364 #in this case, we can reject the null hypothesis shapiro.test(rgamma(n = 200, shape = 1)) Shapiro-Wilk normality test data: rgamma(n = 200, shape = 1) W = 0.8443, p-value = 2.299e-13
This work is licensed under a Creative Commons
Attribution 4.0 International License.