Markov chain

A Markov chain is a mathematical system that undergoes transitions from one state to another on a state space in a stochastic (random) manner. Examples of Markov chains include the board game snakes and ladders, where each state represents the position of a player on the board and a player moves between states (different positions...

Continue Reading

Tissue specificity

Wikipedia has a definition of entropy with respect to information theory. The introduction of that article gives an example using a coin toss; if a coin toss is fair, the entropy rate for a fair coin toss is one bit per toss. However, if the coin is not fair, then the uncertainty, and hence the...

Continue Reading

Set notation

I've just started the Mathematical Biostatistics Boot Camp 1 and to help me remember the set notations introduced in the first lecture, I'll include them here: The sample space, (upper case omega), is the collection of possible outcomes of an experiment, such as a die roll: An event, say E, is a subset of ,...

Continue Reading

Comparing different distributions

Updated 2017 September 7th The Kolmogorov-Smirnov test can be used to test whether two underlying one-dimensional probability distributions differ. As noted in the Wikipedia article: Note that the two-sample test checks whether the two data samples come from the same distribution. This does not specify what that common distribution is (e.g. whether it's normal or...

Continue Reading

The Poisson distribution

A Poisson distribution is the probability distribution that results from a Poisson experiment. A probability distribution assigns a probability to possible outcomes of a random experiment. A Poisson experiment has the following properties: The outcomes of the experiment can be classified as either successes or failures. The average number of successes that occurs in a...

Continue Reading

Manual linear regression analysis using R

Updated 2017 September 5th The aim of linear regression is to find the equation of the straight line that fits the data points the best; the best line is one that minimises the sum of squared residuals of the linear regression model. The equation of a straight line is: where is the slope or gradient...

Continue Reading

Step by step Principal Component Analysis using R

I've always wondered what goes on behind the scenes of a Principal Component Analysis (PCA). I found this extremely useful tutorial that explains the key concepts of PCA and shows the step by step calculations. Here, I use R to perform each step of a PCA as per the tutorial. Our dataset visualised on the...

Continue Reading