Markov chain

A Markov chain is a mathematical system that undergoes transitions from one state to another on a state space in a stochastic (random) manner. Examples of Markov chains include the board game snakes and ladders, where each state represents the position of a player on the board and a player moves between states (different positions...

Continue Reading

Tissue specificity

Wikipedia has a definition of entropy with respect to information theory. The introduction of that article gives an example using a coin toss; if a coin toss is fair, the entropy rate for a fair coin toss is one bit per toss. However, if the coin is not fair, then the uncertainty, and hence the...

Continue Reading

Set notation

I've just started the Mathematical Biostatistics Boot Camp 1 and to help me remember the set notations introduced in the first lecture, I'll include them here: The sample space, (upper case omega), is the collection of possible outcomes of an experiment, such as a die roll: An event, say E, is a subset of ,...

Continue Reading

Comparing different distributions

I recently learned of the Kolmogorov-Smirnov Test and how one can use it to test whether two datasets are likely to be different, i.e. comparing different distributions. Strictly speaking, the p-value gives us a probability of whether or not we can reject the null hypothesis, which is that two datasets have the same distribution. Using...

Continue Reading

The Poisson distribution

A Poisson distribution is the probability distribution that results from a Poisson experiment. A probability distribution assigns a probability to possible outcomes of a random experiment. A Poisson experiment has the following properties: The outcomes of the experiment can be classified as either successes or failures. The average number of successes that occurs in a...

Continue Reading

Manual linear regression analysis using R

Updated 2014 June 24th The aim of linear regression is to find the equation of the straight line that fits the data points the best; the best line is one that minimises the sum of squared residuals of the linear regression model. The equation of a straight line is: where is the slope or gradient...

Continue Reading

Step by step principal components analysis using R

I've always wondered about the calculations behind a principal components analysis (PCA). An extremely useful tutorial explains the key concepts and runs through the analysis. Here I use R to calculate each step of a PCA in hopes of better understanding the analysis. I'm still trying to figure out why the signs are inverted between...

Continue Reading