The h-index

The h-index was conceived back in 2005 as a measure of a researcher's scientific output. My impression is that it is quite widely used now. Basically, you have a h-index of 1 if 1 of your published papers has at least 1 citation, a h-index of 5 if 5 of your published papers have at least 5 citations and so on. Your h-index would still be 5 if you published 100 papers but only 5 of them has at least 5 citations (a scenario which I think is quite improbable). If you use Google Scholar, you can create your own Google Scholar profile, which calculates your h-index from all the publications that have your name on it, as well as the total number of citations and your c-index (the number of papers with your name on it that has more than 10 citations). If you happened to be part of a large consortium and had your name included on one of the major papers published as part of the consortium (which happened to become highly cited), Google Scholar includes all those citations under your name.

Now, I just came back from a talk from a renowned physicist who condemned citation counts and therefore the h-index. He talked about researchers who artificially increase their citation counts, by creating small groups of people who cross reference each other, etc. I agree with him; the number of citations or publications doesn't and shouldn't indicate whether someone is a good researcher or not. The same that IQ does not indicate how intelligent someone is. But these things exist because bureaucrats require some sort of a measure for hiring and for evaluating. He recommended another measure for citations as opposed to the h-index, but I didn't really grasp the concept; he spent less than a minute explaining it. But I think if it were to be implemented, people who want to exploit it, will be able to exploit it. I wanted to ask a question at the end of his talk regarding the balance between science and the governing bodies but from my impression, I thought it would have been pointless. So as an outlet, I thought I'll just write about it here, which may also be quite pointless.

The repetitive landscape of the human and mouse genome

Updated on the 31st May 2013 and updated again on the 25th March 2015 in light of Chris's comment.

RepeatMasker is a program that screens DNA sequences for interspersed repeats and low complexity DNA sequences. Results of RepeatMasker performed on the human and mouse genomes are provided via the UCSC Table Browser tool. In the post I will summarise the results of the RepeatMasker program to gain an overview of the repetitive landscape of the human and mouse genome.

Continue reading