Updated: 2014 May 14th; added even more tips
I'm in the middle of writing papers and my thesis, so I've been quite busy. However, I wanted to write a quick blog post as an outlet. So here's a list of random command line tips off the top of my head (GNU bash, version 4.1.2(1)-release); I hope that there's at least one tip in this list you didn't know about beforehand.
The find tool is extremely useful; some uses include:
#in all Perl files #execute a grep quietly #and look for human_id #then report which file contains a match find . -name '*.pl' -exec grep -q 'human_id' {} \; -print #find broken symbolic links and delete them find -L . -type l -delete
Randomly shuffle lines using shuf:
for i in {1..10}; do echo $i; done | shuf 9 10 5 2 8 3 4 6 1 7
Use the -A and -B parameters of grep to print the lines before and after your matched line.
#echo out a bunch of lines echo -e "3\n2\n1\nA\n1\n2\n3" 3 2 1 A 1 2 3 #show me two lines before and two lines after A echo -e "3\n2\n1\nA\n1\n2\n3" | grep -B2 -A2 A 2 1 A 1 2 #also use -E with grep for extended regular expressions #Basic vs Extended Regular Expressions # In basic regular expressions the meta-characters ?, +, {, |, (, and ) lose their special meaning; instead use the #backslashed versions \?, \+, \{, \|, \(, and \). # # Traditional egrep did not support the { meta-character, and some egrep implementations support \{ instead, so portable #scripts should avoid { in grep -E patterns and should use [{] to match a literal {. # # GNU grep -E attempts to support traditional usage by assuming that { is not special if it would be the start of an invalid #interval specification. For example, the command grep -E '{1' searches for the two- # character string {1 instead of reporting a syntax error in the regular expression. POSIX.2 allows this behavior as an #extension, but portable scripts should avoid it.
Use watch to run a command every 2 seconds (default is 2 seconds):
#monitor the contents of a folder #this can be useful for monitoring output files watch ls -lt #monitor the statuses of UGE hosts watch qhost #other monitoring tips #use "tail -f" to monitor a file if it is growing tail -f /var/log/apache/access.log #check out the last 10 users who have logged into a server last | head
Use Bash's parameter expansion to split words or refer to a specific letter in a word:
#syntax ${parameter:offset:length} word=todayisagooddaytodie #length of word is ${#parameter} last_letter=`expr ${#word} - 1` for((i=0; i<${#word}; i++)) do printf "%s" "${word:$i:1}" if [ $i -eq $last_letter ] then printf "\n" else printf " " fi done #or just use sed echo $word | sed 's/\(.\)/\1 /g'
All new files and directories inherit the parent group.
# now all directories and files created inside /home/dtang # will have the same user and group as /home/dtang chmod g+s /home/dtang
Using paste to transform data (see this blog post):
cat test.txt #one #two #three #four #five #six #seven #eight #transform the 8 lines into a 2 by 4 table paste - - - - < test.txt #one two three four #five six seven eight #or 4 by 2 paste - - < test.txt #one two #three four #five six #seven eight
To make directories and sub-directories that don't exist use mkdir -p:
#this doesn't work mkdir test/test mkdir: cannot create directory 'test/test': No such file or directory #this works mkdir -p test/test ls test test #and use braces to create three subdirectories at once #no spaces after the commas! mkdir -p test/{one,two,three} ls test one three two #list only directories ls -d */
Use bc to do quick calculations. In the past I used expr to do simple arithmetic, however bc is more precise:
#8734 divided by 44 #using expr expr 8734 / 44 #198 #using bc -l bc -l<<<8734/44 198.50000000000000000000 #I told you it was more precise than expr
Use readline to navigate around in the command line. For example, I always use ctrl+w to delete one word, ctrl+l to clear the screen, alt+b and alt+f to move back and forward one word, ctrl+a to move to the start of the line (or ctrl+a a if you are using GNU screen), and ctrl+e to move to the end of the line. A word of warning though, ctrl+w is the shortcut to close a tab for Chrome and Opera (my favourite Internet Browsers). Occasionally, I've mistakenly closed tabs by accident because I thought the active window was the terminal. But not to worry, ctrl+shift+t re-opens the last tab closed both for Chrome and Opera.
My favourite GNU application of all time is screen. I highly recommend using screen. Did I mention it was my favourite tool of all time? One cool trick with screen is to use split screens. Use ctrl+a + S for a horizontal split, ctrl+a + | for a vertical split, ctrl+a + X to remove the split window, and ctrl+a + tab to navigate between split windows. If you have a big physical screen, you can have four concurrent mini-screens.
Redirect the standard error stream to standard output using 2>&1. Some programs output their usage to the standard error stream, such as intersectBed. To be able to read the usage, we can redirect STDERR to STDOUT and pipe it to less:
#we can scroll through the usage intersectBed 2>&1 | less #combine with grep -A #to see what the parameter -r is intersectBed 2>&1 | grep -A2 "\s-r\s" -r Require that the fraction overlap be reciprocal for A and B. - In other words, if -f is 0.90 and -r is used, this requires that B overlap 90% of A and A _also_ overlaps 90% of B.
To pipe output from one program to the next as well as saving a copy, use tee:
#save output from 1 to 10 loop using tee #then stream to Perl, which will add the numbers up for i in {1..10}; do echo $i; done | tee one_to_ten.txt | perl -nle '$a+=$_; END {print $a}' 55 cat one_to_ten.txt 1 2 3 4 5 6 7 8 9 10
Show tabs in a file:
# -t = -vT = --show-nonprinting + --show-tabs cat -t file.tsv | grep --color "\^I"
To read and write to the same file, use sponge:
#clone repository git clone git://git.kitenet.net/moreutils #compile gcc sponge.c -o sponge #I've prepared a test file, called test.txt cat test.txt #xyz #this doesn't work cat test.txt | sed 's/xyz/abc/' > test.txt #empty cat test.txt #using sponge cat test.txt #xyz #read test.txt, substitute, and write back to same file cat test.txt | sed 's/xyz/abc/' | sponge test.txt #voila cat test.txt abc
Start vim without loading .vimrc
vim -u NONE
Quickly switch between two directories by using cd -:
cd /etc #switch back to the previous directory cd -
This should be common knowledge but if you work with a lot of tables, use cut to cut out columns:
#echo out some 2 x 3 table echo -e "chr1\t1\t2\nchr1\t2\t3" chr1 1 2 chr1 2 3 #the default delimiter is a tab #cut out the first two columns echo -e "chr1\t1\t2\nchr1\t2\t3" | cut -f1,2 chr1 1 chr1 2
Use sort and uniq -c to create a tally:
#echo out a list of numbers echo -e "11\n3\n1\n2\n2\n4\n6\n1\n1" 11 3 1 2 2 4 6 1 1 #create a tally of the numbers #there are three 1's, two 2's, etc. echo -e "11\n3\n1\n2\n2\n4\n6\n1\n1" | sort | uniq -c | sort -k1rn 3 1 2 2 1 11 1 3 1 4 1 6 #you can combine cut, sort and uniq -c #to create quick summaries of columns
Sorting chromosomes alpha-numerically by using sort -k1,1V:
echo -e "chrY\nchr10\nchr2\nchr1\nchrM" | sort -k1,1V chr1 chr2 chr10 chrM chrY
Sorting by scientific notation:
# not sorted echo -e "10e-10\n10e-13\n10e-7" 10e-10 10e-13 10e-7 # not sorted properly echo -e "10e-10\n10e-13\n10e-7" | sort -n 10e-10 10e-13 10e-7 # sorted from smallest to largest echo -e "10e-10\n10e-13\n10e-7" | sort -g 10e-13 10e-10 10e-7 # sorted from largest to smallest echo -e "10e-10\n10e-13\n10e-7" | sort -gr 10e-7 10e-10 10e-13
For getting quick statistics at the command line, use the filo package:
for i in {1..10}; do echo $i; done | stats Total lines: 10 Sum of lines: 55 Ari. Mean: 5.5 Geo. Mean: 4.52872868811677 Median: 5.5 Mode: 1 (N=1) Anti-Mode: 1 (N=1) Minimum: 1 Maximum: 10 Variance: 8.25 StdDev: 2.87228132326901
If you use Perl, use perl -le to run a one-liner on the command line. The -e enables Perl code to be executed on the command line. The -l adds a newline to everything you print.
#print out 1 to 100 perl -le 'for(1..100){ print $_ }'
If you want to pipe to Perl, use -n; this saves you the trouble of having to type while(<>){ ... }. See perldoc perlrun for more information.
echo hi | perl -nle 's/hi/bye/; print' #bye #I use this a lot to print out line numbers #learn about Perl's special variables at #http://www.kichwa.com/quik_ref/spec_variables.html echo -e 'a\nb\nc\nd\ne\nf' | perl -nle 'print "line $.: $_"' line 1: a line 2: b line 3: c line 4: d line 5: e line 6: f
Run R -e to use R from the command line:
#find the number of combinations without replacement R -e 'choose(100,2)' #as suggested in the comments #to keep R quiet, i.e., turn off the welcome message R --quiet -e 'choose(100,2)' #which can be used with vanilla mode #The command-line option --vanilla implies #--no-site-file, --no-init-file, --no-environ and (except for R CMD) --no-restore R --vanilla --quiet -e 'choose(100,2)'
The convert program from ImageMagick is absolutely amazing for manipulating images at the command line.
If you need to rearrange columns of a file, use awk:
echo -e '3\t4\t1\t2' | awk 'OFS="\t" {print $3, $4, $1, $2}' 1 2 3 4
If you want to print out the middle of a file, use sed. In the past, I used a combination of head and tail:
#I want lines 3 to 7 #using head and tail for i in {1..10}; do echo $i; done | head -7 | tail -5 3 4 5 6 7 #using sed for i in {1..10}; do echo $i; done | sed -n '3,7p' 3 4 5 6 7
Copy files from one directory to another but don't copy files that already exist using rsync. This is really useful for example when copying Bioconductor packages from an older R installation (e.g. R-3.0.2) to a newer R installation (e.g. R-3.0.3).
# where org/ is the existing directory # and dup is where you want to copy the files rsync --ignore-existing -r -v org/ dup # copy Bioconductor packages from an older R installation rsync --ignore-existing -r -v ~/src/R-3.0.2/library/ ~/src/R-3.0.3/library # then open up R and run the below to update the packages # update.packages(checkBuilt = TRUE, ask = FALSE) # source("https://bioconductor.org/biocLite.R") # biocLite()
Submit data to a HTML form with POST method and save the response:
#http://www.commandlinefu.com/commands/view/2681/submit-data-to-a-html-form-with-post-method-and-save-the-response curl -sd 'rid=value&submit=SUBMIT' <URL> > out.html
Need to work out which day you were born? Use cal:
cal -y 1983 #I was born on a Thursday! #probably this is more useful #prints out a monthly calendar for the current year cal -y
Check out http://www.explainshell.com/ to have shell commands and parameters explained.
Use GNU parallel to speed up your work.
See Stephen Turner's useful bash one-liners for bioinformatics.
I better get back to writing! I'll update this list periodically.

This work is licensed under a Creative Commons
Attribution 4.0 International License.
For R, I would add “–quiet”: R –quiet -e ‘choose(100,2)’. And I usually also add “–vanilla”.
Thanks for the tip!
If I may add one, I’d suggest
awk -F $’\t’ ‘{print NF}’ “$@” | uniq -c;
to quickly show the dimensions of a tab-separated table.
Hey Derek,
thanks for the tip. I don’t use awk much but I found this guide to understand what your one liner was doing.
Cheers,
Dave