PLINK

From Dave's wiki
Jump to navigation Jump to search

Downloading

Download PLINK from http://pngu.mgh.harvard.edu/~purcell/plink/download.shtml

ls -1 plink-1.07-mac-intel/
COPYING.txt
README.txt
gPLINK.jar
plink
plink.log
test.map
test.ped

Converting PED into BED

cd plink-1.07-mac-intel/
plink --file test --make-bed --out test

@----------------------------------------------------------@
|        PLINK!       |     v1.07      |   10/Aug/2009     |
|----------------------------------------------------------|
|  (C) 2009 Shaun Purcell, GNU General Public License, v2  |
|----------------------------------------------------------|
|  For documentation, citation & bug-report instructions:  |
|        http://pngu.mgh.harvard.edu/purcell/plink/       |
@----------------------------------------------------------@

Web-based version check ( --noweb to skip )
Recent cached web-check found...Problem connecting to web

Writing this text to log file [ test.log ]
Analysis started: Tue Jul 21 17:09:59 2015

Options in effect:
        --file test
        --make-bed
        --out test

2 (of 2) markers to be included from [ test.map ]
6 individuals read from [ test.ped ] 
6 individuals with nonmissing phenotypes
Assuming a disease phenotype (1=unaff, 2=aff, 0=miss)
Missing phenotype value is also -9
3 cases, 3 controls and 0 missing
6 males, 0 females, and 0 of unspecified sex
Before frequency and genotyping pruning, there are 2 SNPs
6 founders and 0 non-founders found
Total genotyping rate in remaining individuals is 1
0 SNPs failed missingness test ( GENO > 1 )
0 SNPs failed frequency test ( MAF < 0 )
After frequency and genotyping pruning, there are 2 SNPs
After filtering, 3 cases, 3 controls and 0 missing
After filtering, 6 males, 0 females, and 0 of unspecified sex
Writing pedigree information to [ test.fam ] 
Writing map (extended format) information to [ test.bim ] 
Writing genotype bitfile to [ test.bed ] 
Using (default) SNP-major mode

Analysis finished: Tue Jul 21 17:09:59 2015

Converting PED into BED

The first six columns of a PED format are:

  1. Family ID
  2. Individual ID
  3. Paternal ID
  4. Maternal ID
  5. Sex (1=male; 2=female; other=unknown)
  6. Phenotype

By default, each line of the MAP file describes a single marker and must contain exactly 4 columns:

  1. chromosome (1-22, X, Y or 0 if unplaced)
  2. rs# or snp identifier
  3. Genetic distance (morgans)
  4. Base-pair position (bp units)

Comparing the PED file to the three separate files:

#original files
cat test.ped 
1 1 0 0 1  1  A A  G T
2 1 0 0 1  1  A C  T G
3 1 0 0 1  1  C C  G G
4 1 0 0 1  2  A C  T T
5 1 0 0 1  2  C C  G T
6 1 0 0 1  2  C C  T T

cat test.map 
1 snp1 0 1
1 snp2 0 2

#converted files 
xxd -b test.bed 
00000000: 01101100 00011011 00000001 10111000 00001111 11001010  l.....
00000006: 00001110                                               .

cat test.bim 
1       snp1    0       1       A       C
1       snp2    0       2       G       T

cat test.fam
1 1 0 0 1 1
2 1 0 0 1 1
3 1 0 0 1 1
4 1 0 0 1 2
5 1 0 0 1 2
6 1 0 0 1 2

Converting a BED back to a PED

plink --bfile test --recode --out retest

@----------------------------------------------------------@
|        PLINK!       |     v1.07      |   10/Aug/2009     |
|----------------------------------------------------------|
|  (C) 2009 Shaun Purcell, GNU General Public License, v2  |
|----------------------------------------------------------|
|  For documentation, citation & bug-report instructions:  |
|        http://pngu.mgh.harvard.edu/purcell/plink/       |
@----------------------------------------------------------@

Web-based version check ( --noweb to skip )
Recent cached web-check found...Problem connecting to web

Writing this text to log file [ retest.log ]
Analysis started: Tue Jul 21 17:16:17 2015

Options in effect:
        --bfile test
        --recode
        --out retest

Reading map (extended format) from [ test.bim ] 
2 markers to be included from [ test.bim ]
Reading pedigree information from [ test.fam ] 
6 individuals read from [ test.fam ] 
6 individuals with nonmissing phenotypes
Assuming a disease phenotype (1=unaff, 2=aff, 0=miss)
Missing phenotype value is also -9
3 cases, 3 controls and 0 missing
6 males, 0 females, and 0 of unspecified sex
Reading genotype bitfile from [ test.bed ] 
Detected that binary PED file is v1.00 SNP-major mode
Before frequency and genotyping pruning, there are 2 SNPs
6 founders and 0 non-founders found
Total genotyping rate in remaining individuals is 1
0 SNPs failed missingness test ( GENO > 1 )
0 SNPs failed frequency test ( MAF < 0 )
After frequency and genotyping pruning, there are 2 SNPs
After filtering, 3 cases, 3 controls and 0 missing
After filtering, 6 males, 0 females, and 0 of unspecified sex
Writing recoded ped file to [ retest.ped ] 
Writing new map file to [ retest.map ] 

Analysis finished: Tue Jul 21 17:16:17 2015

cat retest.ped 
1 1 0 0 1 1 A A G T
2 1 0 0 1 1 A C G T
3 1 0 0 1 1 C C G G
4 1 0 0 1 2 A C T T
5 1 0 0 1 2 C C G T
6 1 0 0 1 2 C C T T