Main Page
From PileLine
|  (→Use Cases) |  (→Use Cases) | ||
| Line 68: | Line 68: | ||
| *'''Perform a genotyping test for quality control''' | *'''Perform a genotyping test for quality control''' | ||
|   ## Step1.   |   ## Step1.   | ||
| - | + | ||
|   #Create genotest file (required). |   #Create genotest file (required). | ||
|   pileline-genotest --create-genotest-file <experiment.genotest> –p <locus_file.txt> –g <gold_genotype.sorted> -r <ref_genome.pileline> |   pileline-genotest --create-genotest-file <experiment.genotest> –p <locus_file.txt> –g <gold_genotype.sorted> -r <ref_genome.pileline> | ||
|   ## Step2. QC analysis. |   ## Step2. QC analysis. | ||
| - | + | ||
|   #Generate a metrics table of performance at a given threshold. |   #Generate a metrics table of performance at a given threshold. | ||
|   pileline-genotest -a <experiment.genotest> -t <snpq_treshold> |   pileline-genotest -a <experiment.genotest> -t <snpq_treshold> | ||
| Line 79: | Line 79: | ||
|   #Generate all performance metrics for several thresholds |   #Generate all performance metrics for several thresholds | ||
|   pileline-genotest -a <experiment.genotest> --batch-t 0,255,1 |   pileline-genotest -a <experiment.genotest> --batch-t 0,255,1 | ||
| - | + | ||
|   #Generate values for ROC curve plot (outfile compatible to ROCR R package) |   #Generate values for ROC curve plot (outfile compatible to ROCR R package) | ||
|   pileline-genotest -a <experiment.genotest> --roc |   pileline-genotest -a <experiment.genotest> --roc | ||
Revision as of 14:17, 9 June 2010
| Contents | 
Welcome to PileLine Wiki
PileLine (Pileup pipeLine) is a flexible command-line toolkit for efficient handling, filtering, and comparison of locus text files produced by next-generation sequencing experiments (i.e. pileup files from SAMtools). PileLine is designed to be memory efficient by performing on-disk operations over sorted locus files directly.
PileLine is available for downloading at: http://sourceforge.net/projects/pileline
Main Features
- Filtering and comparison of locus text files.
- Full annotation of locus files with human dbSNP, HGNC Gene Symbol and Ensembl IDs. Custom annotations are also allowed and may be supplied through standard .BED or .GFF files.
- SIFT and PolyPhen-2 compatible outputs to facilitate the biological interpretation of huge lists of variants.
- Genotyping quality control functionality to estimate performance metrics (Harismendi et al. 2009) on detecting homo/heterozigote variants against a given gold standard genotype.
PileLine Commands
Processing Commands
- pileline-fastseek.sh
Prints a given range of a locus file.
- pileline-fastsjoin.sh
Joins two positional files.
- pileline-rfilter.sh
Filters (or annotates) a positional file with range-based annotations (in bed format). Each position that is inside of a specific range is annotated.
- pileline-sort.sh
Sorts a locus text files by coordinate.
- pileline-genindex.sh
Indexes fasta genome and then can perform range based queries in that genome.
Analysis Commands
- pileline-2smc.sh
Looks for discrepancies in genotypes of two samples (i.e.: case vs control). It also can annotate each output position with a given positional file containing custom annotations (i.e. dbSNP). Also produces a SIFT and PolyPhen-2 compatible outfiles.
- pileline-nsmc.sh
Takes the output of several 2smc comparisons commands to reports where variants are reproduced.
- pileline-genotest.sh
Calculates the NGS performance on genotyping, surveying a set of genomic positions whose genotype is known in the sample.
Use Cases
- Perform 2 samples comparison
pileline-2smc.sh –a <locusfile_A.txt> –b <locusfile_B.txt> –v <variants_locusfile_A.txt> –w <variants_locusfile_B.txt> –o <out.txt> -d <min_depth>
- Perform n samples comparison
pileline-nsmc.sh --a-samples<locusfile_a1>,<locusfile_a2>,<locusfile_a3> --b-samples <locusfile_b1>,<locusfile_b2>,<locusfile_b3>
- Sort a locus file
pileline-sort.sh -i <input_locus_file.txt> -o <outfile.sorted.txt>
- Annotate a locus file with dbSNP
pileline-fastjoin.sh –a <locus_file.txt> -b dbSNP130.txt --left-outer-join
- Annotate a locus file with genes
pileline-rfilter.sh --annotate –A <locus_file.txt> –b <genes.bed>
- Filter pileup to exon loci
pileline-rfilter.sh –A <locus_file.txt> –b <exons.bed>
- Perform a genotyping test for quality control
## Step1. #Create genotest file (required). pileline-genotest --create-genotest-file <experiment.genotest> –p <locus_file.txt> –g <gold_genotype.sorted> -r <ref_genome.pileline> ## Step2. QC analysis. #Generate a metrics table of performance at a given threshold. pileline-genotest -a <experiment.genotest> -t <snpq_treshold> #Generate all performance metrics for several thresholds pileline-genotest -a <experiment.genotest> --batch-t 0,255,1 #Generate values for ROC curve plot (outfile compatible to ROCR R package) pileline-genotest -a <experiment.genotest> --roc



