Main Page
From PileLine
(→Use Cases) |
|||
| Line 15: | Line 15: | ||
===Processing Commands=== | ===Processing Commands=== | ||
*'''''pileline-fastseek.sh''''' | *'''''pileline-fastseek.sh''''' | ||
| - | + | Prints a given range of a locus file: | |
| - | + | pileline-fastseek.sh –p <locus_file.txt> -s chr10:100:10000 | |
*'''''pileline-fastsjoin.sh''''' | *'''''pileline-fastsjoin.sh''''' | ||
| - | + | Joins two positional files: | |
| - | + | XXXXX | |
*'''''pileline-rfilter.sh''''' | *'''''pileline-rfilter.sh''''' | ||
| - | + | Filters (or annotates) a positional file with range-based annotations (in bed format). Each position that is inside of a specific range is annotated. | |
| - | + | pileline-rfilter –A <locus_file.txt> –i <targets.bed> –o <out.txt> | |
| - | + | pileline-rfilter --annotate –A <locus_file.tx> –i <annotations.bed> –o <out.txt> | |
*'''''pileline-genindex.sh''''' | *'''''pileline-genindex.sh''''' | ||
| - | + | Indexes fasta genome and then can perform range based queries in that genome. | |
| - | + | pileline-genindex –-index –g <fasta> -i <new_index> | |
| - | + | pileline-genindex'' --seek –i <index> -s chr1:1000:2000 | |
===Analysis Commands=== | ===Analysis Commands=== | ||
*'''''pileline-2smc.sh''''' | *'''''pileline-2smc.sh''''' | ||
| - | + | Looks for discrepancies in genotypes of two samples (i.e.: case vs control). It also can annotate each output position with a given positional file containing custom annotations (i.e. dbSNP). Also produces a SIFT and PolyPhen-2 compatible outfiles. | |
*'''''pileline-nsmc.sh''''' | *'''''pileline-nsmc.sh''''' | ||
| - | + | Takes the output of several 2smc comparisons commands to reports where variants are reproduced. | |
*'''''pileline-genotest.sh''''' | *'''''pileline-genotest.sh''''' | ||
| - | + | Calculates the NGS performance on genotyping, surveying a set of genomic positions whose genotype is known in the sample. | |
==Use Cases== | ==Use Cases== | ||
| - | [[File:Figure_paper_Final.png|right|thumb|PileLine coupled to SAMtools]] | + | [[File:Figure_paper_Final.png|right|thumb|PileLine coupled to SAMtools.]] |
*'''Annotate a locus file with dbSNP.''' | *'''Annotate a locus file with dbSNP.''' | ||
| - | + | pileline-fastjoin.sh –a <locus_file.txt> -b dbSNP130.txt --left-outer-join | |
*'''Annotate a locus file with genes.''' | *'''Annotate a locus file with genes.''' | ||
| - | + | pileline-rfilter.sh --annotate –A <locus_file.txt> –b <genes.bed> | |
*'''Filter pileup to exon loci.''' | *'''Filter pileup to exon loci.''' | ||
| - | + | pileline-rfilter.sh –A <locus_file.txt> –b <exons.bed> | |
| - | *'''Perform | + | *'''Perform 2 samples comparison.''' |
| - | + | pileline-2smc.sh | |
| + | –a <locusfile_A.txt> –b <locusfile_B.txt> | ||
| + | –v <variants_locusfile_A.txt> –w <variants__locusfile_B.txt> | ||
| + | –o <out.txt> -d <min_depth> | ||
| + | |||
| + | *'''Perform n samples comparison.''' | ||
| + | pileline-nsmc.sh | ||
| + | --a-samples<pileup_a1>,<pileup_a2>,<pileup_a3> | ||
| + | --b-samples <pileup_b1>,<pileup_b2>,<pileup_b3> | ||
Revision as of 16:50, 7 June 2010
Contents |
Welcome to PileLine Wiki
PileLine (Pileup pipeLine) is a flexible command-line toolkit for efficient handling, filtering, and comparison of locus text files produced by next-generation sequencing experiments (i.e. pileup files from SAMtools). PileLine is designed to be memory efficient by performing on-disk operations over sorted locus files directly.
PileLine is available for downloading at: http://sourceforge.net/projects/pileline
Main Features
- Filtering and comparison of locus text files.
- Full annotation of locus files with human dbSNP, HGNC Gene Symbol and Ensembl IDs. Custom annotations are also allowed and may be supplied through standard .BED or .GFF files.
- SIFT and PolyPhen-2 compatible outputs to facilitate the biological interpretation of huge lists of variants.
- Genotyping quality control functionality for estimating performance metrics (Harismendi et al. 2009) on detecting homo/heterozigote variants against a given gold standard genotype.
PileLine Commands
Processing Commands
- pileline-fastseek.sh
Prints a given range of a locus file:
pileline-fastseek.sh –p <locus_file.txt> -s chr10:100:10000
- pileline-fastsjoin.sh
Joins two positional files:
XXXXX
- pileline-rfilter.sh
Filters (or annotates) a positional file with range-based annotations (in bed format). Each position that is inside of a specific range is annotated.
pileline-rfilter –A <locus_file.txt> –i <targets.bed> –o <out.txt> pileline-rfilter --annotate –A <locus_file.tx> –i <annotations.bed> –o <out.txt>
- pileline-genindex.sh
Indexes fasta genome and then can perform range based queries in that genome.
pileline-genindex –-index –g <fasta> -i <new_index> pileline-genindex --seek –i <index> -s chr1:1000:2000
Analysis Commands
- pileline-2smc.sh
Looks for discrepancies in genotypes of two samples (i.e.: case vs control). It also can annotate each output position with a given positional file containing custom annotations (i.e. dbSNP). Also produces a SIFT and PolyPhen-2 compatible outfiles.
- pileline-nsmc.sh
Takes the output of several 2smc comparisons commands to reports where variants are reproduced.
- pileline-genotest.sh
Calculates the NGS performance on genotyping, surveying a set of genomic positions whose genotype is known in the sample.
Use Cases
- Annotate a locus file with dbSNP.
pileline-fastjoin.sh –a <locus_file.txt> -b dbSNP130.txt --left-outer-join
- Annotate a locus file with genes.
pileline-rfilter.sh --annotate –A <locus_file.txt> –b <genes.bed>
- Filter pileup to exon loci.
pileline-rfilter.sh –A <locus_file.txt> –b <exons.bed>
- Perform 2 samples comparison.
pileline-2smc.sh –a <locusfile_A.txt> –b <locusfile_B.txt> –v <variants_locusfile_A.txt> –w <variants__locusfile_B.txt> –o <out.txt> -d <min_depth>
- Perform n samples comparison.
pileline-nsmc.sh --a-samples<pileup_a1>,<pileup_a2>,<pileup_a3> --b-samples <pileup_b1>,<pileup_b2>,<pileup_b3>


