nf-core/sarek
Analysis pipeline to detect germline or somatic variants (pre-processing, variant calling and annotation) from WGS / targeted sequencing
3.2.0). The latest
stable release is
3.6.0
.
Define where the pipeline should find input data and save output data.
Starting step
stringPath to comma-separated file containing information about the samples in the experiment.
string^\S+\.csv$The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.
stringMost common options used for the pipeline
Specify how many reads each split of a FastQ file contains. Set 0 to turn off splitting at all.
integer50000000Enable when exome or panel data is provided.
booleanPath to target bed file in case of whole exome or targeted sequencing or intervals file.
stringEstimate interval size.
number200000Disable usage of intervals.
booleanTools to use for variant calling and/or for annotation.
stringDisable specified tools.
stringTrim fastq file or handle UMIs
Run FastP for read trimming
booleanRemove bp from the 5’ end of read 1
integerRemove bp from the 5’ end of read 2
integerRemove bp from the 3’ end of read 1
integerRemove bp from the 3’ end of read 2
integerRemoving poly-G tails.
integerSave trimmed FastQ file intermediates.
booleanSpecify UMI read structure
stringDefault strategy with UMI
stringAdjacencyIf set, publishes split FASTQ files. Intended for testing purposes.
booleanConfigure preprocessing tools
Specify aligner to be used to map reads to reference genome.
stringSave mapped files.
booleanSaves output from mapping (if --save_mapped), Markduplicates & Baserecalibration as BAM file instead of CRAM
booleanEnable usage of GATK Spark implementation for duplicate marking and/or base quality score recalibration
stringConfigure variant calling tools
Option for concatenating germline vcf-files.
booleanIf true, skips germline variant calling for matched normal to tumor sample. Normal samples without matched tumor will still be processed through germline variant calling tools.
booleanTurn on the joint germline variant calling for GATK haplotypecaller
booleanOverwrite Ascat min base quality required for a read to be counted.
number20Overwrite Ascat minimum depth required in the normal for a SNP to be considered.
number10Overwrite Ascat min mapping quality required for a read to be counted.
number35Overwrite ASCAT ploidy.
numberOverwrite ASCAT purity.
numberSpecify a custom chromosome length file.
stringOverwrite Control-FREEC coefficientOfVariation
number0.05Overwrite Control-FREEC contaminationAdjustement
booleanDesign known contamination value for Control-FREEC
numberMinimal sequencing quality for a position to be considered in BAF analysis.
numberMinimal read coverage for a position to be considered in BAF analysis.
numberGenome ploidy used by ControlFREEC
string2Overwrite Control-FREEC window size.
numberCopy-number reference for CNVkit
stringPanel-of-normals VCF (bgzipped) for GATK Mutect2
stringIndex of PON panel-of-normals VCF.
stringDo not analyze soft clipped bases in the reads for GATK Mutect2.
booleanAllow usage of fasta file for annotation with VEP
booleanEnable the use of the VEP dbNSFP plugin.
booleanPath to dbNSFP processed file.
stringPath to dbNSFP tabix indexed file.
stringConsequence to annotate with
stringFields to annotate with
stringrs_dbSNP,HGVSc_VEP,HGVSp_VEP,1000Gp3_EAS_AF,1000Gp3_AMR_AF,LRT_score,GERP++_RS,gnomAD_exomes_AFEnable the use of the VEP LOFTEE plugin.
booleanEnable the use of the VEP SpliceAI plugin.
booleanPath to spliceai raw scores snv file.
stringPath to spliceai raw scores snv tabix indexed file.
stringPath to spliceai raw scores indel file.
stringPath to spliceai raw scores indel tabix indexed file.
stringEnable the use of the VEP SpliceRegion plugin.
booleanAdd an extra custom argument to VEP.
stringPath to snpEff cache.
stringPath to VEP cache.
stringThe output directory where the cache will be saved. You have to use absolute paths to storage on Cloud infrastructure.
stringVEP output-file format.
stringReference genome related files and options required for the workflow.
Name of iGenomes reference.
stringGATK.GRCh38ASCAT genome.
stringPath to ASCAT allele zip file.
stringPath to ASCAT loci zip file.
stringPath to ASCAT GC content correction file.
stringPath to ASCAT RT (replictiming) correction file.
stringPath to BWA mem indices.
stringPath to bwa-mem2 mem indices.
stringPath to chromosomes folder used with ControLFREEC.
stringPath to dbsnp file.
stringPath to dbsnp index.
stringlabel string for VariantRecalibration (haplotypecaller joint variant calling)
stringPath to FASTA dictionary file.
stringPath to dragmap indices.
stringPath to FASTA genome file.
string^\S+\.fn?a(sta)?(\.gz)?$Path to FASTA reference index.
stringPath to GATK Mutect2 Germline Resource File.
stringPath to GATK Mutect2 Germline Resource Index.
stringPath to known indels file.
stringPath to known indels file index.
stringIf you use AWS iGenomes, this has already been set for you appropriately.
1st label string for VariantRecalibration (haplotypecaller joint variant calling)
stringIf you use AWS iGenomes, this has already been set for you appropriately.
Path to known snps file.
stringPath to known snps file snps.
stringIf you use AWS iGenomes, this has already been set for you appropriately.
label string for VariantRecalibration (haplotypecaller joint variant calling)
stringPath to Control-FREEC mappability file.
stringsnpEff DB version.
numbersnpEff genome.
stringsnpEff version.
stringVEP genome.
stringVEP species.
stringVEP cache version.
numberVEP version.
stringSave built references.
booleanOnly built references.
booleanDownload annotation cache.
booleanDirectory / URL base for iGenomes references.
strings3://ngi-igenomes/igenomes/Do not load the iGenomes reference config.
booleanParameters used to describe centralised config profiles. These should not be edited.
Git commit id for Institutional configs.
stringmasterBase directory for Institutional configs.
stringhttps://raw.githubusercontent.com/nf-core/configs/masterInstitutional config name.
stringInstitutional config description.
stringInstitutional config contact information.
stringInstitutional config URL link.
stringBase path / URL for data used in the test profiles
stringhttps://raw.githubusercontent.com/nf-core/test-datasets/sarek3Sequencing center information to be added to read group (CN field).
stringSequencing platform information to be added to read group (PL field).
stringILLUMINASet the top limit for requested resources for any single job.
Maximum number of CPUs that can be requested for any single job.
integer16Maximum amount of memory that can be requested for any single job.
string128.GB^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$Maximum amount of time that can be requested for any single job.
string240.h^(\d+\.?\s*(s|m|h|day)\s*)+$Less common options for the pipeline, typically set in a config file.
Display help text.
booleanDisplay version and exit.
booleanMethod used to save pipeline results to output directory.
stringEmail address for completion summary.
string^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$Email address for completion summary, only when pipeline fails.
string^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$Send plain-text email instead of HTML.
booleanFile size limit when attaching MultiQC reports to summary emails.
string25.MB^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$Do not use coloured log outputs.
booleanMultiQC report title. Printed as page header, used for filename if not otherwise specified.
stringCustom config file to supply to MultiQC.
stringCustom logo file to supply to MultiQC. File name must also be set in the MultiQC config file
stringCustom MultiQC yaml file containing HTML including a methods description.
stringDirectory to keep pipeline Nextflow logs and reports.
string${params.outdir}/pipeline_infoBoolean whether to validate parameters against the schema at runtime
booleantrueShow all params when using --help
booleanIncoming hook URL for messaging service
string