nf-core/methylseq
Methylation (Bisulfite-Sequencing) analysis pipeline using Bismark or bwa-meth + MethylDackel
Define where the pipeline should find input data and save output data.
Path to comma-separated file containing information about the samples in the experiment.
string
^\S+\.csv$
The output directory where the results will be saved. You have to use absolute paths to storage on Cloud infrastructure.
string
Email address for completion summary.
string
^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$
MultiQC report title. Printed as page header, used for filename if not otherwise specified.
string
Options for saving a variety of intermediate files
Save reference(s) to results directory
boolean
Save aligned intermediates to results directory
boolean
Bismark only - Save unmapped reads to FastQ files
boolean
Save trimmed reads to results directory.
boolean
Options for the reference genome indices used to align reads.
Name of iGenomes reference.
string
Path to FASTA genome file
string
^\S+\.fn?a(sta)?(\.gz)?$
Path to Fasta index file.
string
^\S+\.fn?a(sta)?.fai$
Path to a directory containing a Bismark reference index.
string
bwameth index filename base
string
Do not load the iGenomes reference config.
boolean
The base path to the igenomes reference files
string
s3://ngi-igenomes/igenomes/
Alignment tool to use.
string
Presets for working with specific bisulfite library preparation methods.
Preset for working with PBAT libraries.
boolean
Turn on if dealing with MspI digested material.
boolean
Run bismark in SLAM-seq mode.
boolean
Preset for EM-seq libraries.
boolean
Trimming preset for single-cell bisulfite libraries.
boolean
Trimming preset for the Accel kit.
boolean
Trimming preset for the Zymo kit.
boolean
Bisulfite libraries often require additional base pairs to be removed from the ends of the reads before alignment.
In addition to manually specifying bases to be hard-clipped, the workflow has a number of parameter presets:
Parameter | 5’ R1 Trim | 5’ R2 Trim | 3’ R1 Trim | 3’ R2 Trim |
---|---|---|---|---|
--pbat | 8 | 8 | 8 | 8 |
--single_cell | 6 | 6 | 6 | 6 |
--accel | 10 | 15 | 10 | 10 |
--zymo | 10 | 15 | 10 | 10 |
Note that you can use the --skip_trimming
parameter to skip trimming completely.
Further, --skip_trimming_presets
disables setting any presets for hard-clipping, thereby uncoupling trimming from specific alignment modes entirely
Trim bases from the 5’ end of read 1 (or single-end reads).
integer
Trim bases from the 5’ end of read 2 (paired-end only).
integer
Trim bases from the 3’ end of read 1 AFTER adapter/quality trimming.
integer
Trim bases from the 3’ end of read 2 AFTER adapter/quality trimming
integer
Trim bases below this quality value from the 3’ end of the read, ignoring high-quality G bases
integer
Discard reads that become shorter than INT because of either quality or adapter trimming.
integer
Skip presetting trimming parameters entirely
boolean
Parameters specific to the Bismark workflow
Run alignment against all four possible strands.
boolean
Output stranded cytosine report, following Bismark’s bismark_methylation_extractor step.
boolean
Turn on to relax stringency for alignment (set allowed penalty with —num_mismatches).
boolean
0.6 will allow a penalty of bp * -0.6 - for 100bp reads (bismark default is 0.2)
number
0.6
Specify a minimum read coverage to report a methylation call
integer
Ignore read 2 methylation when it overlaps read 1
boolean
true
Ignore methylation in first n bases of 5’ end of R1
integer
Ignore methylation in first n bases of 5’ end of R2
integer
2
Ignore methylation in last n bases of 3’ end of R1
integer
Ignore methylation in last n bases of 3’ end of R2
integer
2
Supply a .gtf file containing known splice sites (bismark_hisat only).
string
^\S+\.gtf(\.gz)?$
Allow soft-clipping of reads (potentially useful for single-cell experiments).
boolean
The minimum insert size for valid paired-end alignments.
integer
The maximum insert size for valid paired-end alignments.
integer
Sample is NOMe-seq or NMT-seq. Runs coverage2cytosine.
boolean
Merges methylation calls for every strand into a single, context dependent file.
boolean
Call methylation in all three CpG, CHG and CHH contexts.
boolean
Merges methylation metrics of the Cytosines in a given context.
boolean
Specify a minimum read coverage for MethylDackel to report a methylation call.
integer
MethylDackel - ignore SAM flags
boolean
Save files for use with methylKit
boolean
Qualimap configurations
A GFF or BED file containing the target regions which will be passed to Qualimap/Bamqc.
string
^\S+\.gff|\.bed(\.gz)?$
Targeted sequencing analysis configurations. They need --run_targeted_sequencing
to have an effect
A BED file containing the target regions
string
^\S+|\.bed(\.gz)?$
Run Picard CollectHsMetrics in the targeted analysis
boolean
Skip read trimming.
boolean
Skip deduplication step after alignment.
boolean
Skip FastQC
boolean
Skip MultiQC
boolean
Run preseq/lcextrap tool
boolean
Run qualimap/bamqc tool
boolean
Run advanced analysis for targeted methylation kits with enrichment of specific regions
boolean
Parameters used to describe centralised config profiles. These should not be edited.
Git commit id for Institutional configs.
string
master
Base directory for Institutional configs.
string
https://raw.githubusercontent.com/nf-core/configs/master
Institutional config name.
string
Institutional config description.
string
Institutional config contact information.
string
Institutional config URL link.
string
Less common options for the pipeline, typically set in a config file.
Display version and exit.
boolean
Method used to save pipeline results to output directory.
string
Email address for completion summary, only when pipeline fails.
string
^([a-zA-Z0-9_\-\.]+)@([a-zA-Z0-9_\-\.]+)\.([a-zA-Z]{2,5})$
Send plain-text email instead of HTML.
boolean
File size limit when attaching MultiQC reports to summary emails.
string
25.MB
^\d+(\.\d+)?\.?\s*(K|M|G|T)?B$
Do not use coloured log outputs.
boolean
Incoming hook URL for messaging service
string
Custom config file to supply to MultiQC.
string
Custom logo file to supply to MultiQC. File name must also be set in the MultiQC config file
string
Custom MultiQC yaml file containing HTML including a methods description.
string
Boolean whether to validate parameters against the schema at runtime
boolean
true
Base URL or local path to location of pipeline test dataset files
string
https://raw.githubusercontent.com/nf-core/test-datasets/methylseq/
Suffix to add to the trace report filename. Default is the date and time in the format yyyy-MM-dd_HH-mm-ss.
string