Welcome

Join us in person for the October 2024 nf-core hackathon!

This hackathon will be held in advance of the Nextflow Summit 2024 in Barcelona, Spain and is organised jointly.

Nextflow Summit 2024 - Barcelona

This nf-core hackathon is sponsored by Seqera and Oxford Nanopore Technologies. Many thanks to both companies for making this event possible!

Oxford NanoporeOxford NanoporeSeqera

How the hackathon works

What to expect

The nf-core hackathons are collaborative, community-driven events where participants work together on projects.

Everyone is welcome, no prior experience of nf-core contributions is needed. However, we do expect that you have some experience with writing Nextflow code. Note that there is a separate training event for learning Nextflow and nf-core from scratch, running in parallel to the hackathon.

nf-core hackathons are not just about coding! We also have a lot of fun. We typically run things like a quiz, a bingo and have several small prizes for the winners. In addition to small social games and contests during the event, we also have a social evening.

Prerequisites

Before you arrive at the hackathon, please make sure that you have:

Where to find tasks

We collect all tasks in the “Hackathon October 2024” GitHub project board.

If you are not too familiar yet with the code base, a great starting point is to filter for issues labelled as good first issue.

Once you found something, get in touch with the project group (i.e. ping them on slack, find them in the room), assign yourself, and get started.

How to contribute code

We use GitHub to collaborate on code:

  1. Find a hackathon project
  2. Discuss with the group and assign yourself to an issue. (Create one if it is not there)
  3. Fork the repository
  4. Work on a branch in your Fork
  5. Once ready, open a PR to the parent repository
Note

Only assign yourself to an issue if you are ready to work on it, typically one issue at a time.

See Helpful resources below for more information.

Schedule

The hackathon will run from Monday 28th October to Wednesday 30th October. The registration opens on Monday at 9am. We will start at 10am everyday and close at 5pm on Monday and Tuesday. On Wednesday we will wrap up at 1pm. For a complete schedule visit the summit agenda page

Projects

This hackathon we will use projects, as opposed to the broader groups of previous events. At the end of each day, we will group the projects into categories and sum up their progress. Projects can be anything from:

  • Adding new features to existing pipelines
  • Adding and improving components (modules / subworkflows)
  • Improving the website and nf-core tooling
  • Creating entirely new pipelines
  • Discussion and planning community initiatives
  • Working on special interest group topics
  • …anything else

You can bring your own favourite topic or choose from a list of open issues in the community. Each project has a lead who can point you in the right direction.

You don’t need to commit to a single project and are free to move around groups and projects throughout the event.

Submit a new project

New projects can be proposed in the #hackathon-oct-2024 slack channel. Use the project proposal form to submit an idea. After a some community discussion, you can add your project to the list below and others can find it.

Tip

If you are planning to start a new pipeline, please propose it on the #new-pipelines slack channel ahead of the hackathon start to avoid delays during the event.

Once a project is approved, the project leaders should add it to this webpage and add issues issues to the GitHub project board ahead of the hackathon. If appropriate, label them as good first issue.

Join a project

Joining a project is as simple as turning up and getting in touch with the group. If you don’t know where to find them in the room, ping the project lead on slack.

You can move freely between projects throughout the event.

List of projects

Pipelines

nf-core/seqinspector

pipelines
Slack: #seqinspector

nf-core/seqinspector aims to be a pipeline for initial quality control of sequencing data. Input is either FASTQ files or a run folder, and output is planned to be a global MultiQC report and, if wished, MultiQC files of groups that are defined in the sample sheet. By joining this group you can

  1. add existing modules to a pipeline (beginner friendly)
  2. write a new module of your preferred QC tool if it doesn’t exist yet (intermediate level).
  3. start with implementation of long read methods (advanced level, we have only limited experience in the group, so help would be more than appreciated!).
  4. work on display of the data in the MultiQC reports (beginner - intermediate level)
  5. write documentation

Goal

Work towards a first release

Group Leaders

Single cell analysis: nf-core/scrnaseq and nf-core/scdownstream

pipelines

nf-core/scrnaseq transforms FASTQ files into expression matrices, while nf-core/scdownstream receives expression matrices as input and performs quality control, integration, clustering and more. This project has two main parts:

  1. Move some local modules from scdownstream to nf-core/modules so that they can be re-used in scrnaseq.
  2. Add new features to scdownstream. See the open enhancement issues for details.

By joining this group you can:

  1. Create new modules and move existing local modules to nf-core/modules (beginner friendly)
  2. Integrate the newly added modules into the pipelines (beginner - intermediate level)
  3. Improve the scdownstream MultiQC report and documentation (beginner - intermediate level)
  4. Look into a potential extension of scdownstream to multi-omics analyses (advanced level, has not yet been tackled but help would be great!)

Prior experience with single cell analysis is not required, but helpful.

Goal

Move shared functionality to nf-core/modules and make #scdownstream ready for 1.0 release

Group Leaders

nf-core/deepmodeloptim

pipelines
Slack: #deepmodeloptim

We proposed a new pipeline to nf-core, initially called STIMULUS (available here: https://github.com/mathysgrapotte/stimulus). This pipeline aims to explore ways that deep learning models can learn, relative to how the input data is processed (check GitHub or the #deepmodeloptim channel on the nf-core slack).

Working on deepmodeloptim on the hackathon will mostly involve nf-core-izing the pipeline and making it a place where it is easy to contribute.

Goal

Reach nf-core/deepmodeloptim v1.0.0 release!

Group Leaders

nf-core/sarek

pipelines
Slack: #sarek

General work on sarek with a focus on maintenance:

  • Improve input validation
  • Improve documentation
  • Fix bugs

If you want to get started on a new addition, this is a great time to come by and chat.

Goal

Improve input validation, usability, and docs

Group Leaders

Sarek (preprocessing) goes GPU

pipelines
Slack: #parabricks

Variant calling has multiple time consuming steps that could be faster if we use GPUs instead of CPUs. First steps to achieve that could be the integration of Parabricks which is software developed by NVIDIA. The modules are ready and need to be integrated into sarek.

This project can also expand onto other pipelines and include more tools that allow execution on GPU.

Goal

Integration of Parabricks in sarek

Group Leaders

nf-core/genomeqc

pipelines
Slack: #genomeqc

A pipeline to compare and contrast genome assemblies and their annotations.

When you sequence a new genome, or wish to use a published genome, it is important to gauge the quality of the assembly. There are basic tools, such as BUSCO (completeness), QUAST (contiguity), or general statistics of numbers of chromosomes, genes, etc (AGAT), but no nf-core pipeline to perform all of these tasks, including documenting their TE content, telomere locations, contamination level.

In addition, we would want to plot this on a phylogenetic tree, to help compare these stats. See the #genomeqc nf-core Slack channel to join!

Goal

Write a first draft of this pipeline

Group Leaders

nf-core/variantbenchmarking

pipelines
Slack: #variantbenchmarking

This is a variant benchmarking pipeline, for now structural variant and small benchmarking parts for germlines are working, yet there are plans for addin g somatic benchmarking including creation of a truth file for structural variants and adding some benchmark tools. The pipeline needs to be tested extensively and a set of reviews is required.

Goal

I would like to have the first version of this pipeline published

Group Leaders

nf-core/proteinfold

pipelines
Slack: #proteinfold

We have adding new reporting capabilities to the pipeline lately and we would like to finish adding this features and testing them during the hackathon.

A part from this, we would like to explore which other tools could be added to the pipeline (e.g. RoseTTAFold or OmegaFold) and discuss with the community what should be the future of the pipeline in terms of development.

Of course, as any other pipeline we would try to find other more “house-keeping” issues in which people joining the group could get involved during the hackathon.

If you are interested please join the group!

Goal

Towards release 1.2.0 and beyond

Group Leaders

nf-core/phaseimpute

pipelines
Slack: #phaseimpute

nf-core/phaseimpute is a multi-steps pipeline dedicated to genetic imputation from simulation to validation. By joining this group you can

  1. Contribute updated subworkflows and modules back to nf-core (beginner - intermediate level)
  2. Improve documentation and enhance readability (beginner)
  3. Assist with pre-release review (beginner to expert)
  4. Add support for SNP chip array data (simulation and imputation) (expert)
  5. Add support for sexual chromosome imputation (expert)

Goal

Work towards a first release

Group Leaders

nf-core/differentialabundance

pipeline

This is a pipeline for downstream gene expression analysis, with a main focus on differential expression analysis. Currently, we are working on a new branch dev-ratio with two objectives:

  • add new methods
  • convert the pipeline into a modular and unified framework that allocates different ways of performing differential analysis.

By joining this group you will help with:

  1. Implement and add nf-core modules that wrap other methods that can be used to perform differential analysis. Some modules (ie. propd) were already created but need to be updated (beginner friendly).
  2. Add these new modules to the pipeline and update the pipeline parameters correspondingly (beginner friendly).
  3. Add the existing modules to the new modular subworkflow in a way that will reproduce the original pipeline’s behaviour (beginner - intermediate level).
  4. Update the code required to generate the plots and reports (beginner - intermediate level).
  5. Restructure the pipeline architecture (intermediate level).
  6. Update documentation (beginner friendly).

Goal

move forward for the next release with the restructured pipeline and the new methods!

Group Leaders

Components

Image processing pipelines

components

This project focuses on creating modules and adding functionality to (highly multiplexed) imaging pipelines - nf-core/mcmicro and nf-core/molkart. By joining this group you can:

  1. Create new segmentation modules for nf-core/modules (beginner friendly) …
  2. … and integrate them into the pipelines (beginner - intermediate level)
  3. Support the spot detection implementation for MCMICRO (intermediate level)
  4. Work on improved QC metric reporting for both pipelines (beginner - intermediate level)
  5. Help us address open issues (beginner - intermediate level)

Goal

Work towards next Molkart release and implement an additional MCMICRO segmentation option.

Group Leaders

Update subworkflows meta.yml

components

We will work on updating the meta.yml file of subworkflows to have the proper description for the structure of input and output channels.

Check the issue describing the tasks and tracking the progress.

Beginners and first-time contributors are welcome!

Goal

Update the meta.yml file of all nf-core subworkflows

Group Leaders

Software packaging: ARM

components

Looking into making more packages build natively for linux/arm64 and improving performance of the important ones for faster and cheaper runs on ARM machines such as AWS Graviton.

Goal

Optimise run time on at least one tool

Group Leaders

Tooling

References

tooling
Slack: #references

Continuing the discussion from last year’s hackathon, this group will work on tasks related to references’ genomes handling / management. Some work has started with nf-core/references but it is at a very early stage. This hackathon group will work towards agreeing on a fundamental structure and plan.

Goal

Replacing iGenomes, then world domination.

Group Leaders

Tube map polishing

tooling

Everybody loves the nf-core tube maps, but they also need some special care to gleam in all their beauty. Come join us and refine your workflows representations to their full glory. Doesn’t matter if you already have a finished version and want a thorough review (🦅👀) or brainstorm some ideas and concepts to start a new one, this group is for you. Disclaimer: This will not be an introduction to vector graphic tools. You bring the tools, we bring the eyes and brains.

Goal

Make the tube maps in pipelines even more fabulous.

Group Leaders

nf-test plugins

tooling
Slack: #nft-plugins

nf-test is a very important piece of our modules, used for continuous integration testing of all our modules. However, writing tests for some file types / more advanced tests can be difficult. In this group we will try and kickstart the creation of nf-test plugins to make our testing a lot easier. This will mainly involve the development of nft-utils, the improvement of nft-bam and hopefully the creation of completely new nf-test plugins.

Goal

Fully develop nft-utils and start new nf-test plugins

Group Leaders

Infrastructure around nf-core/modules

tooling
Slack: #tools

For Pythonistas 🎉 Working on nf-core/tools by developing infrastructure related to nf-core/modules.

Issues:

  • Make nf-core modules create use the same structure for local modules than for remote modules (beginner friendly)
  • Fix bug: nf-core modules update deletes templates files when there is a patch file (intermediate level)
  • Fix the structure of modules meta.yml files (intermediate level)

Goal

Develop infrastructure for nf-core/modules

Group Leaders

Special Interest Groups

Regulatory

Special Interest Group
Slack: #regulatory

This group will work on tasks for the #regulatory special interest group. Most likely we will try to come up with more detailed plans on how to tackle different needs of subgroups within regulatory and try to come up with a strategy on how to both align between those subgroups as well as to come up with plans / proposals for the wider community what we could add to enable e.g. auditors or authorities to understand better what nf-core already provides.

Goal

Clearing out the scope of the regulatory special interest group and discussing who would tackle different subfields of the entire regulatory space for future improvements on nf-core guidelines and pipelines.

Group Leaders

Meta-omics

Special Interest Group
Slack: #meta-omics

We will work together on any or all of the meta-omics pipelines — mag, ampliseq, metatdenovo, magmap, eager, funcscan, createtaxdb and taxprofiler etc. — extending functionality, but also discussing how they can be made to better integrate with each other plus a number of downstream pipelines, both within and outside nf-core.

We will have a number of documentation and/or new-module requests for newcomers to get their hands dirty, and larger implementation tasks for more advanced developers.

Goal

Widened understanding of the implementation details of all pipelines in a larger group of developers.

Group Leaders

Teaching

Special Interest Group
Slack: #training

nf-core pipelines are playing a crucial role in standardising bioinformatics workflows and their user base is growing every day. Engaging training materials are essential to complement the pipelines that are being released. To achieve this, a problem-based learning approach could be developed for several nf-core pipelines where tutorials follow a storyline based on carefully simulated data. A first attempt at this approach has been drafted for nfcore/sarek (https://lescai-teaching.github.io/sarek-tutorial) and nfcore/rnaseq (https://lescai-teaching.github.io/rnaseq-tutorial). This project intends to gather people who are willing to discuss and develop further similar materials for these and other nf-core pipelines.

Goal

Develop tailored course materials and hands-on tutorials.

Group Leaders

Social activities

During the hackathon, we will have light-hearted fun and games! Special prizes are up for grabs for the winners!

More details will be revelealed at the start of the event, but you can expect: a quiz, bingo and sock-related activities.

Helpful resources

Bytesize talks

There are many talks about Nextflow and nf-core on the nf-core Bytesize playlist. In particular, the talk about using git and GitHub in an nf-core environment may be useful.

Tutorials and docs on the nf-core website

Help with coding and nf-core tools

Adding to pipelines

Creating a new pipeline

Code of conduct

Please note that our Code of Conduct applies to the Hackathon, and all participants need to abide by our guidelines to participate. We should all feel responsible for making nf-core events safe and fun for everyone.

You can also report any CoC violations directly to safety@nf-co.re. Our safety officers will contact you to follow up on your report.

In case of an immediate perceived threat at the hackathon, please reach out to any of staff or organizers on site.