There is IT maintenance on the servers at January 26 from 11:00 to 12:00.

Commit 10268535 authored by Mattias's avatar Mattias
Browse files

Add notebook and snakefile

parent 51f29a3b
from Bio.Seq import Seq
PROJECT = "PRJEB19467"
SAMPLES, = glob_wildcards("PRJEB19467/{sample}_1.fastq.gz")
FW = Seq("CCTACGGGNGGCWGCAG")
FW_RC = FW.reverse_complement()
RV = Seq("GACTACHVGGGTATCTAATCC")
RV_RC = FW.reverse_complement()
rule all:
input: expand("{project}/cutadapt/{sample}_1.fastq", project=PROJECT, sample=SAMPLES)
rule filter_primers:
input:
forward="{project}/{sample}_1.fastq.gz",
reverse="{project}/{sample}_2.fastq.gz"
output:
forward="{project}/cutadapt/{sample}_1.fastq",
reverse="{project}/cutadapt/{sample}_2.fastq"
log: "{project}/cutadapt/cutadapt_{sample}.log"
shell: "cutadapt -g {FW} -a {FW_RC} -G {RV} -A {RV_RC} -n 2 -o {output.forward} -p {output.reverse} {input} 2>&1 > {log}"
This diff is collapsed.
Markdown is supported
0% or .
You are about to add 0 people to the discussion. Proceed with caution.
Finish editing this message first!
Please register or to comment