NAME

    BioX::Workflow - A very opinionated template based workflow writer.

SYNOPSIS

    Most of the functionality can be accessed through the biox-workflow.pl
    script.

        biox-workflow.pl --workflow /path/to/workflow.yml

    This module was written with Bioinformatics workflows in mind, but
    should be extensible to any sort of workflow or pipeline.

Usage

    Please check out the full Usage Docs at BioX::Workflow::Usage

In Code Documenation

    You shouldn't really need to look here unless you have some reason to
    do some serious hacking.

 Attributes

    Moose attributes. Technically any of these can be changed, but may
    break everything.

 comment_char

 coerce_paths

 select_rules

    Select a subsection of rules

  resample

    Boolean value get new samples based on indir/file_rule or no

    Samples are found at the beginning of the workflow, based on the global
    indir variable and the file_find.

    Chances are you don't want to set resample to try, because these files
    probably won't exist outside of the indirectory until the pipeline is
    run.

    One example of doing so, shown in the gemini.yml in the examples
    directory, is looking for uncompressed files, .vcf extension,
    compressing them, and then resampling based on the .vcf.gz extension.

 find_by_dir

    Use this option when you sample names are by directory The default is
    to find samples by filename

        /SAMPLE1
            SAMPLE1_r1.fastq.gz
            SAMPLE1_r2.fastq.gz
        /SAMPLE2
            SAMPLE2_r1.fastq.gz
            SAMPLE2_r2.fastq.gz

 by_sample_outdir

        outdir/
        /outdir/SAMPLE1
            /rule1
            /rule2
            /rule3
        /outdir/SAMPLE2
            /rule1
            /rule2
            /rule3

    Instead of

        /outdir
            /rule1
            /rule2

    This feature is not particularly well supported, and may break when
    mixed with other methods, particularly --resample

  min

    Print the workflow as 2 files.

        #run-workflow.sh
        export SAMPLE=sampleN && ./run_things

  number_rules

        Instead of
        outdir/
            rule1
            rule2
    
        outdir/
            001-rule1
            002-rule2

  auto_name

    Auto_name - Create outdirectory based on rulename

    global: - outdir: /home/user/workflow/processed rule: normalize:
    process: dostuff {$self->indir}/{$sample}.in >>
    {$self->outdir}/$sample.out

    Would create your directory structure
    /home/user/workflow/processed/normalize (if it doesn't exist)

  auto_input

    This is similar to the auto_name function in the BioX::Workflow.
    Instead this says each input should be the previous output.

  verbose

    Output some more things

  wait

    Print "wait" at the end of each rule

  override_process

    local: - override_process: 1

  indir outdir

  create_outdir

  INPUT OUTPUT

    Special variables that can have input/output

    These variables are also used in BioX::Workflow::Plugin::Drake

  file_rule

    Rule to find files

  No GetOpt Here

  attr

    attributes read in from runtime

  global_attr

    Attributes defined in the global section of the yaml file

  local_attr

    Attributes defined in the rules->rulename->local section of the yaml
    file

  local_rule

  infiles

    Infiles to be processed

  samples

  process

    Do stuff

  key

    Do stuff

  workflow

    Path to workflow workflow. This must be a YAML file.

  rule_based

    This is the default. The outer loop are the rules, not the samples

  sample_based

    Default Value. The outer loop is samples, not rules. Must be set in
    your global values or on the command line --sample_based 1

    If you ever have resample: 1 in your config you should NOT set this
    value to true!

 stash

    This isn't ever used in the code. Its just there incase you want to do
    some things with override_process

    It uses Moose::Meta::Attribute::Native::Trait::Hash and supports all
    the methods.

            set_stash     => 'set',
            get_stash     => 'get',
            has_no_stash => 'is_empty',
            num_stashs    => 'count',
            delete_stash  => 'delete',
            stash_pairs   => 'kv',

 _classes

    Saves a snapshot of the entire namespace for the initial environment,
    and each rule.

 Subroutines

    Subroutines can also be overriden and/or extended in the usual Moose
    fashion.

  run

    Starting point.

 save_env

    At each rule save the env for debugging purposes.

  make_outdir

    Set initial indir and outdir

  get_samples

    Get basename of the files. Can add optional rules.

    sample.vcf.gz and sample.vcf would be sample if the file_rule is
    (.vcf)$|(.vcf.gz)$

    Also gets the full path to infiles

    Instead of doing

        foreach my $sample (@$self->samples){
            dostuff
        }

    Could have

        foreach my $infile (@$self->infiles){
            dostuff
        }

 match_samples

    Match samples based on regex written in file_rule

  plugin_load

    Load plugins defined in yaml with MooseX::Object::Pluggable

  class_load

    Load classes defined in yaml with Class::Load

  make_template

    Make the template for interpolating strings

  create_attr

    make attributes

 check_keys

    There should be one key and one key only!

 clear_process_vars

    Clear the process vars

 init_process_vars

    Initialize the process vars

 add_attr

    Add the local attr onto the global attr

 write_rule_meta

  write_process

    Fill in the template with the process

  process_by_sample_outdir

    Make sure indir/outdirs are named appropriated for samples when using
    by

  OUTPUT_to_INPUT

    If we are using auto_input chain INPUT/OUTPUT

DESCRIPTION

    BioX::Workflow - A very opinionated template based workflow writer.

AUTHOR

    Jillian Rowe <jillian.e.rowe@gmail.com>

Acknowledgements

    Before version 0.03

    This module was originally developed at and for Weill Cornell Medical
    College in Qatar within ITS Advanced Computing Team. With approval from
    WCMC-Q, this information was generalized and put on github, for which
    the authors would like to express their gratitude.

    As of version 0.03:

    This modules continuing development is supported by NYU Abu Dhabi in
    the Center for Genomics and Systems Biology. With approval from NYUAD,
    this information was generalized and put on bitbucket, for which the
    authors would like to express their gratitude.

COPYRIGHT

    Copyright 2015- Weill Cornell Medical College in Qatar

LICENSE

    This library is free software; you can redistribute it and/or modify it
    under the same terms as Perl itself.

SEE ALSO