command line interface

Availablility

The source code for the command line tool is available in our GitHub repository (repo). You can clone the reposiory to access the code locally.

git clone https://github.com/GoyalLab/singletCodeTools/

Using the interface

Navigating to commandLineTools folder, you will find 3 files and you will need to run singletCodeCommandLine.py. There are 2 modules available, one to run singletCode and the other to create a sample sheet if you have used Watermelon barcoding techmology using the fastq sequenced files from MISEQ.

Detailed information about the modules in the command line interface

Description:

This script contains two modules: - Count Module: Generates the singlet files for the input data sheet. - Watermelon Module: Uses the MiSeq dial-out files to create the cell ID, barcode, and sample file. These outputs can then be used as input for the singlet code module.

Usage:

For the Count Module:

python singletCode.py count -i /path/to/input.txt -o /path/to/output

For the Watermelon Module:

python3 singletCodeCommadLine.py watermelon -i /path/to/fastq/files -o path/to/save/csv/file -s path/sample/sheet -use10X False -input10X path/to/barcodes/tsv
Options for Count Module:
  • -i, –input_file: Specify the path to the input barcode file.

  • -o, –output_prefix: Specify the path to the output prefix.

  • -f, –force: Force overwrite if the output file already exists.

  • -u, –cutoff: UMI cutoff ratio.

  • -d, –umi_diff_threshold: Minimum difference in UMI between UMI count for dominant UMI and median UMI count.

  • -m, –min_umi_good_data_cutoff: Minimum UMI count for a barcode to be dominant.

  • -g, –min_umi_filter_threshold: Minimum UMI filter threshold.

Options for Watermelon Module:
  • -i, –inputFolder: Specify the path to the folder containing the fastq folders output from MiSeq.

  • -s, –sampleSheet: Specify the path to the sample sheet in .csv format that contains sample name and sample number. This should match the names of the fastq files (e.g., Sample-1_S1_L001_R1_001.fastq.gz).

  • -o, –outputFolder: Specify the path to save the output CSV file containing the barcode, cell ID information.

  • -outputName: Specify the name of the output CSV file.

  • –use10X: Specify if a 10X object is provided which has the same cells as those in fastq. If provided, cells in the fastq file will be filtered out if not present in the 10X object.

  • –input10X: Path to the barcodes.tsv.gz or barcodes.tsv which contains cell IDs.

Authors:
  • Ziyang Zhang (Charles)

  • Keerthana M Arun