The required scripts to run TMCSNPdb 2.0 toolkit are placed in the “src” directory.
To load the R Shiny based GUI or via web-browser use the following command:
In Linux terminal:
In Windows CMD:
Once users executes the above commands TMCSNPdb 2.0 toolkit GUI appears with 3 tabs.
Following are the detailed description for each of the tabs-
Allows user to create custom germline database from individual .vcf or .gz (bgzip compressed) files.
Files generated in steps 3 and 4 in pre-processing steps can be uploaded here which will combine all the variants and will generate merged.vcf file and further will apply filter of DP that is maximum 5 reads supporting the variant position (DP=5) and percent cutoff to check the recurrence of the variants in 5 percent of samples. Further these variants will deplete against various public databases which is provided in "Choose database VCF files to deplete" (database such as dbSNP,gnomAD,GenomeAsia 100K,Indigenomes,TMC-SNPdb 2.0) which will result the unique variants which will be specific to population which is not present in any of the mentioned databases.
Allows user to combine multiple .vcf.gz (bgzip compressed) database files with TMC-SNPdb 2.0 database.
Database generated by dbCreator can be used as a input in dbCombiner (other bgzip compressed vcf database) can also be provided as a input to merge with our individual as well as combined ethnic database (TMC-SNPdb 2.0 , IndiGenomes , GenomeAsia 100K )
Allows user to annotate and flag multiple .vcf files for the presence of variants in TMC-SNPdb 2.0 database.
dbCreator generates two output files:
dbCombiner creates following output files:
dbAnnotator creates following output files: