File formats in bioinformatics are notoriously hard to standardize. We hope that this documentation provides the user with a clear idea of what is need as input into Swan.
If you are having trouble with your GTF, Swan includes a quick GTF validator which can tell you if your file seems to have an unconventional header or lacks entries needed to run Swan. It cannot tell you if your gene/transcript names/ids match across datasets, or if your exon entries are in the correct order after the corresponding transcript entry. The validator can be run as follows:
Abundance matrix
Swan can load abundance information for more meaningful analysis and visualizations. To work with Swan, abundance matrices must:
Be tab-separated
First column are transcript IDs that are the same as those loaded via GTF or TALON db
Columns labelled by their dataset names containing raw counts for each transcript