salmonEC.Rd
Constructs a count matrix with equivalence class identifiers in the rows. The count matrix is generated from one or multiple `eq_classes.txt` files that have been created by running salmon with the --dumpEq flag. Salmon - https://doi.org/10.1038/nmeth.4197
salmonEC(
paths,
tx2gene,
multigene = FALSE,
ignoreTxVersion = FALSE,
ignoreAfterBar = FALSE,
quiet = FALSE
)
`Charachter` or `character vector`, path specifying the location of the `eq_classes.txt` files generated with salmon.
A `dataframe` linking transcript identifiers to their corresponding gene identifiers. Transcript identifiers must be in a column `isoform_id`. Corresponding gene identifiers must be in a column `gene_id`.
`Logical`, should equivalence classes that are compatible with multiple genes be retained? Default is `FALSE`, removing such ambiguous equivalence classes.
logical, whether to split the isoform id on the '.' character to remove version information to facilitate matching with the isoform id in `tx2gene` (default FALSE).
logical, whether to split the isoform id on the '|' character to facilitate matching with the isoform id in `tx2gene` (default FALSE).
`Logical`, set `TRUE` to avoid displaying messages.
A list with two elements. The first element `counts` is a sparse count matrix with equivalence class identifiers in the rows. If multiple paths are specified, the columns are in the same order as the paths. The second element `tx2gene_matched` allows for linking those identifiers to their respective transcripts and genes.
The resulting count matrix uses equivalence class identifiers as rownames. These can be linked to respective transcripts and genes using the `tx2gene_matched` element of the output. Specifically, if the equivalence class identifier reads 1|2|8, then the equivalence class is compatible with the transcripts and their respective genes in rows 1, 2 and 8 of `tx2gene_matched`.