Update transcript metadatda for importData()
imported data
Source: R/mixed_reference.R
updateMetadata.Rd
This function takes as input a SummarizedExperiment as output by importData()
,
and will update the metadata on the transcripts when possible
(updating rowData
and/or rowRanges
depending on the value of ranges
).
importData()
uses metadata pulled from digest matches in registries used by tximeta
(linkedTxome
, linkedTxpData
, and the pre-computed digests).
Additionally, GRanges or data.frame-type data can be provided on a one-time basis
via the argument txpData
, which will annotate transcripts with index="user"
.
See inspectDigests()
for how to inspect which indices have matching digests,
and how to link data to local metadata in a persistent manner.
Arguments
- se
the SummarizedExperiment (SE) output by
importData()
- txpData
either GRanges or data.frame-type object to use if there is not a match based on digest. This is used on a one-time basis, and transcripts will be marked in metadata columns as
index = "user"``. See
makeLinkedTxome()or
makeLinkedTxpData()` for persistent metadata storage/retrieval- ranges
logical, whether to add
rowRanges
(or justrowData
)- prefer
vector of length up to 3, giving the preferred order of tximeta's transcript registries to when finding matches, with elements:
txome
: linkedTxome,txpdata
: linkedTxpData,precomputed
: the pre-computed digests in tximeta- order
order of index, in which to update the metadata, by default the order is
annotation
, thennovel
, thenuser
, info supplied here astxpData
- key
a named character vector of length 3. For each index (annotated, novel, and user)
key
is the name of the column to use for merging metadata withrownames(se)
. Theuser
index corresponds to data provided here astxpData
Defaults to"tx_name"
which often matches the transcript names in GENCODE
Examples
example(importData)
#>
#> imprtD> # oarfish files using a mix of --annotated and --novel transcripts
#> imprtD> dir <- system.file("extdata/oarfish", package="tximportData")
#>
#> imprtD> names <- paste0("rep", 2:4)
#>
#> imprtD> files <- file.path(dir, paste0("sgnex_h9_", names, ".quant.gz"))
#>
#> imprtD> coldata <- data.frame(files, names)
#>
#> imprtD> # returns an un-ranged SE object
#> imprtD> se <- importData(coldata, type="oarfish")
#> reading in files with read.delim (install 'readr' package for speed up)
#> 1
#> 2
#> 3
#>
#> returning un-ranged SummarizedExperiment, see functions:
#> -- inspectDigests() to check matching digests
#> -- makeLinkedTxome/makeLinkedTxpData() to link digests to metadata
#> -- updateMetadata() to update metadata and optionally add ranges
# build custom novel GRanges data
library(GenomicRanges)
novel <- data.frame(
seqnames = paste0("chr", rep(1:22, each=500)),
start = 1e6 + 1 + 0:499 * 1000, end = 1e6 + 1 + 0:499 * 1000 + 1000 - 1,
strand = "+", tx_name = paste0("novel", 1:(22*500)),
gene_id = paste0("novel_gene", rep(1:(22*10), each=50)), type = "protein_coding"
)
novel_gr <- as(novel, "GRanges")
names(novel_gr) <- novel$tx_name
# now update the metadata + ranges:
if (FALSE) { # \dontrun{
# this requires connection to internet (will download GENCODE GTF via FTP)
se_with_ranges <- updateMetadata(
se, txpData=novel_gr, ranges=TRUE
)
mcols(se_with_ranges)
} # }