These provide top-level, powerful settings for a comprehensive
reproducible workflow. To see defaults, run
See Details below.
Below are options that can be set with
options("reproducible.xxx" = newValue),
xxx is one of the values below, and
newValue is a new value to
give the option. Sometimes these options can be placed in the user's
file so they persist between sessions.
The following options are likely of interest to most users:
.reproducibleTempCacheDir. Used in
Cache and many others.
The default path for repositories if not passed as an argument.
"rds". What save format to use; currently,
"slow". One of
"fast" (1 or 2).
digest::digest internally, which is transferable across operating
systems, but much slower than
So, if all caching is happening on a single machine,
"fast" would be a good setting.
NULL. Sets a specific connection to a database, e.g.,
dbConnect(drv = RSQLite::SQLite()) or
dbConnect(drv = RPostgres::Postgres().
For remote database servers, setting one connection may be far faster than using
drv which must make a new connection every time.
FALSE. On Linux OSes,
cloudCache have some
functionality that uses the
Default is to not use these, as they are experimental.
They may, however, be very effective in speeding up some things, specifically,
uploading cached elements via googledrive in
NULL. Used in
If set to a path, this will cause these functions to save their downloaded and preprocessed
file to this location, with a hardlink (via
file.link) to the file created in the
This can be used so that individual projects that use common data sets can maintain
modularity (by placing downloaded objects in their
destinationPath, but also minimize
re-downloading the same (perhaps large) file over and over for each project.
Because the files are hardlinks, there is no extra space taken up by the apparently
1. The number of threads to use for reading/writing cache files.
FALSE. Passed to
TRUE. Used in
FALSE, then the entire
Cache machinery is skipped and the functions are run as if there was no Cache occurring.
Can also take 2 other values:
'overwrite' will cause no recovery of objects from the cache repository, only new
ones will be created. If the hash is identical to a previous one, then this will overwrite
the previous one.
'devMode' will function as normally
Cache except it will use the
userTags to determine if a previous function has been run. If the
are identical, but the digest value is different, the old value will be deleted from the
cache repository and this new value will be added.
This addresses a common situation during the development stage: functions are changing
frequently, so any entry in the cache repository will be stale following changes to
functions, i.e., they will likely never be relevant again.
This will therefore keep the cache repository clean of stale objects.
If there is ambiguity in the
userTags, i.e., they do not uniquely identify a single
entry in the
cacheRepo, then this option will default back to the non-dev-mode
behaviour to avoid deleting objects.
This, therefore, is most useful if the user is using unique values for
FALSE. Passed to
TRUE. As of version 0.3, the backend is now DBI instead of
TRUE. Passed to
FALSE. Used in
TRUE, recovery of cached
elements from the
cacheRepo will use
This means that the 3rd time running a function will be much faster than the first (create
cache entry) or second (recover from the SQLite database on disk).
NOTE: memoised values are removed when the R session is restarted.
This option will use more RAM and so may need to be turned off if RAM is limiting.
clearCache of any sort will cause all memoising to be 'forgotten' (
TRUE. This will mean that previous cache repositories will be defunct.
This new algorithm will make
Cache less sensitive to minor but irrelevant changes
(like changing the order of arguments) and will work successfully across operating systems
(especially relevant for the new
FALSE. If set to
TRUE then every
Cache call will show a
summary of the objects being cached, their
object.size and the time it took to digest
them and also the time it took to run the call and save the call to the cache repository or
load the cached copy from the repository.
This may help diagnosing some problems that may occur.
The following options are likely not needed by a user.
Inf. Used in
Cache, specifically to the internal
CacheDigest. This is passed to
Mostly this would be changed from default
Inf if the digesting is taking too long.
Use this with caution, as some objects will have many
NA values in their first
User agent for downloads using this package.