CacheGeo(
targetFile = NULL,
domain,
FUN,
destinationPath = getOption("reproducible.destinationPath", "."),
useCloud = getOption("reproducible.useCloud", FALSE),
cloudFolderID = NULL,
purge = FALSE,
useCache = getOption("reproducible.useCache"),
overwrite = getOption("reproducible.overwrite"),
action = c("nothing", "update", "replace", "append"),
bufferOK = FALSE,
verbose = getOption("reproducible.verbose"),
...
)The (optional) local file (or path to file) name for a sf
object or data.frame that can be coerced to a sf object (i.e., has a geometry
column). If cloudFolderID is specified, then this will be the name of the
file stored and/or accessed in that cloud folder.
An sf polygon object that is the spatial area of interest. If NULL,
then this will return the whole object in targetFile.
A function call that will be called if there is the domain is
not already contained within the sf object at targetFile. This function
call MUST return either a sf class object or a data.frame class object
that has a geometry column (which can then be converted to sf with sf::st_as_sf)
Character string of a directory in which to download
and save the file that comes from url and is also where the function
will look for archive or targetFile. NOTE (still experimental):
To prevent repeated downloads in different locations, the user can also set
options("reproducible.inputPaths") to one or more local file paths to
search for the file before attempting to download. Default for that option is
NULL meaning do not search locally.
A logical.
If this is specified, then it must be either 1) a Google Drive
url to a folder where the targetFile will be read from or written to, or
2) a googledrive id or 3) an absolute path to a (possibly non-existent yet)
folder on your Google drive.
Logical or Integer. 0/FALSE (default) keeps existing CHECKSUMS.txt file and
prepInputs will write or append to it. 1/TRUE will deleted the entire CHECKSUMS.txt file.
Other options, see details.
Passed to Cache in various places.
Defaults to getOption("reproducible.useCache", 2L) in prepInputs, and
getOption("reproducible.useCache", FALSE) if calling any of the inner
functions manually. For prepInputs, this mean it will use Cache
only up to 2 nested levels, which includes preProcess. postProcess and
its nested *Input functions (e.g., cropInputs, projectInputs,
maskInputs) are no longer internally cached, as terra processing speeds
mean internal caching is more time consuming. We recommend caching the full
prepInputs call instead (e.g. prepInputs(...) |> Cache()).
Logical. Passed to writeTo (possibly inside postProcess) and postProcess.
A character string, with one of c("nothing", "update",
"replace", "append"). Partial matching is used ("n" is sufficient).
nothing will prevent any updating of the targetFile,
i.e., "read only". append will add the spatial elements in domain to
targetFile (and writing it back to disk). update will do the same as
append, but will also remove any identical geometries before appending.
replace does nothing currently.
A logical. If TRUE, then after testing whether the domain is
within the targetFile spatial object, and if it returns FALSE, then the function
will create a larger object, buffered by 2.5% of the extent of the object. If
FALSE, then it will be strict about whether the domain is within the targetFile.
Numeric, -1 silent (where possible), 0 being very quiet,
1 showing more messaging, 2 being more messaging, etc.
Default is 1. Above 3 will output much more information about the internals of
Caching, which may help diagnose Caching challenges. Can set globally with an
option, e.g., options('reproducible.verbose' = 0) to reduce to minimal
All named objects that are needed for FUN, including the function itself, if it is not in a package.
Returns an object that results from FUN, which will possibly be a subset
of a larger spatial object that is specified with targetFile.
This function is a combination of Cache and prepInputs but for spatial
domains. This differs from Cache in that the current function call doesn't
have to have an identical function call previously run. Instead, it needs
to have had a previous function call where the domain being passes is
within the geographic limits of the targetFile.
This is similar to a geospatial operation on a remote GIS server, with 2 differences:
This downloads the object first before doing the GIS locally, and 2. it will optionally upload an updated object if the geographic area did not yet exist.
This has a very specific use case: assess whether an existing sf polygon
or multipolygon object (local or remote) covers the spatial
area of a domain of interest. If it does, then return only that
part of the sf object that completely covers the domain.
If it does not, then run FUN. It is expected that FUN will produce an sf
polygon or multipolygon class object. The result of FUN will then be
appended to the sf object as a new entry (feature) or it will replace
the existing "same extent" entry in the sf object.
# \donttest{
if (requireNamespace("sf", quietly = TRUE) &&
requireNamespace("terra", quietly = TRUE)) {
dPath <- checkPath(file.path(tempdir2()), create = TRUE)
localFileLux <- system.file("ex/lux.shp", package = "terra")
# 1 step for each layer
# 1st step -- get study area
full <- prepInputs(localFileLux, destinationPath = dPath) # default is sf::st_read
zoneA <- full[3:6, ]
zoneB <- full[8, ] # not in A
zoneC <- full[3, ] # yes in A
zoneD <- full[7:8, ] # not in A, B or C
zoneE <- full[3:5, ] # yes in A
# 2nd step: re-write to disk as read/write is lossy; want all "from disk" for this ex.
writeTo(zoneA, writeTo = "zoneA.shp", destinationPath = dPath)
writeTo(zoneB, writeTo = "zoneB.shp", destinationPath = dPath)
writeTo(zoneC, writeTo = "zoneC.shp", destinationPath = dPath)
writeTo(zoneD, writeTo = "zoneD.shp", destinationPath = dPath)
writeTo(zoneE, writeTo = "zoneE.shp", destinationPath = dPath)
# Must re-read to get identical columns
zoneA <- sf::st_read(file.path(dPath, "zoneA.shp"))
zoneB <- sf::st_read(file.path(dPath, "zoneB.shp"))
zoneC <- sf::st_read(file.path(dPath, "zoneC.shp"))
zoneD <- sf::st_read(file.path(dPath, "zoneD.shp"))
zoneE <- sf::st_read(file.path(dPath, "zoneE.shp"))
# The function that is to be run. This example returns a data.frame because
# saving `sf` class objects with list-like columns does not work with
# many st_driver()
fun <- function(domain, newField) {
domain |>
as.data.frame() |>
cbind(params = I(lapply(seq_len(NROW(domain)), function(x) newField)))
}
# Run sequence -- A, B will add new entries in targetFile, C will not,
# D will, E will not
for (z in list(zoneA, zoneB, zoneC, zoneD, zoneE)) {
out <- CacheGeo(
targetFile = "fireSenseParams.rds",
domain = z,
FUN = fun(domain, newField = I(list(list(a = 1, b = 1:2, c = "D")))),
fun = fun, # pass whatever is needed into the function
destinationPath = dPath,
action = "update"
# , cloudFolderID = "cachedObjects" # to upload/download from cloud
)
}
}
#> Running prepInputs
#> Running `preProcess`
#> Preparing: /home/runner/work/_temp/Library/terra/ex/lux.shp
#> alsoExtract is unspecified; assuming that all files must be extracted
#> Using sf::st_read on shapefile because sf package is available; to force old
#> behaviour with 'raster::shapefile' use fun = 'raster::shapefile' or
#> options('reproducible.shapefileRead' = 'raster::shapefile')
#> Running `process` (i.e., loading file into R)
#> targetFile located at:
#> /home/runner/work/_temp/Library/terra/ex/lux.shp
#> Loading object into R
#> Reading layer `lux' from data source
#> `/home/runner/work/_temp/Library/terra/ex/lux.shp' using driver `ESRI Shapefile'
#> Simple feature collection with 12 features and 6 fields
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: 5.74414 ymin: 49.44781 xmax: 6.528252 ymax: 50.18162
#> Geodetic CRS: WGS 84
#> Saved! Cache file: d598abd965f6e371.rds; fn: sf::st_read
#> writing...
#> Writing layer `zoneA' to data source
#> `/tmp/RtmphO6j97/reproducible/rU9cvXhP/zoneA.shp' using driver `ESRI Shapefile'
#> Writing 4 features with 6 fields and geometry type Polygon.
#> done! took: 0.0042 secs
#> writing...
#> Writing layer `zoneB' to data source
#> `/tmp/RtmphO6j97/reproducible/rU9cvXhP/zoneB.shp' using driver `ESRI Shapefile'
#> Writing 1 features with 6 fields and geometry type Polygon.
#> done! took: 0.00251 secs
#> writing...
#> Writing layer `zoneC' to data source
#> `/tmp/RtmphO6j97/reproducible/rU9cvXhP/zoneC.shp' using driver `ESRI Shapefile'
#> Writing 1 features with 6 fields and geometry type Polygon.
#> done! took: 0.00251 secs
#> writing...
#> Writing layer `zoneD' to data source
#> `/tmp/RtmphO6j97/reproducible/rU9cvXhP/zoneD.shp' using driver `ESRI Shapefile'
#> Writing 2 features with 6 fields and geometry type Polygon.
#> done! took: 0.00262 secs
#> writing...
#> Writing layer `zoneE' to data source
#> `/tmp/RtmphO6j97/reproducible/rU9cvXhP/zoneE.shp' using driver `ESRI Shapefile'
#> Writing 3 features with 6 fields and geometry type Polygon.
#> done! took: 0.00264 secs
#> Reading layer `zoneA' from data source
#> `/tmp/RtmphO6j97/reproducible/rU9cvXhP/zoneA.shp' using driver `ESRI Shapefile'
#> Simple feature collection with 4 features and 6 fields
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: 5.74414 ymin: 49.69933 xmax: 6.528252 ymax: 50.03632
#> Geodetic CRS: WGS 84
#> Reading layer `zoneB' from data source
#> `/tmp/RtmphO6j97/reproducible/rU9cvXhP/zoneB.shp' using driver `ESRI Shapefile'
#> Simple feature collection with 1 feature and 6 fields
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: 6.169137 ymin: 49.58699 xmax: 6.516485 ymax: 49.75016
#> Geodetic CRS: WGS 84
#> Reading layer `zoneC' from data source
#> `/tmp/RtmphO6j97/reproducible/rU9cvXhP/zoneC.shp' using driver `ESRI Shapefile'
#> Simple feature collection with 1 feature and 6 fields
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: 5.746118 ymin: 49.69933 xmax: 6.029145 ymax: 49.89461
#> Geodetic CRS: WGS 84
#> Reading layer `zoneD' from data source
#> `/tmp/RtmphO6j97/reproducible/rU9cvXhP/zoneD.shp' using driver `ESRI Shapefile'
#> Simple feature collection with 2 features and 6 fields
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: 6.169137 ymin: 49.46498 xmax: 6.516485 ymax: 49.75016
#> Geodetic CRS: WGS 84
#> Reading layer `zoneE' from data source
#> `/tmp/RtmphO6j97/reproducible/rU9cvXhP/zoneE.shp' using driver `ESRI Shapefile'
#> Simple feature collection with 3 features and 6 fields
#> Geometry type: POLYGON
#> Dimension: XY
#> Bounding box: xmin: 5.74414 ymin: 49.69933 xmax: 6.239243 ymax: 50.03632
#> Geodetic CRS: WGS 84
#> Domain is not contained within the targetFile; running FUN
#> No cachePath supplied and getOption('reproducible.cachePath') is inside a temporary directory;
#> this will not persist across R sessions.
#> Running prepInputs
#> Running `preProcess`
#> Preparing: fireSenseParams.rds
#> alsoExtract is unspecified; assuming that all files must be extracted
#> Running `process` (i.e., loading file into R)
#> targetFile located at:
#> /tmp/RtmphO6j97/reproducible/rU9cvXhP/fireSenseParams.rds
#> Loading object into R
#> targetFile is already a binary; skipping Cache while loading
#> Saved! Cache file: a23e6daa3fc8bdc5.rds; fn: prepInputs_fireSenseParams.rds
#> Domain is not contained within the targetFile; running FUN
#> Spatial domain is contained within the url; returning the object
#> To get the full object from googledrive, which looks like this:
#> ID_1 NAME_1 ID_2 NAME_2 AREA POP geometry
#> 1 1 Diekirch 3 Redange 259 18664 MULTIPOLYGON (((5.881378 49...
#> 2 1 Diekirch 4 Vianden 76 5163 MULTIPOLYGON (((6.131309 49...
#> 3 1 Diekirch 5 Wiltz 263 16735 MULTIPOLYGON (((5.977929 50...
#> 4 2 Grevenmacher 6 Echternach 188 18899 MULTIPOLYGON (((6.385532 49...
#> 5 2 Grevenmacher 12 Grevenmacher 210 29828 MULTIPOLYGON (((6.425158 49...
#> params
#> 1 list(a =....
#> 2 list(a =....
#> 3 list(a =....
#> 4 list(a =....
#> 5 list(a =....
#> ... run the following:
#> prepInputs(targetFile = "fireSenseParams.rds", url = NULL, destinationPath = "/tmp/RtmphO6j97/reproducible/rU9cvXhP",
#> useCache = FALSE, purge = FALSE, overwrite = FALSE)
#> No cachePath supplied and getOption('reproducible.cachePath') is inside a temporary directory;
#> this will not persist across R sessions.
#> Object to retrieve (fn: prepInputs_fireSenseParams.rds, a23e6daa3fc8bdc5.rds)
#> ...
#> Loaded! Cached result from previous prepInputs_fireSenseParams.rds call
#> Spatial domain is contained within the url; returning the object
#> To get the full object from googledrive, which looks like this:
#> ID_1 NAME_1 ID_2 NAME_2 AREA POP geometry
#> 1 1 Diekirch 3 Redange 259 18664 POLYGON ((5.881378 49.87015...
#> 2 1 Diekirch 4 Vianden 76 5163 POLYGON ((6.131309 49.97256...
#> 3 1 Diekirch 5 Wiltz 263 16735 POLYGON ((5.977929 50.02602...
#> 4 2 Grevenmacher 6 Echternach 188 18899 POLYGON ((6.385532 49.83703...
#> params
#> 1 list(a =....
#> 2 list(a =....
#> 3 list(a =....
#> 4 list(a =....
#> ... run the following:
#> prepInputs(targetFile = "fireSenseParams.rds", url = NULL, destinationPath = "/tmp/RtmphO6j97/reproducible/rU9cvXhP",
#> useCache = FALSE, purge = FALSE, overwrite = FALSE)
#> No cachePath supplied and getOption('reproducible.cachePath') is inside a temporary directory;
#> this will not persist across R sessions.
#> Object to retrieve (fn: prepInputs_fireSenseParams.rds, a23e6daa3fc8bdc5.rds)
#> ...
#> Loaded! Cached result from previous prepInputs_fireSenseParams.rds call
#> Domain is not contained within the targetFile; running FUN
#> Spatial domain is contained within the url; returning the object
#> To get the full object from googledrive, which looks like this:
#> ID_1 NAME_1 ID_2 NAME_2 AREA POP geometry
#> 1 1 Diekirch 3 Redange 259 18664 MULTIPOLYGON (((5.881378 49...
#> 2 1 Diekirch 4 Vianden 76 5163 MULTIPOLYGON (((6.131309 49...
#> 3 1 Diekirch 5 Wiltz 263 16735 MULTIPOLYGON (((5.977929 50...
#> 4 2 Grevenmacher 6 Echternach 188 18899 MULTIPOLYGON (((6.385532 49...
#> 5 2 Grevenmacher 7 Remich 129 22366 MULTIPOLYGON (((6.316665 49...
#> 6 2 Grevenmacher 12 Grevenmacher 210 29828 MULTIPOLYGON (((6.425158 49...
#> params
#> 1 list(a =....
#> 2 list(a =....
#> 3 list(a =....
#> 4 list(a =....
#> 5 list(a =....
#> 6 list(a =....
#> ... run the following:
#> prepInputs(targetFile = "fireSenseParams.rds", url = NULL, destinationPath = "/tmp/RtmphO6j97/reproducible/rU9cvXhP",
#> useCache = FALSE, purge = FALSE, overwrite = FALSE)
#> No cachePath supplied and getOption('reproducible.cachePath') is inside a temporary directory;
#> this will not persist across R sessions.
#> Object to retrieve (fn: prepInputs_fireSenseParams.rds, a23e6daa3fc8bdc5.rds)
#> ...
#> Loaded! Cached result from previous prepInputs_fireSenseParams.rds call
#> Spatial domain is contained within the url; returning the object
#> To get the full object from googledrive, which looks like this:
#> ID_1 NAME_1 ID_2 NAME_2 AREA POP geometry
#> 1 1 Diekirch 3 Redange 259 18664 POLYGON ((5.881378 49.87015...
#> 2 1 Diekirch 4 Vianden 76 5163 POLYGON ((6.131309 49.97256...
#> 3 1 Diekirch 5 Wiltz 263 16735 POLYGON ((5.977929 50.02602...
#> 4 2 Grevenmacher 6 Echternach 188 18899 POLYGON ((6.385532 49.83703...
#> params
#> 1 list(a =....
#> 2 list(a =....
#> 3 list(a =....
#> 4 list(a =....
#> ... run the following:
#> prepInputs(targetFile = "fireSenseParams.rds", url = NULL, destinationPath = "/tmp/RtmphO6j97/reproducible/rU9cvXhP",
#> useCache = FALSE, purge = FALSE, overwrite = FALSE)
# }