experimental

CacheGeo(
  targetFile = NULL,
  domain,
  FUN,
  destinationPath = getOption("reproducible.destinationPath", "."),
  useCloud = getOption("reproducible.useCloud", FALSE),
  cloudFolderID = NULL,
  purge = FALSE,
  useCache = getOption("reproducible.useCache"),
  overwrite = getOption("reproducible.overwrite"),
  action = c("nothing", "update", "replace", "append"),
  bufferOK = FALSE,
  verbose = getOption("reproducible.verbose"),
  ...
)

Arguments

targetFile

The (optional) local file (or path to file) name for a sf object or data.frame that can be coerced to a sf object (i.e., has a geometry column). If cloudFolderID is specified, then this will be the name of the file stored and/or accessed in that cloud folder.

domain

An sf polygon object that is the spatial area of interest. If NULL, then this will return the whole object in targetFile.

FUN

A function call that will be called if there is the domain is not already contained within the sf object at targetFile. This function call MUST return either a sf class object or a data.frame class object that has a geometry column (which can then be converted to sf with sf::st_as_sf)

destinationPath

Character string of a directory in which to download and save the file that comes from url and is also where the function will look for archive or targetFile. NOTE (still experimental): To prevent repeated downloads in different locations, the user can also set options("reproducible.inputPaths") to one or more local file paths to search for the file before attempting to download. Default for that option is NULL meaning do not search locally.

useCloud

A logical.

cloudFolderID

If this is specified, then it must be either 1) a Google Drive url to a folder where the targetFile will be read from or written to, or 2) a googledrive id or 3) an absolute path to a (possibly non-existent yet) folder on your Google drive.

purge

Logical or Integer. 0/FALSE (default) keeps existing CHECKSUMS.txt file and prepInputs will write or append to it. 1/TRUE will deleted the entire CHECKSUMS.txt file. Other options, see details.

useCache

Passed to Cache in various places. Defaults to getOption("reproducible.useCache", 2L) in prepInputs, and getOption("reproducible.useCache", FALSE) if calling any of the inner functions manually. For prepInputs, this mean it will use Cache only up to 2 nested levels, which includes preProcess. postProcess and its nested *Input functions (e.g., cropInputs, projectInputs, maskInputs) are no longer internally cached, as terra processing speeds mean internal caching is more time consuming. We recommend caching the full prepInputs call instead (e.g. prepInputs(...) |> Cache()).

overwrite

Logical. Passed to writeTo (possibly inside postProcess) and postProcess.

action

A character string, with one of c("nothing", "update", "replace", "append"). Partial matching is used ("n" is sufficient). nothing will prevent any updating of the targetFile, i.e., "read only". append will add the spatial elements in domain to targetFile (and writing it back to disk). update will do the same as append, but will also remove any identical geometries before appending. replace does nothing currently.

bufferOK

A logical. If TRUE, then after testing whether the domain is within the targetFile spatial object, and if it returns FALSE, then the function will create a larger object, buffered by 2.5% of the extent of the object. If FALSE, then it will be strict about whether the domain is within the targetFile.

verbose

Numeric, -1 silent (where possible), 0 being very quiet, 1 showing more messaging, 2 being more messaging, etc. Default is 1. Above 3 will output much more information about the internals of Caching, which may help diagnose Caching challenges. Can set globally with an option, e.g., options('reproducible.verbose' = 0) to reduce to minimal

...

All named objects that are needed for FUN, including the function itself, if it is not in a package.

Value

Returns an object that results from FUN, which will possibly be a subset of a larger spatial object that is specified with targetFile.

Details

This function is a combination of Cache and prepInputs but for spatial domains. This differs from Cache in that the current function call doesn't have to have an identical function call previously run. Instead, it needs to have had a previous function call where the domain being passes is within the geographic limits of the targetFile. This is similar to a geospatial operation on a remote GIS server, with 2 differences:

  1. This downloads the object first before doing the GIS locally, and 2. it will optionally upload an updated object if the geographic area did not yet exist.

This has a very specific use case: assess whether an existing sf polygon or multipolygon object (local or remote) covers the spatial area of a domain of interest. If it does, then return only that part of the sf object that completely covers the domain. If it does not, then run FUN. It is expected that FUN will produce an sf polygon or multipolygon class object. The result of FUN will then be appended to the sf object as a new entry (feature) or it will replace the existing "same extent" entry in the sf object.

Examples

# \donttest{

if (requireNamespace("sf", quietly = TRUE) &&
    requireNamespace("terra", quietly = TRUE)) {
  dPath <- checkPath(file.path(tempdir2()), create = TRUE)
  localFileLux <- system.file("ex/lux.shp", package = "terra")

  # 1 step for each layer
  # 1st step -- get study area
  full <- prepInputs(localFileLux, destinationPath = dPath) # default is sf::st_read
  zoneA <- full[3:6, ]
  zoneB <- full[8, ] # not in A
  zoneC <- full[3, ] # yes in A
  zoneD <- full[7:8, ] # not in A, B or C
  zoneE <- full[3:5, ] # yes in A
  # 2nd step: re-write to disk as read/write is lossy; want all "from disk" for this ex.
  writeTo(zoneA, writeTo = "zoneA.shp", destinationPath = dPath)
  writeTo(zoneB, writeTo = "zoneB.shp", destinationPath = dPath)
  writeTo(zoneC, writeTo = "zoneC.shp", destinationPath = dPath)
  writeTo(zoneD, writeTo = "zoneD.shp", destinationPath = dPath)
  writeTo(zoneE, writeTo = "zoneE.shp", destinationPath = dPath)
  # Must re-read to get identical columns
  zoneA <- sf::st_read(file.path(dPath, "zoneA.shp"))
  zoneB <- sf::st_read(file.path(dPath, "zoneB.shp"))
  zoneC <- sf::st_read(file.path(dPath, "zoneC.shp"))
  zoneD <- sf::st_read(file.path(dPath, "zoneD.shp"))
  zoneE <- sf::st_read(file.path(dPath, "zoneE.shp"))

  # The function that is to be run. This example returns a data.frame because
  #    saving `sf` class objects with list-like columns does not work with
  #    many st_driver()
  fun <- function(domain, newField) {
    domain |>
      as.data.frame() |>
      cbind(params = I(lapply(seq_len(NROW(domain)), function(x) newField)))
  }

  # Run sequence -- A, B will add new entries in targetFile, C will not,
  #                 D will, E will not
  for (z in list(zoneA, zoneB, zoneC, zoneD, zoneE)) {
    out <- CacheGeo(
      targetFile = "fireSenseParams.rds",
      domain = z,
      FUN = fun(domain, newField = I(list(list(a = 1, b = 1:2, c = "D")))),
      fun = fun, # pass whatever is needed into the function
      destinationPath = dPath,
      action = "update"
      # , cloudFolderID = "cachedObjects" # to upload/download from cloud
    )
  }
}
#> Running prepInputs
#> Running `preProcess`
#> Preparing: /home/runner/work/_temp/Library/terra/ex/lux.shp
#> alsoExtract is unspecified; assuming that all files must be extracted
#>   Using sf::st_read on shapefile because sf package is available; to force old
#>     behaviour with 'raster::shapefile' use fun = 'raster::shapefile' or
#>     options('reproducible.shapefileRead' = 'raster::shapefile')
#> Running `process` (i.e., loading file into R)
#> targetFile located at:
#> /home/runner/work/_temp/Library/terra/ex/lux.shp
#> Loading object into R
#> Reading layer `lux' from data source 
#>   `/home/runner/work/_temp/Library/terra/ex/lux.shp' using driver `ESRI Shapefile'
#> Simple feature collection with 12 features and 6 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 5.74414 ymin: 49.44781 xmax: 6.528252 ymax: 50.18162
#> Geodetic CRS:  WGS 84
#> Saved! Cache file: d598abd965f6e371.rds; fn: sf::st_read
#> writing...
#> Writing layer `zoneA' to data source 
#>   `/tmp/RtmphO6j97/reproducible/rU9cvXhP/zoneA.shp' using driver `ESRI Shapefile'
#> Writing 4 features with 6 fields and geometry type Polygon.
#> done! took:  0.0042 secs
#> writing...
#> Writing layer `zoneB' to data source 
#>   `/tmp/RtmphO6j97/reproducible/rU9cvXhP/zoneB.shp' using driver `ESRI Shapefile'
#> Writing 1 features with 6 fields and geometry type Polygon.
#> done! took:  0.00251 secs
#> writing...
#> Writing layer `zoneC' to data source 
#>   `/tmp/RtmphO6j97/reproducible/rU9cvXhP/zoneC.shp' using driver `ESRI Shapefile'
#> Writing 1 features with 6 fields and geometry type Polygon.
#> done! took:  0.00251 secs
#> writing...
#> Writing layer `zoneD' to data source 
#>   `/tmp/RtmphO6j97/reproducible/rU9cvXhP/zoneD.shp' using driver `ESRI Shapefile'
#> Writing 2 features with 6 fields and geometry type Polygon.
#> done! took:  0.00262 secs
#> writing...
#> Writing layer `zoneE' to data source 
#>   `/tmp/RtmphO6j97/reproducible/rU9cvXhP/zoneE.shp' using driver `ESRI Shapefile'
#> Writing 3 features with 6 fields and geometry type Polygon.
#> done! took:  0.00264 secs
#> Reading layer `zoneA' from data source 
#>   `/tmp/RtmphO6j97/reproducible/rU9cvXhP/zoneA.shp' using driver `ESRI Shapefile'
#> Simple feature collection with 4 features and 6 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 5.74414 ymin: 49.69933 xmax: 6.528252 ymax: 50.03632
#> Geodetic CRS:  WGS 84
#> Reading layer `zoneB' from data source 
#>   `/tmp/RtmphO6j97/reproducible/rU9cvXhP/zoneB.shp' using driver `ESRI Shapefile'
#> Simple feature collection with 1 feature and 6 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 6.169137 ymin: 49.58699 xmax: 6.516485 ymax: 49.75016
#> Geodetic CRS:  WGS 84
#> Reading layer `zoneC' from data source 
#>   `/tmp/RtmphO6j97/reproducible/rU9cvXhP/zoneC.shp' using driver `ESRI Shapefile'
#> Simple feature collection with 1 feature and 6 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 5.746118 ymin: 49.69933 xmax: 6.029145 ymax: 49.89461
#> Geodetic CRS:  WGS 84
#> Reading layer `zoneD' from data source 
#>   `/tmp/RtmphO6j97/reproducible/rU9cvXhP/zoneD.shp' using driver `ESRI Shapefile'
#> Simple feature collection with 2 features and 6 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 6.169137 ymin: 49.46498 xmax: 6.516485 ymax: 49.75016
#> Geodetic CRS:  WGS 84
#> Reading layer `zoneE' from data source 
#>   `/tmp/RtmphO6j97/reproducible/rU9cvXhP/zoneE.shp' using driver `ESRI Shapefile'
#> Simple feature collection with 3 features and 6 fields
#> Geometry type: POLYGON
#> Dimension:     XY
#> Bounding box:  xmin: 5.74414 ymin: 49.69933 xmax: 6.239243 ymax: 50.03632
#> Geodetic CRS:  WGS 84
#> Domain is not contained within the targetFile; running FUN
#> No cachePath supplied and getOption('reproducible.cachePath') is inside a temporary directory;
#>   this will not persist across R sessions.
#> Running prepInputs
#> Running `preProcess`
#> Preparing: fireSenseParams.rds
#> alsoExtract is unspecified; assuming that all files must be extracted
#> Running `process` (i.e., loading file into R)
#> targetFile located at:
#> /tmp/RtmphO6j97/reproducible/rU9cvXhP/fireSenseParams.rds
#> Loading object into R
#> targetFile is already a binary; skipping Cache while loading
#> Saved! Cache file: a23e6daa3fc8bdc5.rds; fn: prepInputs_fireSenseParams.rds
#> Domain is not contained within the targetFile; running FUN
#> Spatial domain is contained within the url; returning the object
#> To get the full object from googledrive, which looks like this:
#>   ID_1       NAME_1 ID_2       NAME_2 AREA   POP                       geometry
#> 1    1     Diekirch    3      Redange  259 18664 MULTIPOLYGON (((5.881378 49...
#> 2    1     Diekirch    4      Vianden   76  5163 MULTIPOLYGON (((6.131309 49...
#> 3    1     Diekirch    5        Wiltz  263 16735 MULTIPOLYGON (((5.977929 50...
#> 4    2 Grevenmacher    6   Echternach  188 18899 MULTIPOLYGON (((6.385532 49...
#> 5    2 Grevenmacher   12 Grevenmacher  210 29828 MULTIPOLYGON (((6.425158 49...
#>         params
#> 1 list(a =....
#> 2 list(a =....
#> 3 list(a =....
#> 4 list(a =....
#> 5 list(a =....
#> ... run the following:
#> prepInputs(targetFile = "fireSenseParams.rds", url = NULL, destinationPath = "/tmp/RtmphO6j97/reproducible/rU9cvXhP", 
#>     useCache = FALSE, purge = FALSE, overwrite = FALSE)
#> No cachePath supplied and getOption('reproducible.cachePath') is inside a temporary directory;
#>   this will not persist across R sessions.
#> Object to retrieve (fn: prepInputs_fireSenseParams.rds, a23e6daa3fc8bdc5.rds)
#>   ...
#> Loaded! Cached result from previous prepInputs_fireSenseParams.rds call
#> Spatial domain is contained within the url; returning the object
#> To get the full object from googledrive, which looks like this:
#>   ID_1       NAME_1 ID_2     NAME_2 AREA   POP                       geometry
#> 1    1     Diekirch    3    Redange  259 18664 POLYGON ((5.881378 49.87015...
#> 2    1     Diekirch    4    Vianden   76  5163 POLYGON ((6.131309 49.97256...
#> 3    1     Diekirch    5      Wiltz  263 16735 POLYGON ((5.977929 50.02602...
#> 4    2 Grevenmacher    6 Echternach  188 18899 POLYGON ((6.385532 49.83703...
#>         params
#> 1 list(a =....
#> 2 list(a =....
#> 3 list(a =....
#> 4 list(a =....
#> ... run the following:
#> prepInputs(targetFile = "fireSenseParams.rds", url = NULL, destinationPath = "/tmp/RtmphO6j97/reproducible/rU9cvXhP", 
#>     useCache = FALSE, purge = FALSE, overwrite = FALSE)
#> No cachePath supplied and getOption('reproducible.cachePath') is inside a temporary directory;
#>   this will not persist across R sessions.
#> Object to retrieve (fn: prepInputs_fireSenseParams.rds, a23e6daa3fc8bdc5.rds)
#>   ...
#> Loaded! Cached result from previous prepInputs_fireSenseParams.rds call
#> Domain is not contained within the targetFile; running FUN
#> Spatial domain is contained within the url; returning the object
#> To get the full object from googledrive, which looks like this:
#>   ID_1       NAME_1 ID_2       NAME_2 AREA   POP                       geometry
#> 1    1     Diekirch    3      Redange  259 18664 MULTIPOLYGON (((5.881378 49...
#> 2    1     Diekirch    4      Vianden   76  5163 MULTIPOLYGON (((6.131309 49...
#> 3    1     Diekirch    5        Wiltz  263 16735 MULTIPOLYGON (((5.977929 50...
#> 4    2 Grevenmacher    6   Echternach  188 18899 MULTIPOLYGON (((6.385532 49...
#> 5    2 Grevenmacher    7       Remich  129 22366 MULTIPOLYGON (((6.316665 49...
#> 6    2 Grevenmacher   12 Grevenmacher  210 29828 MULTIPOLYGON (((6.425158 49...
#>         params
#> 1 list(a =....
#> 2 list(a =....
#> 3 list(a =....
#> 4 list(a =....
#> 5 list(a =....
#> 6 list(a =....
#> ... run the following:
#> prepInputs(targetFile = "fireSenseParams.rds", url = NULL, destinationPath = "/tmp/RtmphO6j97/reproducible/rU9cvXhP", 
#>     useCache = FALSE, purge = FALSE, overwrite = FALSE)
#> No cachePath supplied and getOption('reproducible.cachePath') is inside a temporary directory;
#>   this will not persist across R sessions.
#> Object to retrieve (fn: prepInputs_fireSenseParams.rds, a23e6daa3fc8bdc5.rds)
#>   ...
#> Loaded! Cached result from previous prepInputs_fireSenseParams.rds call
#> Spatial domain is contained within the url; returning the object
#> To get the full object from googledrive, which looks like this:
#>   ID_1       NAME_1 ID_2     NAME_2 AREA   POP                       geometry
#> 1    1     Diekirch    3    Redange  259 18664 POLYGON ((5.881378 49.87015...
#> 2    1     Diekirch    4    Vianden   76  5163 POLYGON ((6.131309 49.97256...
#> 3    1     Diekirch    5      Wiltz  263 16735 POLYGON ((5.977929 50.02602...
#> 4    2 Grevenmacher    6 Echternach  188 18899 POLYGON ((6.385532 49.83703...
#>         params
#> 1 list(a =....
#> 2 list(a =....
#> 3 list(a =....
#> 4 list(a =....
#> ... run the following:
#> prepInputs(targetFile = "fireSenseParams.rds", url = NULL, destinationPath = "/tmp/RtmphO6j97/reproducible/rU9cvXhP", 
#>     useCache = FALSE, purge = FALSE, overwrite = FALSE)
# }