This does downloading (via downloadFile), checksumming (Checksums), and extracting from archives (extractFromArchive), plus cleaning up of input arguments (e.g., paths, function names). This is the first stage of three used in prepInputs.

preProcess(targetFile = NULL, url = NULL, archive = NULL,
  alsoExtract = NULL, destinationPath = ".", fun = NULL,
  dlFun = NULL, quick = getOption("reproducible.quick"),
  overwrite = FALSE, purge = FALSE,
  useCache = getOption("reproducible.useCache", FALSE), ...)

Arguments

targetFile

Character string giving the path to the eventual file (raster, shapefile, csv, etc.) after downloading and extracting from a zip or tar archive. This is the file before it is passed to postProcess. Currently, the internal checksumming does not checksum the file after it is postProcessed (e.g., cropped/reprojected/masked). Using Cache around prepInputs will do a sufficient job in these cases. See table in preProcess.

url

Optional character string indicating the URL to download from. If not specified, then no download will be attempted. If not entry exists in the CHECKSUMS.txt (in destinationPath), an entry will be created or appended to. This CHECKSUMS.txt entry will be used in subsequent calls to prepInputs or preProcess, comparing the file on hand with the ad hoc CHECKSUMS.txt. See table in preProcess.

archive

Optional character string giving the path of an archive containing targetFile, or a vector giving a set of nested archives (e.g., c("xxx.tar", "inner.zip")). If there is/are (an) inner archive(s), but they are unknown, the function will try all until it finds the targetFile. See table in preProcess.

alsoExtract

Optional character string naming files other than targetFile that must be extracted from the archive. If NULL, the default, then it will extract all files. Other options: "similar" will extract all files with the same filename without file extension as targetFile. NA will extract nothing other than targetFile. A character string of specific file names will cause only those to be extracted. See table in preProcess.

destinationPath

Character string of a directory in which to download and save the file that comes from url and is also where the function will look for archive or targetFile.

fun

Function or character string indicating the function to use to load targetFile into an R object, e.g., in form wtih package name: "raster::raster".

dlFun

Optional "download function" name, such as "raster::getData", which does custom downloading, in addition to loading into R. Still experimental.

quick

Logical. This is passed internally to Checksums (the quickCheck argument), and to Cache (the quick argument). This results in faster, though less robust checking of inputs. See the respective functions.

overwrite

Logical. Should downloading and all the other actions occur even if they pass the checksums or the files are all there.

purge

Logical or Integer. 0/FALSE (default) keeps existing CHECKSUMS.txt file and prepInputs will write or append to it. 1/TRUE will deleted the entire CHECKSUMS.txt file. Other options, see details.

useCache

Passed to Cache in various places. Defaults to getOption("reproducible.useCache")

...

Additional arguments passed to fun (i.e,. user supplied), postProcess and Cache. Since ... is passed to postProcess, these will ... will also be passed into the inner functions, e.g., cropInputs. See details and examples.

Value

A list with 5 elements, checkSums (the result of a Checksums after downloading), dots (cleaned up ..., including deprecated argument checks), fun (the function to be used to load the preProcessed object from disk), and targetFilePath (the fully qualified path to the targetFile).

Combinations of targetFile, url, archive, alsoExtract

# ParamsurltargetFilearchivealsoExtractResultChecksum 1st timeChecksum 2nd time
------------------------------------------------
1charNULLNULLNULLDownload, extract all files if an archive, guess at targetFile, load into Rwrite or append all new filessame as 1st -- no targetFile*
NULLcharNULLNULLload targetFile into Rwrite or append targetFileno downloading, so no checksums use
NULLNULLcharNULLextract all files, guess at targetFile, load into Rwrite or append all new filesno downloading, so no checksums use
NULLNULLNULLcharguess at targetFile from files in alsoExtract, load into Rwrite or append all new filesno downloading, so no checksums use
------------------------------------------------
2charcharNULLNULLDownload, extract all files if an archive, load targetFile into Rwrite or append all new filesuse Checksums, skip downloading
charNULLcharNULLDownload, extract all files, guess at targetFile, load into Rwrite or append all new filessame as 1st -- no targetFile*
charNULLNULLcharDownload, extract only named files in alsoExtract, guess at targetFile, load into Rwrite or append all new filessame as 1st -- no targetFile*
NULLcharNULLcharload targetFile into Rwrite or append all new filesno downloading, so no checksums use
NULLcharcharNULLExtract all files, load targetFile into Rwrite or append all new filesno downloading, so no checksums use
NULLNULLcharcharExtract only named files in alsoExtract, guess at targetFile, load into Rwrite or append all new filesno downloading, so no checksums use
------------------------------------------------
3charcharcharNULLDownload, extract all files, load targetFile into Rwrite or append all new filesuse Checksums, skip downloading
charNULLcharcharDownload, extract files named in alsoExtract, guess at targetFile, load into Rwrite or append all new filesuse Checksums, skip downloading
charNULLchar"similar"Download, extract all files (can't understand "similar"), guess at targetFile, load into Rwrite or append all new filessame as 1st -- no targetFile*
charcharNULLcharDownload, if an archive, extract files named in targetFile and alsoExtract, load targetFile into Rwrite or append all new filesuse Checksums, skip downloading
charcharNULL"similar"Download, if an archive, extract files with same base as targetFile, load targetFile into Rwrite or append all new filesuse Checksums, skip downloading
charcharcharNULLDownload, extract all files from archive, load targetFile into Rwrite or append all new filesuse Checksums, skip downloading
NULLcharcharcharExtract files named in alsoExtract from archive, load targetFile into Rwrite or append all new filesno downloading, so no checksums use
------------------------------------------------
4charcharcharcharDownload, extract files named in targetFile and alsoExtract, load targetFile into Rwrite or append all new filesuse Checksums, skip downloading
charcharchar"similar"Download, extract all files with same base as targetFile, load targetFile into Rwrite or append all new filesuse Checksums, skip downloading
# ParamsurltargetFilearchivealsoExtractResultChecksum 1st time

* If the url is a file on Google Drive, checksumming will work even without a targetFile specified because there is an initial attempt to get the remove file information (e.g., file name). With that, the connection between the url and the filename used in the CHECKSUMS.txt file can be made.