-
Bryan Mayer talking about DataPackageR and the trials of sharing data. Lots of issues, hard to notate final final versions of data... #Cascadiarconf
-
DataPackageR: Versionable data, with processing scripts, and processing data. An @rOpenSci joint. #Cascadiarconf
-
@rOpenSci Good for mature workflows using version control, and packaging multiple datasets. Not for really large data. Data is shared as R package, easy to load in data as you need it. #Cascadiarconf
-
@rOpenSci Configuration using datapackage_skeleton() - YAML file controls package building process. data-raw/ folder houses user code for data. #Cascadiarconf
-
[editorial note: I was a reviewer on DataPackageR for Gates open sci, and we use DataPackageR for our work]
-
Convenience functions exist for adding/removing data objects in the package. Store data processing code in data-raw/ and access using project_extdata_path() #Cascadiarconf
-
Once everything is set, build package and fill out documentation. Goes through datasets and creates documentation. Fill out documentation, and then use devtools::document(). Then you can build as a normal R package. #Cascadiarconf