Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

csv #69

Merged
merged 31 commits into from
Mar 17, 2022
Merged

csv #69

Show file tree
Hide file tree
Changes from all commits
Commits
Show all changes
31 commits
Select commit Hold shift + click to select a range
977be55
update checklist settings
ThierryO May 21, 2021
bd10219
upgrade to 7.1.2
ThierryO Nov 3, 2021
804ee7d
add CITATION information
ThierryO Nov 3, 2021
c598ad8
fix URL in README
ThierryO Nov 3, 2021
d631c73
write_vc() stores non optimized files as comma separated values inste…
ThierryO Nov 4, 2021
a8a2d3a
update vignettes
ThierryO Nov 4, 2021
88024bd
upgrade_data() handles the new format for verbose data
ThierryO Nov 5, 2021
86b3250
update NEWS
ThierryO Nov 26, 2021
383822f
update citation
ThierryO Jan 21, 2022
5b24ce3
add verify_vc()
ThierryO Jan 25, 2022
2b977d2
fix linters
ThierryO Jan 25, 2022
087fa91
reduce cyclomatic complexity
ThierryO Jan 25, 2022
52e8a52
reduce cyclomatic complexity of upgrade_data()
ThierryO Jan 25, 2022
01cd4f6
recent_commit() handles data files that no longer exist in the worksp…
ThierryO Jan 25, 2022
8bee128
update to most recent checklist
ThierryO Jan 25, 2022
50ab6a7
move DOI from Description: to URL:
ThierryO Jan 25, 2022
53166c4
vignette: handle when 'defaultBranch' is specified in '.gitconfig'
stewid Feb 23, 2022
acf8ae2
remove LazyData from DESCRIPTION
ThierryO Feb 24, 2022
5948e7b
don't mention DOI in DESCRIPION due to problems with pkgdown and roxy…
ThierryO Feb 25, 2022
372e4b6
Merge branch 'styling' into vignette-branch-name
ThierryO Feb 25, 2022
63f80ef
Merge pull request #68 from stewid/vignette-branch-name
ThierryO Feb 25, 2022
1070289
fix lintr
ThierryO Feb 25, 2022
0574b2c
set config() for git repo in unit test
ThierryO Feb 25, 2022
52526d2
add unit test for verify_vc()
ThierryO Feb 25, 2022
aa8f7e7
update NEWS
ThierryO Feb 25, 2022
982e359
install missing linux dependency
ThierryO Feb 25, 2022
edf6af8
define language in ISO 639-3 format
ThierryO Mar 14, 2022
5a311fa
add keywords to citation files
ThierryO Mar 14, 2022
51d3131
don't clean up examples
ThierryO Mar 16, 2022
2819a1b
don't clean temp files from unit tests
ThierryO Mar 16, 2022
1815ae3
update CRAN comments
ThierryO Mar 17, 2022
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
27 changes: 16 additions & 11 deletions .Rbuildignore
Original file line number Diff line number Diff line change
@@ -1,17 +1,22 @@
# checklist
^_pkgdown.yml$
^.*\.Rproj$
^\.Rproj\.user$
^\.github$
^codemeta\.json$
^.zenodo\.json$
^man-roxygen$
^pkgdown$
^_pkgdown.yml$
^docs$
^cran-comments\.md$
# checklist
^\.github$
^\.httr-oauth$
^\.Rproj\.user$
^\.zenodo\.json$
^checklist.yml$
^CITATION\.cff$
^codecov.yml$
^LICENSE.md$
^\.httr-oauth$
^codecov\.yml$
^codemeta\.json$
^cran-comments\.md$
^data-raw$
^doc$
^docs$
^LICENSE.md$
^man-roxygen$
^Meta$
^pkgdown$
^README\.Rmd$
7 changes: 5 additions & 2 deletions .github/workflows/check_on_branch.yml
Original file line number Diff line number Diff line change
@@ -1,6 +1,7 @@
on:
push:
branches-ignore:
- main
- master
- ghpages

Expand All @@ -10,8 +11,10 @@ jobs:
check-package:
runs-on: ubuntu-latest
name: "check package"
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
ORCID_TOKEN: ${{ secrets.ORCID_TOKEN }}
steps:
- uses: inbo/actions/check_pkg@master
with:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
ORCID_TOKEN: ${{ secrets.ORCID_TOKEN }}
token: ${{ secrets.PAT }}
12 changes: 9 additions & 3 deletions .github/workflows/check_on_different_r_os.yml
Original file line number Diff line number Diff line change
@@ -1,12 +1,14 @@
on:
push:
branches:
- main
- master
pull_request:
branches:
- main
- master

name: R-CMD-check
name: R-CMD-check-OS

jobs:
R-CMD-check:
Expand All @@ -21,10 +23,11 @@ jobs:
- {os: macOS-latest, r: 'release'}
- {os: windows-latest, r: 'release'}
- {os: ubuntu-20.04, r: 'devel', rspm: "https://packagemanager.rstudio.com/cran/__linux__/focal/latest"}
- {os: ubuntu-16.04, r: 'oldrel', rspm: "https://packagemanager.rstudio.com/cran/__linux__/xenial/latest"}
- {os: ubuntu-20.04, r: 'oldrel', rspm: "https://packagemanager.rstudio.com/cran/__linux__/focal/latest"}

env:
R_REMOTES_NO_ERRORS_FROM_WARNINGS: true
_R_CHECK_SYSTEM_CLOCK_: false
RSPM: ${{ matrix.config.rspm }}
GITHUB_PAT: ${{ secrets.GITHUB_TOKEN }}
ORCID_TOKEN: ${{ secrets.ORCID_TOKEN }}
Expand Down Expand Up @@ -61,6 +64,8 @@ jobs:
Rscript -e "remotes::install_github('r-hub/sysreqs')"
sysreqs=$(Rscript -e "cat(sysreqs::sysreq_commands('DESCRIPTION'))")
sudo -s eval "$sysreqs"
sudo apt-get install -y libcurl4-openssl-dev

- name: Install dependencies
run: |
remotes::install_deps(dependencies = TRUE)
Expand All @@ -77,7 +82,7 @@ jobs:
- name: Check
env:
_R_CHECK_CRAN_INCOMING_: false
run: rcmdcheck::rcmdcheck(args = c("--no-manual", "--as-cran"), error_on = "warning", check_dir = "check")
run: rcmdcheck::rcmdcheck(args = c("--no-manual", "--as-cran"), error_on = "error", check_dir = "check")
shell: Rscript {0}

- name: Show testthat output
Expand All @@ -91,3 +96,4 @@ jobs:
with:
name: ${{ runner.os }}-r${{ matrix.config.r }}-results
path: check
retention-days: 5
21 changes: 21 additions & 0 deletions .github/workflows/check_on_main.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
on:
push:
branches:
- main
- master
schedule:
- cron: '6 0 15 * *'

name: "check package on main"

jobs:
check-package:
runs-on: ubuntu-latest
name: "check package"
env:
CODECOV_TOKEN: ${{ secrets.CODECOV_TOKEN }}
ORCID_TOKEN: ${{ secrets.ORCID_TOKEN }}
steps:
- uses: inbo/actions/check_pkg@master
with:
token: ${{ secrets.PAT }}
19 changes: 0 additions & 19 deletions .github/workflows/check_on_master.yml

This file was deleted.

30 changes: 30 additions & 0 deletions .github/workflows/release.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,30 @@
on:
push:
tags:
- 'v*'

name: Create Release

jobs:
build:
name: Create Release
runs-on: ubuntu-latest
steps:
- name: Checkout code
uses: actions/checkout@v2
- name: Get tag message
run: |
TAG_BODY=$(git tag --contains ${{ github.sha }} -n100 | awk '(NR>1)')
echo "::set-output name=TAG_BODY::$TAG_BODY"
id: tag-body
- name: Create Release
id: create_release
uses: actions/create-release@v1
env:
GITHUB_TOKEN: ${{ secrets.GITHUB_TOKEN }}
with:
tag_name: ${{ github.ref }}
release_name: Release ${{ github.ref }}
body: ${{ steps.tag-body.outputs.TAG_BODY }}
draft: false
prerelease: false
11 changes: 6 additions & 5 deletions .gitignore
Original file line number Diff line number Diff line change
@@ -1,9 +1,10 @@
.Rproj.user
.Rhistory
.httr-oauth
.RData
.Rhistory
.Rproj.user
.Ruserdata
inst/doc
docs
.httr-oauth
*.html
doc
docs
inst/doc
Meta
47 changes: 47 additions & 0 deletions .zenodo.json
Original file line number Diff line number Diff line change
@@ -0,0 +1,47 @@
{
"title": "git2rdata: Store and Retrieve Data.frames in a Git Repository",
"version": "0.4.0",
"description": "The git2rdata package is an R package for writing and reading dataframes as plain text files. A metadata file stores important information. 1) Storing metadata allows to maintain the classes of variables. By default, git2rdata optimizes the data for file storage. The optimization is most effective on data containing factors. The optimization makes the data less human readable. The user can turn this off when they prefer a human readable format over smaller files. Details on the implementation are available in vignette(\"plain_text\", package = \"git2rdata\"). 2) Storing metadata also allows smaller row based diffs between two consecutive commits. This is a useful feature when storing data as plain text files under version control. Details on this part of the implementation are available in vignette(\"version_control\", package = \"git2rdata\"). Although we envisioned git2rdata with a git workflow in mind, you can use it in combination with other version control systems like subversion or mercurial. 3) git2rdata is a useful tool in a reproducible and traceable workflow. vignette(\"workflow\", package = \"git2rdata\") gives a toy example. 4) vignette(\"efficiency\", package = \"git2rdata\") provides some insight into the efficiency of file storage, git repository size and speed for writing and reading.",
"creators": [
{
"name": "Onkelinx, Thierry",
"orcid": "https://orcid.org/0000-0001-8804-4216"
}
],
"upload_type": "software",
"access_right": "open",
"license": "GPL-3.0",
"communities": [
{
"identifier": "inbo"
}
],
"contributors": [
{
"name": "Vanderhaeghe, Floris",
"type": "ProjectMember",
"orcid": "https://orcid.org/0000-0002-6378-6229"
},
{
"name": "Desmet, Peter",
"type": "ProjectMember",
"orcid": "https://orcid.org/0000-0002-8442-8025"
},
{
"name": "Lommelen, Els",
"type": "ProjectMember",
"orcid": "https://orcid.org/0000-0002-3481-5684"
},
{
"name": "Research Institute for Nature and Forest",
"type": "RightsHolder"
},
{
"name": "Onkelinx, Thierry",
"type": "ContactPerson",
"orcid": "https://orcid.org/0000-0001-8804-4216"
}
],
"language": "eng",
"keywords": ["R package", "reproducible research", "version control"]
}
40 changes: 40 additions & 0 deletions CITATION.cff
Original file line number Diff line number Diff line change
@@ -0,0 +1,40 @@
cff-version: 1.2.0
message: If you use this software, please cite it as below.
authors:
- family-names: Onkelinx
given-names: Thierry
orcid: https://orcid.org/0000-0001-8804-4216
contact:
- email: [email protected]
family-names: Onkelinx
given-names: Thierry
- email: [email protected]
name: Research Institute for Nature and Forest
title: 'git2rdata: Store and Retrieve Data.frames in a Git Repository'
version: 0.4.0
abstract: The git2rdata package is an R package for writing and reading dataframes
as plain text files. A metadata file stores important information. 1) Storing metadata
allows to maintain the classes of variables. By default, git2rdata optimizes the
data for file storage. The optimization is most effective on data containing factors.
The optimization makes the data less human readable. The user can turn this off
when they prefer a human readable format over smaller files. Details on the implementation
are available in vignette("plain_text", package = "git2rdata"). 2) Storing metadata
also allows smaller row based diffs between two consecutive commits. This is a useful
feature when storing data as plain text files under version control. Details on
this part of the implementation are available in vignette("version_control", package
= "git2rdata"). Although we envisioned git2rdata with a git workflow in mind, you
can use it in combination with other version control systems like subversion or
mercurial. 3) git2rdata is a useful tool in a reproducible and traceable workflow.
vignette("workflow", package = "git2rdata") gives a toy example. 4) vignette("efficiency",
package = "git2rdata") provides some insight into the efficiency of file storage,
git repository size and speed for writing and reading.
license: GPL-3.0
type: software
repository-code: https://github.com/ropensci/git2rdata/
identifiers:
- type: url
value: https://ropensci.github.io/git2rdata/
keywords:
- R package
- reproducible research
- version control
49 changes: 18 additions & 31 deletions DESCRIPTION
Original file line number Diff line number Diff line change
@@ -1,30 +1,17 @@
Package: git2rdata
Title: Store and Retrieve Data.frames in a Git Repository
Version: 0.3.1
Authors@R:
c(person(given = "Thierry",
family = "Onkelinx",
role = c("aut", "cre"),
email = "[email protected]",
comment = c(ORCID = "0000-0001-8804-4216")),
person(given = "Floris",
family = "Vanderhaeghe",
role = "ctb",
email = "[email protected]",
comment = c(ORCID = "0000-0002-6378-6229")),
person(given = "Peter",
family = "Desmet",
role = "ctb",
email = "[email protected]",
comment = c(ORCID = "0000-0002-8442-8025")),
person(given = "Els",
family = "Lommelen",
role = "ctb",
email = "[email protected]",
comment = c(ORCID = "0000-0002-3481-5684")),
person(given = "Research Institute for Nature and Forest",
role = c("cph", "fnd"),
email = "[email protected]"))
Version: 0.4.0
Authors@R: c(
person("Thierry", "Onkelinx", , "[email protected]", role = c("aut", "cre"),
comment = c(ORCID = "0000-0001-8804-4216")),
person("Floris", "Vanderhaeghe", , "[email protected]", role = "ctb",
comment = c(ORCID = "0000-0002-6378-6229")),
person("Peter", "Desmet", , "[email protected]", role = "ctb",
comment = c(ORCID = "0000-0002-8442-8025")),
person("Els", "Lommelen", , "[email protected]", role = "ctb",
comment = c(ORCID = "0000-0002-3481-5684")),
person("Research Institute for Nature and Forest", , , "[email protected]", role = c("cph", "fnd"))
)
Description: The git2rdata package is an R package for writing and reading
dataframes as plain text files. A metadata file stores important
information. 1) Storing metadata allows to maintain the classes of
Expand All @@ -44,10 +31,10 @@ Description: The git2rdata package is an R package for writing and reading
traceable workflow. vignette("workflow", package = "git2rdata") gives
a toy example. 4) vignette("efficiency", package = "git2rdata")
provides some insight into the efficiency of file storage, git
repository size and speed for writing and reading. Please cite using
<doi:10.5281/zenodo.1485309>.
repository size and speed for writing and reading.
License: GPL-3
URL: https://ropensci.github.io/git2rdata/
URL: https://ropensci.github.io/git2rdata/,
https://github.com/ropensci/git2rdata/
BugReports: https://github.com/ropensci/git2rdata/issues
Depends:
R (>= 3.5.0)
Expand All @@ -66,10 +53,9 @@ Suggests:
VignetteBuilder:
knitr
Encoding: UTF-8
Language: en-GB
LazyData: true
Language: eng
Roxygen: list(markdown = TRUE)
RoxygenNote: 7.1.1
RoxygenNote: 7.1.2
Collate:
'clean_data_path.R'
'datahash.R'
Expand All @@ -87,3 +73,4 @@ Collate:
'rename_variable.R'
'upgrade_data.R'
'utils.R'
'verify_vc.R'
2 changes: 2 additions & 0 deletions NAMESPACE
Original file line number Diff line number Diff line change
Expand Up @@ -61,6 +61,7 @@ export(repository)
export(rm_data)
export(status)
export(upgrade_data)
export(verify_vc)
export(write_vc)
importFrom(assertthat,"on_failure<-")
importFrom(assertthat,assert_that)
Expand All @@ -81,6 +82,7 @@ importFrom(git2r,workdir)
importFrom(methods,setOldClass)
importFrom(stats,setNames)
importFrom(utils,file_test)
importFrom(utils,flush.console)
importFrom(utils,packageVersion)
importFrom(utils,read.table)
importFrom(utils,write.table)
Expand Down
17 changes: 17 additions & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,20 @@
# git2rdata 0.4.0

## New features

* `write_vc()` stores non optimised files as comma separated values rather than
tab separated values.
The general public seems to better recognised `.csv` files than `.tsv` files
as being data files.
* Add a new function `verify_vc()` which reads a `git2rdata` object and verifies
the presence of a set of variables.
It return the data upon success.

## Internal changes

* Upgrade to Roxygen2 7.1.2
* Add `inst/CITATION`, `CITATION.cff`, `.zenodo.json`

# git2rdata 0.3.1

* Use `icuSetCollate()` to define a standardised sorting.
Expand Down
Loading