read_vc()
handles empty datasets stored withsplit_by
.write_vc()
andmeta()
gain adigits
argument. The arguments specifies the number of significant digits to store for numeric values.
- Add
update_metadata()
to update the description of agit2rdata
object. Seevignette("metadata")
for more details. - Update the checklist and pkgdown infrastructure.
write_vc()
stores non optimised files as comma separated values rather than tab separated values. The general public seems to better recognised.csv
files than.tsv
files as being data files.- Add a new function
verify_vc()
which reads agit2rdata
object and verifies the presence of a set of variables. It return the data upon success.
- Upgrade to Roxygen2 7.1.2
- Add
inst/CITATION
,CITATION.cff
,.zenodo.json
- Use
icuSetCollate()
to define a standardised sorting.
write_vc()
gains an optionalsplit_by
argument. Seevignette("split_by")
for more details.rename_variable()
efficiently renames variables in a storedgit2rdata
object.
read_vc()
,is_git2rdata()
andis_git2rmeta()
now yield a better message when both the data and metadata are missing.
- Use the checklist package for CI.
- Explicitly use the
stringsAsFactors
ofdata.frame()
in the examples and unit tests if the dataframe contains characters. The upcoming change in default value ofstringsAsFactors
requires this change. See https://developer.r-project.org/Blog/public/2020/02/16/stringsasfactors/index.html
- Calculation of data hash has changed (#53).
You must use
upgrade_data()
to read data stored by an older version. is_git2rdata()
andupgrade_data()
no longer not test equality in data hashes (butread_vc()
still does).write_vc()
andread_vc()
fail whenfile
is a location outside ofroot
(#50).- Reordering factor levels requires
strict = TRUE
.
- Linux and Windows machines now generated the same data hash (#49).
- Internal sorting uses the "C" locale, regardless of the current locale.
read_vc()
reads older stored in an older version (#44). When the version is too old, it prompts toupgrade_data()
.- Improve
warnings()
anderror()
messages. - Use vector version of logo.
- Transfer to rOpenSci.
- Use new logo (@peterdesmet, #37).
- Add estimate of upper bound of the number of commits.
upgrade_data()
uses the same order of the metadata aswrite_vc()
.
write_vc()
stores thegit2rdata
version number to the metadata. Useupgrade_data()
to update existing data.
read_vc()
checks the meta data hash. A mismatch results in an error.- The meta data gains a data hash. A mismatch throws a warning when reading the object. This tolerates updating the data by other software, while informing the user that such change occurred.
is_git2rmeta()
validates metadata.list_data()
lists files with valid metadata.rm_data()
andprune_meta()
remove files with valid metadata. They don't touchtsv
file without metadata oryml
files not associated withgit2rdata
.- Files with invalid metadata yield a warning with
list_data()
,rm_data()
andprune_meta()
.
write_vc()
andrelabel()
handle empty strings (''
) in characters and factors (#24).read_vc()
no longer treats#
as a comment character.read_vc()
handles non ASCII characters on Windows.
- Use a faster algorithm to detect duplicates (suggestion by @brodieG).
- Improve documentation.
- Fix typo's in documentation, vignettes and README.
- Add a rOpenSci review badge to the README.
- The README mentions on upper bound on the size of dataframes.
- Set lifecycle to "maturing" and repo status to "active".
- The functions handle
root
containing regex expressions. - Rework
vignette("workflow", package = "git2rdata")
. - Update timings in
vignette("efficiency", package = "git2rdata")
- Minor tweaks in
vignette("plain_text", package = "git2rdata")
- Fix typo's in documentation, vignettes and README.
meta()
appends the metadata as a list to the objects rather than in YAML format.yaml::write_yaml()
writes the metadata list in YAML format.write_vc()
now uses the 'strict' argument instead of 'override'.rm_data()
removes the data files. Useprune_meta()
to remove left-over metadata files (#9).
- Vignette on efficiency added (#2).
- Three separate vignettes instead of one large vignette.
- Focus on the plain text format.
- Focus on version control.
- Focus on workflows.
- S3 methods replace the old S4 methods (#8).
- Optimized factors use stable indices. Adding or removing levels result in smaller diffs (#13).
- Use
relabel()
to alter factor levels without changing their index (#13). write.table()
stores the raw data instead ofreadr::write_tsv()
(#7). This avoids thereadr
dependency.write_vc()
andread_vc()
use the current working directory as default root (#6, @florisvdh).- The user can specify a string to code missing values (default =
NA
). This allows the storage of the character string"NA"
. write_vc()
returns a list of issues which potentially result in large diffs.list_data()
returns a vector with dataframes in the repository.
write_vc()
allows to use a customNA
string.- Each helpfile contains a working example (#11).
- README updated (#12).
- We removed
auto_commit()
because of limited extra functionality overgit2r::commit()
.
- Use
readr
to write and read plain text files. - Allow storage of strings with "NA" or special characters.
- Handle ordered factors.
- Stop handling complex numbers.