Skip to content

Commit

Permalink
Add type to one_hot_encoding + cleanup(#45)
Browse files Browse the repository at this point in the history
* Add type to one_hot_encoding
* Delete fast_dates.R
* Delete test_fast_dates.R
  • Loading branch information
ELToulemonde authored Sep 10, 2018
1 parent d48e7d9 commit 5a5a195
Show file tree
Hide file tree
Showing 7 changed files with 28 additions and 86 deletions.
1 change: 1 addition & 0 deletions NEWS
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ V 0.3.8
- New features in existing functions:
- Identification of bijection through internal function *fastIsBijection* is way faster (up to 40 times faster in case of bijection). So *whichArebijection* and *fastFiltervariables* are also improved.
- Remove remaining *gc* to save time.
- In *one_hot_encoder* added parameter *type* to make choise between logical or numerical results.


V 0.3.7
Expand Down
1 change: 1 addition & 0 deletions NEWS.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,6 +4,7 @@ V 0.3.8
- New features in existing functions:
- Identification of bijection through internal function *fastIsBijection* is way faster (up to 40 times faster in case of bijection). So *whichArebijection* and *fastFiltervariables* are also improved.
- Remove remaining *gc* to save time.
- In *one_hot_encoder* added parameter *type* to make choise between logical or numerical results.

V 0.3.7
Expand Down
64 changes: 0 additions & 64 deletions R/fast_dates.R

This file was deleted.

15 changes: 12 additions & 3 deletions R/generateFromFactor.R
Original file line number Diff line number Diff line change
Expand Up @@ -81,6 +81,8 @@ generateFromFactor <- function(dataSet, cols, verbose = TRUE, drop = FALSE, ...)
#' @param encoding Result of funcion \code{\link{build_encoding}}, (list, default to NULL). \cr
#' To perform the same encoding on train and test, it is recommended to compute \code{\link{build_encoding}}
#' before. If it is kept to NULL, build_encoding will be called.
#' @param type What class of columns is expected? "integer" (0L/1L), "numeric" (0/1), or "logical" (TRUE/FALSE),
#' (character, default to "integer")
#' @param drop Should \code{cols} be dropped after generation (logical, default to FALSE)
#' @param verbose Should the function log (logical, default to TRUE)
#' @return \code{dataSet} edited by \strong{reference} with new columns.
Expand All @@ -99,16 +101,23 @@ generateFromFactor <- function(dataSet, cols, verbose = TRUE, drop = FALSE, ...)
#' # Apply same encoding to adult
#' data(adult)
#' adult <- one_hot_encoder(adult, encoding = encoding, drop = TRUE)
#'
#' # To have encoding as logical (TRUE/FALSE), pass it in type argument
#' data(adult)
#' adult <- one_hot_encoder(adult, encoding = encoding, type = "logical", drop = TRUE)
#' @export
#' @import data.table
one_hot_encoder <- function(dataSet, encoding = NULL, verbose = TRUE, drop = FALSE){
one_hot_encoder <- function(dataSet, encoding = NULL, type = "integer", verbose = TRUE, drop = FALSE){
## Working environement
function_name <- "one_hot_encoder"

## Sanity check
dataSet <- checkAndReturnDataTable(dataSet)
is.verbose(verbose)

if (! type %in% c("integer", "logical", "numeric")){
stop(paste0(function_name, ": type should either be 'integer', 'numeric' or 'logical.'"))
}
as_type <- get(paste0("as.", type)) # Build a_type function to transform result type
## Initialization
# Transform char into factor
if (is.null(encoding)){
Expand All @@ -134,7 +143,7 @@ one_hot_encoder <- function(dataSet, encoding = NULL, verbose = TRUE, drop = FAL
new_cols <- encoding[[col]]$new_cols
# Set the write value
for (i in 1:length(new_cols)){
set(dataSet, NULL, new_cols[i], as.integer(dataSet[[col]] == encoding[[col]]$values[i]))
set(dataSet, NULL, new_cols[i], as_type(dataSet[[col]] == encoding[[col]]$values[i]))
}

# drop col if asked
Expand Down
11 changes: 9 additions & 2 deletions man/one_hot_encoder.Rd

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

17 changes: 0 additions & 17 deletions tests/testthat/test_fast_dates.R

This file was deleted.

5 changes: 5 additions & 0 deletions tests/testthat/test_generateFromFactor.R
Original file line number Diff line number Diff line change
Expand Up @@ -37,6 +37,11 @@ test_that("one_hot_encoder: ",
})


test_that("one_hot_encoder: expect error: ",
{
expect_error(one_hot_encoder(adult, type = "character"), ": type should either be ")
})


## build_encoding
# ----------------
Expand Down

0 comments on commit 5a5a195

Please sign in to comment.