Ffix a latex error (#31)

* Trying to fix a latex error * Fix URLs * Accepted by CRAN
ELToulemonde · Oct 25, 2017 · dbe52ce · dbe52ce
1 parent a825001
commit dbe52ce
Show file tree

Hide file tree

Showing 18 changed files with 79 additions and 65 deletions.
diff --git a/DESCRIPTION b/DESCRIPTION
@@ -1,6 +1,6 @@
 Package: dataPreparation
 Title: Automated Data Preparation
-Version: 0.3
+Version: 0.3.2
 Authors@R: person("Emmanuel-Lin", "Toulemonde", email = "[email protected]", role = c("aut", "cre"))
 Description: Do most of the painful data preparation for a data science project with a minimum amount of code; Take advantages of data.table efficiency and use some algorithmic trick in order to perform data preparation in a time and RAM efficient way.
 Depends:

diff --git a/NEWS.md b/NEWS.md
@@ -1,3 +1,11 @@
+V 0.3.2
+========
+- Change URLs to meet CRAN requirement
+
+v 0.3.1
+=======
+- Fix bug in Latex documentation
+
 v 0.3
 =====
 - New features:

diff --git a/R/aggregate.R b/R/aggregate.R
@@ -8,15 +8,15 @@
 #' @param ... Optional argument: \code{functions}:  aggregation functions for numeric columns 
 #' (vector of function, optional, if not set we use: c(mean, min, max, sd))
 #' @details
-#' Perform aggregation depending on column type:\cr
+#' Perform aggregation depending on column type:
 #' \itemize{
-#' \item If column is numeric \code{functions} are performed on the column. So 1 numeric column 
-#' give length(functions) new columns,
-#' \item If column is character or factor and have less than \code{thresh} different values, 
-#' frequency count of values is performed,
-#' \item If column is character or factor with more than \code{thresh} different values, number 
-#' of different values for each \code{key} is performed,
-#' \item If column is logical, count of number and rate of positive is performed.
+#'   \item If column is numeric \code{functions} are performed on the column. So 1 numeric column 
+#'     give length(functions) new columns,
+#'   \item If column is character or factor and have less than \code{thresh} different values, 
+#'     frequency count of values is performed,
+#'   \item If column is character or factor with more than \code{thresh} different values, number 
+#'     of different values for each \code{key} is performed,
+#'   \item If column is logical, count of number and rate of positive is performed.
 #' }
 #' Be careful using functions argument, given functions should be an aggregation function, 
 #' meaning that for multiple values it should only return one value.

diff --git a/R/dataSet.R b/R/dataSet.R
@@ -5,7 +5,7 @@
 ###################################################################################################
 #' Adult with some ugly columns added
 #' 
-#' For examples and tutorials, messy_adult has been built using UCI \code{adult}.\cr
+#' For examples and tutorials, messy_adult has been built using UCI \code{adult}.
 #' 
 #' We added 9 really ugly columns to the data set:
 #' 

diff --git a/R/discretization.R b/R/discretization.R
@@ -9,7 +9,7 @@
 #' @param verbose Should the algorithm talk? (Logical, default to TRUE)
 #' @return A list where each element name is a column name of data set and each element contains 
 #' bins to discretize this column.
-#' @details \cr
+#' @details
 #' Using equal freq first bin will start at -Inf and last bin will end at +Inf.
 #' @examples 
 #' # Load data

diff --git a/R/generateFromFactor.R b/R/generateFromFactor.R
@@ -1,6 +1,6 @@
 #' Recode factor
 #' 
-#' Recode factors into 3 new columns: \cr
+#' Recode factors into 3 new columns:
 #' \itemize{
 #' \item was the value not NA, "NA", "",
 #' \item how often this value occures,

diff --git a/R/prepareSet.R b/R/prepareSet.R
@@ -3,7 +3,7 @@
 ###################################################################################################
 #' Preparation pipeline
 #' 
-#' Full pipeline for preparing your dataSet set \cr
+#' Full pipeline for preparing your dataSet set.
 #' @param dataSet Matrix, data.frame or data.table
 #' @param finalForm "data.table" or "numerical_matrix" (default to data.table)
 #' @param verbose Should the algorithm talk? (logical, default to TRUE)
@@ -26,14 +26,16 @@
 #'      \code{\link{generateFactorFromDate}}) (character, default to "yearmonth")
 #' }
 #' @return A data.table or a numerical matrix (according to \code{finalForm}). \cr
-#' It will perform the following steps: \cr
-#' - Correct set: unfactor factor with many values, id dates and numeric that are hiden in character \cr
-#' - Transform set: compute differences between every date, transform dates into factors, generate 
-#' features from character..., if \code{key} is provided, will perform aggregate according to this \code{key} \cr
-#' - Filter set: filter constant, in double or bijection variables. If `digits` is provided, 
-#' will round numeric \cr
-#' - Handle NA: will perform \code{\link{fastHandleNa}}) \cr
-#' - Shape set: will put the result in asked shape (\code{finalForm}) with acceptable columns format.
+#' It will perform the following steps:
+#' \itemize{
+#'   \item Correct set: unfactor factor with many values, id dates and numeric that are hiden in character
+#'   \item Transform set: compute differences between every date, transform dates into factors, generate 
+#'      features from character..., if \code{key} is provided, will perform aggregate according to this \code{key}
+#'   \item Filter set: filter constant, in double or bijection variables. If `digits` is provided, 
+#'      will round numeric
+#'   \item Handle NA: will perform \code{\link{fastHandleNa}})
+#'   \item Shape set: will put the result in asked shape (\code{finalForm}) with acceptable columns format.
+#' }
 #' @examples 
 #' # Load ugly set
 #' \dontrun{

diff --git a/R/shapeSet.R b/R/shapeSet.R
@@ -1,14 +1,15 @@
 #' Final preparation before ML algorithm
 #'
-#' Prepare a data.table by: \cr
-#' - transforming numeric variables into factors whenever they take less than \code{thresh} unique 
-#' variables \cr
-#' - transforming characters using \code{\link{generateFromCharacter}} \cr
-#' - transforming logical into binary integers \cr
-#' - dropping constant columns \cr
-#' - Sending the data.table to \code{\link{setAsNumericMatrix}} (when \code{finalForm == "numerical_matrix"}) will then allow 
-#' you to get a numerical matrix usable by most Machine Learning Algorithms.
-#' 
+#' Prepare a data.table by: 
+#' \itemize{
+#'  \item transforming numeric variables into factors whenever they take less than \code{thresh} unique 
+#'    variables
+#'  \item transforming characters using \code{\link{generateFromCharacter}}
+#'  \item transforming logical into binary integers
+#'  \item dropping constant columns
+#'  \item Sending the data.table to \code{\link{setAsNumericMatrix}} (when \code{finalForm == "numerical_matrix"}) 
+#'    will then allow you to get a numerical matrix usable by most Machine Learning Algorithms.
+#' }
 #' @param dataSet Matrix, data.frame or data.table
 #' @param finalForm "data.table" or "numerical_matrix" (default to data.table)
 #' @param thresh  Threshold such that  a numerical column is transformed into

diff --git a/R/whichFunctions.R b/R/whichFunctions.R
@@ -84,7 +84,7 @@ whichAreConstant <- function(dataSet, keep_cols = NULL, verbose = TRUE){
 #' first 10 lines of both columns. If they are not equal then the columns aren't identical, else
 #' it compares lines 11 to 100; then 101 to 1000... So this function is fast with dataSet set 
 #' with a large number of lines and a lot of columns that aren't equals. \cr
-#' If \code{verbose} is TRUE, the column logged will be the one returned. \cr
+#' If \code{verbose} is TRUE, the column logged will be the one returned. 
 #' @examples
 #' # First let's build a matrix with 3 columns and a lot of lines, with 1's everywhere
 #' M <- matrix(1, nrow = 1e6, ncol = 3)
@@ -172,7 +172,7 @@ whichAreBijection <- function(dataSet, keep_cols = NULL, verbose = TRUE){
 
   ## Initialization
 
-  ## Computation # to-do dé-gorifier
+  ## Computation # to-do clean it
   bijection_cols <- bi_col_test(dataSet, keep_cols, verbose = verbose, 
                                  test_function = "fastIsBijection", function_name = function_name, test_log = " is a bijection of ")
 
@@ -245,7 +245,7 @@ whichAreIncluded <- function(dataSet, keep_cols = NULL, verbose = TRUE){
     pb <- initPB(function_name, names(dataSet))
   }
   nbr_various_val <- sapply(dataSet, uniqueN)
-  ## Computation # to-do dé-gorifier
+  ## Computation # to-do clean it
   while (length(I) > 0){
     i <- I[1]
 

diff --git a/man/aggregateByKey.Rd b/man/aggregateByKey.Rd
diff --git a/man/build_bins.Rd b/man/build_bins.Rd
diff --git a/man/generateFromFactor.Rd b/man/generateFromFactor.Rd
diff --git a/man/messy_adult.Rd b/man/messy_adult.Rd
diff --git a/man/prepareSet.Rd b/man/prepareSet.Rd
diff --git a/man/shapeSet.Rd b/man/shapeSet.Rd
diff --git a/man/whichAreInDouble.Rd b/man/whichAreInDouble.Rd
diff --git a/vignettes/dataPreparation.Rmd b/vignettes/dataPreparation.Rmd
@@ -359,7 +359,7 @@ description(agg_adult, level = 0)
 
 
 # Conclusion
-We presented some of the functions of *dataPreparation* package. There are a few more available, plus they have some parameters to make their use easier. So if you liked it, please go check the package documentation (by installing it or on [CRAN](https://cran.r-project.org/web/packages/dataPreparation/dataPreparation.pdf))
+We presented some of the functions of *dataPreparation* package. There are a few more available, plus they have some parameters to make their use easier. So if you liked it, please go check the package documentation (by installing it or on [CRAN](https://CRAN.R-project.org/package=dataPreparation/dataPreparation.pdf))
 
 
 We hope that this package is helpful, that it helped you prepare your data in a faster way.

diff --git a/vignettes/train_test_prep.Rmd b/vignettes/train_test_prep.Rmd
@@ -34,7 +34,7 @@ In this tutorial the following points are going to be viewed:
 - Applying the same preparation to a testing set,
 - Controling that train and test sets have the same shape.
 
-Using [dataPreparation](https://cran.r-project.org/web/packages/dataPreparation/index.html) package, those sets will be
+Using [dataPreparation](https://CRAN.R-project.org/package=dataPreparation/index.html) package, those sets will be
 
 - fast (since dataPreparation is based on data.table framework and uses some computational tricks)
 - easy (since those functions are packaged and handle most of the situations)
@@ -229,7 +229,7 @@ No warning have been raised it's all is ok.
 
 
 # Conclusion
-We presented some of the functions of *dataPreparation* package. There are a few more available, plus they have some parameters to make their use easier. So if you liked it, please go check the package documentation (by installing it or on [CRAN](https://cran.r-project.org/web/packages/dataPreparation/dataPreparation.pdf))
+We presented some of the functions of *dataPreparation* package. There are a few more available, plus they have some parameters to make their use easier. So if you liked it, please go check the package documentation (by installing it or on [CRAN]( https://CRAN.R-project.org/package=dataPreparation/dataPreparation.pdf))
 
 
 We hope that this package is helpful, that it helped you prepare your data in a faster way.