-
Notifications
You must be signed in to change notification settings - Fork 8
Commit
* IRV now handles missing values, and psychsyn handles NA results better (#17) * typos and renamed carelessDataset columns - typo in IRV / psacing in psychsyn - carelessDataset now has cogent column names * changed vignettes removed introduction vignette from package (now hosted online) and fixed typo with linking vignette * changed psychsyn to do a resampling psychsyn comes up with different correlation coefficients dependent on whether the item of an item pair is located at the x or y column. * psychsyn now resamples psychsyn correlation values was dependent on the placement of items of item pairs in the x and y columns. psychsyn now does multiple random (re)placements of the items. * irv now can handle missing values * Updated psychsyn to do up to 10 resamples while encountering NAs In response to Issue #16 * updated IRV to make na.rm optional * updating docs to match francisco's changes * remove vignette * adding suggestion for texlive-fonts-extra * change to travis-ci config to attempt to account for latex issues * trying a different before install command * adding one more * changing repo for texlive * trying http mirror instead * try again * idfk * still not working * another try * one more thing before bed * macos why?!? * again * trying to work around. * no cache * last try * travis is not allowed to macos anymore * feature updates for psychsyn, mahad, irv (#18) * typos and renamed carelessDataset columns - typo in IRV / psacing in psychsyn - carelessDataset now has cogent column names * changed vignettes removed introduction vignette from package (now hosted online) and fixed typo with linking vignette * changed psychsyn to do a resampling psychsyn comes up with different correlation coefficients dependent on whether the item of an item pair is located at the x or y column. * psychsyn now resamples psychsyn correlation values was dependent on the placement of items of item pairs in the x and y columns. psychsyn now does multiple random (re)placements of the items. * irv now can handle missing values * Updated psychsyn to do up to 10 resamples while encountering NAs In response to Issue #16 * updated IRV to make na.rm optional * updated mahad to handle NA properly * Update mahad.R - fixed typo fixed typo * Update .gitignore * Updates to mahaD, psychsyn (#21) * typos and renamed carelessDataset columns - typo in IRV / psacing in psychsyn - carelessDataset now has cogent column names * changed vignettes removed introduction vignette from package (now hosted online) and fixed typo with linking vignette * changed psychsyn to do a resampling psychsyn comes up with different correlation coefficients dependent on whether the item of an item pair is located at the x or y column. * psychsyn now resamples psychsyn correlation values was dependent on the placement of items of item pairs in the x and y columns. psychsyn now does multiple random (re)placements of the items. * irv now can handle missing values * Updated psychsyn to do up to 10 resamples while encountering NAs In response to Issue #16 * updated IRV to make na.rm optional * updated mahad to handle NA properly * Update mahad.R - fixed typo fixed typo * Update .gitignore * updated psychsyn handling of NA improved. Added corresponding test. Co-authored-by: fwilhelm <[email protected]> Co-authored-by: Richard Yentes <[email protected]> * changed psychsyn to default resample * Changed evenodd so it's interpreted similarly to other metrics addresses #19 * fixes to psychsyn doc for resampling * update travis.yml to skip oldrel * Cleaning up dev * Update .travis.yml * removing travis ci * Update DESCRIPTION * updating docs * changes to even-odd (#23) * changes to even-odd 1. Introduced error when calling even odd with just one factor. 2. Introduced warning when variables indicated by factor argument do not match the input data (x). 3. Use own warning for cases when NA arises because of 0 variance in even or odd vectors. 4. refactored coding using apply. * evenodd minor change added an explicit error when factors > x. Adjusted the call. parameter to be FALSE. * added warning to even-odd that computation has changed (#24) Co-authored-by: Francisco Wilhelm <[email protected]> Co-authored-by: fwilhelm <[email protected]>
- Loading branch information
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,3 +1,8 @@ | ||
*.Rproj | ||
.Rproj.user | ||
.Rhistory | ||
temp.RData | ||
R/.DS_Store | ||
.RData | ||
vignettes/*.html | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,8 +1,8 @@ | ||
Package: careless | ||
Type: Package | ||
Title: Procedures for Computing Indices of Careless Responding | ||
Version: 1.1.3 | ||
Date: 2018-06-19 | ||
Version: 1.2.0 | ||
Date: 2020-07-25 | ||
Authors@R: c( | ||
person("Richard", "Yentes" , email = "[email protected]", role = c("cre", "aut")), | ||
person("Francisco", "Wilhelm", email = "[email protected]", role = c("aut"))) | ||
|
@@ -20,4 +20,4 @@ Suggests: | |
Encoding: UTF-8 | ||
LazyData: true | ||
VignetteBuilder: knitr | ||
RoxygenNote: 6.0.1 | ||
RoxygenNote: 7.1.1 |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -8,17 +8,18 @@ | |
#' Marjanovic et al. (2015) propose to mark persons with \emph{high} IRV scores - reflecting highly random responses (see References). | ||
#' | ||
#' @param x a matrix of data (e.g. survey responses) | ||
#' @param split boolean indicating whether to additionally calculate the IRV on subsets of columns (of equal length). | ||
#' @param na.rm logical indicating whether to calculate the IRV for a person with missing values. | ||
#' @param split logical indicating whether to additionally calculate the IRV on subsets of columns (of equal length). | ||
#' @param num.split the number of subsets the data is to be split in. | ||
#' @author Francisco Wilhelm \email{[email protected]} | ||
#' @references | ||
#' Dunn, A. M., Heggestad, E. D., Shanock, L. R., & Theilgard, N. (2018). | ||
#' Intra-individual Response Variability as an Indicator of Insufficient Effort Responding: | ||
#' Comparison to Other Indicators and Relationships with Individual Differences. | ||
#' \emph{Journal of Business and Psychology, 33(1)}, 105-121. \doi{10.1007/s10869-016-9479-0} | ||
#' | ||
#' Marjanovic, Z., Holden, R., Struthers, W., Cribbie, R., & Greenglass, E. (2015). | ||
#' The inter-item standard deviation (ISD): An index that discriminates between conscientious and random responders. | ||
#' | ||
#' Marjanovic, Z., Holden, R., Struthers, W., Cribbie, R., & Greenglass, E. (2015). | ||
#' The inter-item standard deviation (ISD): An index that discriminates between conscientious and random responders. | ||
#' \emph{Personality and Individual Differences}, 84, 79-83. \doi{10.1016/j.paid.2014.08.021} | ||
#' @export | ||
#' @examples | ||
|
@@ -29,20 +30,21 @@ | |
#' irv_split <- irv(careless_dataset, split = TRUE, num.split = 4) | ||
#' boxplot(irv_split$irv4) #produce a boxplot of the IRV for the fourth quarter | ||
|
||
irv <- function(x, split = FALSE, num.split = 3) { | ||
out <- apply(x, 1, stats::sd) | ||
irv <- function(x, na.rm = TRUE, split = FALSE, num.split = 3) { | ||
out <- apply(x, 1, stats::sd, na.rm = na.rm) | ||
|
||
if(split == TRUE) { | ||
chunk <- function(x,n) split(x, cut(seq_along(x), n, labels = FALSE)) | ||
split_x <- apply(x, 1, chunk, num.split) | ||
out_split <- t(replicate(nrow(x), rep(NA, num.split))) | ||
colnames(out_split) <- paste0("irv",1:num.split) | ||
for(k in 1:nrow(out_split)) { | ||
split_x_single <- split_x[[k]] | ||
out_split[k,] <- unlist(lapply(split_x_single, stats::sd), use.names = FALSE) | ||
out_split[k,] <- unlist(lapply(split_x_single, stats::sd, na.rm = na.rm), use.names = FALSE) | ||
} | ||
out_split <- data.frame(out, out_split) | ||
colnames(out_split)[1] <- "irvTotal" | ||
return(out_split)} else { | ||
return(out_split)} else { #split subsection end | ||
return(out) | ||
} | ||
} |
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -6,7 +6,7 @@ | |
#' identical responses is returned. Additionally, can return the average length of uninterrupted string of identical responses. | ||
#' | ||
#' @param x a matrix of data (e.g. item responses) | ||
#' @param avg a boolean indicating whether to additionally return the average length of identical consecutive responses | ||
#' @param avg logical indicating whether to additionally return the average length of identical consecutive responses | ||
#' @author Richard Yentes \email{[email protected]}, Francisco Wilhelm \email{[email protected]} | ||
#' @references | ||
#' Johnson, J. A. (2005). Ascertaining the validity of individual protocols | ||
|
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -15,6 +15,7 @@ | |
#' @param anto determines whether psychometric anonyms are returned instead of | ||
#' psychometric synonyms. Defaults to \code{FALSE} | ||
#' @param diag additionally return the number of item pairs available for each observation. Useful if dataset contains many missing values. | ||
#' @param resample_na if psychsyn returns NA for a respondent resample to attempt getting a non-NA result. | ||
#' @author Richard Yentes \email{[email protected]}, Francisco Wilhelm \email{[email protected]} | ||
#' @references | ||
#' Meade, A. W., & Craig, S. B. (2012). Identifying careless responses in survey data. | ||
|
@@ -32,21 +33,22 @@ | |
#' synonyms <- psychsyn(careless_dataset, .60, diag = TRUE) | ||
#' antonyms <- psychant(careless_dataset2, .50, diag = TRUE) | ||
|
||
psychsyn <- function(x, critval=.60, anto=FALSE, diag=FALSE) { | ||
psychsyn <- function(x, critval=.60, anto=FALSE, diag=FALSE, resample_na=TRUE) { | ||
x <- as.matrix(x) | ||
item_pairs <- get_item_pairs(x, critval, anto) | ||
synonyms <- apply(x,1,syn_for_one, item_pairs) | ||
|
||
synonyms <- apply(x,1,syn_for_one, item_pairs, resample_na) | ||
synonyms_df <- as.data.frame(aperm(synonyms)) | ||
colnames(synonyms_df) <- c("numPairs", "cor") | ||
|
||
if(diag==TRUE) { return(synonyms_df) } | ||
else { return(synonyms_df$cor) } | ||
} | ||
|
||
# Helper function that identifies psychometric synonyms in a given dataset | ||
get_item_pairs <- function(x, critval=.60, anto=FALSE) { | ||
x <- as.matrix(x) | ||
critval <- abs(critval) #Dummy Proofing | ||
|
||
correlations <- stats::cor(x, use = "pairwise.complete.obs") | ||
correlations[upper.tri(correlations, diag=TRUE)] <- NA | ||
correlations <- as.data.frame(as.table(correlations)) | ||
|
@@ -71,15 +73,32 @@ get_item_pairs <- function(x, critval=.60, anto=FALSE) { | |
} | ||
|
||
# Helper function to calculate the within person correlation for a single individual | ||
syn_for_one <- function(x, item_pairs) { | ||
syn_for_one <- function(x, item_pairs, resample_na) { | ||
item_pairs_omit_na <- which(!(is.na(x[item_pairs[,1]]) | is.na(x[item_pairs[,2]]))) | ||
sum_item_pairs <- length(item_pairs_omit_na) | ||
|
||
#only execute if more than two item pairs | ||
if(sum_item_pairs > 2) { | ||
itemvalues <- cbind(as.numeric(x[as.numeric(item_pairs[,1])]), as.numeric(x[as.numeric(item_pairs[,2])])) | ||
synvalue <- suppressWarnings(stats::cor(itemvalues, use = "pairwise.complete.obs", method = "pearson")[1,2]) | ||
itemvalues <- cbind(as.numeric(x[as.numeric(item_pairs[,1])]), as.numeric(x[as.numeric(item_pairs[,2])])) | ||
|
||
# helper that calculates within-person correlation | ||
psychsyn_cor <- function(x) { | ||
suppressWarnings(stats::cor(x, use = "pairwise.complete.obs", method = "pearson")[1,2]) | ||
} | ||
|
||
# if resample_na == TRUE, re-calculate psychsyn should a result return NA | ||
if(resample_na == TRUE) { | ||
counter <- 1 | ||
synvalue <- psychsyn_cor(itemvalues) | ||
while(counter <= 10 & is.na(synvalue)) { | ||
itemvalues <- t(apply(itemvalues, 1, sample, 2, replace = F)) | ||
synvalue <- psychsyn_cor(itemvalues) | ||
counter = counter+1 | ||
} | ||
} else { | ||
synvalue <- psychsyn_cor(itemvalues) # executes if resample_na == FALSE | ||
} | ||
|
||
} else {synvalue = NA} | ||
} else {synvalue <- NA} # executes if insufficient item pairs | ||
|
||
return(c(sum_item_pairs, synvalue)) | ||
} | ||
} |
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,3 @@ | ||
careless_dataset_na <- careless_dataset | ||
careless_dataset_na[c(5:8),] <- NA | ||
data_careless_maha <- mahad(careless_dataset_na) |
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,13 @@ | ||
# test 1: calculate psych syn on a dataset with missings | ||
|
||
# first, create a dataset with missings | ||
dataset_na <- careless_dataset | ||
replacements <- 500 | ||
random_row <- sample(1:nrow(dataset_na), replacements, replace = TRUE) | ||
random_col <- sample(1:ncol(dataset_na), replacements, replace = TRUE) | ||
|
||
for(i in 1:replacements) { | ||
dataset_na[random_row[i], random_col[i]] <- NA | ||
} | ||
|
||
synonyms <- psychsyn(dataset_na, .60) |