Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sel_is command #9

Open
kbjarkefur opened this issue Jan 3, 2024 · 9 comments
Open

sel_is command #9

kbjarkefur opened this issue Jan 3, 2024 · 9 comments
Assignees
Labels
solved but not yet published ✅ This issue is resolved, but not published on SSC yet, nor merged to main
Milestone

Comments

@kbjarkefur
Copy link
Collaborator

@arthur-shaw , when I looked at this with fresh eyes I thought of a way that I think will be better to do the sel_is_numeric, sel_is_string commands. Creating one command for each sel_is_??? will be a pain to maintain across that many files. So how about we make the type a sub-command. Like this: sel_is numeric, sel_is string etc. Then there is only one command sel_is and it accepts the sub-commands below:

  • sel_is numeric. All numeric questions.
  • sel_is string. All questions that contain text (i.e., Text and List types)
  • sel_is text. Text questions only
  • sel_is list. List question only
  • sel_is multi_select. All multi-select questions.
  • sel_is multi_ordered. Multi-select questions where answer order is captured.
  • sel_is multi_yn. Multi-select with yes/no answers.
  • sel_is multi_checkbox. Multi-select with checkbox input.
  • sel_is gps.

I can create this quite quickly. But let me know what you think first

@kbjarkefur kbjarkefur self-assigned this Jan 3, 2024
@arthur-shaw
Copy link
Contributor

arthur-shaw commented Jan 4, 2024

@kbjarkefur , I like this idea. All of these commands would have shared code (e.g., capture variables in a macro, return the macro as r(varlist), message about number/list of variables, etc.). All of these commands would benefit from having the shared documentation with details on each sub-command. And all of these command would probably have a common API (e.g., limit scope of selection by variable name glob, etc.).

The only change I might propose is the name. For the main command, consider sel_vars. For the sub-command, consider is_{suso_type}. In pseudo English, this would read: select variables that are {suso_type}.

For the API, here's a link to one part of our past discussions.

For the implementation of these question type selectors, here's a synthesis of what I've done in cleanstart:

  • is_single_select. Single-select question whose answer options aren't linked to a roster ID or to a list question: type == "SingleQuestion" & (mi(linked_to_roster_id) & mi(linked_to_question_id))
  • is_numeric. Numeric question: type == "NumericQuestion"
  • has_decimals. Is not an integer: is_integer == 0
  • is_text. Is a text question: type == "TextQuestion" & mi(mask)
  • follows_pattern. Text question that follows a pattern: type_var == "TextQuestion" & !mi(mask)
  • is_list. List question: type == "TextListQuestion"
  • is_multi_select. Is a multi-select question: type == "MultyOptionsQuestion"
  • is_multi_ordered. Multi-select with question order recorded: type == "MultyOptionsQuestion" & are_answers_ordered == 1. (NOTE: in the data shared, rename are_answered_ordered to are_answers_ordered. Will be fixing this upstream momentarily.)
  • is_multi_yn. Multi-select yes/no: type == "MultyOptionsQuestion" & yes_no_view == 1
  • is_multi_checkbox. Multi-select checkbox: type == "MultyOptionsQuestion" & yes_no_view == 1
  • is_date. type == "DateTimeQuestion" & is_timestamp == 0
  • is_timestamp. type == "DateTimeQuestion" & is_timestamp == 0
  • is_gps. GPS question: type == "GpsCoordinateQuestion"
  • is_variable. Computed SuSo variable: type == "Variable"
  • is_picture. Picture capture question: type == "MultimediaQuestion"
  • is_barcode. Need to look this up. type == " "QRBarcodeQuestion""

My only concern is that this would require a fair number of metadata characteristics. Some might be able to be combined (e.g., linked_to_roster_id and linked_to_question_id_var could be combined into a is_linked indicator). Most need to stay as his.

Here's my quick compilation:

  • General: type
  • Linked or not: linked_to_roster_id, linked_to_question_id
  • Numeric: is_integer
  • Text: mask
  • Multi-select: are_answers_ordered, yes_no_view

@kbjarkefur
Copy link
Collaborator Author

Agreed. We have a command called sel_vars already, but I think we should change that to accomodate this.

Currently sel_vars has sysntax sel_vars "query_string", type("NumericQuestion"). I still like to keep the query_string as it allows the users to start using this commands on custom chars they create themselves. But lets move the query to an option and the sub-commands you suggest to the main parameter.

So it would be sel_vars is_numeric , query("query_string") where query is of course optional.

I will asked about the metadata types we have not discussed yet during our call today.

@kbjarkefur
Copy link
Collaborator Author

add option of varlist(varlist) to make it possible to pipe this from chained calls of commands in this package

@kbjarkefur
Copy link
Collaborator Author

is_single_select. Single-select question whose answer options aren't linked to a roster ID or to a list question: type == "SingleQuestion" & (mi(linked_to_roster_id) & mi(linked_to_question_id))

I do not find the variable linked_to_question_id in the meta data dta file. (I do find linked_to_roster_id)

@kbjarkefur
Copy link
Collaborator Author

kbjarkefur commented Jan 5, 2024

  • is_multi_yn. Multi-select yes/no: type == "MultyOptionsQuestion" & yes_no_view == 1
  • is_multi_checkbox. Multi-select checkbox: type == "MultyOptionsQuestion" & yes_no_view == 1

These seems to have a copy paste issue as the conditions are duplicates

@kbjarkefur
Copy link
Collaborator Author

kbjarkefur commented Jan 5, 2024

  • is_date. type == "DateTimeQuestion" & is_timestamp == 0
  • is_timestamp. type == "DateTimeQuestion" & is_timestamp == 0

I assume is_timestamp should be type == "DateTimeQuestion" & is_timestamp == 1

EDIT: also, the value for the variable is_timestamp is not 0/1 in the meta data, it is TRUE/FALSE. I would prefer to change this to 0/1 in the meta data as that is Stata practice.

@arthur-shaw
Copy link
Contributor

arthur-shaw commented Jan 5, 2024

is_single_select. Single-select question whose answer options aren't linked to a roster ID or to a list question: type == "SingleQuestion" & (mi(linked_to_roster_id) & mi(linked_to_question_id))

I do not find the variable linked_to_question_id in the meta data dta file. (I do find linked_to_roster_id)

You're right. I updated the data files on OneDrive to include this variable. This omission is due to an error in the susometa package. This exercise is helping me improve that package. Sorry that you're suffering through that process. Thanks for flagging issues.

  • is_multi_yn. Multi-select yes/no: type == "MultyOptionsQuestion" & yes_no_view == 1
  • is_multi_checkbox. Multi-select checkbox: type == "MultyOptionsQuestion" & yes_no_view == 1

These seems to have a copy paste issue as the conditions are duplicates

Correct. It should have been:

  • is_multi_yn. Multi-select yes/no: type == "MultyOptionsQuestion" & yes_no_view == 1
  • is_multi_checkbox. Multi-select checkbox: type == "MultyOptionsQuestion" & yes_no_view == 0
  • is_date. type == "DateTimeQuestion" & is_timestamp == 0
  • is_timestamp. type == "DateTimeQuestion" & is_timestamp == 0

I assume is_timestamp should be type == "DateTimeQuestion" & is_timestamp == 1

EDIT: also, the value for the variable is_timestamp is not 0/1 in the meta data, it is TRUE/FALSE. I would prefer to change this to 0/1 in the meta data as that is Stata practice.

Your assumption is right. Sorry for the copy-paste problem.

As for the values in the metadata, I've updated the data on OneDrive to have 0/1 values. While I expected TRUE/FALSE to get automatically converted to 1/0 values, I've not added some code to convert them explicitly. For the moment, I'm simply targetting variables matching is_*. If there are others, please let me know.

@kbjarkefur
Copy link
Collaborator Author

The fixes in the your last comment was implemented in 62e50c4

@kbjarkefur
Copy link
Collaborator Author

Adding a reminder here for you to update the helpfile of sel_vars following the updates in #10

@kbjarkefur kbjarkefur added the solved but not yet published ✅ This issue is resolved, but not published on SSC yet, nor merged to main label Feb 7, 2024
@arthur-shaw arthur-shaw added this to the selector-v1 milestone Feb 7, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
solved but not yet published ✅ This issue is resolved, but not published on SSC yet, nor merged to main
Projects
None yet
Development

No branches or pull requests

2 participants