Consider adding VariableType to metadata #127

bryanrcarlson · 2022-08-31T18:04:24Z

The Variable Catalog tends to get filled up with descriptive-type variables (site location, treatment ID, block ID, latitude, longitude, etc.) where most researchers are likely looking for measurement-type variables (stand count, biomass, percent carbon, volumetric water content). We should consider providing a way to differentiate these variable types for better filtering.

This can easily get out of hand (see here and here).

Best option is to be consistent with other such filters (e.g. zone, processing, quality control) and allow variable types to be defined in the app-config.json file.

We also will need to provide default values. Maybe follow statistics and go with the upper level: Quantitative/Numeric, Qualitative/Categorical? Or go with an extra step and do: Discrete, Continuous, Nominal, Ordinal.

Keeping things simple (for speed-to-metadata): Numeric, Categorical.

bryanrcarlson · 2023-08-08T23:22:57Z

Take a cue from dimensional modeling. A "dimension" describes "who, what, where, when, why, and how". A "metric" is a quantitative measurement.

I think it's safe to say the context of the dataset can determine what is a dimension vs a metric -- so we have some leeway here. For example, measuring crop height, "height" will be a metric. But if we are describing the height of a sensor then "height" will be a dimension.

Nominal and ordinal variables pose some confusion. There are arguments that ordinal vars can be considered continuous vars (https://www.frontiersin.org/articles/10.3389/feduc.2020.589965/full). So maybe we can handwave that. (although there are best practices that state "metrics" should allow the calculation of means, min, max, etc.)

I'm not sure about nominal though. Should we always treat those as dimensions? If so, something like a management dataset, which is mostly nominal values, will pose an issue. Would we want a "drill type" or "tractorId" grouped with "plotId", "NearestTown", and other descriptive variables? Maybe this is fine?

bryanrcarlson added this to the 0.4 milestone Aug 24, 2023

bryanrcarlson modified the milestones: 0.4, 0.3 Sep 1, 2023

bryanrcarlson linked a pull request Sep 1, 2023 that will close this issue

Add variable type #132

Merged

bryanrcarlson closed this as completed Sep 1, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consider adding VariableType to metadata #127

Consider adding VariableType to metadata #127

bryanrcarlson commented Aug 31, 2022

bryanrcarlson commented Aug 8, 2023

Consider adding VariableType to metadata #127

Consider adding VariableType to metadata #127

Comments

bryanrcarlson commented Aug 31, 2022

bryanrcarlson commented Aug 8, 2023