Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added Outfilling dialog #9381

Open
wants to merge 9 commits into
base: master
Choose a base branch
from

Conversation

Vitalis95
Copy link
Contributor

Fixes #9363
@rdstern @lilyclements @jkmusyoka , please check the progress.

Omit Months is not working yet, and I have raised that in the issue.

@rdstern
Copy link
Collaborator

rdstern commented Jan 23, 2025

@Vitalis95 many thanks. I show the current dialog here to help @jkmusyoka and @lilyclements to contribute to the review:

image

And very well done. This looks really nice and neat. I suggest it is important also for our audience. I note you already have the automatic filling of the controls, when the data are defined as climatic. I suggest having the dialog - rather than just the script - is important in our strategy of involving the ZMD staff centrally in this stage of the research on outfilling. I suggest many staff could contribute well via the dialog route and it would be much harder for them otherwise.

Also:

a) @jkmusyoka I assume you have a version of the data from Eastern Province that we used for the outfilling. Can you add a copy here, so we can try it with the dialog?
b) @lilyclements I assume we do want to use "our" version of the Omit Months, (see the Rainfall QC dialog)?
This feature is important to get the testing working ok for Zambia. I am happy if we decide instead to have a simpler control - used in the current function, which uses the month numbers. But I am nervous that we may want to use this facility with the shifted data - James what did we do in Zambia in December? We could alternatively tell users this currently is only for non-shifted data, so easy to give month numbers as Lily does currently?
c) Lily, what did Emily say about the Station to Exclude control. Before asking Vitalis to improve it, I wonder on the reply as to whether it is needed at all - given we can use filters in R-Instat. Can we omit that?
d) Vitalis, one omission you could sort out now is that we need a Store Result control at the bottom- the same as in the Calculator and other dialogs. Lily, you are going to make the corresponding change in the function? (It currently makes a new data frame, despite just producing a single column of the same length. It will be so much easier to use when it just adds to the existing dataframe. What should the default name be? Maybe outfilled!
e) While you are doing that could you add a blank at the top of the Omit station control, - currently we have to omit a station.
Could you also add a numeric version of the Omit Months checkbox, on the left instead of "ours". If checked this opens a drop-down (same control as for your current bins) with 5,6,7,8,9 as the default and one can type numbers into it. Have 1, 2, 3, 11, 12 as a second and 1,2,12 as a third.
And in the drop down for bins, change 5 to 2 in the second option there.
f) Lily do we need a More button towards a sub-dialog that facilitates changing the other arguments, or can that wait?

@Vitalis95
Copy link
Contributor Author

@rdstern , sorted out e). I will add Store Result control once the function is changed.

@rdstern
Copy link
Collaborator

rdstern commented Jan 23, 2025

zambia_data_for outfilling2.zip

@Vitalis95 @lilyclements and @jkmusyoka this is wonderful. The dialog is working already!

a) I attach the data file for Eastern Province (first data frame) together with the results from 3 runs of the dialog (Tamsat, chirps and era5) :

image

Amazing it is working immediately. So full marks to Lily, for the outfillingR package. I installed it directly from the menu. (I'm still not clear whether the importing dialog needs the James adjustment - I just use the dialog.)

Then it ran first time!

Lily I still would far prefer it to produce a new column in the same data frame, rather than a whole new data frame.
And I still think it is important to have an option of a specified random number seed. The whole thing is comparisons, and this includes comparisons of the different methods - which should use the same sequences.

But should we merge this now, while we wait?

@Vitalis95
Copy link
Contributor Author

@rdstern , have a look at it

Copy link
Collaborator

@rdstern rdstern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@Vitalis95 and @lilyclements that's wonderful - what I have tested. It seems to work.

Vitalis, the only trivial point is that the label Stations to Exclude is too long, and is therefore only incomplete. Could you please change it to Omit Stations.

@N-thony this is a major new facility. Can you please check so it can be merged?

@lilyclements I tried with the 3 methods and all seem to work on these data. (tamsat, chirps and era5.) I also tried era5 twice with the same initial random number seed and got the same answers. Brilliantly done therefore.

The only result I don't understand is that tamsat outfilling gave missing values when tamsta was missing. But the rainfall was not missing then. I thought that it would give the station rainfall except when that was missing. So now I'm not sure what the results are?

@lilyclements
Copy link
Contributor

lilyclements commented Jan 24, 2025

@Vitalis95 nice. I ran this for my data with custom_bins=c(1, 3, 5, 10, 15, 20) and got an error in the R code. I did not get this error with custom_bins=c(1, 5, 10, 15, 20). Since this is a problem with the R code, I will document it in the outfillingR repo here.

I do not understand the system well enough to know a suitable fix. But essentially what is happening is an NA is generated for the SD of rainfall in one month due to small bin sizes, and so it throws an error. Something to discuss with Emily, but @rdstern if you could take a look at the issue in case it is something you can understand then that would be very useful!

In terms of the dialog, this looks all good - except that one of the checkboxes is too small and cuts off the word "Exclude".

Roger Addition: Agreed with the last sentence - Could it be changed to the shorter Omit Stations

@rdstern
Copy link
Collaborator

rdstern commented Jan 27, 2025

@lilyclements did you see Emily's answer to my query above? Essentially that it might not yet be running the second part of the function? If that's the case, we should also check whether the current results are useful as part of the checking process. of how to adapt the different methods. We then leave the data for the main stations intact, which is what we want, when applying them to more stations.

@lilyclements
Copy link
Contributor

@rdstern yes, it sounds like something we can go through next week when we meet? I'm not sure if I will be able to come in person due to my ankle but I can certainly join on teams for the afternoon.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Outfilling Dialog
3 participants