Releases: nils-braun/b2luigi
v0.6.0
Announcements
I am very happy that since the last release, b2luigi has gotten many new users and with the number of contributors has also increased. See our new team list (updated with #65). If you contributed to the project and see yourself missing, fee free to contact me.
I (@meliache) was promoted by @nils-braun as a new project maintainer. I am still learning this role, so I'm sorry that I have been a bit slow with making new releases and instead many just installed the development branch, but from now on I'll try to keep the releases more up-to-date. I recommend to press the watch button on our github and you can customize it via the dropdown menu to get notified for new releases only if you're not interested in getting email for issues or PRs.
My personal recommendation is that from now on you should be fine with the PyPi release, except maybe if you use the gbasf2
wrapper, which suffers from the fact that gbasf2
itself is far from stable and is getting new backwards-incompatible releases frequently and we often have to introduce hot-fixes for that. For those users I recommend keeping an eye on our gbasf2 issue tag.
General changes
Development model
-
use github actions / workflows for CI and PyPi deployment #78
Code-coverage tests automated and enforced with
codecov
to encourage writing unittests. This already resulted in some new unittests for thehtcondor
batch :)
Features
-
New optional
job_name
setting for assigning human-readable names for groups of jobs in LSF and HTCondor batches. This is useful when checking job statuses by hand. See documentation for more. #76, #79 -
#55 Optional to only pass known command line arguments, usueful in scripting if you want to pass additional command line args that should be forwarded to the script instead of being used by b2luigi
-
#70 Users can now add a
dry_run
method to their tasks which will be called during dry-run, e.g. if the b2luigi steering file is executed withpython3 <b2luigi_file_name>.py --dry-run
Gbasf2 Batch
Fixes
-
Adapt download of job outputs to new gbasf2 v5 output directory structure by adding
/sub00
to LFN's #57. Caveats are:S-
In future releases gbasf2 will split the outputs of large projects into multiple
sub<xy>
directories, but this isn't done as of now. These other subdirectories are not supported yet, but I created issue #80 as a reminder -
The output of
mdst
/udst
files is moved into subdirectories deeper in the hierarchie. We don't support that yet either. I have to think about whether I can figure out in a smart way what the output is or if the user should provide some additional info. Best would be to do it in parallel to what gbasf2 does. See issue #58 for more, help is welcome.
-
Features
-
More stable downloads with
gb2_ds_get
When I started developing the gbasf2 wrapper, I expected that the failing of downloads will be a rare exception, but I realized that it is the norm and adapted the code to handle that more gracefully.
-
#72 if one job download fails, this doesn't raise a full exception anymore, so all the other tasks continue to run/download their outputs. The only thing that happens is that this particular task is marked a
failed
-
downloaded datasets persist after failure #67: If a download fails, the partially downloaded dataset remains in a directory with the
.partial
ending next to the expected output directory. On the one hand this ensures that b2luigi doesn't prematurely mark a task as completed until all job outputs in a gbasf2 project downloaded completely. The.partial
directory is only renamed to the final output directory, which b2luigi uses as a completeness target, once all jobs have been downloaded. On the other hand, keeping the partial downloads means that the download doesn't have to start from scratch everytime that you re-run a failed task. So, if a gbasf2 task failed downloading, you can just re-run the task and it will re-run the download of the missing outputs in your.partial directory
-
#62: Option to disable automatic log download from gird via
gbasf2_download_logs
setting. Logs are useful for debugging and reproducibility and I think they should always be stored in addition to the data itself. However, for gbasf2 it can take quite a while to download logs, so sometimes if in a hurry it can be useful disabling them and just looking them up online with the dirac web app if you need them.
-
v0.5.1
General
Features:
- New luigi release 3 as dependency. This drops python2 support in luigi, which we didn't have anyway in b2luigi, so there should be no backwards incompatibility issues. On the plus side, this solves a dependency conflict with jupyter due to different required
tornado
versions
Fixes:
- fix issue with
core.utils.get_filename()
in jupyter #34 - fix code in some basf2 examples to work with newer basf2 releases #37, #38, #37
Development model:
- default branch is now
main
#39
gbasf2 wrapper
Features:
Fixes:
- fix logic bug in setting
gbasf2_additional_params
#43 - modified time parsing that recognizes dirac proxy validity times > 24h #46
- workaround gbasf2 wildcard bug #41
- for dirac proxy handling, replace gbasf2 command string-parsing with direct communication with DIRAC Api via sub-script #51. Intended as an feature, but I think this also fixed a bug with a newer gbasf2 release
v0.5.0
Features in this release:
- deprecate some settings (#22)
- corrected path to decfile for new structure in basf2 release-04 (#23)
- Adding option to provide userdefined location of the task executable. This can be used analog to the optional task attribute . (#25)
- Warning if forward slash in parameter (#27)
- change link to documentaion from latest to stable (#29)
- additional requirements structure (#30)
- Soft wrapper for gbasf2 as a b2luigi BatchProcess (#32)
v0.4.4
Features in this release:
- small bugfixes with envs and basf2 tasks (@nils-braun)
v0.4.3
Features in this release:
- Added documentation
- Re-add an old feature for log files, will soon be deprecated.
v0.4.2
Features in this release:
- Fixed a problem with basf2 module importing (@nils-braun)
- Better handling for filesystems (#21) (@nils-braun)
Started supporting file copy mechanisms in htcondor, do only create folders when needed, better relative path handling.
v0.4.1
Features in this release:
- Added relevant authors in docu (Nils Braun)
- Fixed travis config (Nils Braun)
v0.4.0
Features in this release:
-
Batch Improvements (#20) (@nils-braun):
Generalize and simplify the batch setup and the dispatch method.
Updated and added a lot of documentation.
Please see the docu or the examples to check out the new ways
to setup the batch environment. -
HTCondor support (#19) (@welschma):
Added long-needed support for HTCondor batch systems.
Building block for #20. -
Added Community Documents (@nils-braun)
-
Fix serialized parameters for basf2 tasks (#18) (@elimik31):
Fixed problems after refactoring in basf2 tasks -
Fix for get_basf2_git_hash to work with new basf2 tools (#17) (@elimik31)
Check for the correct release name or head
v0.3.2
Features in this release:
- Fixed required versions of packages (#15) (@nils-braun)