Release v0.9.0 · mlverse/torch

Breaking changes

Added cuda_synchronize() to allow synchronization of CUDA operations. (#887)
Added support for M1 Macs, including creating Tensors in the MPS device. (#890)
Added support for CUDA 11.6 on Linux. (#902)
Added cuda_empty_cache() to allow freeing memory from the caching allocator to the system. (#903)
Added $is_sparse() method to check wether a Tensor is sparse or not. (#903)
dataset_subset now adds a class to the modified dataset that is the same as the original dataset classes postfixed with _subset. (#904)
Added torch_serialize() to allow creating a raw vector from torch objects. (#908)

Fixed bug in torch_arange that was causing the end value not getting included in the result. (#885, @skeydan)
Fixed bug in window functions by setting a default dtype. (#886, @skeydan)
Fixed bug when using install_torch(reinstall = TRUE). (#883)
The dims argument in torch_tile() is no longer modified, as it's not meant to be the a 1-based dimension. (#905)
nn_module$state_diict() now detaches output tensors by default. (#916)

Re-implemented the $ method for R7 classes in C/C++ to improve speed when calling methods. (#873)
Re-implemented garbage collection logic when calling it from inside a backward() call. This improves speed because we no longer need to call GC
everytime backward is called. (#873)
We now use a thread pool instead of launching a new thread for backward calls. (#883)
Implemented options to allow configuring the activation of garbage collection when allocating more CUDA memory. (#883)
Some nnf_ functions have been updated to use a single torch_ kernel instead of the custom implementation. (#896)
Improved performance of dataloaders. (#900)
We now let LibTorch query the default generator, this allows one to use torch_bernoulli() with device="gpu". (#906)