Skip to content

Halide v15.0.0

Compare
Choose a tag to compare
@steven-johnson steven-johnson released this 06 Mar 23:38
· 5 commits to release/15.x since this release
d7651f4

What's Changed

General Notes

  • Support for RISC V Vector architectures.

  • Python-related:

    • Halide builds for Python are now being built and provided to PyPI, so it is now possible to use the Halide Python bindings simply by pip install halide
    • Major improvements were made to the Python bindings, with many missing or incomplete sections of the API added or filled in.
    • We now support the use of Generators from Python (for both JIT and AOT usage).
    • The standard CMake rules now support generating a Python extension directly.
    • Support for Python was removed from Halide's Makefiles; you must use CMake to build the Python bindings
  • Halide::Func now allows you to (optionally) constrain the type(s) of Exprs that the Func can contain, and/or the dimensionality of the Func.

  • Added a new way to use the JIT (compile_to_callable) that allows calling a jitted function with the same syntax as for AOT-compiled functions, allowing more control over JIT lifespan, as well as thread-safe arguments without requiring ParamMap

  • General improvements to SIMD codegen

  • Several rarely-used parts of the C++ Generator API were deprecated, and the way that autoschedulers are specified for AOT compilation is now completely different (but better for future expandability).

  • CMake builds now require >= v3.22

  • WABT usage requires >= v1.0.30

  • LLVM 12 is no longer supported

  • The target flag disable_llvm_loop_opt is deprecated, as it's now the default behavior. This means that we have turned off llvm's autovectorization and loop unrolling. This should not affect any schedules with manually-specified vectorization and unrolling, other than trimming code size a little. However, schedules that do not vectorize or unroll may slow down because they were (intentionally or not) relying on llvm to do it automatically. If you see a performance regression with Halide 15, try turning on the enable_llvm_loop_opt target flag.

Notable bug fixes

  • Make Halide::round behave as documented (#7012)
  • Incorrect folding of saturating_sub (#6883)
  • The check for race conditions didn't consider where clauses (#6808)
  • Performance regression for x86 for certain LLVM versions (#6783)
  • Fusing a specialization drops compute_withs from generated code (#6770)
  • Incorrect output when realize condition depends on tuple call (#6915)
  • Python extensions should default to throwing exceptions rather than calling abort() for errors (#6986)
  • Python bindings didn't support bool buffers (#7006)
  • Python bindings didn't support float16 buffers (#7060)
  • Python extensions that executed on GPU didn't copy back to host properly (#6869)
  • Fix bugs in div_round_to_zero and fast_integer_divide_round_to_zero (#7008)
  • Bugs in add_requirement() (#7045)

Major changes

Minor changes

Changes to public API since last release

New Deprecations (Upcoming API changes)

Other Notes

  • Although there are commits relating to a Vulkan backend, this release of Halide doesn't provide Vulkan support (it's still a work in progress)
  • It's possible that the changes in #6754 can cause performance degradation (but usually only for poorly-schedule Halide code).

New Contributors