Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix overflow in softmax #82

Closed
wants to merge 1 commit into from
Closed
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
6 changes: 5 additions & 1 deletion lleaves/compiler/codegen/codegen.py
Original file line number Diff line number Diff line change
Expand Up @@ -317,6 +317,7 @@ def _populate_objective_func_block(
llvm_copysign = builder.module.declare_intrinsic(
"llvm.copysign", (DTYPE, DTYPE), ir.FunctionType(DTYPE, (DTYPE, DTYPE))
)
llvm_maxnum = builder.module.declare_intrinsic("llvm.maxnum", (DTYPE, DTYPE), ir.FunctionType(DTYPE, (DTYPE, DTYPE)))
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any reason why you're using maxnum instead of maximum? Sounds like maximum propagates NaNs while maxnum doesn't, and when in doubt I'd probably rather propagate them.

Copy link
Author

@starkwj starkwj Aug 1, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks, I agree with you.
However, maximum cannot work for me (and maybe a common case). When I try maximum, It turns out as "LLVM ERROR: Cannot select: t65: f64 = fmaximum t55, t64".
I tried to find out the cause, listed below. I'am not familiar with llvm, so I cannot ensure the below is correct.

  1. The issue is simliar to Cannot select llvm.{min,max}imum.{f32,f64} llvm/llvm-project#53353 . From replies, it seems that maximum of float has been supported not long ago: llvm/llvm-project@a82d27a
  2. For llvmlite, I installed the latest version 0.43.0 by pre-built binary (pip or conda), and the required LLVM is integrated by llvmlite binary (https://llvmlite.readthedocs.io/en/latest/admin-guide/install.html). I believe this is a common usage. The integrated LLVM doesn't seems to support the float version of maximum.

Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh yeah it seems this is not implement. I guess it doesn't super matter, since we're currently doing this elementwise anyway (instead of e.g. using llvm.vector.reduce.fmax) you could just implement it via fcmp followed by select.


if average_output:
args[0] = builder.fdiv(args[0], get_fdtype_const(num_trees, use_fp64))
Expand Down Expand Up @@ -368,7 +369,10 @@ def _populate_sigmoid(alpha):
elif objective == "multiclass":
assert len(args)
# TODO Might profit from vectorization, needs testing
result = [builder.call(llvm_exp, [arg]) for arg in args]
arg_max = args[0]
for a in args[1:]:
arg_max = builder.call(llvm_maxnum, [arg_max, a])
result = [builder.call(llvm_exp, [builder.fsub(arg, arg_max)]) for arg in args]

denominator = get_fdtype_const(0.0, use_fp64)
for r in result:
Expand Down
Loading