Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

word_stem: Presto vs Velox value mismatch #12341

Open
peterenescu opened this issue Feb 14, 2025 · 2 comments
Open

word_stem: Presto vs Velox value mismatch #12341

peterenescu opened this issue Feb 14, 2025 · 2 comments
Labels
bug Something isn't working fuzzer-found

Comments

@peterenescu
Copy link
Contributor

peterenescu commented Feb 14, 2025

Description

word_stem produces mismatched values between Presto/Velox queries.

Reproduction

(Velox followed by Presto)

presto:di> select word_stem(normalize(c0)) from (values (varchar '0Q7,XcIi4$[;Jz<vJp11ndu<k`?\Nd`qG<|YJ8hf4HJtfk9o+DvNz7y()=9"S9o{9I{:{/}jsV-<AnWJ`^n')) t(c0);
                                        _col0                                        
-------------------------------------------------------------------------------------
 0q7,xcii4$[;jz<vjp11ndu<k`?\nd`qg<|yj8hf4hjtfk9o+dvnz7y()=9"s9o{9i{:{/}jsv-<anwj`^n 
(1 row)
presto:di> select word_stem(normalize(varchar '0Q7,XcIi4$[;Jz<vJp11ndu<k`?\Nd`qG<|YJ8hf4HJtfk9o+DvNz7y()=9"S9o{9I{:{/}jsV-<AnWJ`^n'));
                                        _col0                                        
-------------------------------------------------------------------------------------
 0Q7,XcIi4$[;Jz<vJp11ndu<k`?\Nd`qG<|YJ8hf4HJtfk9o+DvNz7y()=9"S9o{9I{:{/}jsV-<AnWJ`^n 
(1 row)
presto:di> 

Relevant logs

normalize produces identical results: (Velox followed by Presto)

presto:di> select normalize(c0) from (values (varchar '0Q7,XcIi4$[;Jz<vJp11ndu<k`?\Nd`qG<|YJ8hf4HJtfk9o+DvNz7y()=9"S9o{9I{:{/}jsV-<AnWJ`^n')) t(c0);
                                        _col0                                        
-------------------------------------------------------------------------------------
 0Q7,XcIi4$[;Jz<vJp11ndu<k`?\Nd`qG<|YJ8hf4HJtfk9o+DvNz7y()=9"S9o{9I{:{/}jsV-<AnWJ`^n 
(1 row)
presto:di> select normalize(varchar '0Q7,XcIi4$[;Jz<vJp11ndu<k`?\Nd`qG<|YJ8hf4HJtfk9o+DvNz7y()=9"S9o{9I{:{/}jsV-<AnWJ`^n');
                                        _col0                                        
-------------------------------------------------------------------------------------
 0Q7,XcIi4$[;Jz<vJp11ndu<k`?\Nd`qG<|YJ8hf4HJtfk9o+DvNz7y()=9"S9o{9I{:{/}jsV-<AnWJ`^n 
(1 row)

@peterenescu peterenescu added bug Something isn't working fuzzer Issues related the to Velox fuzzer test components. fuzzer-found labels Feb 14, 2025
@kagamiori
Copy link
Contributor

@peterenescu, is this mismatch in word_stem or in normalize? Can we replace normalize(c0) with its result as a constant literal to make it clearer?

@peterenescu peterenescu removed fuzzer Issues related the to Velox fuzzer test components. fuzzer-found labels Feb 20, 2025
@peterenescu
Copy link
Contributor Author

normalize is not the issue, added some additional queries in the Relevant logs section to demonstrate so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working fuzzer-found
Projects
None yet
Development

No branches or pull requests

2 participants