Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Build failure in MacOS Intel blocked CI #5421

Closed
Yohahaha opened this issue Jun 27, 2023 · 10 comments
Closed

Build failure in MacOS Intel blocked CI #5421

Yohahaha opened this issue Jun 27, 2023 · 10 comments
Assignees
Labels

Comments

@Yohahaha
Copy link
Contributor

Problem description

https://app.circleci.com/pipelines/github/facebookincubator/velox/27747/workflows/349fe077-d792-4b9f-be55-307f05a4662b/jobs/175099

System information

Velox System Info v0.0.2
Commit: Not in a git repo.
CMake Version: 3.20.2
System: Linux-5.10.134-14.al8.x86_64
Arch: x86_64
C++ Compiler: /usr/lib64/ccache/c++
C++ Compiler Version: 10.2.1
C Compiler: /usr/lib64/ccache/cc
C Compiler Version: 10.2.1
CMake Prefix Path: /usr/local;/usr;/;/usr;/usr/local;/usr/X11R6;/usr/pkg;/opt

CMake log

FAILED: velox/functions/prestosql/registration/CMakeFiles/velox_functions_prestosql.dir/BinaryFunctionsRegistration.cpp.o 
ccache /Applications/Xcode-14.3.1.app/Contents/Developer/Toolchains/XcodeDefault.xctoolchain/usr/bin/c++ -DBOOST_ATOMIC_DYN_LINK -DBOOST_ATOMIC_NO_LIB -DBOOST_CONTEXT_DYN_LINK -DBOOST_CONTEXT_NO_LIB -DBOOST_FILESYSTEM_DYN_LINK -DBOOST_FILESYSTEM_NO_LIB -DBOOST_PROGRAM_OPTIONS_DYN_LINK -DBOOST_PROGRAM_OPTIONS_NO_LIB -DBOOST_REGEX_DYN_LINK -DBOOST_REGEX_NO_LIB -DBOOST_SYSTEM_DYN_LINK -DBOOST_SYSTEM_NO_LIB -DBOOST_THREAD_DYN_LINK -DBOOST_THREAD_NO_LIB -DFMT_LOCALE -DFOLLY_HAVE_INT128_T=1 -DGFLAGS_IS_A_DLL=0 -DSIMDJSON_THREADS_ENABLED=1 -I/Users/distiller/project/_build/debug/_deps/protobuf-src/src -I/Users/distiller/project/. -I/Users/distiller/project/velox/external/xxhash -I/Users/distiller/project/_build/debug/_deps/folly-src -I/Users/distiller/project/_build/debug/_deps/folly-build -I/Users/distiller/project/_build/debug/_deps/xsimd-src/include -I/Users/distiller/project/_build/debug/_deps/simdjson-src/include -isystem /Users/distiller/project/velox -isystem /Users/distiller/project/velox/external -isystem /Users/distiller/project/velox/external/duckdb -isystem /Users/distiller/project/velox/external/duckdb/tpch/dbgen/include -isystem /Users/distiller/deps/include -isystem /Users/distiller/project/_build/debug/_deps/gtest-src/googletest/include -isystem /Users/distiller/project/_build/debug/_deps/gtest-src/googletest -mavx2 -mfma -mavx -mf16c -mlzcnt -std=c++17 -mbmi2 -fvisibility=hidden -fvisibility-inlines-hidden -D USE_VELOX_COMMON_BASE -D HAS_UNCAUGHT_EXCEPTIONS -Wall -Wextra -Wno-unused        -Wno-unused-parameter        -Wno-sign-compare        -Wno-ignored-qualifiers        -Wno-range-loop-analysis          -Wno-mismatched-tags          -Wno-nullability-completeness -Werror -g -std=gnu++17 -isysroot /Applications/Xcode-14.3.1.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX13.3.sdk -mmacosx-version-min=13.2 -fPIC -MD -MT velox/functions/prestosql/registration/CMakeFiles/velox_functions_prestosql.dir/BinaryFunctionsRegistration.cpp.o -MF velox/functions/prestosql/registration/CMakeFiles/velox_functions_prestosql.dir/BinaryFunctionsRegistration.cpp.o.d -o velox/functions/prestosql/registration/CMakeFiles/velox_functions_prestosql.dir/BinaryFunctionsRegistration.cpp.o -c /Users/distiller/project/velox/functions/prestosql/registration/BinaryFunctionsRegistration.cpp
In file included from /Users/distiller/project/velox/functions/prestosql/registration/BinaryFunctionsRegistration.cpp:16:
In file included from /Users/distiller/project/./velox/functions/Registerer.h:18:
In file included from /Users/distiller/project/./velox/core/SimpleFunctionMetadata.h:21:
In file included from /Users/distiller/project/./velox/common/base/Exceptions.h:29:
In file included from /Users/distiller/project/./velox/common/base/VeloxException.h:25:
In file included from /Users/distiller/project/_build/debug/_deps/folly-src/folly/synchronization/CallOnce.h:25:
In file included from /Users/distiller/project/_build/debug/_deps/folly-src/folly/SharedMutex.h:29:
In file included from /Users/distiller/project/_build/debug/_deps/folly-src/folly/concurrency/CacheLocality.h:33:
/Users/distiller/project/_build/debug/_deps/folly-src/folly/Memory.h:289:33: error: 'HMAC_CTX_free' is deprecated [-Werror,-Wdeprecated-declarations]
  void operator()(T* t) const { f(t); }
                                ^
/Applications/Xcode-14.3.1.app/Contents/Developer/Platforms/MacOSX.platform/Developer/SDKs/MacOSX13.3.sdk/usr/include/c++/v1/__memory/unique_ptr.h:305:7: note: in instantiation of member function 'folly::static_function_deleter<hmac_ctx_st, &HMAC_CTX_free>::operator()' requested here
      __ptr_.second()(__tmp);
      ^
/Users/distiller/project/_build/debug/_deps/folly-src/folly/ssl/OpenSSLHash.h:232:14: note: in instantiation of member function 'std::unique_ptr<hmac_ctx_st, folly::static_function_deleter<hmac_ctx_st, &HMAC_CTX_free>>::reset' requested here
        ctx_.reset(HMAC_CTX_new());
             ^
/Users/distiller/deps/include/openssl/hmac.h:35:1: note: 'HMAC_CTX_free' has been explicitly marked deprecated here
OSSL_DEPRECATEDIN_3_0 void HMAC_CTX_free(HMAC_CTX *ctx);
^
/Users/distiller/deps/include/openssl/macros.h:193:49: note: expanded from macro 'OSSL_DEPRECATEDIN_3_0'
#   define OSSL_DEPRECATEDIN_3_0                OSSL_DEPRECATED(3.0)
                                                ^
/Users/distiller/deps/include/openssl/macros.h:62:52: note: expanded from macro 'OSSL_DEPRECATED'
#     define OSSL_DEPRECATED(since) __attribute__((deprecated))
                                                   ^
1 error generated.
@Yohahaha Yohahaha added build triage Newly created issue that needs attention. labels Jun 27, 2023
@Yohahaha
Copy link
Contributor Author

CC @kgpai @majetideepak

@Yohahaha Yohahaha changed the title MacOS Intel build failure blocking CI Folly build failure in MacOS Intel blocking CI Jun 27, 2023
@Yohahaha Yohahaha changed the title Folly build failure in MacOS Intel blocking CI Build failure in MacOS Intel blocking CI Jun 27, 2023
@Yohahaha Yohahaha changed the title Build failure in MacOS Intel blocking CI Build failure in MacOS Intel blocked CI Jun 27, 2023
@kgpai
Copy link
Contributor

kgpai commented Jun 27, 2023

Cc: @assignUser

@kgpai
Copy link
Contributor

kgpai commented Jun 27, 2023

Hi @Yohahaha can you add details on which version of OpenSSL you have and when it was installed?

@assignUser
Copy link
Collaborator

That's an openssl 3 warning @kgpai I think we suppressed similar warnings in other places?

@kgpai
Copy link
Contributor

kgpai commented Jun 27, 2023

Sorry , disregard last message - I missed that this was happening on Ci . Looking into Ci failure .

@mbasmanova mbasmanova removed the triage Newly created issue that needs attention. label Jun 27, 2023
@sanumandla
Copy link

Thanks a lot for helping while you are on vacation @kgpai @assignUser.
Tagging @raulcd if you can provide some coverage :). Thank you!

@kgpai
Copy link
Contributor

kgpai commented Jun 27, 2023

This is happening because deps/include/openssl links to openssl -> ../Cellar/openssl@3/3.1.1/include/openssl.

@majetideepak
Copy link
Collaborator

This is similar to #5309

facebook-github-bot pushed a commit that referenced this issue Jun 28, 2023
Summary:
Probable fix for issue : #5421 .

Pull Request resolved: #5426

Reviewed By: kevinwilfong

Differential Revision: D47079139

Pulled By: pedroerp

fbshipit-source-id: 0ee0ff98ce0424d3abe80d6d53dabf78b2bfcc05
rrando901 pushed a commit to rrando901/velox that referenced this issue Jul 24, 2023
Add utility to register file data sinks (facebookincubator#5340)

Summary: Pull Request resolved: facebookincubator#5340

Reviewed By: Yuhta

Differential Revision: D46905800

Pulled By: xiaoxmeng

fbshipit-source-id: e842f27c72e7472cd1291a090357f04f3f3d8ec4

Updating submodules

Summary:
GitHub commits:

facebook/fb303@550ea38
facebook/fbthrift@154c26f
facebook/folly@93b2aea
facebook/ocamlrep@db3bb4c
facebook/proxygen@8d362b9
facebook/wangle@c841815
facebook/watchman@e64f1df
facebookexperimental/edencommon@1aa21ed
facebookexperimental/rust-shed@dc3d716
facebookincubator/fizz@fc71dd8
facebookincubator/katran@fb3c0cc
facebook/mvfst@072b444
facebookincubator@787b688

Reviewed By: jailby

fbshipit-source-id: 99864426247a6050d13f47cc3a3ad1a1e34144d0

Fix: mapFromEntries initialize sizes with 0. (facebookincubator#5377)

Summary:
Pull Request resolved: facebookincubator#5377

A previous fix by me PR5344 introduced another bug, when not all rows selected,
map vector still need to be valid for unselected rows, hence setting sizes to 0.

Reviewed By: mbasmanova

Differential Revision: D46984868

fbshipit-source-id: dbf08cb986b31ff1f7417ec335db4e7d282bda6f

Disable expect death memory pool tests which hang in linux build (facebookincubator#5374)

Summary:
oss gtest expect death implementation can lead to test hang.

Pull Request resolved: facebookincubator#5374

Reviewed By: kevinwilfong

Differential Revision: D46983548

Pulled By: xiaoxmeng

fbshipit-source-id: 96b34029334c6809c31be3807ef85a57ca27f626

Fix race condition in Task abort (facebookincubator#5376)

Summary:
Pull Request resolved: facebookincubator#5376

HashJoinTest.hashBuildAbortDuringInputgProcessing exposes a potential race condition when aborting a Task.

The test creates a Task for a Join query, which has 2 Drivers, one for the HashProbe and one for the HashBuild.  The Driver for the HashBuild will get suspended at which point the Task will be aborted.

It's possible that the Driver for the HashProbe has been enqueued on the Task's executor but hasn't been executed yet and isn't executed until after abort is called on its operators.

In this case:
* when the Task's terminate function is called as part of abort, it calls enterForTerminateLocked on the Driver's state and sees that this Driver is not "onThread", it sets its state to terminated and calls setThread
* when abort is called on any operator from the Driver, the MemoryReclaimer sees that the Driver is now "onThread" and the Driver is not suspended, so it fails this check https://www.internalfb.com/code/fbsource/[80b90c8613db0fc6a4f44778b41a236a4b4d92a4]/fbcode/velox/exec/Operator.cpp?lines=470

In this case it is still safe to close the Operator.  As soon as the Driver is run, it will see that it is terminated and cease to run.  In general this is the case because a Driver can only be in state terminated if it is not actively running.

I believe this is also an issue for the reclaim function.

Reviewed By: xiaoxmeng

Differential Revision: D46983765

fbshipit-source-id: 0ce1558c8ee28b0af9a545e05f0aab0ef62dd46c

Fix potential concurrency bug in SsdFile::updateStats (facebookincubator#5367)

Summary:
Pull Request resolved: facebookincubator#5367

The test AsyncDataCacheTest.ssd is currently flaky.  When it fails it reports that numPins in the stats from
the SsdCache is more than 0 when it's expected to be 0.

My theory (I'm having trouble consistently reproducing it) is that this is due to cache incoherence between
the threads that are pinning/unpinning regions and the main thread.

The numPins stats is gotten by summing up all the pins across all regions across all SsdFiles in the
SsdCache.  Each SsdFile has an std::vector, counting all the pins across all regions.  Multiple threads update
this std::vector as regions are pinned/unpinned each taking a lock as they do so.  When we call updateStats
on the SsdFile, we don't take a lock (except, somewhat tellingly, in TSAN mode which would complain
otherwise) when we read from that std::vector.

According to the description of the C++ memory model as I understand it, without something to synchronize
threads A and B, if they run on separate cores, B may not see all updates made by A due to cache
incoherence. https://en.cppreference.com/w/cpp/atomic/memory_order (and is also why TSAN is
complaining)

Acquiring a mutex is one way to synchronize threads, so we just need to make that mutex lock happen
across all modes, not just TSAN.

In terms of performance, other than tests
* SsdFile::updateStats() is only called from SsdCache::stats()
* SsdCache::stats() is only called from SsdCache::toString() (which I assume is only used for debugging
purposes) and AsyncDataCache::refreshStats()
* AsyncDataCache::refreshStats() is only called from AsyncDataCache::toString() (which I again assume is
only used for debugging)

So it shouldn't have any impact on production.

Reviewed By: laithsakka

Differential Revision: D46953325

fbshipit-source-id: 142874270de97306daadbbdc607f9481b4d0f8e3

Updating submodules

Summary:
GitHub commits:

facebook/CacheLib@b36cb54
facebook/folly@5648b86
pytorch/FBGEMM@751ce3a

Reviewed By: jailby

fbshipit-source-id: 2b3f4b9726aeb40b0bc3cbe5ff26bc215f5b5ce9

Add array with element concat Presto function (facebookincubator#5292)

Summary: Pull Request resolved: facebookincubator#5292

Reviewed By: amitkdutta

Differential Revision: D46914854

Pulled By: zacw7

fbshipit-source-id: 910f94f23b8f1d134fd130af62bac9fa854e5ff1

Updating submodules

Summary:
GitHub commits:

facebook/folly@9bd57cd

Reviewed By: jailby

fbshipit-source-id: c97b95cafd79c553831ddad6d4c52682e9f4be9c

Fix flaky concurrent arbitration test (facebookincubator#5378)

Summary:
Fix a couple tsan detected race condition in concurrent arbitration

Pull Request resolved: facebookincubator#5378

Test Plan:
(1) protect the reclaimer set using mutex lock in MemoryPoolImpl and use
tsan lock for access
(2) fix capacity access race in out of memory exception message generation
(3) fix concurrent duckdb access under tsan
(4) fix fake memory operator initialization race with reclaimer

Verified with meta internal tsan tests

Reviewed By: kevinwilfong

Differential Revision: D46988269

Pulled By: xiaoxmeng

fbshipit-source-id: 68718bd56325ee8201c85f2fe12dab041a87c837

Updating submodules

Summary:
GitHub commits:

facebook/folly@468d81a

Reviewed By: jailby

fbshipit-source-id: aed16f09ac556c2c32dab1fc1fd04a2e32b631a7

Fix out-of-range condition buffer access in SwitchExpr (facebookincubator#5355)

Summary:
Pull Request resolved: facebookincubator#5355

When evaluating a branch, SwitchExpr first evaluates the condition sub-expression into a `condition` vector. getFlatBool() is then called to extract a  uint64_t* buffer of values in `condition` to participate the subsequent bit operation that computes a SelectivityVector of rows that pass the condition. However, `condition` is guaranteed to be at least of the size `remainingRows->end()` while the bit operaiton is performed from bit 0 to rows.end() that may be larger than remainingRows->end(), hence causing out-of-range access error. This diff fixes this bug by making the bit operation performed from bit 0 to remainingRows->end().

In addition, the `condition` vector was released before its raw values buffer is used in the subsequent bit operation, which is unsafe. This diff also moves the release of the `condition` vector to after the bit operation.

Reviewed By: bikramSingh91

Differential Revision: D46920659

fbshipit-source-id: ea86d44c96e4c86e26cfbe463cb292b41fe7d1f1

Avoid quadratic calls to computeMetaData (facebookincubator#5348)

Summary:
Pull Request resolved: facebookincubator#5348

computeMetaData is called from  the expression compiler.
Input expressions are compiled from leaf to root, calling
computeMetaData on leafs then root.

We need to call computeMetaData in that manner, because
we do constant folding. This diff avoid recompilation when not
needed by adding a flag metaDataComputed_

Reviewed By: kagamiori

Differential Revision: D46909748

fbshipit-source-id: 6266515f44df704329f4ab537e0e2c7e8529b24e

Fix max_by/min_by when processing extreme values (facebookincubator#5178)

Summary:
Remove initial value for fixed-width types. Use group's null flag to check if it
is the first valid value to process.

Also, add Timestamp to isNumeric check to allow Timestamps to go over fast path.

Fixed: facebookincubator#5145

Pull Request resolved: facebookincubator#5178

Reviewed By: xiaoxmeng

Differential Revision: D46583948

Pulled By: mbasmanova

fbshipit-source-id: 98cf67de7c9986f3d3c55db992dbd52a0deb1044

Reduce linker threads in nightly benchmark build (facebookincubator#5389)

Summary:
Similar to the fix for linux-build in facebookincubator#4582

Calling make debug OOM's the CircleCI container unless we drop the number of linker threads to 6.

This fixes the nightly fuzzer runs so they no longer OOM during the build phase.

In facebookincubator#5381 I modified the CircleCI config so that the fuzzer jobs run on PRs.  All the build phases succeeded (a couple of the fuzzer tests failed at runtime which is why we need them running).

Pull Request resolved: facebookincubator#5389

Reviewed By: laithsakka

Differential Revision: D46994371

Pulled By: kevinwilfong

fbshipit-source-id: 628434a78d61b7bd28530ee408911e14745b9a77

Updating submodules

Summary:
GitHub commits:

facebook/CacheLib@d2d9f73
facebook/fbthrift@495bda5
facebook/folly@d59f984
facebook/litho@6235932
facebook/ocamlrep@5046175
facebook/rocksdb@ff1cc8a
facebook/wangle@dace022
facebookincubator/katran@b17ab0d
facebookincubator@36a1507

Reviewed By: bigfootjon

fbshipit-source-id: ad5255f4fc1d1105ef2e13d6b8fce7c34ffd72ef

Skip concat function in fuzzer test (facebookincubator#5407)

Summary: Pull Request resolved: facebookincubator#5407

Reviewed By: mbasmanova

Differential Revision: D47022269

Pulled By: zacw7

fbshipit-source-id: 544eefdeefd9b2d08b9b73b0f2503e0cea3aab5c

Updating submodules

Summary:
GitHub commits:

facebook/fbthrift@a80e449
facebook/folly@e74fe5c
facebook/rocksdb@94c247b
facebookresearch/multimodal@94979e2

Reviewed By: bigfootjon

fbshipit-source-id: ed01d5ed12880ef462ce3489abfe99f870c60fcd

redefine distinctFields and make skipPeeling optimization explicit. (facebookincubator#5354)

Summary:
Pull Request resolved: facebookincubator#5354

distinctFields was defined as
1. the distinct fields of the expression.
2. or empty list if the expression is single referenced and has the same as its parent distinct fields.

(2) was added to optimize peeling (avoiding it) when fields are known to be shared with parents. This diff separate the optimization from the definition of the distinct fields and makes it explicit using skipPeeling() function. With this, users of the library can always trust that distinct fields refer to the expression's distinct fields.

Reviewed By: bikramSingh91

Differential Revision: D46915886

fbshipit-source-id: e2120f89b899730a7addbd083de4d719a7db91b6

No need for computeMetaData to be virtual. (facebookincubator#5353)

Summary:
Pull Request resolved: facebookincubator#5353

computeMetaData was virtual, however only one expression namely FieldReference overrides it. The FieldReference computeMetaData calls back the base computeMetaData when there are inputs.
Moreover, metadata for field reference is computed in the same way as other special forms except for distinct fields when there are no inputs.
This diff remove the virtual and handles field reference in computeMetaData

Reviewed By: kagamiori

Differential Revision: D46916994

fbshipit-source-id: 3b3741be31f497024659b82f803e9696d5ba27e5

Add support for UNKNOWN type to UnsafeRow serializer (facebookincubator#5408)

Summary:
Extends UnsafeRow serde to support values of UNKNOWN types. These are always nulls.

Pull Request resolved: facebookincubator#5408

Reviewed By: miaoever

Differential Revision: D47028801

Pulled By: mbasmanova

fbshipit-source-id: 9821a38a3f4c0f8787913f2cd9d74ef67a41576a

Updating submodules

Summary:
GitHub commits:

facebook/folly@b4fabe8
facebook/litho@2168b53
facebook/ocamlrep@496d4cb
facebookincubator/fizz@d645712
facebookincubator@a9af06c
pytorch/FBGEMM@555ad07

Reviewed By: bigfootjon

fbshipit-source-id: 59f623f1e8a5311fd75ede79c731df709a8adb98

Add sanity check in computeChecksum (facebookincubator#5321)

Summary:
Add checks outside of the while loop to fail fast before going into the actually reading and checksum computing.
Moved implementations of ByteStream to its cpp file.

Pull Request resolved: facebookincubator#5321

Reviewed By: xiaoxmeng

Differential Revision: D46877120

Pulled By: tanjialiang

fbshipit-source-id: cf7266c7156d3bf1c8313961072cc8543759f38f

Make various code cleanups (open source) (facebookincubator#5395)

Summary:
Pull Request resolved: facebookincubator#5395

Various code cleanups- open source changes in this commit.

Reviewed By: DanielMunozT, abhash09

Differential Revision: D46994931

fbshipit-source-id: 0d8103c92262589db7d5bfd9abd2c4c633bd3cae

Updating submodules

Summary:
GitHub commits:

facebook/fb303@8130eb6
facebook/fbthrift@cb689e8
facebook/folly@5095e77
facebookincubator@94bc7ad

Reviewed By: bigfootjon

fbshipit-source-id: fa176b70957e2490ede22b7da84411c7eb6af91f

More flexible serializer support in remote functions (facebookincubator#5370)

Summary:
Pull Request resolved: facebookincubator#5370

Make remote functions able to specify serializer based on a serde
format in a more flexible manner, without relying on the default serializer
installed in the process. This is needed because different remote functions
might need to communicate with different remote services, using distinct serialization
formats.
Now the serialization format can be defined when registering the function, as
part of the metadata.

Reviewed By: mbasmanova

Differential Revision: D46960082

fbshipit-source-id: 87d876c4d8f577079390a22868576f417207a085

Remove static data sink registration method (facebookincubator#5402)

Summary: Pull Request resolved: facebookincubator#5402

Reviewed By: HuamengJiang, kewang1024

Differential Revision: D47024419

Pulled By: xiaoxmeng

fbshipit-source-id: df3cf67bfc4058c411089cbea6610af609dc3cf8

Add to release more memory resources in hash probe close (facebookincubator#5396)

Summary:
Currently hash probe close only free major memory resources
and this caused  a flaky unit test facebookincubator#5240: which expect all the memory
has been freed. This PR adds to fix more memory resource and the
flaky test passed 1000 times in meta internal test.

Pull Request resolved: facebookincubator#5396

Test Plan: Passed 200 test iterations: https://www.internalfb.com/intern/testinfra/testrun/5066549760861273

Reviewed By: kevinwilfong

Differential Revision: D47024400

Pulled By: xiaoxmeng

fbshipit-source-id: c55ed0d216c1dd8a6d9b1f21c760ac74fa64b9ff

Move velox_dwio_type_fbhive into velox/type (facebookincubator#5388)

Summary:
Pull Request resolved: facebookincubator#5388

Moving the hive type parser and serializer code from dwio into
velox/type. This code used to be used strictly by dwio, but now will be used
when serializing/deserializing types more generally for remote function
execution.

Reviewed By: xiaoxmeng, mbasmanova

Differential Revision: D46993770

fbshipit-source-id: 8693f3f6b874dcf4bb6bc52818a90ca01d5d8035

Fix conjunct clearing pre-existing errors beyond rows.size() (facebookincubator#5365)

Summary:
Pull Request resolved: facebookincubator#5365

Fuzzer found a bug in Conjunct when evaluating an expression
`try(coalesce(map_from_entries(c0), c1, switch(c3, switch((c4) or (c5), c7))))`.
Here `map_from_entiries(c0)` threw at row 99, while `(c4) or (c5)`
was evaluated on a SelectivityVector of size 45. ConjunctExpr calls
finalizeErrors() to clear errors that show up in the initial set of rows
but not in the final set of active rows, which is expected. But what's
wrong is that finalizeErrors() performs bit operation on bit 0--99
(i.e., the size of the pre-existing error vector) with selectivity vectors
of size 45. This leads to out-of-range memory access that either
causes ASAN error or incorrect clear of pre-existing errors. This diff
fixes this problem.

This diff fixes facebookincubator#5364.

Reviewed By: laithsakka

Differential Revision: D46943921

fbshipit-source-id: da7e2ece031e0148936b891643855dca346485a8

Add support for sorted aggregations with masks (facebookincubator#5416)

Summary:
Add support for queries like `SELECT agg(x ORDER BY y) FILTER (WHERE z) FROM t`

Pull Request resolved: facebookincubator#5416

Reviewed By: xiaoxmeng

Differential Revision: D47041831

Pulled By: mbasmanova

fbshipit-source-id: d3f0e0313a7220f4aa0710900e4d9522e34bfe89

Updating submodules

Summary:
GitHub commits:

facebook/folly@f82d679
facebook/litho@3f80da6
facebook/ocamlrep@63644ef
facebookincubator@22ecefc

Reviewed By: bigfootjon

fbshipit-source-id: 15bce5d250ba2d96cd67a0a177b14229e4d90c91

Fix capture vectors size in ExprCallable::apply[NoThrow] (facebookincubator#5418)

Summary:
ExprCallable::apply[NoThrow] must ensure that 'capture' vectors cover all rows in 'final selection'. Otherwise, peeling dictionary encoding from capture vectors fails.

```
VeloxRuntimeError: rows->end() <= vector.size() (1889 vs. 1887) presto.default.concat(item_category, //:VARCHAR)
	at Unknown.# 0  facebook::velox::VeloxException::VeloxException(char const*, unsigned long, char const*, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, std::basic_string_view<char, std::char_traits<char> >, bool, facebook::velox::VeloxException::Type, std::basic_string_view<char, std::char_traits<char> >)(Unknown Source)
	at Unknown.# 1  void facebook::velox::detail::veloxCheckFail<facebook::velox::VeloxRuntimeError, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&>(facebook::velox::detail::VeloxCheckFailArgs const&, std::__cxx11::basic_string<char, std::char_traits<char>, std::allocator<char> > const&)(Unknown Source)
	at Unknown.# 2  facebook::velox::DecodedVector::makeIndices(facebook::velox::BaseVector const&, facebook::velox::SelectivityVector const*, int)(Unknown Source)
	at Unknown.# 3  facebook::velox::exec::PeeledEncoding::peelInternal(std::vector<std::shared_ptr<facebook::velox::BaseVector>, std::allocator<std::shared_ptr<facebook::velox::BaseVector> > > const&, facebook::velox::SelectivityVector const&, facebook::velox::exec::LocalDecodedVector&, bool, std::vector<std::shared_ptr<facebook::velox::BaseVector>, std::allocator<std::shared_ptr<facebook::velox::BaseVector> > >&)(Unknown Source)
```

Pull Request resolved: facebookincubator#5418

Reviewed By: pedroerp

Differential Revision: D47047117

Pulled By: mbasmanova

fbshipit-source-id: a64219584fc802f35fd188938ff6bb3e74bc7cae

Make LocalWriteFile optionally create parent directory (facebookincubator#5386)

Summary:
Pull Request resolved: facebookincubator#5386

`LocalFileSink` creates parent directory if not exists. Pass flag to `LocalWriteFile` so it can do the same.

Reviewed By: DanielMunozT

Differential Revision: D46993078

fbshipit-source-id: 55d8f35c82502871ec71fb7e50b78a76c0dfc99a

Fix size reporting in WriteFileDataSink (facebookincubator#5393)

Summary:
Pull Request resolved: facebookincubator#5393

We can run into an issue where tests fail due to size mismatch between LocalWriteFile and the wrapping WriteFileDataSink. This is because:

1. We write to a file
2. We open a new WriteFileDataSink on that file
3. We check the file's size. WriteFileDataSink uses DataSink's built in size, based on total appended data (0) LocalWriteFile uses the size of the underlying file itself (however many bytes we wrote in step 1).

This fixes this by using semantics from `DataSink`- report 0 if file has not been written to.

Reviewed By: DanielMunozT

Differential Revision: D46996258

fbshipit-source-id: 0f2a4850a7b58518f83ba53f94dc9656d6872363

Add function to register WriteFileDataSink factory wrapping LocalWriteFile (facebookincubator#5387)

Summary:
Pull Request resolved: facebookincubator#5387

We need to be able to register factory for a DataSink wrapped LocalWriteFile. This does that without using the soon-to-be-deprecated `VELOX_REGISTER_DATA_SINK_METHOD_DEFINITION` macro.

Reviewed By: DanielMunozT

Differential Revision: D46953912

fbshipit-source-id: 8551b83b3b227d5b0ac4bac4d45337cadd476a34

Remove LocalFileSink from open source (facebookincubator#5390)

Summary:
Pull Request resolved: facebookincubator#5390

Removes `LocalFileSink` usages from velox, replacing with `WriteFileDataSink`.

Reviewed By: DanielMunozT

Differential Revision: D46951742

fbshipit-source-id: 77b2229214556573d342e7b0d59f7b678391ebb5

Change preadv signature to use folly::Ranges (facebookincubator#5368)

Summary:
Pull Request resolved: facebookincubator#5368

To address concerns of using raw pointers.

Reviewed By: Sullivan-Patrick

Differential Revision: D46958136

fbshipit-source-id: ec841c8c84745d14396578c78881ccda5fa8a60f

Remove vread(Segments) (facebookincubator#5375)

Summary:
Pull Request resolved: facebookincubator#5375

We moved to the IOBuf version.

Reviewed By: Sullivan-Patrick

Differential Revision: D46979164

fbshipit-source-id: b315c3071c175e6a68fae56b1f6d9e6016ba4c9f

Remove sequence and bias support from DecodedVector (facebookincubator#5430)

Summary:
Sequence and bias encodings are not being used, but the logic for processing
these in DecodedVector is complex and not well tested.

Pull Request resolved: facebookincubator#5430

Reviewed By: bikramSingh91

Differential Revision: D47070504

Pulled By: mbasmanova

fbshipit-source-id: 5702aab7a7c191bdf41b54c2c5220326b5282461

Enable huge pages for large contiguous allocations (facebookincubator#5413)

Summary:
Hash tables of tens of MB or more benefit from huge pages because of fewer TLB misses. Savings can be ~30% of CPU time.  We enable huge pages for allocations over 2MB (huge page is 2MB). Since this is an advice to the OS that does not change semantics, we do not fail even if the madvise fails.

Pull Request resolved: facebookincubator#5413

Reviewed By: xiaoxmeng

Differential Revision: D47033963

Pulled By: oerling

fbshipit-source-id: 690fe1c678847ef3fb29b23e24889943ebac0e3f

Make Timestamp constructor throw VeloxUserError instead of VeloxRuntimeError (facebookincubator#5432)

Summary:
Pull Request resolved: facebookincubator#5432

A recent diff introduced VELOX_DCHECK in Timestamp constructor. But there
are UDFs, such as from_unixtime, that construct Timestamp objects from input
values. As the result, these UDFs may throw VeloxRuntimeError during fuzzer
test and fail the test. This diff changes these checks to VELOX_USER_DCHECK.

This diff fixes facebookincubator#5424.

Reviewed By: mbasmanova

Differential Revision: D47063148

fbshipit-source-id: f4c4003252e2731cf863f8ea2960f566a4733731

Force [email protected] path. (facebookincubator#5426)

Summary:
Probable fix for issue : facebookincubator#5421 .

Pull Request resolved: facebookincubator#5426

Reviewed By: kevinwilfong

Differential Revision: D47079139

Pulled By: pedroerp

fbshipit-source-id: 0ee0ff98ce0424d3abe80d6d53dabf78b2bfcc05

Allow varchar input in json_xxx functions (facebookincubator#5436)

Summary:
Presto allows both JSON and VARCHAR values in json_xxx functions.

```
presto:di> show functions like '%json_%';
      Function       | Return Type |     Argument Types     | Function Type | Deterministic | Description | Variable Arity | Built In | Temporary | Language
---------------------+-------------+------------------------+---------------+---------------+-------------+----------------+----------+-----------+----------
 is_json_scalar      | boolean     | json                   | scalar        | true          |             | false          | true     | false     |
 is_json_scalar      | boolean     | varchar(x)             | scalar        | true          |             | false          | true     | false     |
 json_array_contains | boolean     | json, bigint           | scalar        | true          |             | false          | true     | false     |
 json_array_contains | boolean     | json, boolean          | scalar        | true          |             | false          | true     | false     |
 json_array_contains | boolean     | json, double           | scalar        | true          |             | false          | true     | false     |
 json_array_contains | boolean     | json, varchar(x)       | scalar        | true          |             | false          | true     | false     |
 json_array_contains | boolean     | varchar(x), bigint     | scalar        | true          |             | false          | true     | false     |
 json_array_contains | boolean     | varchar(x), boolean    | scalar        | true          |             | false          | true     | false     |
 json_array_contains | boolean     | varchar(x), double     | scalar        | true          |             | false          | true     | false     |
 json_array_contains | boolean     | varchar(x), varchar(y) | scalar        | true          |             | false          | true     | false     |
 json_array_get      | json        | json, bigint           | scalar        | true          |             | false          | true     | false     |
 json_array_get      | json        | varchar(x), bigint     | scalar        | true          |             | false          | true     | false     |
 json_array_length   | bigint      | json                   | scalar        | true          |             | false          | true     | false     |
 json_array_length   | bigint      | varchar(x)             | scalar        | true          |             | false          | true     | false     |
 json_extract        | json        | json, JsonPath         | scalar        | true          |             | false          | true     | false     |
 json_extract        | json        | varchar(x), JsonPath   | scalar        | true          |             | false          | true     | false     |
 json_extract_scalar | varchar     | json, JsonPath         | scalar        | true          |             | false          | true     | false     |
 json_extract_scalar | varchar(x)  | varchar(x), JsonPath   | scalar        | true          |             | false          | true     | false     |
 json_format         | varchar     | json                   | scalar        | true          |             | false          | true     | false     |
 json_parse          | json        | varchar(x)             | scalar        | true          |             | false          | true     | false     |
 json_size           | bigint      | json, JsonPath         | scalar        | true          |             | false          | true     | false     |
 json_size           | bigint      | varchar(x), JsonPath   | scalar        | true          |             | false          | true     | false     |
(22 rows)
```

Pull Request resolved: facebookincubator#5436

Reviewed By: xiaoxmeng

Differential Revision: D47085189

Pulled By: mbasmanova

fbshipit-source-id: 3e758ac283d5269d0992ed3b227d2dab4235977a

Updating submodules

Summary:
GitHub commits:

facebook/folly@d2ca58d
facebook/mcrouter@124fb5b
facebook/ocamlrep@0cb1903

Reviewed By: bigfootjon

fbshipit-source-id: 99bf8379d90ab8f88bd9dfcf83619a5d76c2c449

Updating submodules

Summary:
GitHub commits:

facebook/fb303@e40198f
facebook/folly@066724c
facebook/litho@04c8de5
facebook/ocamlrep@2a05b67
facebook/wangle@3a48d0c
facebookexperimental/edencommon@626a959
facebook/mvfst@44a8935

Reviewed By: bigfootjon

fbshipit-source-id: 5eda2e11f8dc22252847bc8f29c49aae8c3a5692

Add support for map_union(map(unknown, unknown)) (facebookincubator#5440)

Summary:
Presto supports map_union(map(unknown, unknown)) when input maps are all empty.

```
WITH t as (SELECT map(array[], array[]) as m)
SELECT map_union(m) FROM t
```

Pull Request resolved: facebookincubator#5440

Reviewed By: xiaoxmeng

Differential Revision: D47088037

Pulled By: mbasmanova

fbshipit-source-id: c2215677272ef28f77223c1855060d25cf5282ed

Fix reusable buffers setting issue in hive data sink (facebookincubator#5433)

Summary:
The reusable partition rows buffer settings for hive data sink
has bug that can cause memory segment fault for partitioned table
write. Here is the input pattern that cause the problem:
1. the initial vector writes to a single partition and we apply the fast
    path optimization which doesn't set the reusable partition rows and
    leave them empty;
2. the second vector writes to multiple partitions except the first partition.
     It will cause problem when we populate the partition rows buffers:
     We have N writers but only N - 1 partition rows.
This PR fix the problem by extending the partition rows buffers when
add new writers and only set partition row buffer capacity if the partition
size is not empty. Add a unit test to reproduce the issue and verified the
fix.

Pull Request resolved: facebookincubator#5433

Test Plan: pranjalssh has verified this fix in presto batch cluster

Reviewed By: pranjalssh, mbasmanova

Differential Revision: D47083727

Pulled By: xiaoxmeng

fbshipit-source-id: 5080478e3eea5e8ca041ead84f003bbd5800b733

Add support for all types to set_agg and set_union (facebookincubator#5445)

Summary:
Add support for complex types (arrays, maps and structs) and missing scalar
types (boolean, real, double, timestamp and date).

Pull Request resolved: facebookincubator#5445

Reviewed By: xiaoxmeng

Differential Revision: D47092227

Pulled By: mbasmanova

fbshipit-source-id: 825773460281866188cd702e4cc3309bf7b993e8

Updating submodules

Summary:
GitHub commits:

facebook/folly@0657a73
facebook/litho@3ca8973
facebook/ocamlrep@41b09a7
facebookexperimental/rust-shed@1651c6e

Reviewed By: bigfootjon

fbshipit-source-id: 658b2aca961df2dd074c91eae2e5b1c0a8ef54e7

Add support for complex types to 3-arg min_by and max_by (facebookincubator#5448)

Summary: Pull Request resolved: facebookincubator#5448

Reviewed By: xiaoxmeng

Differential Revision: D47104260

Pulled By: mbasmanova

fbshipit-source-id: 3875fe0790127ae1e7252ab651d667aa65bda65d

Register Presto between function with Timestamp arguments (facebookincubator#5417)

Summary:
Presto function `between` is not registered with arguments:
(TIMESTAMP, TIMESTAMP, TIMESTAMP). Since the gte and lte is already registered
with Timestamp type, scalar function registration is the only missing part.

Pull Request resolved: facebookincubator#5417

Reviewed By: miaoever

Differential Revision: D47073227

Pulled By: JasmineWeii

fbshipit-source-id: 014f8614ebd9c09c18e0754365391b724251cac5

Fix + New gflag; correctly implement coalesce gflags. (facebookincubator#5095)

Summary:
Replaces PR#4982 to correctly add the coalesce parameters as properly configurable values. Please Review with care. All UT was locally passing.

Pull Request resolved: facebookincubator#5095

Reviewed By: mbasmanova

Differential Revision: D46947539

Pulled By: pedroerp

fbshipit-source-id: 73cfcbec23e18cc66f2d057dbbb26266ac43dffd

Updating submodules

Summary:
GitHub commits:

facebook/CacheLib@97a24b0
facebook/fbthrift@261d27f
facebook/folly@5ad717f
facebook/ocamlrep@ebb5b81
facebook/proxygen@593ffac
facebook/wangle@9b84244
facebook/watchman@6a1d1e8
facebookincubator/fizz@69bdfd9
facebook/mvfst@8a39909
facebookincubator@f37eccc

Reviewed By: bigfootjon

fbshipit-source-id: 4a4c1165a77d2270ff502b69aa47fb1973b0a1dd

Add sum_data_size_for_stats aggregation function (facebookincubator#5362)

Summary: Pull Request resolved: facebookincubator#5362

Reviewed By: xiaoxmeng

Differential Revision: D47100600

Pulled By: kewang1024

fbshipit-source-id: 0c8123350892a205cab94395df5c02eab6364449

Updating submodules

Summary:
GitHub commits:

facebook/folly@a282761
facebook/proxygen@bde3d86

Reviewed By: bigfootjon

fbshipit-source-id: b2899331e041a38c89ef6e9dbd674d95bf1b65ed

Use hive type parser/serializer for remote functions (facebookincubator#5456)

Summary:
Pull Request resolved: facebookincubator#5456

Use hive type parser/serializer for remote functions to make it
interoperable with Java implementations.

Reviewed By: mbasmanova

Differential Revision: D47114390

fbshipit-source-id: a218334f8fab1cca16722b90b796849106aa4d58

Add test for table writer special partition name (facebookincubator#5427)

Summary: Pull Request resolved: facebookincubator#5427

Reviewed By: xiaoxmeng

Differential Revision: D47112961

Pulled By: kewang1024

fbshipit-source-id: 3ca6fb2beb56eafe0b46eccf8fc36fdf6484a545

Avoid resizing child in FieldReference when parent has encoding layers (facebookincubator#5453)

Summary:
Pull Request resolved: facebookincubator#5453

FieldReference used to resize the child vector through BaseVector::ensureWritable when the child size is smaller than its parent size. This resizing was necessary to avoid errors during when wrapping the encoding layers of the struct over its child where indices in unselected rows can point to indices outside the size of the child. However, this resulted in increased memory usage since the child needed to be copied to resize. This avoids the copy by only operating on the non-null selected rows when peeling and wrapping the child so that those invalid indices are never touched. The final step of adding nulls also occurs after wrapping so that nulls can be directly added over the wrapped layer which would be unique and can be modified in place without the need for a copy.

Reviewed By: kagamiori

Differential Revision: D47110861

fbshipit-source-id: 0daea7d6bf947e0e0600f7e79285327091a85e71

Add query id to hive connector to build bucket file name (facebookincubator#5449)

Summary:
In Presto e2e test, we found the bucket file name is not compatible
with one generated by Presto java if some bucket file is missing
(computeFileNamesForMissingBuckets). The difference is that java
use query id to build bucket file name but velox use the task id.
We can just use query id as bucket id are unique under a particular
partition dir so no need to use task id.

Pull Request resolved: facebookincubator#5449

Reviewed By: mbasmanova, kewang1024

Differential Revision: D47107728

Pulled By: xiaoxmeng

fbshipit-source-id: a8b9fc4fc26bc386941fe6cb7af697af5273d198

Improve Decimal to Double cast performance (facebookincubator#5455)

Summary:
Apply Decimal to Double cast at the batch level instead of row-level.
This also improves performance of cast between primitive types.

Pull Request resolved: facebookincubator#5455

Reviewed By: amitkdutta

Differential Revision: D47115411

Pulled By: mbasmanova

fbshipit-source-id: 766994738d49b8d724a95abc79061b3392819964

Extract SetAccumulator into a separate header file (facebookincubator#5467)

Summary: Pull Request resolved: facebookincubator#5467

Reviewed By: pedroerp

Differential Revision: D47123707

Pulled By: mbasmanova

fbshipit-source-id: 074cdb3300d2102ad6b9fc54c4c82aa2e0db4769

Updating submodules

Summary:
GitHub commits:

facebook/folly@a606c63

Reviewed By: bigfootjon

fbshipit-source-id: 474cc11e0ad351814a8b0a22dc4ca977a53096e5

Skip probe input from non-spilled source when build is empty (facebookincubator#5423)

Summary:
Hash probe adds to skip probe input data processing logic for a specific
join types (see skipProbeOnEmptyBuild()), and the build table is empty
and the probe input is read from non-spilled source. This ensures the
hash probe operator keeps running until all the probe input from the sources
have been processed. It prevents the exchange hanging problem at the
producer side caused by the early query finish. Here is event sequence that
can lead into this hanging situation:
1. Hash probe in exchange consumer detects the build side is empty and
    mark the operator to finish
2. Hash probe query finishes which don't receive any new splits (exchange
    producer and no more splits signal) from the coordinator as the query has
    finished
3. Exchange producer still keeps generating data until full and then block there
    as there is no completion (abort) signal from the consumer.

Pull Request resolved: facebookincubator#5423

Reviewed By: pranjalssh

Differential Revision: D47063937

Pulled By: xiaoxmeng

fbshipit-source-id: 78b36d66bb98aea819501d1c61690494113d71ec

Fix: Only compare selected rows in the expression verifier. (facebookincubator#5425)

Summary:
Pull Request resolved: facebookincubator#5425

fix a fuzzer issue.

Under some condition our expression eval can generate output
vectors that are larger than rows.end().  While that might not ideal its
not a bug.

Reviewed By: bikramSingh91

Differential Revision: D47062031

fbshipit-source-id: 8c80e26f45bd0c31caa1c1c28f0ded93d17e004f

Enable Switch in expression fuzzer test (facebookincubator#5470)

Summary: Pull Request resolved: facebookincubator#5470

Reviewed By: bikramSingh91

Differential Revision: D47032652

fbshipit-source-id: 4fd75c68a3490e89ea04af034119afde3e197434

Creating one event base per remote function (facebookincubator#5472)

Summary:
Pull Request resolved: facebookincubator#5472

Event base manager is not setup correctly in Velox integrations like
Prestissimo. For now just creating one event base to drive the loop per
function instance (per Expr instance). In the future we can look into using a
pool of available event bases, but it doesn't seem to affect performance for
now.

Reviewed By: mbasmanova

Differential Revision: D46917299

fbshipit-source-id: 5ced044c310bee7080ed6e37d4669997cfe144bb

Remove incorrect and confusing code from ValueList (facebookincubator#5475)

Summary:
Pull Request resolved: facebookincubator#5475

> totalBytes_ += stream.size();

This code is incorrect because stream.size() returns total capacity of the stream, not how many bytes have been written.

> auto reserve = std::max<int32_t>(1024, std::min<int64_t>(128, totalBytes_));

This is confusing because the result is always 1024.

The min and max need to be swapped above, but doing this runs in to memory management bug. Hence, just replace with 1024 for now.

Reviewed By: kagamiori

Differential Revision: D47135739

fbshipit-source-id: 025b88fb88d5e0971fb0e33fbd3a309a35d6a9c5

Updating submodules

Summary:
GitHub commits:

facebook/fbthrift@76fde46
facebook/folly@3035c90
facebookincubator@195ce28

Reviewed By: bigfootjon

fbshipit-source-id: b055e4cbc28152d535d8fa3164c55f2e79379b97

Updating submodules

Summary:
GitHub commits:

facebook/folly@1f7db18
facebook/ocamlrep@992e28a

Reviewed By: bigfootjon

fbshipit-source-id: 1c5f2347cbc777a6b32372391874fc2dbe9c4f11

Add totalUsedBytes API for MemoryAllocator (facebookincubator#5474)

Summary:
Add a common interface to MemoryAllocator to retrieve the currently used bytes for the allocator. This will be used for exposing global memory usage to MemoryManager

Pull Request resolved: facebookincubator#5474

Reviewed By: xiaoxmeng

Differential Revision: D47134955

Pulled By: tanjialiang

fbshipit-source-id: 2294de1e1188d19a1d21450d57451a6853524bdd

Updating submodules

Summary:
GitHub commits:

facebook/folly@1355605
facebook/ocamlrep@176acef
facebookincubator@0e1ddb9

Reviewed By: bigfootjon

fbshipit-source-id: f4c41606a0d5ac00fc4d32c648e218689b7613d7
@652053395
Copy link

Force [email protected] path. #5426

This is happening because deps/include/openssl links to openssl -> ../Cellar/openssl@3/3.1.1/include/openssl.

hi, I still meet this problem , " error: 'HMAC_CTX_free' is deprecated ", ask for your help

@kgpai
Copy link
Contributor

kgpai commented Jan 9, 2024

@652053395 Please add some more details , like your cmake logs etc so we can help investigate. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

7 participants