Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support passing tables and closures to new threads #708

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

cshuaimin
Copy link

Hi luvit team, first thank you for this great project! Luv makes it possible for calling libuv functions in Neovim!

In this PR I changed thread_arg_* functions to support table and function types to new_thread(), queue_work() and async_send() functions.

Althrough in some feature requesting issues you said that not supporting tables is intentional (I can't find the exact reply anymore), I believe it's a very important feature to have. It will unlock possibilities like multi-threaded Neovim plugins, off-loading some CPU intensive work to a thread pool. I sincerely invite you to re-think this feature.

I mainly translated Neovim's deepcopy() function to C. It uses a cache table to properly support reused table fields.

The function type supports upvalues. Supporting functions is needed because it allows Lua OOP values to be passed to new threads.

Testing: I added test and passed valgrind, and I've tested this branch in Neovim. However not tested in Lua versions other than LuaJIT.

@SinisterRectus
Copy link
Member

SinisterRectus commented Jul 7, 2024

I can't find the exact reply anymore

Some context:

Adding table and function serializers to luv is intriguing, but they do add complexity that has been considered undesirable.

I'm not proficient enough in C to review this code. I will opine that these serializers should not be added based on the prior discussions, but I an open to being convinced otherwise.

My opinion is that, because coroutines exist, luv threads should be used sparingly in cases where states do not need to be shared across them.

Overhead and complexity aside, I'm also concerned about this adding pass-by-copy behavior for users who may not expect it.

Could you maybe show an example of using this in neovim? I'm curious to see what code would benefit from threading where copying states is important.

@cshuaimin
Copy link
Author

cshuaimin commented Jul 7, 2024

I totally understand your concerns. I'm not very sure whether this is the best solution too. However in order to use the full power of CPU cores, in my opinion this complexity is unavoidable.

I have a nvim plugin that searches a file with tree-sitter. It can only search one file. To support searching multiple files under a directory, the plugin needs to parse hundreds of files with tree-sitter, which may block nvim UI for several minutes. If I can pass tables and functions (actually the vim.treesitter.LanguageTree object) to Luv threads, then I can move parsing off main thread. Lua coroutine does not help here, because it's pure CPU work.

Aside from my own plugin, nvim's tree-sitter highlighting also runs on main thread. Opening or eding a large file with lots of tree-sitter injections was quite slow. Popular nvim distros disable tree-sitter highlights in large files. Nvim devs has already made this faster. With this PR it may be possible to make nvim UI even more responsive.

@cshuaimin
Copy link
Author

Here is the relevant code in my plugin if you are curious. It's still experimental code.

@truemedian
Copy link
Member

in my opinion this complexity is unavoidable.

This complexity is certainly avoidable in Luv. The solution you've implemented here is nothing more than a serialization and de-serialization of both tables and functions types.

Nothing is stopping you from doing this serialization outside of luv, potentially even serializing the respective table to a string entirely within Lua.

In my opinion this change is unnecessary and adds even more confusion around how luv threads work, because tables and functions are passed through the thread boundary by copy instead of by reference, which is the opposite of how the types behave normally.

From a user-facing perspective, this invisible copy of the types introduces the wrong behavior (eg. changing a table inside the thread does not change the table outside the thread; functions with upvalues behave strangely, because all upvalues are copies not references to their original values).

The most important problem I'd like to bring up is that this introduces incorrect and invalid behavior when serializing any function associated with a C library; most libraries are not thread-safe, and without a significant amount of added effort userdata will not get the correct metatable because the library was never required in the new Lua state and the metatables will not be present in the registry.

@cshuaimin
Copy link
Author

because tables and functions are passed through the thread boundary by copy instead of by reference, which is the opposite of how the types behave normally

The copy behavior isn’t that novel, the web worker API postMessage() has exactly the same deep copy semantics. We can tell users clearly the behavior in the documentation.

most libraries are not thread-safe

I had planned to fix this in a follow up PR if you are OK with this one. The idea is that to mak userdata thread safe, Luv users should 1) make sure the functions use locks properly when accessing the userdata, 2) use some reference counting mechanism to close the resource once (shared ownership). For 2) we can call a __luv_thread_copy() like method on userdata when deep copying. In the method users increase the reference counter. For example in my user case, to transfer a tree-sitter userdata I will call ts_tree_copy() in this method.

serializing the respective table to a string entirely within Lua

In Lua it’s impossible to handle userdata type, and too complex for plug-in authors. We had the API to create threads for years, but there was’t a single multi-threaded plugin for nvim.

the metatables will not be present in the registry

At least Neovim registers metatables in every Lua state, via acquire_vm_cb().

@Bilal2453
Copy link
Contributor

I am not in favor of adding this into Luv, I also believe it is an avoidable complexity that is better off being handled in Lua. But also admittedly that won't be as straight forward, and as mentioned handling userdata on the Lua side is going to be tricky to say the least.

But also threading in Luv (and Lua generally) is really lacking, and just barely works, personally I think due to the design of Lua it is really unfavorable to do any extensive threading besides crushing some numbers/strings, so any work on that side is appreciated.

How about adding support for this but outside of new_thread(), queue_work(), async_send(). For example adding new new_thread_copy, queue_work_copy, async_send_copy. This will hopefully make it very clear what it is doing with the data to not confuse the user, and at the same time it isolates the unwanted complexity into its own experimental space. Will also ensure that we don't accidentally break already working code.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants