-
-
Notifications
You must be signed in to change notification settings - Fork 263
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Feature: autogenerate profiling code for packer_compiled #221
Conversation
🙌🏿 this looks great @wbthomason, regarding some of the todos:
string.format('Chunk name: %s -------- Time in ms %d ----------- metadata %s)
|
@wbthomason just revisited this today since I'm seeing a lot of slowness in my config I use plenary's profile module to check
I think something like this will be quite useful since as you can see from the profile there are quite a few lua plugins doing expensive things on startup. A user could use something like this to start to figure out what's going on and what needs optimisation or an issue raised on a plugin's repo |
@akinsho I implemented a basic It would probably also be good to add line number information to the profile output. |
@wbthomason I think the question of if it should go in every file is one of the cost of doing so. If it's negligible than I think its useful not to have an extra layer that someone would have to discover in order to unearth this feature. If it's costly then a flag of some kind is a good idea. I think it'd be cool to just point a user at the compiled file so they could just inspect profile data without a bunch of faff but I'm guessing the profiling maybe adds time onto startup so is at least a little costly |
Ah, do you mean outputting the profile information as a comment to the end of the compiled file or something? That would be neat. I should probably also implement sorting. Regarding the overhead of profiling: I should run experiments to see if this is significant, but the basic overhead is 2 function calls per timed "chunk". Each of these calls is pretty lightweight - basically gets the time and accesses a table with string keys - but this could add up. |
@wbthomason I just tried this out since I'm really trying to get on top of my config's startup time which is a constant point of annoyance for me 😅 . It works quite well I had a I could push some commits to this branch to work on tidying up the formatting a bit if you're open to that. Was thinking that if maybe that's too much for a firs pass at this though 🤷🏿 |
@wbthomason I tinkered with this a little locally to get it into a display. I also changed the sort order to slowest first. I was immediately able to pick up a buggy personal plugin I'm developing as the culprit of my startup time which was super helpful. I think this is quite valuable with a few little tweaks. I can push (well probably a branch into your branch since I can't push to your branch) what I changed locally it's not a lot. I could potentially add some syntax rules for just this buffer to make the times and types of event stand out a bit better. Regarding usage I think using the EDIT: then again perhaps profiling is a niche enough thing to want to do to warrant specifying something in your config, alternatively thinking about plugins like startuptime that don't require anything like the above maybe there can just be a single command that runs |
@akinsho This all sounds good, thanks for tackling the formatting (I admit, output formatting/display is some of my least favorite code to work on for I'm happy to just add you as a collaborator on the repo, if you'd like - you do a lot of work for |
Regarding how to enable/disable the feature: I think I favor a |
Oh, and for the concern over multiple sourcing: we can add a global flag to prevent the issue. |
Sounds good @wbthomason I'd be happy to be added as a collaborator, it's an interesting problem space and I'd like to help out when I can. In which case I'll tidy up what I've got locally and push to this branch. I can add in the config variable as well as the command + argument. Regarding the multiple sourcing thing I guess it's more a question of if you can reliably re-run |
This is so a user can set in their config if packer should profile by default or per command
4a3d493
to
c365b67
Compare
@wbthomason heads up that I've rebased this on master and added in my changes. So a user can set config = {
--- blah
profile = {
enable = true
}
} as well as :PackerCompile profile=true
:PackerCompile profile=false I decided to go with One issue I've just noticed though is that the results aren't consistent across multple runs e.g. Start Anyway this is still pending some syntax to make it a little nicer to read and maybe confirming the times are consistent. |
Hmm, I am presumably doing something wrong, but following your example ("Start |
@wbthomason you need to set the enabled Boolean in your config or run |
Right, I did that - hence the generation of the profiling code. Should there be a display with the results? |
So the full flow would be compile with profile true as you've done the run |
Ahhh I think it might relate to my use of
|
Ok I'm convinced now that it relates to how the quoted command works, when I run the compile with the flags in an already started neovim it works fine, think that's an edge case to investigate later if necessary maybe. I've updated the docs as well so I think generally this fine, might be missing something. Think we can always iterate once it's out granted there are no lingering bugs or anything. |
Cool, thanks. The only thing I think we still might want to consider before merging is whether we should always output the profiling code and conditionally enable it (as the current PR does) or conditionally output always enabled profiling code (e.g. if the profiling code imposes a significant slowdown on startup). I'll try to run some tests on my setup tonight to see if there's a significant difference (with profiling disabled) between my startup time on this branch or |
I personally think it would be great to implement without the need to recompile. I find having to think about whether or not my compiled file is right or not unwieldy. Would be nice to remove that obstruction. But I get the need to be mindful of performance. |
I agree, but since I'm already intending to move away from manual recompilation (by introducing the plugin manifest we talked about in #248 (comment) etc.), I'm slightly more OK with operations that require a recompile, as that'll be made automatic soon (I'm hoping to pull together a draft PR for the manifest + automatic recompilation this weekend, at least the start of those changes). This could also reasonably be changed in a follow-up PR, but I'd like to see if it's even an issue first (i.e. if there's any significant overhead added). |
....actually, just did some preliminary time sampling, and I'm seeing at most ~1ms overhead for profiling enabled (not even disabled but profiling present). If you can confirm that there's similarly basically no overhead on your end, I'm happy to merge as-is. |
Those changes sound interesting, and like a real quality of life improvement. Regarding the changes here, by merging as is, do you mean with the compiled profiling always being present since at the moment it isn't it or do you mean taking out the conditional compilation? I was actually wondering idly, why technically does packer need a compiled file at all. Think I've missed some steps but most of the lazy loading is via fake commands, mappings and autocommands which can all be created on startup? |
Isn't the compiled profiling always present right now? I don't see a switch to e.g. make
So, you're absolutely right that you can generate all of those on startup, and this is indeed what e.g. Part of my motivation for designing I think there's a fair argument that the compiled file has, for a solid chunk of users, been more trouble than it's worth so far. I think that once compilation no longer needs to be manual, though, this will change. |
@wbthomason thanks for the explanation sounds like solid reasoning, can definitely see the argument for decoupling optimisations to a compilation stage separate from the startup. You're right re. The compiled timed functionality I guess it's just whether or not saving it is enabled or not which begs the question if we leave this in why bother with enabling or disabling it anyway. It's always running so a user can just decide to look at it or not. We can just skip all that complexity? EDIT: to clarify my thoughts here a bit better I mean that basically we are already doing the work of profiling the only difference that adding this flag does is whether or not the output, which we've already got but not collated, gets saved to the global variable or not. In which case why bother with a config flag and arguments to compile if ultimately we are already always doing almost all the work. If a user doesn't want to see profile data then they just won't use |
Just ran this with profiling enabled and disabled and measured using EDIT: it's worth noting that my startup time fluctuates quite a bit between runs of this plugin under normal circumstances anyway so it's hardly scientific 😆 |
Great, thanks! I want to make a couple of small tweaks (e.g. printing a helpful message if there's no profiling data), then I think we're OK to merge.
Huh, this surprises me. My setup's startup time is quite stable (~1-2ms usually) under
Yeah, given that the overhead seems to be super minimal, I'm pretty OK with this. I'll push some changes to make this so. |
Doesn't your start up time depend on the amount of buffers you have open?
Like depending on your last session
…On Sat, Apr 10, 2021, 22:10 Wil Thomason ***@***.***> wrote:
Just ran this with profiling enabled and disabled and measured using
dstein/startuptime and not seeing any change over a ms in my startup time.
Great, thanks! I want to make a couple of small tweaks (e.g. printing a
helpful message if there's no profiling data), then I think we're OK to
merge.
it's worth noting that my startup time fluctuates quite a bit between runs
of this plugin under normal circumstances anyway so it's hardly scientific
Huh, this surprises me. My setup's startup time is quite stable (~1-2ms
usually) under startuptime. Do you have nondeterministic parts of your
config or something similar?
If a user doesn't want to see profile data then they just won't use
PackerProfile otherwise it's always there.
Yeah, given that the overhead seems to be super minimal, I'm pretty OK
with this. I'll push some changes to make this so.
—
You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
<#221 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AD7WGR36UNOPY3TD335MP3TTIC5EZANCNFSM4XXAOXNA>
.
|
@dagadbm I don't think your startup time does depend on open buffers since it should normally be time till vim opens the first buffer according to |
I use session plugins yeah.
So my slow down is mostly one plugin that is executing on an open buffer
then. I get a random error with no stack trace and no way to understand wtf
is happening and where it's coming from. It's madening
…On Mon, Apr 12, 2021, 09:36 Akin ***@***.***> wrote:
@dagadbm <https://github.com/dagadbm> I don't think your startup time
does depend on open buffers since it should normally be time till vim opens
the first buffer according to :h --startuptime sourcing a session is
something that isn't done by default so vim shouldn't normally open with a
bunch of buffers. Unless vim is started with -S or a plugin sources the
session file vim startup wouldn't include sessions
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#221 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AD7WGR42VAOTX2HD3BLJKZTTIKWIRANCNFSM4XXAOXNA>
.
|
@dagadbm If this slowdown is @akinsho I tested the overhead of enabled profiling on another machine, and saw a ~10ms slowdown, which is pretty significant. I'm trying to think about how to enable/disable profiling without requiring a recompile to work around this, before merging. |
Wow @wbthomason that's much larger than I would have thought likely, was it an older machine? it's just checking the time for the most part and doing a very small loop 🤔 . Anyway if it might impact performance that much makes sense to see if there's a way to avoid that. |
Not that old (a few years) - and this was measured between starting Neovim with profiling enabled and with profiling disabled. I should confirm that the slowdown wasn't some kind of fluke - I've as yet been unable to find a way to enable profiling without e.g. using an environment var or requiring a recompile. |
@wbthomason well since we are already in the recompilation track and that's what users know then I guess the solution would be to not compile in the timing code unless profile is set to true? Not sure how easy that will be to do but I if that's the path we have to go down I think it's fair enough to get this out in some form. Maybe it can be optimised later or your recompilation changes will make the compilation part less of of an overhead |
This works for me! I haven't noticed a significant performance impact on the startup time. I decided to use this because I was wondering why my neovim was taking ~350ms to start. This PR pointed the way and resulted in kabouzeid/nvim-lspinstall#38 being created. |
That's awesome @nogweii was actually looking into that i.e. lspinstall as well using this PR but hadn't got round to doing anything, glad you were able to track that down and get it fixed 🚀 . @wbthomason I could push some changes to this to make |
@akinsho The overhead is only there if we have profiling enabled always; we currently Also @nogweii I'm glad to hear this is already being useful! |
🤦🏿♂️ yep forgot that it already is/was a noop if disabled in which case I think it's fine to merge as is. I think people can decide whether or not to use it if they are seeing an impact in startup time/ just want it temporarily and as you say the ergonomics can be improved over time 👍🏿 |
Sounds good to me. Thanks for all the help! |
This PR adds timing code wrapping each "chunk" of the compiled output, which runs iff the environment variable PACKER_PROFILE is present and true. This allows a simple output of timing information for each part of the compiled loaders.
This is still WIP. The main things it needs are:
timed_chunk
function, which is simple except for figuring out what name to use for each chunk[ ] Concatenate output to compiled filePinging @akinsho re: discussion on #191.