Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Speed Improvements #30

Open
blutooth opened this issue Aug 9, 2017 · 5 comments
Open

Speed Improvements #30

blutooth opened this issue Aug 9, 2017 · 5 comments

Comments

@blutooth
Copy link

blutooth commented Aug 9, 2017

Hey guys,
Thanks for the great implementation, has really helped visualise data that I'm working on.
However, I can see two ways in which the implementation could be improved.
(1) NMSLib can compute K-NN trees 10 times as quickly as Annoy, and allows for 10 times as many queries per second.
(2) When computing the objective of the model, you could use a GPU library like pytorch and batch compute. This might speed up the calculations by a big factor if you can allow for large batches.

I'd be willing to work on this as a project, if you guys are up for it. I'm not sure about the optimisation tricks you've used in (2) training the objective, so would be less likely to try to implement by myself.

Thanks,
Max

@buaawht
Copy link

buaawht commented Dec 17, 2017

Hi, there is a issue puzzled me. You know that in LargeVis -fea means specify whether the input file is high-dimensional feature vectors (1) or networks (0). Default is 1. I have a file of feature vectors, and when i use LargeVis i can get a set of 2-D vectors. I want to know the node-sequence of original feature is in keeping with the generated 2-D vectors or not. I try to read the source code, but I cannot get the answer. Thank u.

@blutooth
Copy link
Author

Yes, it is in keeping.

@bigheiniu
Copy link

Have you ever guys implemented this based on spark platform? I would like to implement largevis in a distributed systems to fit the large amount of data.

@tangjianpku
Copy link
Collaborator

tangjianpku commented Jan 17, 2018 via email

@bigheiniu
Copy link

@blutooth In the second suggestion, do you mean use batch gradient relied on GPU rather than sgd in previous code to speed up?
And could you give some hints or reading materials about the second advice?
Thanks

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants