Skip to content
This repository has been archived by the owner on Aug 15, 2019. It is now read-only.

Add cholesky and gradients #1492

Open
wants to merge 2 commits into
base: master
Choose a base branch
from

Conversation

ethanluoyc
Copy link

@ethanluoyc ethanluoyc commented Jan 10, 2019

This change is Reviewable

@googlebot
Copy link

Thanks for your pull request. It looks like this may be your first contribution to a Google open source project (if not, look below for help). Before we can look at your pull request, you'll need to sign a Contributor License Agreement (CLA).

📝 Please visit https://cla.developers.google.com/ to sign.

Once you've signed (or fixed any issues), please reply here (e.g. I signed it!) and we'll verify it.


What to do if you already signed the CLA

Individual signers
Corporate signers

@DirkToewe
Copy link
Contributor

DirkToewe commented Jan 10, 2019

@ethanluoyc Have You looked at PR#1356? It already contains a Cholesky Decomposition. Your implementation is a lot cleaner and uses TFJS methods as building blocks, which is a big upside. But as of yet, PR#1356 seems to be significantly faster. The Benchmark is done on CPU backend since PR#1492 runs faster there than on WebGL on my machine. Feel free to double check the benchmark since I'm known to make mistakes and mix things up.

P.S.: I hope its okay that I included Your code in the Benchmark. I specifically mentioned the source and exempted it from the license.

@ethanluoyc
Copy link
Author

ethanluoyc commented Jan 10, 2019

Hi @DirkToewe,

Thanks for letting me know your PR! I checked the issues page but didn't see any mentions of cholesky so I thought no one has tackled this.

The version currently I have here is WIP. One downside is indeed the memory issue. I think mine has not been optimized for that yet. I guess maybe we can work together on implementing cholesky and merge it into the codebase.

BTW, do you have the gradients implemented? I have a preliminary version which I think is correct (more testing needed still). Another thing I haven't implemented is the blocked version of the algorithm, same for the gradients. I need the differentiable version because I use it as part of my models (Gaussian processes).

Updates
@DirkToewe I took a closer look at your benchmarks. I appear to be using the algorithm from wikipedia? I am using the one mentioned in https://homepages.inf.ed.ac.uk/imurray2/pub/16choldiff/choldiff.pdf, which theoretically can be made faster by blocking and in-memory updates. I currently do not know the best way to implement them
idiomatically in tfjs but I guess maybe someone from tfjs can advise?

@googlebot
Copy link

CLAs look good, thanks!

@DirkToewe
Copy link
Contributor

Yes gradients are implemented symbollically, see here. But it requires quite a few new operations (bandPart, triangularSolve, setDiag, ...) to be introduced, which is why the PR became so massive.

@DirkToewe
Copy link
Contributor

Updates
@DirkToewe I took a closer look at your benchmarks. I appear to be using the algorithm from wikipedia? I am using the one mentioned in https://homepages.inf.ed.ac.uk/imurray2/pub/16choldiff/choldiff.pdf, which theoretically can be made faster by blocking and in-memory updates. I currently do not know the best way to implement them
idiomatically in tfjs but I guess maybe someone from tfjs can advise?

That's correct. It was the implementation of choice for me because:

  • it is simple and
  • it calculates elements in 64-bit before rounding to float32
    (In NDJS, Kahan Summation was used to potentially improve float64 precision as well).

But there's no doubt there's room for performance improvements. If You want to do blocked operations, You're going to have to work with nested loops. As an example for that, take a look at the CPU MatMul implementation.

But a slight word of warning: From my experience, micro-optimizations will not yield as much performance as You may hope for.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants