Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: Fail on infinite recursion in encode.js #2099

Open
wants to merge 10 commits into
base: alpha
Choose a base branch
from

Conversation

ajmeese7
Copy link

Issue

Closes: #2098

Approach

Implemented the changes suggested by @mtrezza. Will give some clarity to the issues being encountered, and may illuminate further test cases that cause issues that can be handled in a better way in the future.

Tasks

Copy link

I will reformat the title to use the proper commit message syntax.

@parse-github-assistant parse-github-assistant bot changed the title fix: fail on infinite recursion in encode.js fix: Fail on infinite recursion in encode.js Mar 25, 2024
Copy link

Thanks for opening this pull request!

src/encode.js Outdated Show resolved Hide resolved
Copy link

codecov bot commented Mar 25, 2024

Codecov Report

Attention: Patch coverage is 96.29630% with 1 lines in your changes are missing coverage. Please review.

Project coverage is 99.96%. Comparing base (72bc9ac) to head (fe3a1bc).
Report is 15 commits behind head on alpha.

❗ Current head fe3a1bc differs from pull request most recent head a7333c4. Consider uploading reports for the commit a7333c4 to get more accurate results

Files Patch % Lines
src/encode.js 93.33% 1 Missing ⚠️
Additional details and impacted files
@@            Coverage Diff             @@
##            alpha    #2099      +/-   ##
==========================================
- Coverage   99.98%   99.96%   -0.02%     
==========================================
  Files          61       61              
  Lines        6185     6206      +21     
  Branches     1499     1503       +4     
==========================================
+ Hits         6184     6204      +20     
- Misses          1        2       +1     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

Also included support for the `traverse` function, which experienced the same issue.
src/unsavedChildren.js Outdated Show resolved Hide resolved
@mtrezza
Copy link
Member

mtrezza commented Apr 2, 2024

After looking at the encode method again I wonder whether we really need to introduce a max limit of recursive calls, or whether encoding can be "fixed" to properly handle recursive properties. The method already has a seen argument which seems to be used for a similar purpose. If an object has a circular reference, then maybe it could just stay circular in its encoded form. That is, if we consider a circular reference to make sense in the first place. In fact, we even have a existing test for circular pointers, does not stack-overflow when encoding recursive pointers.

@mtrezza mtrezza mentioned this pull request Apr 3, 2024
4 tasks
@ajmeese7
Copy link
Author

@mtrezza found a method that seems to be working on my end with complex objects containing cyclical references, I will be able to fully test by installing locally from my branch later tonight. Will update once I have done so.

Let me know if there's anything else you want changed and I can take a stab at it.

@ajmeese7 ajmeese7 requested a review from mtrezza April 11, 2024 12:07
Comment on lines +1 to +21
/**
* Helper function that turns a string into a unique 53-bit hash.
* @ref https://stackoverflow.com/a/52171480/6456163
* @param {string} str
* @param {number} seed
* @returns {number}
*/
export const cyrb53 = (str, seed = 0) => {
let h1 = 0xdeadbeef ^ seed, h2 = 0x41c6ce57 ^ seed;
for (let i = 0, ch; i < str.length; i++) {
ch = str.charCodeAt(i);
h1 = Math.imul(h1 ^ ch, 2654435761);
h2 = Math.imul(h2 ^ ch, 1597334677);
}
h1 = Math.imul(h1 ^ (h1 >>> 16), 2246822507);
h1 ^= Math.imul(h2 ^ (h2 >>> 13), 3266489909);
h2 = Math.imul(h2 ^ (h2 >>> 16), 2246822507);
h2 ^= Math.imul(h1 ^ (h1 >>> 13), 3266489909);

return 4294967296 * (2097151 & h2) + (h1 >>> 0);
};
Copy link
Member

@mtrezza mtrezza Apr 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We cannot accept this algo (or derivations thereof) due to its problematic usage rights situation. The author has claimed to have made this "public domain", but such a claim is invalid in some jurisdictions. I have not found an OS license attributed to this code by the author. If you have found that anywhere, please provide the link.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would this solution be satisfactory?

bryc/code#23 (comment)

Copy link
Member

@mtrezza mtrezza Apr 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the effort, but unfortunately the underlying issue is still unaddressed. I appreciate the author's rooting for public domain contributions. However, the issue is a legal one. Unless there's an official OS license attributed to the code, we cannot accept it. We also cannot accept any "self-written", non-official OS license on similar grounds.

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mtrezza is the MIT license addition satisfactory?

Copy link
Member

@mtrezza mtrezza Apr 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, but SO is not sufficient as source or link to the author. Has this been published anywhere on GitHub, and can this be added as a dependency? Because if we add code like this it won't be maintained by the author and the maintenance is on us. I'm not sure whether we should maintain this piece of code if there are libs for hash functions out there that are regularly maintained.

Why is this hash code even required in this PR? Are there no native Node hash functions we can use instead? This looks odd.

output[k] = encode(value[k], disallowObjects, forcePointers, seen, offline);
try {
// Attempts to get the name of the object's constructor
// Ref: https://stackoverflow.com/a/332429/6456163
Copy link
Member

@mtrezza mtrezza Apr 13, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Same topic as earlier. If you copy/pasted someone else's code from SO then we cannot accept it.

Edit: Now that I looked at the link, this is a basic JS feature, there is really no need to reference SO here.

However, the above still holds true. I'd ask you to review the whole PR at this point and identify any code that has been copied from somewhere else, so we can investigate whether we can accept it.

Copy link
Member

@mtrezza mtrezza left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have more and more doubts that we can accept this PR without a performance comparison. Encode is such a central code that any changes here can have a significant yet subtle impact on Parse Server performance. Perf issues are tough to analyze after the fact, so I'm somewhat hesitant.

@ajmeese7
Copy link
Author

A fix might still be worth implementing here, but I don't see myself having the cycles to work on this PR in the near future.

HOWEVER, I did figure out a cause of all these errors in the legacy code I inherited, if it helps anyone else. The cloud functions were returning the response of an Axios GET request, and the response object has infinite recursion in its properties with sockets. What I had to do was just parse out the data.result properties from the response instead of sending them across the wire with Parse Server, then I was good to go :)

Feel free to close this out if you want, so someone else can come in with a different implementation that you would prefer.

@mtrezza
Copy link
Member

mtrezza commented Apr 29, 2024

It would be too bad, since you've already put so much work into it. Thinking about it, adding the recursion limit that you implemented initially may be the least intrusive one and I'd think poses the least risk to performance. It's the simplest fix. But only if we set the limit ridiculously high (10k?), we can treat it as a non-breaking change, if we assume that nested objects with 10k levels deep are rather unlikely.

On the other hand the solution you implemented would not require the limit and properly handle recursions, as we already do in other parts of the code. But with hash generation, etc we run into perf issues, so we may need a perf comparison test. In fact we have parse-community/parse-server#7610 for that.

@mtrezza
Copy link
Member

mtrezza commented Jun 26, 2024

Picking this up, I think there are 2 ways to move forward:

  1. Merge the simplified version where the recursion is simply counted and limited to a fixed amount. This is unlikely to have any significant perf impact, as it's just a var counter. If we set the recursion count limit high enough we also don't have to consider it a breaking change. Probably up to d33a54a.

  2. Properly handle recursion similar to how recursion is currently handled in that part of the code. That would be similar to what the current state of the PR is, but requires addressing open issues such as hash algo.

I would be hesitant to merge (2) without having a perf comparison in our CI, because of #2099 (review). So for now we could go ahead and merge (1), so we at least fix the issue of infinite recursion.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Infinite loop in encode.js
2 participants