-
-
Notifications
You must be signed in to change notification settings - Fork 267
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Storage: Reject updates that would exceed database limits #2370
Comments
I suspect this issue might be related to the BaseX node limit (#902). However, it is difficult to determine this from the error message alone. |
True; the amount of data exceeds the limits of a single database instance. A common approach is to distribute documents across multiple database instances (all of which can be addressed by a single query). However, we need to be more consistent in rejecting update operations when the database limits would be exceeded by that update. |
The fundamental issue of this ticket—namely, that the error message does not clearly indicate the root cause—has not been resolved. Therefore, I will keep this issue open. @ChristianGruen Thank you for your advice! Based on your suggestion, I experimented with distributing documents across multiple databases and found that it is possible to perform cross-database statistics if the paths of all XML documents remain unique. Additionally, by binding each Here is an example demonstrating this approach: > basex
BaseX 11.6 [Standalone]
Try 'help' to get more information.
> CREATE DB multidb-01
> OPEN multidb-01
> ADD TO 1.xml <root><item a1='hoo11'>bar1</item><item a1='hoo12' /></root>
> ADD TO 4.xml <root><item a1='hoo41'>bar4</item><item a1='hoo42' /></root>
> CREATE DB multidb-02
> OPEN multidb-02
> ADD TO 2.xml <root><item a1='hoo21'>bar2</item><item a1='hoo22' /></root>
> ADD TO 5.xml <root><item a1='hoo51'>bar5</item><item a1='hoo52' /></root>
> CREATE DB multidb-03
> OPEN multidb-03
> ADD TO 3.xml <root><item a1='hoo31'>bar3</item><item a1='hoo32' /></root>
> ADD TO 6.xml <root><item a1='hoo61'>bar6</item><item a1='hoo62' /></root>
>
> # expected result
> XQUERY let $x1 := collection('multidb-01') let $x2 := collection('multidb-02') let $x3 := collection('multidb-03') for $x in ($x1, $x2, $x3) return db:path($x)
1.xml
4.xml
2.xml
5.xml
3.xml
6.xml
>
> XQUERY let $x1 := collection('multidb-01') let $x2 := collection('multidb-02') let $x3 := collection('multidb-03') for $x in ($x1, $x2, $x3) return $x[db:path($x)!='']/root/item[ends-with(@a1, "1")]
<item a1="hoo11">bar1</item>
<item a1="hoo41">bar4</item>
<item a1="hoo21">bar2</item>
<item a1="hoo51">bar5</item>
<item a1="hoo31">bar3</item>
<item a1="hoo61">bar6</item> While this approach is less efficient, it seems feasible for distributing documents across multiple databases. However, I encountered unexpected results when attempting more intuitive queries. The execution engine appears to be processing them in a way that leads to unintended behavior, but I couldn't pinpoint the exact cause. > # unexpected result (1)
> XQUERY for $x in (collection("multidb-01"), collection("multidb-02"), collection("multidb-03")) return db:path($x)
1.xml
4.xml
1.xml
4.xml
1.xml
4.xml
>
> # unexpected result (2)
> XQUERY let $x1 := collection('multidb-01') let $x2 := collection('multidb-02') let $x3 := collection('multidb-03') for $x in ($x1, $x2, $x3) return $x/root/item[ends-with(@a1, "1")]
<item a1="hoo11">bar1</item>
<item a1="hoo41">bar4</item>
<item a1="hoo21">bar2</item>
<item a1="hoo51">bar5</item>
<item a1="hoo21">bar2</item>
<item a1="hoo51">bar5</item>
>
> # unexpected result (3)
> XQUERY for $doc in ("multidb-01", "multidb-02", "multidb-03") let $elm := collection($doc)/root/item[ends-with(@a1, "1")] return $elm
<item a1="hoo11">bar1</item>
<item a1="hoo41">bar4</item>
<item a1="hoo11">bar1</item>
<item a1="hoo41">bar4</item>
<item a1="hoo11">bar1</item>
<item a1="hoo41">bar4</item>
> It appears that the issue stems from how the queries are being optimized or executed internally. If you have any insights into why this is happening or suggestions for improving efficiency, I would greatly appreciate it. Thanks again for your help! |
@advanceboy Thanks for your new observation. Feel free to create a new issue for it… Ideally, with an example that can be reproduced, but I imagine it may take a while to formulate it. The I’ll keep the original issue open, with a slightly updated title. |
I've created #2373 about #issuecomment-2624910884 |
Description of the Problem
When adding a total of 1,000,000 XML files (total size: 38GiB) using the
add
command, ajava.lang.RuntimeException: Data Access out of bounds
exception occurs.Expected Behavior
Actual Behavior
In step (2), adding dir2 and dir3 results in the following error:
In step (4), adding dir1 results in the following error:
Steps to Reproduce the Behavior
create-db
command.add
command.create-db
command.add
command.Do you have an idea how to solve the issue?
No response
What is your configuration?
The XML files are divided into the following three directories:
OS: Windows 11 22H2
The text was updated successfully, but these errors were encountered: