-
Notifications
You must be signed in to change notification settings - Fork 15
Write_Performance
In addition to storing native (pickle) data, newt converts native data to JSON and saves it. I used zodbshootout to evaluate the impact of saving this extra data. The tests were done:
- A Mac with a 2.3 Ghz Intel I7 quad core processor with 16GB of RAM and an SSD disk.
- Python 3.5
- History-free RelStorage configuration. This is much faster than history-preserving (and should be the default, and is the default configuration for Newt DB.)
For the initial test, I compared RelStorage with Newt with the default JSON index. Here are the results for various shootout settings. The values are number of objects read or written, so larger numbers are better.
concurrency: 2, object size, 96
What | RS | Newt |
---|---|---|
"Add 1000 Objects", | 13027, | 5182 |
"Update 1000 Objects", | 12140, | 5074 |
"Read 1000 Cold Objects", | 10567, | 10283 |
concurrency: 4, object size, 96
What | RS | Newt |
---|---|---|
"Add 10 Objects", | 2411, | 1699 |
"Update 10 Objects", | 2465, | 1819 |
"Read 10 Cold Objects", | 9034, | 8881 |
concurrency: 4, object size, 999
What | RS | Newt |
---|---|---|
"Add 10 Objects", | 1831, | 1298 |
"Update 10 Objects", | 1862, | 1372 |
"Read 10 Cold Objects", | 7548, | 7404 |
concurrency: 8, object size, 999
What | RS | Newt |
---|---|---|
"Add 10 Objects", | 1218, | 1110 |
"Update 10 Objects", | 1146, | 1041 |
"Read 10 Cold Objects", | 10052, | 9795 |
concurrency: 16, object size, 999
What | RS | Newt |
---|---|---|
"Add 10 Objects", | 1555, | 1061 |
"Update 10 Objects", | 1939, | 1071 |
"Read 10 Cold Objects", | 9202, | 7977 |
Some things to note:
-
At concurrency 16, the Mac's CPU was often fully consumed.
-
Newt is significantly slower than regular Postgres.
-
Times can vary quite a bit, so only really large differences should be considered to be significant. In the results above, Newt reads seem to be slightly slower, but results in the next section show the opposite and the regular RelStorage results vary quite a bit from those above.
-
The default zodbshootout settings use vary large transactions consisting of extremely small objects. In my experience, transactions typically involve smaller numbers of larger objects.
Note that ZODB applications that make aggressive use of ZODB-based catalogs will often involved larger numbers of objects due to the many BTrees that must be updated. Newt applications won't use ZODB-based catalogs and will tend to have much smaller transactions.
-
Newt adds 2 additional components of load:
- On the client: computation of a JSON representation
- On the server: saving data to an additional table and updating a JSON index.
For throughput, client computations aren't as problematic, because they can happen in parallel. Zodbshootout measures throughput and we can see smaller differences between Newt and regular RelStorage as concurrency increases, until CPU resources are exhausted.
Most applications are likely to use additional indexes, especially text
indexes. A text index was added on the data
attribute of the test
data to look at the impact of the additional indexes:
concurrency: 4, object size, 96
What | RS | Newt |
---|---|---|
"Add 10 Objects", | 1751, | 1312 |
"Update 10 Objects", | 1780, | 1382 |
"Read 10 Cold Objects", | 8825, | 8704 |
concurrency: 4, object size, 999
What | RS | Newt |
---|---|---|
"Add 10 Objects", | 1571, | 728 |
"Update 10 Objects", | 1613, | 640 |
"Read 10 Cold Objects", | 7040, | 7530 |
concurrency: 8, object size, 999
What | RS | Newt |
---|---|---|
"Add 10 Objects", | 1170, | 579 |
"Update 10 Objects", | 1138, | 543 |
"Read 10 Cold Objects", | 9702, | 9759 |
concurrency: 16, object size, 999
What | RS | Newt |
---|---|---|
"Add 10 Objects", | 1184, | 450 |
"Update 10 Objects", | 1076, | 489 |
"Read 10 Cold Objects", | 9245, | 10236 |
Some things to note:
- Adding a text index significantly reduced newt performance.
-
Basic newt adds reduced performance. Much of this is on the client, which is mitigated by parallelism with more clients.
IMO, the slowdown isn't enough to be of significant concern for the basic case.
-
Adding a text index slows things even further. Presumably adding additional indexes would slow processing further. For applications that need high write throughput and especially slow write latency, this could be a problem.
Something to keep in mind, however, is that if the alternative are ZODB-based catalogs, these additional indexes, implemented as catalogs are likely to:
- Require much more client-site computation,
- Cause many more objects to be written, and
- Put more load on client caches, causing a much more client load, and
- Likely cause many more conflicts to be resolved and transactions to be retried.
It would be interesting to compare updating catalog-based indexes with Postgres-based indexes.
Newt will provide the ability to generate JSON and update indexes asynchronously. This should reduce write latency, but perhaps not help throughput, since the same work will need to be done.