You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
v0.58.1, 100M acc/100m NFT NLG load test on Latitude
new warning:
2025-01-04 07:21:06.147 231130 WARN EXCEPTION <<virtual-map: cache-cleaner #4>> StandardFuture: Future has already been cancelled
com.swirlds.common.threading.futures.StandardFuture.cancelWithError(StandardFuture.java:367)
at com.swirlds.common.threading.futures.StandardFuture.cancelWithError(StandardFuture.java:351)
at com.swirlds.virtualmap.internal.cache.ConcurrentArray.lambda$parallelTraverse$0(ConcurrentArray.java:357)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1144)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:642)
at java.lang.Thread.run(Thread.java:1583)
@poulok No visible side-effects or performance degradations observed. However, as this warning is new and may indicate unhealthy state, we need to get proper evaluation from engineers here
There are three different calls to ConcurrentArray.parallelTraverse(), where the exception was thrown:
VirtualNodeCache.deletedLeaves() - a part of a flush, called on the lifecycle thread
VirtualNodeCache.filterMutations() - a part of a flush, called on the lifecycle thread
VirtualNodeCache.purge() - a part of node cache release, called on a thread in the cleaning thread pool
In the first two cases, the future returned from parallelTraverse() is then checked for exceptions using getAndRethrow(). All exceptions rethrown in this way on the lifecycle thread would be very visible in the logs (and most likely would kill or stuck the process). Therefore I assume the exception reported in this bug was thrown from purge(). It may cause a memory leak, but for the leak to result in an OOME, there must be a lot of such exceptions.
I don't know what could go wrong during purge(), so as the first step I'm going to improve logging in parallelTraverse(), so all underlying exceptions are properly propagated.
Description
v0.58.1, 100M acc/100m NFT NLG load test on Latitude
new warning:
repro on all nodes
Log: https://perf.analytics.eng.hashgraph.io/ephemeral/v0.58.1_Latitude_N1_01042025/network-node1_swirlds.log
It was not in v0.58.0
Steps to reproduce
Regular 100M NLG test
Additional context
All logs: https://perf.analytics.eng.hashgraph.io/ephemeral/v0.58.1_Latitude_N1_01042025
Hedera network
other
Version
v0.58.1
Operating system
Linux
The text was updated successfully, but these errors were encountered: