Skip to content

Grakn out of memory due to non closed transactions #303

@SamuelHassine

Description

@SamuelHassine

Description

After some time running the OpenCTI platform, Grakn becomes unavailable due to OOM exception:

2019-10-28 01:51:52,586 [transaction-listener-0] ERROR g.c.s.r.SessionService$TransactionListener - Runtime Exception in RPC TransactionListener: 
org.janusgraph.core.JanusGraphException: Could not execute operation due to backend exception
	at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:56)
	at org.janusgraph.diskstorage.BackendTransaction.executeRead(BackendTransaction.java:470)
	at org.janusgraph.diskstorage.BackendTransaction.edgeStoreQuery(BackendTransaction.java:269)
	at org.janusgraph.graphdb.database.StandardJanusGraph.edgeQuery(StandardJanusGraph.java:437)
	at org.janusgraph.graphdb.query.vertex.SimpleVertexQueryProcessor.lambda$null$0(SimpleVertexQueryProcessor.java:120)
	at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:98)
	at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:90)
	at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:82)
	at org.janusgraph.graphdb.query.vertex.SimpleVertexQueryProcessor.lambda$getBasicIterator$1(SimpleVertexQueryProcessor.java:120)
	at org.janusgraph.graphdb.vertices.CacheVertex.loadRelations(CacheVertex.java:67)
	at org.janusgraph.graphdb.query.vertex.SimpleVertexQueryProcessor.getBasicIterator(SimpleVertexQueryProcessor.java:120)
	at org.janusgraph.graphdb.query.vertex.SimpleVertexQueryProcessor.iterator(SimpleVertexQueryProcessor.java:77)
	at com.google.common.collect.Iterables$5.iterator(Iterables.java:725)
	at org.janusgraph.graphdb.query.vertex.SimpleVertexQueryProcessor.vertexIds(SimpleVertexQueryProcessor.java:100)
	at org.janusgraph.graphdb.query.vertex.BasicVertexCentricQueryBuilder.executeIndividualVertices(BasicVertexCentricQueryBuilder.java:337)
	at org.janusgraph.graphdb.query.vertex.BasicVertexCentricQueryBuilder.executeVertices(BasicVertexCentricQueryBuilder.java:331)
	at org.janusgraph.graphdb.query.vertex.BasicVertexCentricQueryBuilder$VertexConstructor.getResult(BasicVertexCentricQueryBuilder.java:242)
	at org.janusgraph.graphdb.query.vertex.BasicVertexCentricQueryBuilder$VertexConstructor.getResult(BasicVertexCentricQueryBuilder.java:238)
	at org.janusgraph.graphdb.query.vertex.VertexCentricQueryBuilder.execute(VertexCentricQueryBuilder.java:86)
	at org.janusgraph.graphdb.query.vertex.VertexCentricQueryBuilder.vertices(VertexCentricQueryBuilder.java:114)
	at org.janusgraph.graphdb.vertices.AbstractVertex.getVertexLabelInternal(AbstractVertex.java:125)
	at org.janusgraph.graphdb.vertices.AbstractVertex.vertexLabel(AbstractVertex.java:130)
	at org.janusgraph.graphdb.vertices.AbstractVertex.label(AbstractVertex.java:121)
	at grakn.core.server.kb.structure.AbstractElement.label(AbstractElement.java:183)
	at grakn.core.server.kb.concept.ElementFactory.getBaseType(ElementFactory.java:252)
	at grakn.core.server.kb.concept.ElementFactory.buildConcept(ElementFactory.java:171)
	at grakn.core.server.kb.concept.ElementFactory.buildConcept(ElementFactory.java:161)
	at grakn.core.server.session.TransactionOLTP.buildConcept(TransactionOLTP.java:475)
	at grakn.core.server.kb.concept.ThingImpl.lambda$getShortcutNeighbours$14(ThingImpl.java:254)
	at java.util.stream.ReferencePipeline$3$1.accept(ReferencePipeline.java:193)
	at java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1812)
	at java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(StreamSpliterators.java:295)
	at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:207)
	at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:170)
	at java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:301)
	at java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681)
	at grakn.core.server.rpc.SessionService$Iterators.next(SessionService.java:440)
	at grakn.core.server.rpc.SessionService$TransactionListener.next(SessionService.java:408)
	at grakn.core.server.rpc.SessionService$TransactionListener.handleRequest(SessionService.java:220)
	at grakn.core.server.rpc.SessionService$TransactionListener.lambda$onNext$1(SessionService.java:175)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.janusgraph.diskstorage.PermanentBackendException: Permanent exception while executing backend operation EdgeStoreQuery
	at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:81)
	at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:54)
	... 44 common frames omitted
Caused by: com.google.common.util.concurrent.ExecutionError: java.lang.OutOfMemoryError: Java heap space
	at com.google.common.cache.LocalCache$Segment.get(LocalCache.java:2216)
	at com.google.common.cache.LocalCache.get(LocalCache.java:4147)
	at com.google.common.cache.LocalCache$LocalManualCache.get(LocalCache.java:5053)
	at org.janusgraph.diskstorage.keycolumnvalue.cache.ExpirationKCVSCache.getSlice(ExpirationKCVSCache.java:89)
	at org.janusgraph.diskstorage.BackendTransaction$1.call(BackendTransaction.java:272)
	at org.janusgraph.diskstorage.BackendTransaction$1.call(BackendTransaction.java:269)
	at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:68)
	... 45 common frames omitted
Caused by: java.lang.OutOfMemoryError: Java heap space
2019-10-28 01:52:25,957 [transaction-listener-0] ERROR g.c.s.r.SessionService$TransactionListener - Runtime Exception in RPC TransactionListener: 
org.janusgraph.core.JanusGraphException: Could not execute operation due to backend exception
	at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:56)
	at org.janusgraph.diskstorage.BackendTransaction.executeRead(BackendTransaction.java:470)
	at org.janusgraph.diskstorage.BackendTransaction.edgeStoreQuery(BackendTransaction.java:269)
	at org.janusgraph.graphdb.database.StandardJanusGraph.edgeQuery(StandardJanusGraph.java:437)
	at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$2.lambda$null$1(StandardJanusGraphTx.java:1182)
	at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:98)
	at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:90)
	at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:82)
	at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$2.lambda$execute$2(StandardJanusGraphTx.java:1182)
	at org.janusgraph.graphdb.vertices.CacheVertex.loadRelations(CacheVertex.java:67)
	at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$2.execute(StandardJanusGraphTx.java:1182)
	at org.janusgraph.graphdb.transaction.StandardJanusGraphTx$2.execute(StandardJanusGraphTx.java:1124)
	at org.janusgraph.graphdb.query.QueryProcessor$LimitAdjustingIterator.getNewIterator(QueryProcessor.java:194)
	at org.janusgraph.graphdb.query.LimitAdjustingIterator.hasNext(LimitAdjustingIterator.java:68)
	at org.janusgraph.graphdb.query.ResultMergeSortIterator.nextInternal(ResultMergeSortIterator.java:69)
	at org.janusgraph.graphdb.query.ResultMergeSortIterator.<init>(ResultMergeSortIterator.java:49)
	at org.janusgraph.graphdb.query.QueryProcessor.getUnfoldedIterator(QueryProcessor.java:84)
	at org.janusgraph.graphdb.query.QueryProcessor.iterator(QueryProcessor.java:66)
	at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphVertexStep.flatMap(JanusGraphVertexStep.java:115)
	at org.apache.tinkerpop.gremlin.process.traversal.step.map.FlatMapStep.processNextStart(FlatMapStep.java:49)
	at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphVertexStep.processNextStart(JanusGraphVertexStep.java:105)
	at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
	at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50)
	at org.apache.tinkerpop.gremlin.process.traversal.step.map.FlatMapStep.processNextStart(FlatMapStep.java:48)
	at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
	at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50)
	at org.apache.tinkerpop.gremlin.process.traversal.step.map.FlatMapStep.processNextStart(FlatMapStep.java:48)
	at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphVertexStep.processNextStart(JanusGraphVertexStep.java:105)
	at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
	at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50)
	at org.apache.tinkerpop.gremlin.process.traversal.step.filter.FilterStep.processNextStart(FilterStep.java:37)
	at org.apache.tinkerpop.gremlin.process.traversal.step.filter.WherePredicateStep.processNextStart(WherePredicateStep.java:150)
	at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
	at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50)
	at org.apache.tinkerpop.gremlin.process.traversal.step.map.FlatMapStep.processNextStart(FlatMapStep.java:48)
	at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
	at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50)
	at org.apache.tinkerpop.gremlin.process.traversal.step.util.ComputerAwareStep$EndStep.processNextStart(ComputerAwareStep.java:82)
	at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
	at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:192)
	at org.apache.tinkerpop.gremlin.process.traversal.step.branch.BranchStep.standardAlgorithm(BranchStep.java:94)
	at org.apache.tinkerpop.gremlin.process.traversal.step.util.ComputerAwareStep.processNextStart(ComputerAwareStep.java:46)
	at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
	at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.hasNext(DefaultTraversal.java:192)
	at java.util.Spliterators$IteratorSpliterator.tryAdvance(Spliterators.java:1811)
	at java.util.stream.StreamSpliterators$WrappingSpliterator.lambda$initPartialTraversalState$0(StreamSpliterators.java:295)
	at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.fillBuffer(StreamSpliterators.java:207)
	at java.util.stream.StreamSpliterators$AbstractWrappingSpliterator.doAdvance(StreamSpliterators.java:162)
	at java.util.stream.StreamSpliterators$WrappingSpliterator.tryAdvance(StreamSpliterators.java:301)
	at java.util.Spliterators$1Adapter.hasNext(Spliterators.java:681)
	at grakn.core.server.rpc.SessionService$Iterators.next(SessionService.java:440)
	at grakn.core.server.rpc.SessionService$TransactionListener.next(SessionService.java:408)
	at grakn.core.server.rpc.SessionService$TransactionListener.handleRequest(SessionService.java:220)
	at grakn.core.server.rpc.SessionService$TransactionListener.lambda$onNext$1(SessionService.java:175)
	at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
	at java.util.concurrent.FutureTask.run(FutureTask.java:266)
	at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
	at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
	at java.lang.Thread.run(Thread.java:748)
Caused by: org.janusgraph.diskstorage.PermanentBackendException: Permanent exception while executing backend operation EdgeStoreQuery
	at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:81)
	at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:54)
	... 58 common frames omitted
Caused by: java.lang.OutOfMemoryError: GC overhead limit exceeded
2019-10-28 01:53:36,472 [grpc-default-executor-3895] ERROR grakn.core.server.Grakn - Uncaught exception at thread [grpc-default-executor-3895]
java.lang.OutOfMemoryError: GC overhead limit exceeded

A bug is open on the Grakn Github: typedb/typedb#5480.

Environment

All, OpenCTI 2.0.1

Reproducible Steps

Run the platform.

Expected Output

Normal run.

Actual Output

After some time, Grakn crashes with OOM.

Additional information

After some investigations, it seems to be linked to a transactions problem. We close transactions after each queries but live threads are still active:

image

Metadata

Metadata

Assignees

Labels

buguse for describing something not working as expectedsolveduse to identify issue that has been solved (must be linked to the solving PR)

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions