Skip to content

Deadlock in grpc due to recursive grpc call #2515

@garrettjonesgoogle

Description

@garrettjonesgoogle

Please answer these questions before submitting your issue.

What version of gRPC are you using?

1.0.1

What JVM are you using (java -version)?

openjdk version "1.8.0_102"
OpenJDK Runtime Environment (build 1.8.0_102)
OpenJDK 64-Bit Server VM (build 25.102-b01, mixed mode)

What did you do?

If possible, provide a recipe for reproducing the error.

  • Turned on FINE logging
  • Used the Logging client in google-cloud-java

What did you expect to see?

Not what I saw below...

What did you see instead?

Deadlock.

Found one Java-level deadlock:
=============================
"grpc-default-worker-ELG-1-1":
  waiting to lock monitor 0x00007fe628006318 (object 0x00000005cf4cead0, a java.lang.Object),
  which is held by "main"
"main":
  waiting to lock monitor 0x00007fe628002d78 (object 0x00000005ce107468, a java.lang.Object),
  which is held by "pool-1-thread-1"
"pool-1-thread-1":
  waiting to lock monitor 0x00007fe628006318 (object 0x00000005cf4cead0, a java.lang.Object),
  which is held by "main"

Java stack information for the threads listed above:
===================================================
"grpc-default-worker-ELG-1-1":
        at io.grpc.internal.DelayedClientTransport.newStream(DelayedClientTransport.java:121)
        - waiting to lock <0x00000005cf4cead0> (a java.lang.Object)
        at io.grpc.internal.ClientCallImpl.start(ClientCallImpl.java:214)
        at io.grpc.auth.ClientAuthInterceptor$1.checkedStart(ClientAuthInterceptor.java:104)
        at io.grpc.ClientInterceptors$CheckedForwardingClientCall.start(ClientInterceptors.java:195)
        at io.grpc.ForwardingClientCall.start(ForwardingClientCall.java:47)
        at com.google.api.gax.grpc.HeaderInterceptor$1.start(HeaderInterceptor.java:62)
        at io.grpc.stub.ClientCalls.startCall(ClientCalls.java:273)
        at io.grpc.stub.ClientCalls.asyncUnaryRequestCall(ClientCalls.java:252)
        at io.grpc.stub.ClientCalls.futureUnaryCall(ClientCalls.java:189)
        at com.google.api.gax.grpc.DirectCallable.futureCall(DirectCallable.java:60)
        at com.google.api.gax.grpc.ReentranceDetectingCallable.futureCall(ReentranceDetectingCallable.java:56)
        at com.google.api.gax.grpc.ExceptionTransformingCallable.futureCall(ExceptionTransformingCallable.java:64)
        at com.google.api.gax.grpc.RetryingCallable$RetryingResultFuture.issueCall(RetryingCallable.java:220)
        - locked <0x00000005cf8472e8> (a java.lang.Object)
        at com.google.api.gax.grpc.RetryingCallable.futureCall(RetryingCallable.java:88)
        at com.google.api.gax.grpc.UnaryCallable.futureCall(UnaryCallable.java:238)
        at com.google.api.gax.grpc.UnaryCallable.futureCall(UnaryCallable.java:249)
        at com.google.cloud.logging.spi.DefaultLoggingRpc.write(DefaultLoggingRpc.java:202)
        at com.google.cloud.logging.LoggingImpl.writeAsync(LoggingImpl.java:504)
        at com.google.cloud.logging.LoggingImpl.write(LoggingImpl.java:499)
        at com.google.cloud.logging.LoggingHandler.write(LoggingHandler.java:382)
        at com.google.cloud.logging.LoggingHandler.flush(LoggingHandler.java:407)
        at com.google.cloud.logging.LoggingHandler.publish(LoggingHandler.java:318)
        at java.util.logging.Logger.log(Logger.java:738)
        at io.netty.util.internal.logging.JdkLogger.log(JdkLogger.java:606)
        at io.netty.util.internal.logging.JdkLogger.debug(JdkLogger.java:186)
        at io.netty.util.internal.logging.AbstractInternalLogger.log(AbstractInternalLogger.java:147)
        at io.netty.handler.codec.http2.Http2FrameLogger.log(Http2FrameLogger.java:183)
        at io.netty.handler.codec.http2.Http2FrameLogger.logSettings(Http2FrameLogger.java:111)
        at io.netty.handler.codec.http2.Http2OutboundFrameLogger.writeSettings(Http2OutboundFrameLogger.java:81)
        at io.netty.handler.codec.http2.DefaultHttp2ConnectionEncoder.writeSettings(DefaultHttp2ConnectionEncoder.java:246)
        at io.netty.handler.codec.http2.DecoratingHttp2FrameWriter.writeSettings(DecoratingHttp2FrameWriter.java:70)
        at io.netty.handler.codec.http2.Http2ConnectionHandler$PrefaceDecoder.sendPreface(Http2ConnectionHandler.java:324)
        at io.netty.handler.codec.http2.Http2ConnectionHandler$PrefaceDecoder.channelActive(Http2ConnectionHandler.java:224)
        at io.netty.handler.codec.http2.Http2ConnectionHandler.channelActive(Http2ConnectionHandler.java:363)
        at io.grpc.netty.AbstractNettyHandler.channelActive(AbstractNettyHandler.java:78)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:223)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:209)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelActive(AbstractChannelHandlerContext.java:202)
        at io.netty.handler.ssl.SslHandler.channelActive(SslHandler.java:1398)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:223)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:209)
        at io.netty.channel.AbstractChannelHandlerContext.fireChannelActive(AbstractChannelHandlerContext.java:202)
        at io.netty.channel.DefaultChannelPipeline$HeadContext.channelActive(DefaultChannelPipeline.java:1322)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:223)
        at io.netty.channel.AbstractChannelHandlerContext.invokeChannelActive(AbstractChannelHandlerContext.java:209)
        at io.netty.channel.DefaultChannelPipeline.fireChannelActive(DefaultChannelPipeline.java:902)
        at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.fulfillConnectPromise(AbstractNioChannel.java:305)
        at io.netty.channel.nio.AbstractNioChannel$AbstractNioUnsafe.finishConnect(AbstractNioChannel.java:335)
        at io.netty.channel.nio.NioEventLoop.processSelectedKey(NioEventLoop.java:588)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeysOptimized(NioEventLoop.java:512)
        at io.netty.channel.nio.NioEventLoop.processSelectedKeys(NioEventLoop.java:426)
        at io.netty.channel.nio.NioEventLoop.run(NioEventLoop.java:398)
        at io.netty.util.concurrent.SingleThreadEventExecutor$5.run(SingleThreadEventExecutor.java:877)
        at io.netty.util.concurrent.DefaultThreadFactory$DefaultRunnableDecorator.run(DefaultThreadFactory.java:144)
        at java.lang.Thread.run(Thread.java:745)
"main":
        at io.grpc.internal.InUseStateAggregator.updateObjectInUse(InUseStateAggregator.java:51)
        - waiting to lock <0x00000005ce107468> (a java.lang.Object)
        at io.grpc.internal.TransportSet$BaseTransportListener.transportInUse(TransportSet.java:373)
        at io.grpc.internal.DelayedClientTransport.newStream(DelayedClientTransport.java:129)
        - locked <0x00000005cf4cead0> (a java.lang.Object)
        at io.grpc.internal.ClientCallImpl.start(ClientCallImpl.java:214)
        at io.grpc.auth.ClientAuthInterceptor$1.checkedStart(ClientAuthInterceptor.java:104)
        at io.grpc.ClientInterceptors$CheckedForwardingClientCall.start(ClientInterceptors.java:195)
        at io.grpc.ForwardingClientCall.start(ForwardingClientCall.java:47)
        at com.google.api.gax.grpc.HeaderInterceptor$1.start(HeaderInterceptor.java:62)
        at io.grpc.stub.ClientCalls.startCall(ClientCalls.java:273)
        at io.grpc.stub.ClientCalls.asyncUnaryRequestCall(ClientCalls.java:252)
        at io.grpc.stub.ClientCalls.futureUnaryCall(ClientCalls.java:189)
        at com.google.api.gax.grpc.DirectCallable.futureCall(DirectCallable.java:60)
        at com.google.api.gax.grpc.ReentranceDetectingCallable.futureCall(ReentranceDetectingCallable.java:56)
        at com.google.api.gax.grpc.ExceptionTransformingCallable.futureCall(ExceptionTransformingCallable.java:64)
        at com.google.api.gax.grpc.RetryingCallable$RetryingResultFuture.issueCall(RetryingCallable.java:220)
        - locked <0x00000005cd8b7740> (a java.lang.Object)
        at com.google.api.gax.grpc.RetryingCallable.futureCall(RetryingCallable.java:88)
        at com.google.api.gax.grpc.UnaryCallable.futureCall(UnaryCallable.java:238)
        at com.google.api.gax.grpc.UnaryCallable.futureCall(UnaryCallable.java:249)
        at com.google.cloud.logging.spi.DefaultLoggingRpc.write(DefaultLoggingRpc.java:202)
        at com.google.cloud.logging.LoggingImpl.writeAsync(LoggingImpl.java:504)
        at com.google.cloud.logging.LoggingImpl.write(LoggingImpl.java:499)
        at com.google.cloud.logging.LoggingHandler.write(LoggingHandler.java:382)
        at com.google.cloud.logging.LoggingHandler.flush(LoggingHandler.java:407)
        at com.google.cloud.logging.LoggingHandler.publish(LoggingHandler.java:318)
        at java.util.logging.Logger.log(Logger.java:738)
        at java.util.logging.Logger.doLog(Logger.java:765)
        at java.util.logging.Logger.log(Logger.java:788)
        at java.util.logging.Logger.info(Logger.java:1490)
        at com.google.cloud.runtimes.jetty9.StackDriverLogging.init(StackDriverLogging.java:37)
        at com.google.cloud.runtimes.jetty9.StackDriverLogging.main(StackDriverLogging.java:64)
"pool-1-thread-1":
        at io.grpc.internal.DelayedClientTransport.newStream(DelayedClientTransport.java:121)
        - waiting to lock <0x00000005cf4cead0> (a java.lang.Object)
        at io.grpc.internal.ClientCallImpl.start(ClientCallImpl.java:214)
        at io.grpc.auth.ClientAuthInterceptor$1.checkedStart(ClientAuthInterceptor.java:104)
        at io.grpc.ClientInterceptors$CheckedForwardingClientCall.start(ClientInterceptors.java:195)
        at io.grpc.ForwardingClientCall.start(ForwardingClientCall.java:47)
        at com.google.api.gax.grpc.HeaderInterceptor$1.start(HeaderInterceptor.java:62)
        at io.grpc.stub.ClientCalls.startCall(ClientCalls.java:273)
        at io.grpc.stub.ClientCalls.asyncUnaryRequestCall(ClientCalls.java:252)
        at io.grpc.stub.ClientCalls.futureUnaryCall(ClientCalls.java:189)
        at com.google.api.gax.grpc.DirectCallable.futureCall(DirectCallable.java:60)
        at com.google.api.gax.grpc.ReentranceDetectingCallable.futureCall(ReentranceDetectingCallable.java:56)
        at com.google.api.gax.grpc.ExceptionTransformingCallable.futureCall(ExceptionTransformingCallable.java:64)
        at com.google.api.gax.grpc.RetryingCallable$RetryingResultFuture.issueCall(RetryingCallable.java:220)
        - locked <0x00000005cfbce9e0> (a java.lang.Object)
        at com.google.api.gax.grpc.RetryingCallable.futureCall(RetryingCallable.java:88)
        at com.google.api.gax.grpc.UnaryCallable.futureCall(UnaryCallable.java:238)
        at com.google.api.gax.grpc.UnaryCallable.futureCall(UnaryCallable.java:249)
        at com.google.cloud.logging.spi.DefaultLoggingRpc.write(DefaultLoggingRpc.java:202)
        at com.google.cloud.logging.LoggingImpl.writeAsync(LoggingImpl.java:504)
        at com.google.cloud.logging.LoggingImpl.write(LoggingImpl.java:499)
        at com.google.cloud.logging.LoggingHandler.write(LoggingHandler.java:382)
        at com.google.cloud.logging.LoggingHandler.flush(LoggingHandler.java:407)
        at com.google.cloud.logging.LoggingHandler.publish(LoggingHandler.java:318)
        at java.util.logging.Logger.log(Logger.java:738)
        at java.util.logging.Logger.doLog(Logger.java:765)
        at java.util.logging.Logger.log(Logger.java:851)
        at io.grpc.internal.TransportSet.startNewTransport(TransportSet.java:217)
        at io.grpc.internal.TransportSet.obtainActiveTransport(TransportSet.java:192)
        - locked <0x00000005ce107468> (a java.lang.Object)
        at io.grpc.internal.ManagedChannelImpl$3.getTransport(ManagedChannelImpl.java:637)
        at io.grpc.internal.ManagedChannelImpl$3.getTransport(ManagedChannelImpl.java:579)
        at io.grpc.DummyLoadBalancerFactory$DummyLoadBalancer$1.get(DummyLoadBalancerFactory.java:135)
        at io.grpc.internal.DelayedClientTransport$2.run(DelayedClientTransport.java:262)
        at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
        at java.util.concurrent.FutureTask.run(FutureTask.java:266)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.access$201(ScheduledThreadPoolExecutor.java:180)
        at java.util.concurrent.ScheduledThreadPoolExecutor$ScheduledFutureTask.run(ScheduledThreadPoolExecutor.java:295)
        at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1142)
        at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:617)
        at java.lang.Thread.run(Thread.java:745)

Background: I have been trying various strategies to resolve googleapis/google-cloud-java#1386 , where using the Logging service at level FINE results in grpc logging to the Logging service in a recursive way. I tried using a ThreadLocal to prevent this, but this doesn't work with grpc because the actual call is executed on a worker thread. Essentially I think I need some way to bail out of the LoggingHandler.publish call if I can detect that this is in the scope of a grpc worker thread sending a request to the Logging service.

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions