Hello,
we're using Riak over protocol buffers, and with Amazon's Elastic Load Balancer in the middle. We're seeing these exceptions pop out:
Error while storing avatar to Riak: 'Socket returned short packet length 0 - expected 4'
(Here's another person with a similar issue: http://codersdiscuss.com/4870/timeout-issues-with-the-riak-python-customer-with-process-buffers/.)
As far as we can see, it's the idle timeout setting of the ELB that's the problem - a worker can be idle for more than the idle timeout, and ELB will close some (or both - to the worker and to Riak) of the connections. Then, the next Riak action will result in the above exception.
We will probably deal with this on the application layer. However, it'd be very convenient if this could be dealt with on the pool layer transparently.
According to Python docs (https://docs.python.org/2/howto/sockets.html):
When a recv returns 0 bytes, it means the other side has closed (or is in the process of closing) the connection. You will not receive any more data on this connection. Ever.
The RiakPbcPool class already has a set of exceptions on which to retry calls. How about adding this particular case to the set of auto-retry scenarios?
Hello,
we're using Riak over protocol buffers, and with Amazon's Elastic Load Balancer in the middle. We're seeing these exceptions pop out:
(Here's another person with a similar issue: http://codersdiscuss.com/4870/timeout-issues-with-the-riak-python-customer-with-process-buffers/.)
As far as we can see, it's the idle timeout setting of the ELB that's the problem - a worker can be idle for more than the idle timeout, and ELB will close some (or both - to the worker and to Riak) of the connections. Then, the next Riak action will result in the above exception.
We will probably deal with this on the application layer. However, it'd be very convenient if this could be dealt with on the pool layer transparently.
According to Python docs (https://docs.python.org/2/howto/sockets.html):
The RiakPbcPool class already has a set of exceptions on which to retry calls. How about adding this particular case to the set of auto-retry scenarios?