Skip to content

Address some minor bugs around establishing SSL connections#644

Merged
borshop merged 1 commit into
2.0from
bugfix/ssl-test-issues
Dec 15, 2014
Merged

Address some minor bugs around establishing SSL connections#644
borshop merged 1 commit into
2.0from
bugfix/ssl-test-issues

Conversation

@kellymclaughlin
Copy link
Copy Markdown
Contributor

A few minor bugs were discovered while investigating riak_test failures.

  • The ssl application is explicitly started in
    riak_core_connection:try_ssl/0. The statement in the function
    expects the call to ssl:start/0 to always return ok, but in some
    cases the ssl application is already started and the call returns
    {error, {already_started, ssl}} instead. This should not represent
    an error condition, but as written an exception is generated in this
    case. This resulted in riak_test runs of replication tests that
    exercise SSL to stall. Really there is no reason to attempt to start
    the ssl application at this point in the code. The ssl application
    is an application dependency of the riak_repl application and should
    be started by the call to riak_core_util:start_app_deps in
    riak_repl_app:start/2. Removing the attempt to start ssl in
    riak_core_connection to avoid confusion.
  • The first handle_info function clause in riak_core_connection that
    handles a message received while in the wait_for_capabilities state
    attempts to use SSL by calling the try_ssl/4 function. If it
    succeeds a pair is returned whose elements are the name of the new
    transport and a new socket for the SSL connection. However the new
    socket was not being used for subsequent calls to send and setopts
    and this caused failure of several riak_tests.
  • The non-test function clause of
    riak_core_cluster_conn:request_cluster_name/1 contained assumptions
    about the transport in use and explicitly called
    inet:setopts/2. This does not work when SSL is used and also caused
    several test failures. The function has been changed so that the
    specified transport is used for the call to setopts instead.

A few minor bugs were discovered while investigating riak_test failures.

* The ssl application is explicitly started in
  riak_core_connection:try_ssl/0. The statement in the function
  expects the call to ssl:start/0 to always return ok, but in some
  cases the ssl application is already started and the call returns
  {error, {already_started, ssl}} instead. This should not represent
  an error condition, but as written an exception is generated in this
  case. This resulted in riak_test runs of replication tests that
  exercise SSL to stall. Really there is no reason to attempt to start
  the ssl application at this point in the code. The ssl application
  is an application dependency of the riak_repl application and should
  be started by the call to riak_core_util:start_app_deps in
  riak_repl_app:start/2. Removing the attempt to start ssl in
  riak_core_connection to avoid confusion.
* The first handle_info function clause in riak_core_connection that
  handles a message received while in the wait_for_capabilities state
  attempts to use SSL by calling the try_ssl/4 function. If it
  succeeds a pair is returned whose elements are the name of the new
  transport and a new socket for the SSL connection. However the new
  socket was not being used for subsequent calls to send and setopts
  and this caused failure of several riak_tests.
* The non-test function clause of
  riak_core_cluster_conn:request_cluster_name/1 contained assumptions
  about the transport in use and explicitly called
  inet:setopts/2. This does not work when SSL is used and also caused
  several test failures. The function has been changed so that the
  specified transport is used for the call to setopts instead.
@kellymclaughlin
Copy link
Copy Markdown
Contributor Author

I uncovered these issues looking at the following failures from the 2.0.3 scorecard:

  • replication2_ssl
  • replication2_pg:test_mixed_pg_ssl
  • replication2_pg:test_bidirectional_pg_ssl
  • replication2_pg:test_12_pg_mode_repl_mixed_ssl
  • replication2_connections
  • replication2_pg:test_pg_proxy_ssl

There are probably others that failed that may be resolved by these changes, but I have not had a chance to run them yet. In general, if the console.log from a riak node involved in the tests shows error reports that include {error, {already_started, ssl}} there is a good chance it is related to these issues.

@kellymclaughlin
Copy link
Copy Markdown
Contributor Author

cc: @engelsanchez

@engelsanchez engelsanchez self-assigned this Dec 15, 2014
@engelsanchez
Copy link
Copy Markdown
Contributor

Thanks Kelly! I'll review and test this now.

@engelsanchez
Copy link
Copy Markdown
Contributor

All code changes make sense (how the heck did it work before?). I found an issue with the realtime connection unit tests that exists before this PR and filed RIAK-1338 to resolve it. The SSL tests that I tried pass with this. Let's merge so we can start running the entire suite and get a sense of what else is lurking.

👍 2da227c

@engelsanchez
Copy link
Copy Markdown
Contributor

👍 2da227c

I said!

borshop added a commit that referenced this pull request Dec 15, 2014
Address some minor bugs around establishing SSL connections

Reviewed-by: engelsanchez
borshop added a commit that referenced this pull request Dec 15, 2014
Address some minor bugs around establishing SSL connections

Reviewed-by: engelsanchez
@kellymclaughlin
Copy link
Copy Markdown
Contributor Author

@borshop merge

@borshop borshop merged commit 2da227c into 2.0 Dec 15, 2014
@kellymclaughlin kellymclaughlin deleted the bugfix/ssl-test-issues branch December 15, 2014 21:04
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants