Skip to content

Enhance/krab/cluster realtime rebalance rebase#650

Closed
lixen wants to merge 54 commits into
developfrom
enhance/krab/cluster-realtime-rebalance-rebase
Closed

Enhance/krab/cluster realtime rebalance rebase#650
lixen wants to merge 54 commits into
developfrom
enhance/krab/cluster-realtime-rebalance-rebase

Conversation

@lixen
Copy link
Copy Markdown
Contributor

@lixen lixen commented Jan 8, 2015

Rebase and some squash of #632 .

Will start looking at mixed cluster tests as @lordnull proposes.

Mikael Lixenstrand and others added 30 commits October 6, 2014 14:12
This commit extends AAE fullsync to sample the AAE trees to estimate
the number of keys in a given partition.

This estimate is used to properly size the bloom filter, as well as
enable a new percentage-based direct send threshold (eg. direct send
up to 10% of differing keys).

To support AAE-based key estimation, this commit changes the fullsync
logic to update all AAE trees before proceeding to the exchange phase.
This change is necessary because all trees must be updated and sampled
to calculate a correct estimate.

This commit also delays the creation of the bloom filter until it is
needed. Fullsyncs that manage to send all differences directly will
therefore avoid creating a bloom filter.

Authored-by: Mikael Lixenstrand <mikael.lixenstrand@erlang-solutions.com>
Authored-by: rsltrifork <rsl@trifork.com>
Rebased-by:  Joseph Blomstedt <joe@basho.com>
… this needs to be followed up by appropriate
changes to riak_test, doc, etc.
…imate-keys

Conflicts:
	src/riak_repl_aae_source.erl
Problem: transient failures of aae, such as trees not yet built or locks not
being aquired, would cause an aae fullsync process to exit abnormally. This
could happen several times in a row, creating log spam.

Resolution: the concept of soft_exit. A soft_exit is a message sent from a soon
to be exiting process to a soft_linked process. The exiting process would then
exit normally, while any soft_linked processes could handle the soft_exit
message in a similar fashion as an exit message. This would indicate an exit
reason that should be handled, but not bad enough to have the system logger
know about it.

The soft_exit message sent from the aae worker to the fscoordinator is
as simple as `{soft_exit, pid(), term()}'.

The current implementation is not generic. There can only one soft_link to
the aae, and there's no general mechanism to use soft_link's or soft_exits
elsewhere in the code base. Sorry.

Another change rolled into this is consistent use of a #partition_info record
in the fscoordinator, and error tracking the fscoordinator's state. By swapping
to useing a single data structure in the partition queue, whereis waiting list,
and purgatory queues it makes it easier to understand the fscordinator (as
there is less code modify structures).

This is a forward port of the fix done for 1.4. Conflicts favor existing code
where it does not directly effect the fix.

Conflicts:
	Makefile
	rebar.config
	src/riak_repl2_fssource.erl
	src/riak_repl2_rtq_proxy.erl
	src/riak_repl_aae_source.erl
	test/riak_core_cluster_mgr_tests.erl
Increment_error_dict expects the partition, elementN of error dict, and the
state. It pulls the dict out of the state so it put it back in place, thus just
returning the state. So this call that passed the dict in was wrong.
When a partition is not available, perhaps after a number of retries,
the error exits stat should be incremented. Also, the retry exits stat
should be incremented on each retry.  This was discovered when
backporting the repl_location_failures riak_test.
The one in riak_repl2_fssource is a legit bug in the code
…ilures-2.0

2.0 port of AAE transient FS failures

Reviewed-by: lordnull
Remove loop so we can receive cancel_fullsync during
update of remote trees.
A few minor bugs were discovered while investigating riak_test failures.

* The ssl application is explicitly started in
  riak_core_connection:try_ssl/0. The statement in the function
  expects the call to ssl:start/0 to always return ok, but in some
  cases the ssl application is already started and the call returns
  {error, {already_started, ssl}} instead. This should not represent
  an error condition, but as written an exception is generated in this
  case. This resulted in riak_test runs of replication tests that
  exercise SSL to stall. Really there is no reason to attempt to start
  the ssl application at this point in the code. The ssl application
  is an application dependency of the riak_repl application and should
  be started by the call to riak_core_util:start_app_deps in
  riak_repl_app:start/2. Removing the attempt to start ssl in
  riak_core_connection to avoid confusion.
* The first handle_info function clause in riak_core_connection that
  handles a message received while in the wait_for_capabilities state
  attempts to use SSL by calling the try_ssl/4 function. If it
  succeeds a pair is returned whose elements are the name of the new
  transport and a new socket for the SSL connection. However the new
  socket was not being used for subsequent calls to send and setopts
  and this caused failure of several riak_tests.
* The non-test function clause of
  riak_core_cluster_conn:request_cluster_name/1 contained assumptions
  about the transport in use and explicitly called
  inet:setopts/2. This does not work when SSL is used and also caused
  several test failures. The function has been changed so that the
  specified transport is used for the call to setopts instead.
Address some minor bugs around establishing SSL connections

Reviewed-by: engelsanchez
Improve AAE fullsync by estimating number of keys

Reviewed-by: engelsanchez
…-leader

Added test and fix to coord_serv not giving list for status.

Reviewed-by: engelsanchez
Added last_fullsync_completed stat tracking.

Reviewed-by: engelsanchez
When a partition has hit the soft exit limit, we add it to the dropped
list, but forgot to remove it from the purgatory list. So it may
actually be retried later.
Remove partition from purgatory when giving up

Reviewed-by: lordnull
macintux and others added 24 commits January 8, 2015 11:00
This implementes riak_core_cluster_serv {1,1}
with new membership function on the server side
to give list of {node(), {IP,Port} | unreachable} 
for *all* members of remote cluster.  Nodes for
which the cluster_serv cannot RPC to the given
node yield ‘unreachable’ in stead of an IP/Port.
Use new all_members message if remote is 1,1+
For 1,0 emulate new semantics by keeping old
IP:PORTs around until cluster_mgr restart.

Implement new seeded sort+shuffle for result of
calling cluster_mgr:get_ipaddrs_of_cluster/1.
The first tells the caller the address currently
connected to.  The second tells the rtsource_conn
to try (if possible) to switch to an alternative
connection.
Possible to lose some addresses when ConnectedAddr are early in the list.

[{"127.0.0.1",10106},{"127.0.0.1",10066},{"127.0.0.1",10096},{"127.0.0.1",10076},{"127.0.0.1",10086}], ConnectedAddr: {"127.0.0.1",10066}
[{"127.0.0.1",10106}], UsefulAddrs [{"127.0.0.1",10106}]
Stats function now never returns error code.
@lixen lixen closed this Jan 8, 2015
@lixen lixen reopened this Jan 8, 2015
@lixen lixen closed this Jan 8, 2015
@lixen lixen deleted the enhance/krab/cluster-realtime-rebalance-rebase branch January 8, 2015 11:18
@lixen lixen restored the enhance/krab/cluster-realtime-rebalance-rebase branch January 8, 2015 11:18
@lixen lixen deleted the enhance/krab/cluster-realtime-rebalance-rebase branch January 8, 2015 11:18
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

8 participants