Enhance/krab/cluster realtime rebalance rebase by lixen · Pull Request #650 · basho/riak_repl

lixen · 2015-01-08T11:14:36Z

Rebase and some squash of #632 .

Will start looking at mixed cluster tests as @lordnull proposes.

This commit extends AAE fullsync to sample the AAE trees to estimate the number of keys in a given partition. This estimate is used to properly size the bloom filter, as well as enable a new percentage-based direct send threshold (eg. direct send up to 10% of differing keys). To support AAE-based key estimation, this commit changes the fullsync logic to update all AAE trees before proceeding to the exchange phase. This change is necessary because all trees must be updated and sampled to calculate a correct estimate. This commit also delays the creation of the bloom filter until it is needed. Fullsyncs that manage to send all differences directly will therefore avoid creating a bloom filter. Authored-by: Mikael Lixenstrand <mikael.lixenstrand@erlang-solutions.com> Authored-by: rsltrifork <rsl@trifork.com> Rebased-by: Joseph Blomstedt <joe@basho.com>

… this needs to be followed up by appropriate changes to riak_test, doc, etc.

…imate-keys Conflicts: src/riak_repl_aae_source.erl

Keep needed for testing as debug.

Problem: transient failures of aae, such as trees not yet built or locks not being aquired, would cause an aae fullsync process to exit abnormally. This could happen several times in a row, creating log spam. Resolution: the concept of soft_exit. A soft_exit is a message sent from a soon to be exiting process to a soft_linked process. The exiting process would then exit normally, while any soft_linked processes could handle the soft_exit message in a similar fashion as an exit message. This would indicate an exit reason that should be handled, but not bad enough to have the system logger know about it. The soft_exit message sent from the aae worker to the fscoordinator is as simple as `{soft_exit, pid(), term()}'. The current implementation is not generic. There can only one soft_link to the aae, and there's no general mechanism to use soft_link's or soft_exits elsewhere in the code base. Sorry. Another change rolled into this is consistent use of a #partition_info record in the fscoordinator, and error tracking the fscoordinator's state. By swapping to useing a single data structure in the partition queue, whereis waiting list, and purgatory queues it makes it easier to understand the fscordinator (as there is less code modify structures). This is a forward port of the fix done for 1.4. Conflicts favor existing code where it does not directly effect the fix. Conflicts: Makefile rebar.config src/riak_repl2_fssource.erl src/riak_repl2_rtq_proxy.erl src/riak_repl_aae_source.erl test/riak_core_cluster_mgr_tests.erl

Increment_error_dict expects the partition, elementN of error dict, and the state. It pulls the dict out of the state so it put it back in place, thus just returning the state. So this call that passed the dict in was wrong.

When a partition is not available, perhaps after a number of retries, the error exits stat should be incremented. Also, the retry exits stat should be incremented on each retry. This was discovered when backporting the repl_location_failures riak_test.

The one in riak_repl2_fssource is a legit bug in the code

…ilures-2.0 2.0 port of AAE transient FS failures Reviewed-by: lordnull

Remove loop so we can receive cancel_fullsync during update of remote trees.

A few minor bugs were discovered while investigating riak_test failures. * The ssl application is explicitly started in riak_core_connection:try_ssl/0. The statement in the function expects the call to ssl:start/0 to always return ok, but in some cases the ssl application is already started and the call returns {error, {already_started, ssl}} instead. This should not represent an error condition, but as written an exception is generated in this case. This resulted in riak_test runs of replication tests that exercise SSL to stall. Really there is no reason to attempt to start the ssl application at this point in the code. The ssl application is an application dependency of the riak_repl application and should be started by the call to riak_core_util:start_app_deps in riak_repl_app:start/2. Removing the attempt to start ssl in riak_core_connection to avoid confusion. * The first handle_info function clause in riak_core_connection that handles a message received while in the wait_for_capabilities state attempts to use SSL by calling the try_ssl/4 function. If it succeeds a pair is returned whose elements are the name of the new transport and a new socket for the SSL connection. However the new socket was not being used for subsequent calls to send and setopts and this caused failure of several riak_tests. * The non-test function clause of riak_core_cluster_conn:request_cluster_name/1 contained assumptions about the transport in use and explicitly called inet:setopts/2. This does not work when SSL is used and also caused several test failures. The function has been changed so that the specified transport is used for the call to setopts instead.

Address some minor bugs around establishing SSL connections Reviewed-by: engelsanchez

Improve AAE fullsync by estimating number of keys Reviewed-by: engelsanchez

…-leader Added test and fix to coord_serv not giving list for status. Reviewed-by: engelsanchez

Added last_fullsync_completed stat tracking. Reviewed-by: engelsanchez

When a partition has hit the soft exit limit, we add it to the dropped list, but forgot to remove it from the purgatory list. So it may actually be retried later.

Remove partition from purgatory when giving up Reviewed-by: lordnull

This implementes riak_core_cluster_serv {1,1} with new membership function on the server side to give list of {node(), {IP,Port} | unreachable} for *all* members of remote cluster. Nodes for which the cluster_serv cannot RPC to the given node yield ‘unreachable’ in stead of an IP/Port.

Use new all_members message if remote is 1,1+ For 1,0 emulate new semantics by keeping old IP:PORTs around until cluster_mgr restart. Implement new seeded sort+shuffle for result of calling cluster_mgr:get_ipaddrs_of_cluster/1.

The first tells the caller the address currently connected to. The second tells the rtsource_conn to try (if possible) to switch to an alternative connection.

Possible to lose some addresses when ConnectedAddr are early in the list. [{"127.0.0.1",10106},{"127.0.0.1",10066},{"127.0.0.1",10096},{"127.0.0.1",10076},{"127.0.0.1",10086}], ConnectedAddr: {"127.0.0.1",10066} [{"127.0.0.1",10106}], UsefulAddrs [{"127.0.0.1",10106}]

Stats function now never returns error code.

Mikael Lixenstrand and others added 30 commits October 6, 2014 14:12

Add timeout to natmap_test_

e6d1ec1

Use deduce_wire_version_from_proto for AAE

c8ec0d9

Keep legacy for riak_repl_aae_source state.proto

a13c865

Rename fullsync config parameters

3045282

… this needs to be followed up by appropriate changes to riak_test, doc, etc.

Merge branch 'miklix_estimate_keys-2.0' into feature/aae-fullsync-est…

d1c4130

…imate-keys Conflicts: src/riak_repl_aae_source.erl

Remove unused riak_repl_aae_source:replicate_diff/3

779c1ec

Remove redundant and unnecessary logging

83dd192

Revert some of "Remove redundant and unnecessary logging"

e43257a

Keep needed for testing as debug.

Fix xref and dialyzer

67b9535

Fixed invalid call to update error count track.

a1166e1

Increment_error_dict expects the partition, elementN of error dict, and the state. It pulls the dict out of the state so it put it back in place, thus just returning the state. So this call that passed the dict in was wrong.

Fix dialyzer warnings

5479089

The one in riak_repl2_fssource is a legit bug in the code

Merge pull request #640 from basho/feature/chatty-aae-transient-fs-fa…

75ef166

…ilures-2.0 2.0 port of AAE transient FS failures Reviewed-by: lordnull

Added last_fullsync_completed stat tracking.

4bf87a6

Revert riak_kv deps changes

38b7b4b

Update logging after review

ce0c1b4

Cancel directly on not_responsible from remote cluster

da47f0a

Remove loop so we can receive cancel_fullsync during update of remote trees.

Merge pull request #644 from basho/bugfix/ssl-test-issues

fa4478f

Address some minor bugs around establishing SSL connections Reviewed-by: engelsanchez

handle not_responsible for local partitions

6bdc099

Added test and fix to coord_serv not givin list for status.

90733b6

Removed pointless check for leader.

39fa59f

Merge pull request #623 from basho/feature/aae-fullsync-estimate-keys

df42bcf

Improve AAE fullsync by estimating number of keys Reviewed-by: engelsanchez

Merge pull request #645 from basho/bugfix/mw/snmp-stats-crash-when-no…

9cd1ed0

…-leader Added test and fix to coord_serv not giving list for status. Reviewed-by: engelsanchez

Merge pull request #642 from basho/feature/mw/last-fullsync-completed

d5aa377

Added last_fullsync_completed stat tracking. Reviewed-by: engelsanchez

Remove partition from purgatory when giving up

8f079a2

When a partition has hit the soft exit limit, we add it to the dropped list, but forgot to remove it from the purgatory list. So it may actually be retried later.

Silence noisy lager output during tests

31a1657

Merge pull request #648 from basho/bugfix/remove-from-purgatory

8585a89

Remove partition from purgatory when giving up Reviewed-by: lordnull

macintux and others added 24 commits January 8, 2015 11:00

Fixes to make edoc happy

982eb52

Add client code for {1,1} cluster protocol

b0a173f

Use new all_members message if remote is 1,1+ For 1,0 emulate new semantics by keeping old IP:PORTs around until cluster_mgr restart. Implement new seeded sort+shuffle for result of calling cluster_mgr:get_ipaddrs_of_cluster/1.

Register default all_member_fun

3c9f7ee

Add rtsource_conn:address, and :reconnect

570420d

The first tells the caller the address currently connected to. The second tells the rtsource_conn to try (if possible) to switch to an alternative connection.

Expose filter_blacklisted_ipaddrs function

3efd0d7

Extra code in ring change hook to maybe reconnect

06ebbe5

improve ip address comparison

eb0bd2c

fixes to make riak start

651ee80

improved logging

981ec6a

change blacklist behaviour to not blacklist unkown endpoints

c1b18ab

delayed rebalance of rt connections

b4df4c9

refac

ef29773

indent

3d33b54

remove spamming log statement

4b8b350

Only send one rebalance_now.

b54555e

Remove extra multiplication with 1000

51c417b

Remove duplicated code

a6873bd

Refactor maybe_rebalance

610508e

Fix eunit tests.

5188180

Indent & Refac

5c71490

Fix dialyzer

6e61348

Fix exometer induced dialyzer errors

23e81b5

Stats function now never returns error code.

lixen closed this Jan 8, 2015

lixen reopened this Jan 8, 2015

lixen closed this Jan 8, 2015

lixen deleted the enhance/krab/cluster-realtime-rebalance-rebase branch January 8, 2015 11:18

lixen restored the enhance/krab/cluster-realtime-rebalance-rebase branch January 8, 2015 11:18

lixen deleted the enhance/krab/cluster-realtime-rebalance-rebase branch January 8, 2015 11:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Enhance/krab/cluster realtime rebalance rebase#650

Enhance/krab/cluster realtime rebalance rebase#650
lixen wants to merge 54 commits into
developfrom
enhance/krab/cluster-realtime-rebalance-rebase

lixen commented Jan 8, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants

Conversation

lixen commented Jan 8, 2015

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

8 participants