Integrate statistical measurement in the final OptimalAssignment output#486
Integrate statistical measurement in the final OptimalAssignment output#486i3wangyi wants to merge 1 commit into
Conversation
| for (AssignableNode instance : instances) { | ||
| Map<String, Integer> capacityUsage = instance.getCapacityUsage(); | ||
| for (String key : capacityUsage.keySet()) { | ||
| usages.computeIfAbsent(key, k -> new ArrayList<>()).add(capacityUsage.get(key)); |
There was a problem hiding this comment.
Could you spell out lambda variables? Like capacityKey, usageKey, etc.
| } | ||
|
|
||
| return usages.entrySet().stream() | ||
| .collect(Collectors.toMap(Map.Entry::getKey, e -> getCoefficientOfVariation(e.getValue()))); |
There was a problem hiding this comment.
Could you spell out lambda variables? Like capacityKey, usageKey, etc.
| * @return a simple cumulative count of total movements where differences of movements in terms of | ||
| * location, size, etc is ignored | ||
| */ | ||
| public int getTotalPartitionMovements(Map<String, ResourceAssignment> baseAssignment) { |
There was a problem hiding this comment.
baseAssignment -> previousAssignment?
There was a problem hiding this comment.
Commented in the javadoc, the purpose of the method is to allow caller to pass any resource assignments (baseline or bestpossible) and it will calculate the movements
| */ | ||
| public int getTotalPartitionMovements(Map<String, ResourceAssignment> baseAssignment) { | ||
| Map<String, ResourceAssignment> optimalAssignment = getOptimalResourceAssignment(); | ||
| int movements = 0; |
There was a problem hiding this comment.
movements -> movedPartitionCount to be clear.
| ResourceAssignment lastResourceAssignment = baseAssignment.get(resource); | ||
| for (Partition partition : resourceAssignment.getMappedPartitions()) { | ||
| Map<String, String> thisInstanceToStates = | ||
| new HashMap<>(resourceAssignment.getReplicaMap(partition)); |
There was a problem hiding this comment.
Do we really need to copy the map? Possible to synchronize on the assignment or something?
I'm worried that this might add a significant amount of GC pressure since we'd be doing this for the entire assignment.
There was a problem hiding this comment.
On my understanding, this logic is for tuning/testing purposes. So if they are not in this file, but in some tuning class instead, it should be fine.
| Partition partition = new Partition(replica.getPartitionName()); | ||
| ResourceAssignment resourceAssignment = assignmentMap | ||
| .computeIfAbsent(resourceName, key -> new ResourceAssignment(resourceName)); | ||
| ResourceAssignment resourceAssignment = assignmentMap.computeIfAbsent(resourceName, |
There was a problem hiding this comment.
Why does formatting keep changing?
| public String getFailures() { | ||
| // TODO: format the error string | ||
| return _failedAssignments.toString(); | ||
| public String getErrorMessage() { |
There was a problem hiding this comment.
- I'm afraid this might pollute the log big time.
- What value do we have in printing out all failed assignments? I thought as soon as we encounter a failed assignment, we just terminate? So what we'll see here is always just the first failed assignment?
jiajunwang
left a comment
There was a problem hiding this comment.
In general, the measurement logic is for testing purpose only. It is not a good idea to pollute the production code with testing logic.
Could you please create a separate class for the testing? One option is to create a child class of OptimalAssignment. Then adding the measurement code there. Or you just have a standalone measurement method in the tuning code.
Note that if you put the code in OptimalAssignment, anyone who modifies it will be forced to maintain the measurement code as well.
| String errorMessage = String.format( | ||
| "Unable to find any available candidate node for partition %s; Fail reasons: %s", | ||
| replica.getPartitionName(), optimalAssignment.getFailures()); | ||
| replica.getPartitionName(), optimalAssignment.getErrorMessage()); |
There was a problem hiding this comment.
OptimalAssignment is not an Exception.
The meaning of "ErrorMessage" is not clear.
I would prefer to call it AssignmentFailures or just keep it as it was.
|
|
||
|
|
||
| //TODO: will implement by merging other changes | ||
| public Map<String, Integer> getCapacityUsage() { |
There was a problem hiding this comment.
Let's don't make it public since it is not supporting the feature.
I would prefer the model class to be clean and no extra method, as long as it is possible.
| ResourceAssignment lastResourceAssignment = baseAssignment.get(resource); | ||
| for (Partition partition : resourceAssignment.getMappedPartitions()) { | ||
| Map<String, String> thisInstanceToStates = | ||
| new HashMap<>(resourceAssignment.getReplicaMap(partition)); |
There was a problem hiding this comment.
On my understanding, this logic is for tuning/testing purposes. So if they are not in this file, but in some tuning class instead, it should be fine.
Issues
(#480)
Description
(Write a concise description including what, why, how)
Added CV and movements measurement tools for the statistical understanding of the final optimal assignment
Tests
(List the names of added unit/integration tests)
(Copy & paste the result of "mvn test")
Commits
Documentation
(Link the GitHub wiki you added)
Code Quality