IGNITE-12108 TCP Communication Metrics ported to a new framework.#6814
Conversation
| * @param mreg Metrics registry. | ||
| */ | ||
| protected GridAbstractCommunicationClient(int connIdx, @Nullable GridNioMetricsListener metricsLsnr) { | ||
| protected GridAbstractCommunicationClient(int connIdx, @Nullable MetricRegistry mreg) { |
There was a problem hiding this comment.
This class might be used by components other than TCP SPI. You can see that "GridNioMetricsListener" was nullable as well.
| IgniteBiInClosure<GridNioSession, Integer> msgQueueLsnr, | ||
| boolean readWriteSelectorsAssign, | ||
| @Nullable GridWorkerListener workerLsnr, | ||
| @Nullable MetricRegistry mreg, |
There was a problem hiding this comment.
Let me please argue with you about this particular constructor =)
I totally forgot that it's a private constructor and only "GridNioServer.Builder#build" invokes it. There's no point in splitting it because no one will invoke other constructor with fewer parameters. Let's leave it as is.
Just in case - GridAbstractCommunicationClient constructor will be fixed as you asked. I't not about arguing for the sake of arguing.
There was a problem hiding this comment.
Thanks, Ilya.
Please, let me know when PR becomes ready to be reviewed.
| return -1; | ||
|
|
||
| return res; | ||
| return (int) mreg.longAdderMetric( |
There was a problem hiding this comment.
We should store LongMetric inside object instance.
There was a problem hiding this comment.
I don't get it, can you please explain this comment in more details? Is this about excessive memory allocation?
There was a problem hiding this comment.
It's about unnecessary Map#get operation.
Every 'mreg.longAdderMetrictranslates toMap#get`.
Why do we need it? We can store the variable, isn't it?
| RECEIVED_MESSAGES_BY_TYPE_METRIC_DESC | ||
| ); | ||
|
|
||
| sentMsgsCntByNodeIdMetricFactory = nodeId -> mmgr.registry(COMMUNICATION_METRICS_GROUP_NAME + "." + nodeId) |
There was a problem hiding this comment.
Let's use MetricUtils#metricName instead of COMMUNICATION_METRICS_GROUP_NAME + "." + nodeId
| public static final short HANDSHAKE_WAIT_MSG_TYPE = -28; | ||
|
|
||
| /** Communication metrics group name. */ | ||
| public static final String COMMUNICATION_METRICS_GROUP_NAME = "communication.tcp"; |
There was a problem hiding this comment.
Let's use MetricUtils#metricName instead of "communication.tcp".
|
|
||
|
|
||
| /** */ | ||
| public static String sentMessagesByTypeMetricName(Short directType) { |
There was a problem hiding this comment.
Let's declare all constants, first, and methods after.
|
|
||
| /** */ | ||
| public static String sentMessagesByTypeMetricName(Short directType) { | ||
| return "sentMessagesByType." + directType; |
There was a problem hiding this comment.
Let's use MetricUtils#metricName instead of this and similar methods.
| this.log = rootLog.getLogger(getClass()); | ||
| this.jmx = prepareMBeanServer(); | ||
| this.rsrcProc = new GridResourceProcessor(new GridTestKernalContext(this.log, this.cfg)); | ||
| log = rootLog.getLogger(getClass()); |
There was a problem hiding this comment.
"this." was not necessary so I removed it while adding new field.
There was a problem hiding this comment.
Please, revert unnecessary changes.
| this.jmx = jmx; | ||
| this.log = rootLog.getLogger(getClass()); | ||
| this.rsrcProc = new GridResourceProcessor(new GridTestKernalContext(this.log)); | ||
| log = rootLog.getLogger(getClass()); |
There was a problem hiding this comment.
Please, revert unnecessary changes.
| this.log = log.getLogger(getClass()); | ||
| this.jmx = prepareMBeanServer(); | ||
| this.rsrcProc = new GridResourceProcessor(new GridTestKernalContext(this.log)); | ||
| jmx = prepareMBeanServer(); |
There was a problem hiding this comment.
Please, revert unnecessary changes.
| new GridResourceLoggerInjector(ctx.config().getGridLogger()); | ||
| injectorByAnnotation[GridResourceIoc.ResourceAnnotation.IGNITE_INSTANCE.ordinal()] = | ||
| new GridResourceBasicInjector<>(ctx.grid()); | ||
| injectorByAnnotation[GridResourceIoc.ResourceAnnotation.METRIC_MANAGER.ordinal()] = |
There was a problem hiding this comment.
Why do we need new GridResourceSupplierInjector?
Can we use existing GridResourceBasicInejctor?
There was a problem hiding this comment.
Because supplier injector is lazy. In this particular place metric manager might not be initialized yet.
|
|
||
| this.formatter = formatter; | ||
|
|
||
| sentBytesCntMetric = mreg == null ? |
There was a problem hiding this comment.
Good catch, thank you!
| * Gets sent messages count. | ||
| * | ||
| * @return Sent messages count. | ||
| * @deprecated Will be removed in the next major release and replaced with new metrics API. |
There was a problem hiding this comment.
We should inform user - how he(or she) should obtain information previously gathered from deprecated method.
Let's write "Use metric 'xxx.yyy.zzz' instead" where xxx.yyy.zzz the name of the metric providing required number.
Please, apply this to other deprecated method comments.
There was a problem hiding this comment.
I need you advice about the comment. Are there any plans to cancel JMX support for these values? I didn't think about it long enough. Maybe just remove that Deprecated annotation.
Anyway, specifying some metrics names here would be wrong, CommunicationSpi is interface after all and some implementations may want to use some other specific metric names, or no metrics at all.
| } | ||
|
|
||
| /** */ | ||
| public static String receivedMessagesByTypeMetricName(Short directType) { |
There was a problem hiding this comment.
Let's move this method to TcpCommunicationMetricsListener.
As I can see it used only there.
| public static final String RECEIVED_MESSAGES_BY_NODE_ID_METRIC_DESC = "Total number of messages received by current node from the given node"; | ||
|
|
||
| /** */ | ||
| public static String sentMessagesByTypeMetricName(Short directType) { |
There was a problem hiding this comment.
Let's move this method to TcpCommunicationMetricsListener.
As I can see it used only there.
| } | ||
| }; | ||
| /** Sent bytes count metric.*/ | ||
| private final AtomicLongMetric sentBytesMetric; |
There was a problem hiding this comment.
Let's keep LongAdder as an internal storage for all metrics that was based on it previously.
You can use MetricRegistry#longAdderMetric for it.
|
|
||
| /** Sent bytes count.*/ | ||
| private final LongAdder sentBytesCnt = new LongAdder(); | ||
| /** */ |
There was a problem hiding this comment.
Please, add javadoc for this variable.
| /** All registered metrics. */ | ||
| private final Set<ThreadMetrics> allMetrics = Collections.newSetFromMap(new ConcurrentHashMap<>()); | ||
| /** */ | ||
| private final Function<Short, AtomicLongMetric> rcvdMsgsCntByTypeMetricFactory; |
There was a problem hiding this comment.
Please, add javadoc for this variable.
| @Override protected ThreadMetrics initialValue() { | ||
| ThreadMetrics metrics = new ThreadMetrics(); | ||
| /** */ | ||
| private final Function<Object, AtomicLongMetric> sentMsgsCntByConsistentIdMetricFactory; |
There was a problem hiding this comment.
Please, add javadoc for this variable.
| ConcurrentHashMap<UUID, AtomicLong> rcvdMsgsMetricsByNodeId = new ConcurrentHashMap<>(); | ||
|
|
||
| /** Sent messages count metrics grouped by message node consistent id. */ | ||
| ConcurrentHashMap<Object, AtomicLongMetric> sentMsgsMetricsByConsistentId = new ConcurrentHashMap<>(); |
There was a problem hiding this comment.
I'm not sure we should change the way we collect metrics in this PR.
Let's keep existing behaviour?
This also will simplify PR a lot.
There was a problem hiding this comment.
I changed that because in old implementation we stored basically the same map in each thread separately. That's the obvious answer.
The other thing is that old implementation had concurrency issues that no one cared about. Like broken values visibility or potential ConcurrentModificationExceptions. I wanted new implementation to be correct.
There was a problem hiding this comment.
You introduced metrics based on node consistency id.
I think we should avoid introducing new metrics here.
The only goal of this ticket is to migrate existing metrics to the new framework.
Let's
- remove metrics based on consistency id.
- revert changes in
ConnectionKeyandTcpCommunicationSpithat made to support it.
|
@ibessonov Thanks for the PR update. Looks much better now! |
| import org.apache.ignite.plugin.extensions.communication.MessageFormatter; | ||
| import org.jetbrains.annotations.Nullable; | ||
|
|
||
| import static org.apache.ignite.internal.util.nio.GridNioServer.SENT_BYTES_METRIC_DESC; |
|
|
||
| /** | ||
| * Resets metrics for this SPI instance. | ||
| * |
All other LGTM. Thanks! |
|
@ibessonov As we discussed privately we should resolve potential contention on We also should perform a yardstick check of these changes. |
| private static synchronized MetricRegistry getOrCreateMetricRegistry(GridMetricManager mmgr, UUID nodeId) { | ||
| String regName = MetricUtils.metricName(COMMUNICATION_METRICS_GROUP_NAME, nodeId.toString()); | ||
|
|
||
| for (MetricRegistry mreg : mmgr) { |
There was a problem hiding this comment.
We can eliminate iteration on all registries if we use GridMetricManager#addMetricRegistryCreationListener to implement registry initialization logic.
mmgr.addMetricRegistryCreationListener(mreg -> {
if (!mreg.name().startsWith(COMMUNICATION_METRICS_GROUP_NAME))
return;
mreg.longAdderMetric(SENT_MESSAGES_BY_NODE_ID_METRIC_NAME, SENT_MESSAGES_BY_NODE_ID_METRIC_DESC);
mreg.longAdderMetric(RECEIVED_MESSAGES_BY_NODE_ID_METRIC_NAME, RECEIVED_MESSAGES_BY_NODE_ID_METRIC_DESC);
});
| res.put(typeName, ((LongMetric)metric).value()); | ||
| } | ||
| } | ||
| catch (NumberFormatException ignore) { |
There was a problem hiding this comment.
Let's print the warning to log here.
There was a problem hiding this comment.
I'll just remove it, it's excessive
|
|
||
| res.put(nodeId, mreg.<LongMetric>findMetric(metricName).value()); | ||
| } | ||
| catch (IllegalArgumentException ignore) { |
There was a problem hiding this comment.
We shouldn't swallow exception here. Let's remove empty catch block.
…cks; metrics registration moved to listeners.
|
LGTM. @ibessonov Thank you, so much for the valuable contribution! |
No description provided.