[7.x][ML] Gain upper bound estimation for classification and regression by valeriy42 · Pull Request #1568 · elastic/ml-cpp

valeriy42 · 2020-11-11T10:21:13Z

In this PR we start computing an upper bound on the potential gain from splitting a node. If the upper bound of the gain is lower than the currently smallest gain among all candidates, we ignore the node and this way prevent computations that are especially expensive on the large datasets.

Since we avoid computation of the splits that we wouldn't be added to the tree anyway, this PR does not change the qualitative results.

At the moment, we can only compute the upper bound for regression and binary classification. For multiclass classification we proceed as before.

Note: this PR contains additional instrumentation to assess the performance improvement. I will remove this instrumentation in a follow-up PR after tests.

Backport of #1537

…lastic#1537) In this PR we start computing an upper bound on the potential gain from splitting a node. If the upper bound of the gain is lower than the currently smallest gain among all candidates, we ignore the node and this way prevent computations that are especially expensive on the large datasets. Since we avoid computation of the splits that we wouldn't be added to the tree anyway, this PR does not change the qualitative results. At the moment, we can only compute the upper bound for regression and binary classification. For multiclass classification we proceed as before. Note: this PR contains additional instrumentation to assess the performance improvement. I will remove this instrumentation in a follow-up PR after tests.

valeriy42 added >enhancement :ml backport v7.11.0 labels Nov 11, 2020

valeriy42 added 3 commits November 12, 2020 13:16

windows compilation error fixed

7510300

fix compiler issues

06e9677

fixing compiling issues

d29e86c

valeriy42 merged commit ba4e091 into elastic:7.x Nov 12, 2020

valeriy42 deleted the backport-pr1537 branch November 12, 2020 16:12

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[7.x][ML] Gain upper bound estimation for classification and regression #1568

[7.x][ML] Gain upper bound estimation for classification and regression #1568
valeriy42 merged 4 commits intoelastic:7.xfrom
valeriy42:backport-pr1537

valeriy42 commented Nov 11, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

valeriy42 commented Nov 11, 2020

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant