[SPARK-15767][ML][SparkR] Decision Tree wrapper in SparkR#17981
[SPARK-15767][ML][SparkR] Decision Tree wrapper in SparkR#17981zhengruifeng wants to merge 5 commits into
Conversation
|
Test build #76926 has finished for PR 17981 at commit
|
|
Test build #76927 has finished for PR 17981 at commit
|
|
Jenkins, please retest this please |
|
Test build #76930 has finished for PR 17981 at commit
|
|
Test build #76938 has finished for PR 17981 at commit
|
|
@felixcheung I send this PR following your implementation of |
There was a problem hiding this comment.
nit: remove double empty line
|
@felixcheung Updated. Thanks for your reviewing! |
|
Test build #76966 has finished for PR 17981 at commit
|
| #' @param cacheNodeIds If FALSE, the algorithm will pass trees to executors to match instances with | ||
| #' nodes. If TRUE, the algorithm will cache node IDs for each instance. Caching | ||
| #' can speed up training of deeper trees. Users can set how often should the | ||
| #' cache be checkpointed or disable it by setting checkpointInterval. |
There was a problem hiding this comment.
This is kind of confusing
Users can set how often should the cache be checkpointed
There was a problem hiding this comment.
wording can be improved a bit I guess but this matches the Scaladoc...
| function(data, formula, type = c("regression", "classification"), | ||
| maxDepth = 5, maxBins = 32, impurity = NULL, seed = NULL, | ||
| minInstancesPerNode = 1, minInfoGain = 0.0, checkpointInterval = 10, | ||
| maxMemoryInMB = 256, cacheNodeIds = FALSE) { |
There was a problem hiding this comment.
consider adding thresholds parameter - possibly as a follow up PR.
|
any more comment? |
|
merged to master. thanks! |
## What changes were proposed in this pull request? support decision tree in R ## How was this patch tested? added tests Author: Zheng RuiFeng <ruifengz@foxmail.com> Closes apache#17981 from zhengruifeng/dt_r.
What changes were proposed in this pull request?
support decision tree in R
How was this patch tested?
added tests