Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

I have used sklearn.tree.DecisionTreeRegressor to predict a regression problem with two independables aka the features "X", "Y" and the predicted dependable variable "Z". When I plot the tree, the leafs do not seem to differ much from a Classification tree. The result is not a function at each leaf, but it is a single value at each leaf, just like in a classification.

Can someone explain, why this is called a regression and why it is different to a classification tree?

Because I seem to have misunderstood the sklearn class, is there a tree package for python, that does a "real" regression and has a function as an output at each leaf? With X,Y and Z, this would probably be some kind of surface at each leaf.


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
317 views
Welcome To Ask or Share your Answers For Others

1 Answer

This is to be expected. The output at each leaf is not a function, it is a single value, representing the predicted numeric (hence regression) output for all instances in that leaf. The output is a "function" in the sense that you get different values depending on which leaf you would land in. Classification tree words exactly the same, but the output value represents a class probability, not the predicted value of Z.

In other words, regressions output functions that map to arbitrary values, but there is no rule that this function must be continuous. With trees, the function is more of a "stair step".


与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...