Note: Most tree growing methods favor greatest impurity reduction near the root node. Ex.
To assign category to a leaf node. Easy! If sample data is pure
=> assign this class to leaf.
else
=> assign the most frequent class.
Note: Problem of building decision tree is "ill-conditioned" i.e. small variance in the training data can yield large variations in decision rules obtained.
Ex. p.405(D&H) A small move of one sample data can change the decision rules a lot.
Reference about clustering "Data clustering, a review" A. K. Jain, M. N.