Decision Trees – Part 6

Similar to information again, alternate measures can be used to measure impurity of node and thus play role in selection of an attribute to split node to sub nodes (branches) or leaves.

GINI: Similar to Information Gain, GINI measures impurity of a node. GINI is an alternative and can be used in place of Information Gain. Example CART (Classification Tree) uses GINI index for splitting decision tree nodes.

Node (N) with “s” total data elements has subset (count) data elements of class “i”, then

clip_image002[7]

Similar to entropy, GINI impurity index values range from 0 to .5. Graph plotted with values of GINI index is below

image

While entropy values range between 0 to 1 , GINI index values range between 0 to 0.5. Additionally GINI includes number of classes (and count), it may not need to compute something like “Gain Ratio” as in case of Information Gain.

See below comparison of Entropy and GINI values with different splits of Hired and Not Hired in 10 total candidates.

Hired Not Hired Total (Hired / Not Hired) Entropy GINI
0 10 10 0 0
1 9 10 0.468996 0.18
2 8 10 0.721928 0.32
3 7 10 0.881291 0.42
4 6 10 0.970951 0.48
5 5 10 1 0.5
6 4 10 0.970951 0.48
7 3 10 0.881291 0.42
8 2 10 0.721928 0.32
9 1 10 0.468996 0.18
10 0 10 0 0

Similar to computing “Information Gain” for split operation using attribute “A” at node, using GINI impurity is computed at parent node. Selection of an attribute to split node is based on reduction in impurity. If a node (N) with “t” total elements is split into multiple “k” sub nodes, with each node containing “t(i)” elements, aggregated GINI impurity is

clip_image002[10]

Similar to Decision Trees with “Information Gain” using GINI impurity index, attributes that result in split of parent node that larger nodes(more number values) with higher purity are preferred.

Next is measuring impurity using “Misclassification Error”.

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s