Machine Learning Interview Questions- Decision Tree (Set-2)

Decision Trees are very powerful Machine Learning models that can be used for both classification and regression tasks. This makes them one of the most important topics for a Machine Learning interview and having a good grasp of Decision Trees is very important for anyone who is aspiring for the role of a Machine Learning Engineer or Data Scientist. In one of my previous articles, I discussed some of the interview questions asked during a Machine Learning Interview. The post focussed solely on Machine Learning Questions related to Decision Trees. This article will be a follow-up of the previous article and I will talk about some other Machine Learning Interview Questions related to Decision Tree.

Machine Learning Interview Questions on Decision Tree. Image for representation purpose only.
Machine Learning Interview Questions on Decision Tree. Image for representation purpose only.

When creating a Decision Tree from the Training Data, how is the attribute decided for splitting a non-leaf node?

When growing a Decision Tree, each attribute is used to calculate the usefulness of splitting on that attribute. The ‘best‘ and ‘most useful‘ attribute among all the attributes is selected for splitting a non-leaf node. Various quantitative measures exist to determine the usefulness of a split. However, the most commonly used ones are- Decrease in Gini Impurity and Information Gain(i.e, Decrease in Entropy).

How is Gini Impurity for a Node calculated?

Gini Impurity for a node is calculated as the sum of squares of the ratios of all the classes that are present at a particular node subtracted from 1.

Mathematically, the Gini Impurity of a node is calculated with the help of the following formula-

Formula of Gini Impurity
Gini Impurity

So, for example, at a particular node there are 50 samples, of which 10 belong to a particular class(say class A) and 40 belong to another class(say class B), then the Gini Impurity of the node is calculated as-

Calculating Gini Impurity for Machine Learning Interview
Calculating Gini Impurity

How is Entropy calculated?

Mathematically, Entropy for a node is calculated using the following formula-

Entropy
Entropy

So, for example, at a particular node there are 50 samples, of which 10 belong to a particular class(say class A) and 40 belong to another class(say class B), then the ENTROPY of the node is calculated as-

Calculating Entropy for Machine Learning Interview
Calculating Entropy