Machine Learning Interview Questions- Decision Tree (Set-2)

Decision Trees are very powerful Machine Learning models that can be used for both classification and regression tasks. This makes them one of the most important topics for a Machine Learning interview and having a good grasp of Decision Trees is very important for anyone who is aspiring for the role of a Machine Learning Engineer or Data Scientist. In one of my previous articles, I discussed some of the interview questions asked during a Machine Learning Interview. The post focussed solely on Machine Learning Questions related to Decision Trees. This article will be a follow-up of the previous article and I will talk about some other Machine Learning Interview Questions related to Decision Tree.

Machine Learning Interview Questions on Decision Tree. Image for representation purpose only.

When creating a Decision Tree from the Training Data, how is the attribute decided for splitting a non-leaf node?

When growing a Decision Tree, each attribute is used to calculate the usefulness of splitting on that attribute. The ‘best‘ and ‘most useful‘ attribute among all the attributes is selected for splitting a non-leaf node. Various quantitative measures exist to determine the usefulness of a split. However, the most commonly used ones are- Decrease in Gini Impurity and Information Gain(i.e, Decrease in Entropy).

How is Gini Impurity for a Node calculated?

Gini Impurity for a node is calculated as the sum of squares of the ratios of all the classes that are present at a particular node subtracted from 1.

Mathematically, the Gini Impurity of a node is calculated with the help of the following formula-

Gini Impurity

So, for example, at a particular node there are 50 samples, of which 10 belong to a particular class(say class A) and 40 belong to another class(say class B), then the Gini Impurity of the node is calculated as-

Calculating Gini Impurity

How is Entropy calculated?

Mathematically, Entropy for a node is calculated using the following formula-

Entropy

So, for example, at a particular node there are 50 samples, of which 10 belong to a particular class(say class A) and 40 belong to another class(say class B), then the ENTROPY of the node is calculated as-

Calculating Entropy