0. Introduction

Decision Trees are among the most intuitive machine learning models. They work by splitting data step by step according to feature values until the model reaches a final prediction. In MATLAB, the main functions are fitctree for classification trees and fitrtree for regression trees. MATLAB also supports tree workflows in Classification Learner and Regression Learner.

This tutorial explains what decision trees are, how they work, when to use them, and how to implement them in MATLAB with clear examples for both classification and regression. The MATLAB-specific parts below follow the current MathWorks documentation.

1. What is a Decision Tree?

A decision tree is a model that predicts an output by following a sequence of decisions from the root of the tree down to a leaf. In a classification tree, the leaf contains a class label. In a regression tree, the leaf contains a numeric prediction. MathWorks describes decision trees this way and notes that classification trees return nominal responses while regression trees return numeric responses.

In simple terms:

2. Why use Decision Trees?

Decision Trees are popular because they are:

They are especially attractive when interpretability matters, because you can inspect the sequence of splits that leads to a prediction. MATLAB provides direct tree-viewing functions, including text and graph views.

3. Main concepts behind Decision Trees

3.1 Root node

The root node is the first split of the tree. It divides the data into two branches based on one predictor.

3.2 Branch nodes

These are internal decision points where another split happens.

3.3 Leaf nodes

Leaves are terminal nodes. In classification, they contain the class decision. In regression, they contain a numeric response.

3.4 Splitting

By default, fitctree and fitrtree use the standard CART algorithm. MathWorks explains that CART starts with all the input data, examines possible binary splits on every predictor, and chooses the split that best satisfies the optimization criterion.

3.5 Tree depth and complexity

Tree complexity can be controlled with options such as:

MathWorks documents these as key parameters for controlling depth: larger MaxNumSplits or smaller MinLeafSize generally produce deeper trees.

4. MATLAB functions and tools you need to know

The main MATLAB functions and tools for decision trees are:

Part I — Classification Trees in MATLAB

5. First classification tree example

Let us begin with a simple binary classification dataset.

clc;
clear;
close all;

% Example data
X = [1 2;
     2 3;
     2 1;
     3 2;
     6 7;
     7 8;
     8 7;
     7 6];

Y = [1; 1; 1; 1; 2; 2; 2; 2];

% Train classification tree
Mdl = fitctree(X, Y);

% Predict on training data
YPred = predict(Mdl, X);

disp('Predicted labels:');
disp(YPred);

fitctree(X,Y) returns a fitted binary classification decision tree based on the predictors in X and the class labels in Y.

6. Visualizing the tree

One of the strengths of decision trees is that you can view their structure.

view(Mdl)
view(Mdl, 'Mode', 'graph')

MathWorks documents two viewing modes: text output with view(tree) and a graphical tree with view(tree,'mode','graph').

7. Visualizing the training data

gscatter(X(:,1), X(:,2), Y, 'rb', 'ox');
xlabel('Feature 1');
ylabel('Feature 2');
title('Training Data');
grid on;

This helps you see how the classes are distributed before training the tree.

8. Train/test split for classification

A machine learning model should be evaluated on unseen data.

clc;
clear;
close all;

% Dataset
X = [1 2;
     2 3;
     2 1;
     3 2;
     6 7;
     7 8;
     8 7;
     7 6;
     1.5 2.5;
     6.5 7.5];

Y = [1;1;1;1;2;2;2;2;1;2];

% Split
rng(1);
cv = cvpartition(Y, 'HoldOut', 0.3);

XTrain = X(training(cv), :);
YTrain = Y(training(cv), :);

XTest = X(test(cv), :);
YTest = Y(test(cv), :);

% Train tree
Mdl = fitctree(XTrain, YTrain);

% Predict on test set
YPred = predict(Mdl, XTest);

% Accuracy
accuracy = mean(YPred == YTest) * 100;
fprintf('Test Accuracy = %.2f%%\n', accuracy);

This is a standard classification workflow.

9. Confusion matrix

A confusion matrix is a useful way to inspect classification performance.

cm = confusionmat(YTest, YPred);
disp('Confusion Matrix:');
disp(cm);

confusionchart(YTest, YPred);
title('Confusion Matrix');

This shows how many examples were correctly or incorrectly classified.

10. Classification loss

MATLAB provides loss for classification trees.

clc;
clear;
close all;

X = [1 2;
     2 3;
     2 1;
     3 2;
     6 7;
     7 8;
     8 7;
     7 6];

Y = [1;1;1;1;2;2;2;2];

Mdl = fitctree(X, Y);

L = loss(Mdl, X, Y);

fprintf('Classification Loss = %.4f\n', L);
fprintf('Approximate Accuracy = %.2f%%\n', (1 - L) * 100);

MathWorks documents loss(tree,...) for trained classification tree models and notes that better classifiers generally yield smaller loss values.

Part II — Controlling Tree Complexity

11. Why tree complexity matters

A tree that is too deep can memorize the training data and overfit. A tree that is too shallow can underfit and miss important patterns.

MathWorks documents MaxNumSplits and MinLeafSize as important arguments that control depth and leaf size.

12. Example with MaxNumSplits

clc;
clear;
close all;

load fisheriris

X = meas;
Y = species;

% Smaller tree
Mdl1 = fitctree(X, Y, 'MaxNumSplits', 2);

% Larger tree
Mdl2 = fitctree(X, Y, 'MaxNumSplits', 20);

disp('Small tree:');
view(Mdl1);

disp('Large tree:');
view(Mdl2);

This illustrates how limiting the number of splits changes tree size.

13. Example with MinLeafSize

clc;
clear;
close all;

load fisheriris

X = meas;
Y = species;

Mdl = fitctree(X, Y, 'MinLeafSize', 10);

YPred = predict(Mdl, X);

accuracy = mean(strcmp(YPred, Y)) * 100;
fprintf('Training Accuracy = %.2f%%\n', accuracy);

Larger leaf sizes tend to produce simpler trees.

Part III — Multiclass Classification with Decision Trees

14. Decision trees can do multiclass classification

In MATLAB, fitctree returns a binary decision tree structure that supports multiclass classification. The class object itself is described as a “binary decision tree for multiclass classification.”

15. Example with the iris dataset

clc;
clear;
close all;

load fisheriris

X = meas;
Y = species;

% Train classification tree
Mdl = fitctree(X, Y);

% Predict
YPred = predict(Mdl, X);

% Accuracy
accuracy = mean(strcmp(YPred, Y)) * 100;
fprintf('Training Accuracy = %.2f%%\n', accuracy);

% Confusion chart
confusionchart(Y, YPred);
title('Iris Classification Tree');

This is a classic multiclass example.

16. Train/test split with iris

clc;
clear;
close all;

load fisheriris

X = meas;
Y = species;

rng(2);
cv = cvpartition(Y, 'HoldOut', 0.3);

XTrain = X(training(cv), :);
YTrain = Y(training(cv), :);

XTest = X(test(cv), :);
YTest = Y(test(cv), :);

Mdl = fitctree(XTrain, YTrain);

YPred = predict(Mdl, XTest);

accuracy = mean(strcmp(YPred, YTest)) * 100;
fprintf('Test Accuracy = %.2f%%\n', accuracy);

confusionchart(YTest, YPred);
title('Iris Test Results');

This gives a better estimate of real-world performance than evaluating only on the training set.

Part IV — Regression Trees in MATLAB

17. What is a regression tree?

A regression tree predicts a numeric output instead of a class label. In MATLAB, you use fitrtree to train it. MathWorks states that fitrtree returns a regression tree with binary splits and that the resulting RegressionTree object predicts numeric responses.

18. Simple regression tree example

clc;
clear;
close all;

% Example regression data
X = (1:10)';
Y = [1.2; 1.8; 2.5; 3.7; 4.1; 5.3; 6.0; 7.1; 8.0; 9.2];

% Train regression tree
Mdl = fitrtree(X, Y);

% Predict on training data
YPred = predict(Mdl, X);

disp(table(X, Y, YPred));

fitrtree(X,Y) trains a regression tree from predictors X and numeric response Y.

19. Plotting regression tree predictions

clc;
clear;
close all;

X = (1:10)';
Y = [1.2; 1.8; 2.5; 3.7; 4.1; 5.3; 6.0; 7.1; 8.0; 9.2];

Mdl = fitrtree(X, Y);
YPred = predict(Mdl, X);

plot(X, Y, 'o', 'MarkerSize', 8, 'LineWidth', 1.5);
hold on;
plot(X, YPred, '-s', 'LineWidth', 1.5);
xlabel('X');
ylabel('Y');
title('Regression Tree');
legend('Original Data', 'Tree Predictions');
grid on;

Regression tree predictions often look piecewise constant because each leaf outputs one value.

20. Regression tree evaluation

For regression, common metrics include:

clc;
clear;
close all;

X = (1:10)';
Y = [1.2; 1.8; 2.5; 3.7; 4.1; 5.3; 6.0; 7.1; 8.0; 9.2];

Mdl = fitrtree(X, Y);
YPred = predict(Mdl, X);

MAE = mean(abs(Y - YPred));
MSE = mean((Y - YPred).^2);
RMSE = sqrt(MSE);

fprintf('MAE  = %.4f\n', MAE);
fprintf('MSE  = %.4f\n', MSE);
fprintf('RMSE = %.4f\n', RMSE);

21. Real regression tree example

clc;
clear;
close all;

load carsmall

tbl = table(Horsepower, Weight, MPG);
tbl = rmmissing(tbl);

% Train regression tree
Mdl = fitrtree(tbl, 'MPG ~ Horsepower + Weight');

% Predict
YPred = predict(Mdl, tbl(:, {'Horsepower','Weight'}));

% RMSE
RMSE = sqrt(mean((tbl.MPG - YPred).^2));
fprintf('RMSE = %.4f\n', RMSE);

% Plot actual vs predicted
plot(tbl.MPG, YPred, 'o');
xlabel('Actual MPG');
ylabel('Predicted MPG');
title('Regression Tree on carsmall');
grid on;

MATLAB documents formula-based usage for fitrtree with tables.

Part V — Viewing and Interpreting Trees

22. Why trees are easy to interpret

A big advantage of decision trees is interpretability. You can inspect the exact split conditions and follow the path to a prediction. MATLAB supports both text and graph views of trees.

Example:

load fisheriris
ctree = fitctree(meas, species);

view(ctree)
view(ctree, 'Mode', 'graph')

 

This makes decision trees attractive for teaching and for applications where explanation matters.

Part VI — Classification Learner and Regression Learner

23. Classification Learner for decision trees

MATLAB’s Classification Learner app lets you create and compare classification trees, then export trained models to predict new data. MathWorks has a dedicated example titled “Train Decision Trees Using Classification Learner App.”

Open it with:

classificationLearner

Typical workflow:

  1. import your data,
  2. choose the response variable,
  3. select decision tree models,
  4. train and compare them,
  5. export the best model.

MathWorks also notes that the Models gallery includes several preset decision tree options that are starting points for different problems.

24. Regression Learner for regression trees

MATLAB’s Regression Learner app supports regression tree workflows. MathWorks states that to interactively grow a regression tree, you can use Regression Learner, while fitrtree gives more command-line flexibility.

Open it with:

regressionLearner

Typical workflow:

  1. import your data,
  2. choose the numeric target,
  3. train regression trees,
  4. compare performance,
  5. export the best model.

Part VII — End-to-End Mini Project

25. Classification project: predict pass or fail

clc;
clear;
close all;

% Example student data
StudyHours = [1;2;2;3;4;5;5;6;7;8];
Attendance = [50;55;60;65;70;75;80;85;90;95];
Pass = categorical([0;0;0;0;0;1;1;1;1;1]);

tbl = table(StudyHours, Attendance, Pass);

% Train/test split
rng(2);
cv = cvpartition(tbl.Pass, 'HoldOut', 0.3);

trainTbl = tbl(training(cv), :);
testTbl  = tbl(test(cv), :);

% Train classification tree
Mdl = fitctree(trainTbl, 'Pass ~ StudyHours + Attendance');

% Predict
YPred = predict(Mdl, testTbl(:, {'StudyHours','Attendance'}));

% Accuracy
accuracy = mean(YPred == testTbl.Pass) * 100;
fprintf('Test Accuracy = %.2f%%\n', accuracy);

% Confusion chart
confusionchart(testTbl.Pass, YPred);
title('Pass/Fail Classification Tree');

% Predict a new student
newStudent = table(6, 82, 'VariableNames', {'StudyHours','Attendance'});
newClass = predict(Mdl, newStudent);

disp('Predicted class for new student:');
disp(newClass);

26. Regression project: predict a house price

clc;
clear;
close all;

% Example house dataset
Size = [50; 60; 70; 80; 90; 100; 110; 120];
Rooms = [2; 3; 3; 4; 4; 5; 5; 6];
Age = [20; 18; 15; 12; 10; 8; 5; 3];
Price = [100; 120; 135; 150; 170; 190; 210; 230];

tbl = table(Size, Rooms, Age, Price);

% Train/test split
rng(3);
idx = randperm(height(tbl));
trainTbl = tbl(idx(1:6), :);
testTbl  = tbl(idx(7:8), :);

% Train regression tree
Mdl = fitrtree(trainTbl, 'Price ~ Size + Rooms + Age');

% Predict
YPred = predict(Mdl, testTbl(:, {'Size','Rooms','Age'}));

% Evaluate
RMSE = sqrt(mean((testTbl.Price - YPred).^2));
fprintf('Test RMSE = %.4f\n', RMSE);

disp(table(testTbl.Price, YPred, 'VariableNames', {'ActualPrice','PredictedPrice'}));

These two mini-projects show how decision trees work for both kinds of supervised learning.

Part VIII — Common mistakes beginners make

27. Training a tree that is too deep

Very deep trees can overfit the training data. MATLAB documents parameters like MaxNumSplits and MinLeafSize specifically to control this.

28. Evaluating only on the training data

A tree can look excellent on training data but perform poorly on new data.

29. Forgetting that classification and regression use different functions

Use:

30. Ignoring tree visualization

Because trees are easy to inspect, failing to view them wastes one of their biggest benefits. MATLAB supports text and graphical viewing.

Part IX — When should you use Decision Trees?

Decision Trees are a good choice when:

You may avoid a single decision tree when:

In those cases, ensembles are often worth exploring, but a single tree is still excellent for learning and for interpretable baselines.

Part X — Summary

Decision Trees are one of the most understandable machine learning methods. In MATLAB, fitctree trains classification trees, fitrtree trains regression trees, predict returns predictions, loss helps evaluate classification trees, and view helps inspect the model visually. MATLAB also provides Classification Learner and Regression Learner for app-based tree workflows.

A practical workflow is:

  1. prepare the data,
  2. choose classification or regression,
  3. train the tree,
  4. inspect the tree structure,
  5. evaluate on unseen data,
  6. control depth if needed,
  7. use the model for prediction.

Part XI — MATLAB cheat sheet

Classification tree

Mdl = fitctree(X, Y);
YPred = predict(Mdl, Xnew);

Regression tree

Mdl = fitrtree(X, Y);
YPred = predict(Mdl, Xnew);

Classification loss

L = loss(Mdl, X, Y);

View tree

view(Mdl)
view(Mdl, 'Mode', 'graph')

Open Classification Learner

classificationLearner

Open Regression Learner

regressionLearner

These commands match the current MATLAB decision tree workflow.

Practice exercises

Exercise 1

Train a classification tree on a small binary dataset and compute test accuracy

Exercise 2

Use the fisheriris dataset to build a multiclass classification tree

Exercise 3

Train two classification trees with different MaxNumSplits values and compare them

Solution of Exercise 4

Train a regression tree on a simple numeric dataset and compute RMSE

Exercise 5

Build a small end-to-end project using a decision tree, including train/test split, evaluation, and prediction for a new observation

Short recap

Exercise 1

Train a binary classification tree and compute test accuracy.

Exercise 2

Build a multiclass classification tree with the iris dataset.

Exercise 3

Compare two trees with different complexity levels.

Exercise 4

Train a regression tree and compute RMSE.

Exercise 5

Create a complete decision tree mini-project with train/test split and prediction.