Decision Trees are among the most intuitive machine learning models. They work by splitting data step by step according to feature values until the model reaches a final prediction. In MATLAB, the main functions are fitctree for classification trees and fitrtree for regression trees. MATLAB also supports tree workflows in Classification Learner and Regression Learner.
This tutorial explains what decision trees are, how they work, when to use them, and how to implement them in MATLAB with clear examples for both classification and regression. The MATLAB-specific parts below follow the current MathWorks documentation.
A decision tree is a model that predicts an output by following a sequence of decisions from the root of the tree down to a leaf. In a classification tree, the leaf contains a class label. In a regression tree, the leaf contains a numeric prediction. MathWorks describes decision trees this way and notes that classification trees return nominal responses while regression trees return numeric responses.
In simple terms:
Feature1 < 5.2,Decision Trees are popular because they are:
They are especially attractive when interpretability matters, because you can inspect the sequence of splits that leads to a prediction. MATLAB provides direct tree-viewing functions, including text and graph views.
The root node is the first split of the tree. It divides the data into two branches based on one predictor.
These are internal decision points where another split happens.
Leaves are terminal nodes. In classification, they contain the class decision. In regression, they contain a numeric response.
By default, fitctree and fitrtree use the standard CART algorithm. MathWorks explains that CART starts with all the input data, examines possible binary splits on every predictor, and chooses the split that best satisfies the optimization criterion.
Tree complexity can be controlled with options such as:
MaxNumSplitsMinLeafSizeMathWorks documents these as key parameters for controlling depth: larger MaxNumSplits or smaller MinLeafSize generally produce deeper trees.
The main MATLAB functions and tools for decision trees are:
fitctree → classification treefitrtree → regression treepredict → predictions for new dataloss → classification loss for a classification treeview → view the tree in text or graphical formClassification Learner → GUI for classification treesRegression Learner → GUI for regression trees Let us begin with a simple binary classification dataset.
clc;
clear;
close all;
% Example data
X = [1 2;
2 3;
2 1;
3 2;
6 7;
7 8;
8 7;
7 6];
Y = [1; 1; 1; 1; 2; 2; 2; 2];
% Train classification tree
Mdl = fitctree(X, Y);
% Predict on training data
YPred = predict(Mdl, X);
disp('Predicted labels:');
disp(YPred);fitctree(X,Y) returns a fitted binary classification decision tree based on the predictors in X and the class labels in Y.
One of the strengths of decision trees is that you can view their structure.
view(Mdl)
view(Mdl, 'Mode', 'graph')MathWorks documents two viewing modes: text output with view(tree) and a graphical tree with view(tree,'mode','graph').
gscatter(X(:,1), X(:,2), Y, 'rb', 'ox');
xlabel('Feature 1');
ylabel('Feature 2');
title('Training Data');
grid on;This helps you see how the classes are distributed before training the tree.
A machine learning model should be evaluated on unseen data.
clc;
clear;
close all;
% Dataset
X = [1 2;
2 3;
2 1;
3 2;
6 7;
7 8;
8 7;
7 6;
1.5 2.5;
6.5 7.5];
Y = [1;1;1;1;2;2;2;2;1;2];
% Split
rng(1);
cv = cvpartition(Y, 'HoldOut', 0.3);
XTrain = X(training(cv), :);
YTrain = Y(training(cv), :);
XTest = X(test(cv), :);
YTest = Y(test(cv), :);
% Train tree
Mdl = fitctree(XTrain, YTrain);
% Predict on test set
YPred = predict(Mdl, XTest);
% Accuracy
accuracy = mean(YPred == YTest) * 100;
fprintf('Test Accuracy = %.2f%%\n', accuracy);This is a standard classification workflow.
A confusion matrix is a useful way to inspect classification performance.
cm = confusionmat(YTest, YPred);
disp('Confusion Matrix:');
disp(cm);
confusionchart(YTest, YPred);
title('Confusion Matrix');This shows how many examples were correctly or incorrectly classified.
MATLAB provides loss for classification trees.
clc;
clear;
close all;
X = [1 2;
2 3;
2 1;
3 2;
6 7;
7 8;
8 7;
7 6];
Y = [1;1;1;1;2;2;2;2];
Mdl = fitctree(X, Y);
L = loss(Mdl, X, Y);
fprintf('Classification Loss = %.4f\n', L);
fprintf('Approximate Accuracy = %.2f%%\n', (1 - L) * 100);MathWorks documents loss(tree,...) for trained classification tree models and notes that better classifiers generally yield smaller loss values.
A tree that is too deep can memorize the training data and overfit. A tree that is too shallow can underfit and miss important patterns.
MathWorks documents MaxNumSplits and MinLeafSize as important arguments that control depth and leaf size.
MaxNumSplitsclc;
clear;
close all;
load fisheriris
X = meas;
Y = species;
% Smaller tree
Mdl1 = fitctree(X, Y, 'MaxNumSplits', 2);
% Larger tree
Mdl2 = fitctree(X, Y, 'MaxNumSplits', 20);
disp('Small tree:');
view(Mdl1);
disp('Large tree:');
view(Mdl2);
This illustrates how limiting the number of splits changes tree size.
MinLeafSizeclc;
clear;
close all;
load fisheriris
X = meas;
Y = species;
Mdl = fitctree(X, Y, 'MinLeafSize', 10);
YPred = predict(Mdl, X);
accuracy = mean(strcmp(YPred, Y)) * 100;
fprintf('Training Accuracy = %.2f%%\n', accuracy);Larger leaf sizes tend to produce simpler trees.
In MATLAB, fitctree returns a binary decision tree structure that supports multiclass classification. The class object itself is described as a “binary decision tree for multiclass classification.”
clc;
clear;
close all;
load fisheriris
X = meas;
Y = species;
% Train classification tree
Mdl = fitctree(X, Y);
% Predict
YPred = predict(Mdl, X);
% Accuracy
accuracy = mean(strcmp(YPred, Y)) * 100;
fprintf('Training Accuracy = %.2f%%\n', accuracy);
% Confusion chart
confusionchart(Y, YPred);
title('Iris Classification Tree');This is a classic multiclass example.
clc;
clear;
close all;
load fisheriris
X = meas;
Y = species;
rng(2);
cv = cvpartition(Y, 'HoldOut', 0.3);
XTrain = X(training(cv), :);
YTrain = Y(training(cv), :);
XTest = X(test(cv), :);
YTest = Y(test(cv), :);
Mdl = fitctree(XTrain, YTrain);
YPred = predict(Mdl, XTest);
accuracy = mean(strcmp(YPred, YTest)) * 100;
fprintf('Test Accuracy = %.2f%%\n', accuracy);
confusionchart(YTest, YPred);
title('Iris Test Results');This gives a better estimate of real-world performance than evaluating only on the training set.
A regression tree predicts a numeric output instead of a class label. In MATLAB, you use fitrtree to train it. MathWorks states that fitrtree returns a regression tree with binary splits and that the resulting RegressionTree object predicts numeric responses.
clc;
clear;
close all;
% Example regression data
X = (1:10)';
Y = [1.2; 1.8; 2.5; 3.7; 4.1; 5.3; 6.0; 7.1; 8.0; 9.2];
% Train regression tree
Mdl = fitrtree(X, Y);
% Predict on training data
YPred = predict(Mdl, X);
disp(table(X, Y, YPred));fitrtree(X,Y) trains a regression tree from predictors X and numeric response Y.
clc;
clear;
close all;
X = (1:10)';
Y = [1.2; 1.8; 2.5; 3.7; 4.1; 5.3; 6.0; 7.1; 8.0; 9.2];
Mdl = fitrtree(X, Y);
YPred = predict(Mdl, X);
plot(X, Y, 'o', 'MarkerSize', 8, 'LineWidth', 1.5);
hold on;
plot(X, YPred, '-s', 'LineWidth', 1.5);
xlabel('X');
ylabel('Y');
title('Regression Tree');
legend('Original Data', 'Tree Predictions');
grid on;Regression tree predictions often look piecewise constant because each leaf outputs one value.
For regression, common metrics include:
clc;
clear;
close all;
X = (1:10)';
Y = [1.2; 1.8; 2.5; 3.7; 4.1; 5.3; 6.0; 7.1; 8.0; 9.2];
Mdl = fitrtree(X, Y);
YPred = predict(Mdl, X);
MAE = mean(abs(Y - YPred));
MSE = mean((Y - YPred).^2);
RMSE = sqrt(MSE);
fprintf('MAE = %.4f\n', MAE);
fprintf('MSE = %.4f\n', MSE);
fprintf('RMSE = %.4f\n', RMSE);clc;
clear;
close all;
load carsmall
tbl = table(Horsepower, Weight, MPG);
tbl = rmmissing(tbl);
% Train regression tree
Mdl = fitrtree(tbl, 'MPG ~ Horsepower + Weight');
% Predict
YPred = predict(Mdl, tbl(:, {'Horsepower','Weight'}));
% RMSE
RMSE = sqrt(mean((tbl.MPG - YPred).^2));
fprintf('RMSE = %.4f\n', RMSE);
% Plot actual vs predicted
plot(tbl.MPG, YPred, 'o');
xlabel('Actual MPG');
ylabel('Predicted MPG');
title('Regression Tree on carsmall');
grid on;MATLAB documents formula-based usage for fitrtree with tables.
A big advantage of decision trees is interpretability. You can inspect the exact split conditions and follow the path to a prediction. MATLAB supports both text and graph views of trees.
Example:
load fisheriris
ctree = fitctree(meas, species);
view(ctree)
view(ctree, 'Mode', 'graph')
This makes decision trees attractive for teaching and for applications where explanation matters.
MATLAB’s Classification Learner app lets you create and compare classification trees, then export trained models to predict new data. MathWorks has a dedicated example titled “Train Decision Trees Using Classification Learner App.”
Open it with:
classificationLearnerTypical workflow:
MathWorks also notes that the Models gallery includes several preset decision tree options that are starting points for different problems.
MATLAB’s Regression Learner app supports regression tree workflows. MathWorks states that to interactively grow a regression tree, you can use Regression Learner, while fitrtree gives more command-line flexibility.
Open it with:
regressionLearnerTypical workflow:
clc;
clear;
close all;
% Example student data
StudyHours = [1;2;2;3;4;5;5;6;7;8];
Attendance = [50;55;60;65;70;75;80;85;90;95];
Pass = categorical([0;0;0;0;0;1;1;1;1;1]);
tbl = table(StudyHours, Attendance, Pass);
% Train/test split
rng(2);
cv = cvpartition(tbl.Pass, 'HoldOut', 0.3);
trainTbl = tbl(training(cv), :);
testTbl = tbl(test(cv), :);
% Train classification tree
Mdl = fitctree(trainTbl, 'Pass ~ StudyHours + Attendance');
% Predict
YPred = predict(Mdl, testTbl(:, {'StudyHours','Attendance'}));
% Accuracy
accuracy = mean(YPred == testTbl.Pass) * 100;
fprintf('Test Accuracy = %.2f%%\n', accuracy);
% Confusion chart
confusionchart(testTbl.Pass, YPred);
title('Pass/Fail Classification Tree');
% Predict a new student
newStudent = table(6, 82, 'VariableNames', {'StudyHours','Attendance'});
newClass = predict(Mdl, newStudent);
disp('Predicted class for new student:');
disp(newClass);clc;
clear;
close all;
% Example house dataset
Size = [50; 60; 70; 80; 90; 100; 110; 120];
Rooms = [2; 3; 3; 4; 4; 5; 5; 6];
Age = [20; 18; 15; 12; 10; 8; 5; 3];
Price = [100; 120; 135; 150; 170; 190; 210; 230];
tbl = table(Size, Rooms, Age, Price);
% Train/test split
rng(3);
idx = randperm(height(tbl));
trainTbl = tbl(idx(1:6), :);
testTbl = tbl(idx(7:8), :);
% Train regression tree
Mdl = fitrtree(trainTbl, 'Price ~ Size + Rooms + Age');
% Predict
YPred = predict(Mdl, testTbl(:, {'Size','Rooms','Age'}));
% Evaluate
RMSE = sqrt(mean((testTbl.Price - YPred).^2));
fprintf('Test RMSE = %.4f\n', RMSE);
disp(table(testTbl.Price, YPred, 'VariableNames', {'ActualPrice','PredictedPrice'}));These two mini-projects show how decision trees work for both kinds of supervised learning.
Very deep trees can overfit the training data. MATLAB documents parameters like MaxNumSplits and MinLeafSize specifically to control this.
A tree can look excellent on training data but perform poorly on new data.
Use:
fitctree for classification,fitrtree for regression. Because trees are easy to inspect, failing to view them wastes one of their biggest benefits. MATLAB supports text and graphical viewing.
Decision Trees are a good choice when:
You may avoid a single decision tree when:
In those cases, ensembles are often worth exploring, but a single tree is still excellent for learning and for interpretable baselines.
Decision Trees are one of the most understandable machine learning methods. In MATLAB, fitctree trains classification trees, fitrtree trains regression trees, predict returns predictions, loss helps evaluate classification trees, and view helps inspect the model visually. MATLAB also provides Classification Learner and Regression Learner for app-based tree workflows.
A practical workflow is:
Mdl = fitctree(X, Y);
YPred = predict(Mdl, Xnew);Mdl = fitrtree(X, Y);
YPred = predict(Mdl, Xnew);L = loss(Mdl, X, Y);view(Mdl)
view(Mdl, 'Mode', 'graph')classificationLearnerregressionLearnerThese commands match the current MATLAB decision tree workflow.
fisheriris dataset to build a multiclass classification tree
MaxNumSplits values and compare them
Train a binary classification tree and compute test accuracy.
Build a multiclass classification tree with the iris dataset.
Compare two trees with different complexity levels.
Train a regression tree and compute RMSE.
Create a complete decision tree mini-project with train/test split and prediction.