# Logistic Regression

The logistic regression is a type of regression which is used to predict an outcome which comes in a categorical form. It is widely used in biostatistics where binary reponses occur quite frequently – such as if somebody has cancer or not.

In order to keep the outcome between 0 and 1, we apply the logistic function (also called sigmoid function):

[latex]g(z) = \frac{1}{1 + exp(-z)}[/latex]

The logistic regression hypothesis is then defined as:

[latex]h_\beta(x) = g(\beta^T x)[/latex]

Logistic regressions are usually fit by maximum likelihood. The cost function we want to minize is the opposite of the log-likelihood function:

[latex]J(\beta ) = \frac{1}{m} \sum_{i=1}^m[-y_i log(h_\beta (x_i) – (1 – y_i) log(1-h_\beta (x_i))][/latex]

This imply to solve the following equation:

[latex]\frac{dJ(\beta)}{d\beta} = \frac{1}{m} \sum_{i=1}^Nx_i(h_\beta(x_i)-y_i) = 0[/latex]

MATLAB – Logistic Regression Function

[sourcecode language=”matlab”]

function g = logit(z)

g = 1 ./ (1 + exp(-z));

end

[/sourcecode]

MATLAB – Gradient Descent Method

[sourcecode language=”matlab”]

% Initialize

X = myInputs_Normalized

Y = myOutput

% Add a column of ones for the linear intercept

X = [ones(length(y), 1) X];

% Initialize gradient descent parameters

alpha = 0.03;

iterations = 100;

beta = zeros(size(X,2), 1);

% Run gradient descent

for iteration = 1:iterations

oldBeta = beta ;

for dim = 1:length(beta)

beta(dim) = oldBeta(dim) – alpha / length(y) * sum( ((logit(X * oldBeta) – Y) .* X(:,dim) )

end

end

% Estimate the output based on inputs

EY = logit(X * beta);

[/sourcecode]