Mathematical justification on the origin of the sigmoid in logistic regression
Keywords:
Binary logistic regression ; Maximum likelihood estimator ; Odds ratio ; Predictive model ; Data mining.Abstract
Logistic regression is a commonly used classification algorithm in machine learning. It allows categorizing data into discrete classes by learning the relationship from a given set of labeled data. It learns a linear relationship from the given data set and then introduces nonlinearity through an activation function to determine a hyperplane that separates the learning points into two subclasses. In the case of logistic regression, the sigmoid is the most used activation function to perform binary classification. The choice of sigmoid for binary classifications is justified by its ability to transform any real number into a probability between 0 and 1. This study provides, through two different approaches, a rigorous mathematical answer to the crucial question that torments us, namely where does this logistic function on which most neural network algorithms are based come from?