πŸ’‘Logistic Regression

Overview

Logistic regression still use the real-valued feature to first predict the real-valued output. Then, it transform the predicted output into a probability ([0, 1]) and give a binary output based on the probability (i.e. if pp > 0.5 then it is true , otherwise it is false).

  1. z=wTx+bz=w^Tx+b

  2. Logistic function transform to probability: p=11+exp(βˆ’z)p=\frac{1}{1+exp(-z)}

  3. Decision using threshold: y^=1\hat{y}=1 ​if p>=tp>=t, otherwise y^=0\hat{y}=0​

  • Cost Function: In logistic regression, we use cross entropy as cost function: J(w,b)=βˆ‘inyiβˆ—log⁑(yi^)nJ(\mathbf{w}, b) = \frac{\sum_{i}^{n} y_i*\log(\hat{y_i})}{n}

  • Partial Derivative:

    • βˆ‚J(w,b)βˆ‚w=βˆ’XTβ‹…(yβˆ’y^)n\frac{\partial J(\mathbf{w}, b)}{\partial \mathbf{w}} = \frac{- \mathbf{X}^T \cdot (\mathbf{y}-\hat{\mathbf{y}})}{n}

    • βˆ‚J(w,b)βˆ‚b=βˆ’βˆ‘i=0n(yiβˆ’yi^)n\frac{\partial J(\mathbf{w}, b)}{\partial {b}} = \frac{-\sum_{i=0}^{n} (y_i-\hat{y_i})}{n}

  • Stopping Criteria: Each iteration, we calculate the next point using gradient at the current position, scales it (by a learning rate) and subtract obtained value from the current position (makes a step). We subtract the value because we want to minimise the cost function. Thus, we stop when the gradient approach zero, global or local minima; or after fix number of iterations.

Code Implementation

# Logistic regression with Scikit-Learn
from sklearn.linear_model import LogisticRegression
import numpy as np

X = np.array([3.58, 2.34, 2.09, 1.14, 0.22, 1.65, 4.92, 2.35, 3.01, 5.23, 8.69, 4.85]).reshape(-1,1)
y = np.array([0, 0, 0, 0, 0, 0, 1, 1, 1, 1, 1, 1])

logr = LogisticRegression()
logr.fit(X,y)

predicted = logr.predict(numpy.array([3.46]).reshape(-1,1))

Last updated