Notes: Session 5: Regression 2

2026-04-14: 65 min for the lecture + 25 min for the discussion of group projects

Time (min) Duration Topic Additional materials
0–30 30 Refining the model
30–60 30 Explanation ↔︎ prediction
60–90 30 Regression → Machine learning

TODO

Classification Approaches

  • Black box models → Directly map input → class (low interpretability)

  • Score + threshold → Compute a score, then classify via cutoff (separates prediction from decision)

  • Regions / rules → Partition input space into decision areas (e.g., trees, rule systems)

Logistic Regression Choice

We use the score approach, where the score is: \(p(x)=P(Y=1\mid x)\)

Interpretable probability + threshold ⇒ classification

From Logit to Sigmoid (whiteboard)

Start with the Logit formulation and define: \[\log\left(\frac{y}{1-y}\right)=\beta_0 + \beta_1 x = z\]

Solve for (y)

Exponentiate: \[ \frac{y}{1-y} = e^{z} \]

Rearrange: \[ y = (1-y)\cdot e^{z} \]

\[ y + y e^{z} = e^{z} \]

\[ y(1 + e^{z}) = e^{z} \]

\[ y = \frac{e^{z}}{1 + e^{z}} \]

Final step (multiply numerator and denominator by \(e^{-z}\)):

\[ y = \frac{e^{z}}{1 + e^{z}} \cdot \frac{e^{-z}}{e^{-z}} = \frac{1}{1 + e^{-z}} \]

Exercise

2026-04-14: 90 min, did not start with Part 5 (students indicated that we could discuss solutions earlier)

Time (min) Duration Topic Additional materials
0–30 30 Group work assignment
30–90 60 Logistic regression

Part 1.3: TODO: extract confidence and interpret practical and statistical significance? (the small coefficient for income could be practically significant because of the high values of income, which is not standardized)

Materials

  • good slides: https://harvard-iacs.github.io/2019-CS109A/lectures/lecture10/ (https://harvard-iacs.github.io/2019-CS109A/lectures/lecture11/presentation/Lecture11_LogReg2.pdf)
  • good (simple; geographical) explanation: https://www.youtube.com/watch?v=yIYKR4sgzI8
  • Lecture, Ng: https://www.youtube.com/watch?v=4u81xU7BIOc
  • https://web.stanford.edu/class/cme250/files/cme250_lecture2.pdf
  • https://slds-lmu.github.io/i2ml/chapters/03_supervised_classification/03-04-classification-logistic/
  • https://ubc-cs.github.io/cpsc330-2023W1/README.html
  • Optional /extension: changing the data: https://ubc-cs.github.io/cpsc330-2023W1/lectures/09_classification-metrics.html#optional-changing-the-data
  • https://harvard-iacs.github.io/2019-CS109A/lectures/lecture-11/notebook/
  • https://github.com/UBC-CS/cpsc330-2023W1
  • https://slds-lmu.github.io/i2ml/
  • https://www2.stat.duke.edu/courses/Spring20/sta210.001/labs/lab-08-logistic.html