<?xml version="1.0" encoding="utf-8" standalone="yes"?><rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:content="http://purl.org/rss/1.0/modules/content/"><channel><title>Machine Learning on Mike's Blog</title><link>https://mikeogilvy.github.io/blog/tags/machine-learning/</link><description>Recent content in Machine Learning on Mike's Blog</description><generator>Hugo</generator><language>zh-cn</language><lastBuildDate>Sat, 28 Mar 2026 02:24:29 +0800</lastBuildDate><atom:link href="https://mikeogilvy.github.io/blog/tags/machine-learning/index.xml" rel="self" type="application/rss+xml"/><item><title>ML-2 Note: Linear Models for Classification and GLMs</title><link>https://mikeogilvy.github.io/blog/posts/ml/ml---2/</link><pubDate>Sat, 28 Mar 2026 02:24:29 +0800</pubDate><guid>https://mikeogilvy.github.io/blog/posts/ml/ml---2/</guid><description>&lt;h1 id="ml-2-note-linear-models-for-classification-and-glms"&gt;ML-2 Note: Linear Models for Classification and GLMs&lt;/h1&gt;
&lt;h2 id="1-why-classification-uses-logistic-regression"&gt;1. Why Classification Uses Logistic Regression&lt;/h2&gt;
&lt;p&gt;In classification problems, the target variable is discrete.&lt;/p&gt;
&lt;p&gt;Binary classification example:&lt;/p&gt;
&lt;p&gt;$$y \in \left\{ 0,1 \right\}$$&lt;/p&gt;
&lt;p&gt;Given input features&lt;/p&gt;
&lt;p&gt;$$x \in \mathbb{R}^d$$&lt;/p&gt;
&lt;p&gt;we want to model&lt;/p&gt;
&lt;p&gt;$$P(y=1|x)$$&lt;/p&gt;
&lt;h3 id="problem-with-linear-regression"&gt;Problem with Linear Regression&lt;/h3&gt;
&lt;p&gt;A linear model predicts&lt;/p&gt;
&lt;p&gt;$$f(x) = w^T x$$&lt;/p&gt;
&lt;p&gt;but&lt;/p&gt;
&lt;p&gt;$$w^T x \in (-\infty, \infty)$$&lt;/p&gt;
&lt;p&gt;while probabilities must satisfy&lt;/p&gt;
&lt;p&gt;$$P(y=1|x) \in [0,1]$$&lt;/p&gt;
&lt;p&gt;Thus we need a function that maps&lt;/p&gt;</description></item><item><title>ML-1 Note: Supervised Learning; Linear Regression</title><link>https://mikeogilvy.github.io/blog/posts/ml/ml---1/</link><pubDate>Sun, 22 Mar 2026 03:32:29 +0800</pubDate><guid>https://mikeogilvy.github.io/blog/posts/ml/ml---1/</guid><description>&lt;h1 id="ml-1-note-supervised-learning-linear-regression"&gt;ML-1 Note: Supervised Learning; Linear Regression&lt;/h1&gt;
&lt;h2 id="1-basic-model-of-linear-regression"&gt;1. Basic Model of Linear Regression&lt;/h2&gt;
&lt;p&gt;Linear Regression is one of the simplest and most fundamental models in &lt;strong&gt;supervised learning&lt;/strong&gt;.&lt;br&gt;
The goal is to model the relationship between input features and a continuous target variable.&lt;/p&gt;
&lt;h3 id="model-form"&gt;Model Form&lt;/h3&gt;
&lt;p&gt;For a dataset with feature vector \(x\):&lt;/p&gt;
&lt;p&gt;\[
y = w^T x + b
\]&lt;/p&gt;
&lt;p&gt;or equivalently&lt;/p&gt;
&lt;p&gt;\[
\hat{y} = \theta^T x
\]&lt;/p&gt;
&lt;p&gt;where:&lt;/p&gt;
&lt;ul&gt;
&lt;li&gt;\(x\): input feature vector&lt;/li&gt;
&lt;li&gt;\(w\): weight vector&lt;/li&gt;
&lt;li&gt;\(b\): bias term&lt;/li&gt;
&lt;li&gt;\(\theta\): parameter vector (including bias)&lt;/li&gt;
&lt;li&gt;\(\hat{y}\): predicted value&lt;/li&gt;
&lt;/ul&gt;
&lt;hr&gt;
&lt;h3 id="loss-function"&gt;Loss Function&lt;/h3&gt;
&lt;p&gt;The most common loss function for linear regression is &lt;strong&gt;Mean Squared Error (MSE)&lt;/strong&gt;.&lt;/p&gt;</description></item></channel></rss>