#machine learning

わかりやすい　パターン認識

第３章　誤差評価に基づく学習

3.1 Widrow-Hoffの学習規則

[2] 閉じた形の解に引き続き、「[3] 逐次近似による解(p.36)」を例題に当てはめて解いてみる（Widrow-Hoffの学習規則）。

学習パターンと教師ベクトルは「閉じた形の解」と同じだが、列ベクトルの形式で、

$\begin{eqnarray*} \mathbf{x_1} &=& \begin{pmatrix}1 \\ 1.2\end{pmatrix}, \\ \mathbf{x_2} &=& \begin{pmatrix}1 \\ 0.2\end{pmatrix}, \\ \mathbf{x_3} &=& \begin{pmatrix}1 \\ -0.2\end{pmatrix}, \\ \mathbf{x_4} &=& \begin{pmatrix}1 \\ -0.5\end{pmatrix}, \\ \mathbf{x_5} &=& \begin{pmatrix}1 \\ -1.0\end{pmatrix}, \\ \mathbf{x_6} &=& \begin{pmatrix}1 \\ -1.5\end{pmatrix}, \\ \mathbf{b_1} = \mathbf{b_2} = \mathbf{b_3} &=& \begin{pmatrix}1 \\ 0\end{pmatrix}, \\ \mathbf{b_4} = \mathbf{b_5} = \mathbf{b_6} &=& \begin{pmatrix}0 \\ 1\end{pmatrix}, \end{eqnarray*}$

とする。

パターンごとに

$\begin{eqnarray*} \mathbf{W}^\prime &=& \mathbf{W} - \rho \mathbf{X}_p \mathbf{\epsilon}_p^t \tag{3.31} \\ \mathbf{\epsilon}_p &=& g(\mathbf{x}_p) - \mathbf{b}_p \tag{3.3} \\ &=& \mathbf{W}^t \mathbf{X}_p - \mathbf{b}_p \end{eqnarray*}$

として $\mathbf{W}$ を更新する（Widrow-Hoffの学習規則）。

$\mathbf{W}$ の初期値は適当に、 $\mathbf{W} = \begin{bmatrix} \begin{pmatrix}1 \\ 1\end{pmatrix} & \begin{pmatrix}1 \\ 1\end{pmatrix} \end{bmatrix}$

として、計算してみる。

(1) パターン１に適用

$\begin{eqnarray*} \mathbf{W}^\prime &=& \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix} - 0.4 \begin{pmatrix}1 \\ 1.2\end{pmatrix} \left( \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix} \begin{pmatrix}1 \\ 1.2\end{pmatrix} - \begin{pmatrix}1 \\ 0\end{pmatrix} \right)^t \\ &=& \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix} - 0.4 \begin{pmatrix}1 \\ 1.2\end{pmatrix} \begin{pmatrix}1.2 & 2.2\end{pmatrix} \\ &=& \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix} - 0.4 \begin{bmatrix} 1.2 & 2.2 \\ 1.44 & 2.64 \end{bmatrix} \\ &=& \begin{bmatrix} 1 & 1 \\ 1 & 1 \end{bmatrix} - \begin{bmatrix} 0.48 & 0.88 \\ 0.576 & 1.056 \end{bmatrix} \\ &=& \begin{bmatrix} 0.52 & 0.12 \\ 0.424 & -0.056 \end{bmatrix} \end{eqnarray*}$

以後パターン２〜６も同様に計算して一周させると、

$\mathbf{W}^\prime = \begin{bmatrix} 0.68368057 & 0.50155838 \\ 0.40088957 & -0.30792911 \end{bmatrix}$

となり、識別の値 $x= -0.256937622954$ となる。

閉じた形で求めた結果と近い値が得られている。

コード

# learn_widrow_hoff.py
# coding: UTF-8
import numpy as np

def main():
  # パターン行列
  X = np.matrix([[1., 1.2],
                 [1., 0.2],
                 [1., -0.2],
                 [1., -0.5],
                 [1., -1.0],
                 [1., -1.5]])
  # パターンごとの、クラスiに属するかどうかの教師行列
  b = np.matrix([[1., 0.],
                 [1., 0.],
                 [1., 0.],
                 [0., 1.],
                 [0., 1.],
                 [0., 1.]])

  # 識別関数のウェイトをランダムで初期化する
  W = np.matrix(np.random.random((b.shape[1], X.shape[1])))

  rho = 0.4  # 学習率
  for k in xrange(1):  # TODO: 収束するまで
    for i in xrange(X.shape[0]):
      learn_widrow_hoff(W, X[i, :].T, b[i, :].T, rho)
  print W
  x = -(W[0, 0] - W[0, 1]) / (W[1, 0] - W[1, 1])
  print 'x=', x, 'error=', calc_error(W, X, b)

# Widrow-Hoffの学習規則でパターン１つに対して学習させる
def learn_widrow_hoff(W, xp, bp, rho):
  e = W.T * xp - bp
  W -= rho * (xp * e.T)

def calc_error(W, X, b):
  #J = 0.
  #for i in xrange(X.shape[0]):
  #  xp = X[i, :].T
  #  bp = b[i, :].T
  #  e = W.T * xp - bp
  #  J += e.T.dot(e).sum()
  e = X * W - b
  J = np.vectorize(lambda x: x * x)(e).sum()
  return J

if __name__ == '__main__':
  main()

# [[ 0.75759575  0.49664788]
#  [ 0.4370989  -0.31125436]]
# x= -0.348696099486 error= 1.29145201213

「Widrow-Hoffの学習規則」で例題を解いてみる

わかりやすい　パターン認識

第３章　誤差評価に基づく学習

3.1 Widrow-Hoffの学習規則

(1) パターン１に適用

コード

新着記事

わかりやすい パターン認識

第３章 誤差評価に基づく学習

3.1 Widrow-Hoffの学習規則

(1) パターン１に適用

コード

関連記事

誤差逆伝播法の導出

MNISTにバッチ正規化を適用

クロスエントロピー

多層パーセプトロンの出力値の計算（フィードフォワード）

転移学習で手書きのひらがな・漢字認識

新着記事

わかりやすい　パターン認識

第３章　誤差評価に基づく学習