cs224n_2019/Assignment_1_intro_word_vectors/python review.ipynb

{
 "cells": [
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Python Numpy Review\n",
    "\n",
    "主要复习numpy\n",
    "\n",
    "tutor:  `chongjiujin # gmail.com`\n",
    "\n",
    "```\n",
    "if you have any question in python or pytorch:\n",
    "\n",
    "    print(add personal weichat:flypython)\n",
    "   ```"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# List Slicing\n",
    "\n",
    "List elements can be accessed in convenient ways.\n",
    "\n",
    "Basic format: some_list[start_index:end_index]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 1,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0, 1, 2, 3, 4, 5, 6]"
      ]
     },
     "execution_count": 1,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "numbers = [0, 1, 2, 3, 4, 5, 6]\n",
    "numbers"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 2,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0, 1, 2]"
      ]
     },
     "execution_count": 2,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "numbers[0:3]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 4,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0, 1, 2, 3]"
      ]
     },
     "execution_count": 4,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "numbers[:4]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 8,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[5, 6]"
      ]
     },
     "execution_count": 8,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "numbers[5:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 7,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[0, 1, 2, 3, 4, 5, 6]"
      ]
     },
     "execution_count": 7,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "numbers[:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 9,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "6"
      ]
     },
     "execution_count": 9,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Negative index wraps around\n",
    "numbers[-1]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 10,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[4, 5, 6]"
      ]
     },
     "execution_count": 10,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "numbers[-3:]"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 14,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "[]"
      ]
     },
     "execution_count": 14,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "# Can mix and match\n",
    "numbers[1:-10]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Numpy python矩阵计算库\n",
    "\n",
    "\n",
    "Optimized library for matrix and vector computation.\n",
    "\n",
    "用于矩阵和向量\n",
    "\n",
    "\n",
    "\n",
    "Makes use of C/C++ subroutines and memory-efficient data structures.\n",
    "\n",
    "底层是C/C++编译的，效率更高\n",
    "\n",
    "(Lots of computation can be efficiently represented as vectors.)\n",
    "\n",
    "**Main data type: `np.ndarray`**\n",
    "\n",
    "This is the data type that you will use to represent matrix/vector computations.\n",
    "这个数据结构是用来放矩阵/向量的\n",
    "\n",
    "Note: constructor function is `np.array()`\n",
    "\n",
    " `np.array()`初始化函数\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 15,
   "metadata": {},
   "outputs": [],
   "source": [
    "import numpy as np#导入库"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 18,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(3,)"
      ]
     },
     "execution_count": 18,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = np.array([1,2,3])#一维向量\n",
    "x\n",
    "x.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 20,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(2, 3)"
      ]
     },
     "execution_count": 20,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y = np.array([[3,4,5],[6,7,8]])#二维矩阵\n",
    "y.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 21,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(3, 1)"
      ]
     },
     "execution_count": 21,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y = np.array([[1],[2],[3]])#每个框是增加一个维度\n",
    "y.shape"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# np.ndarray Operations 操作函数\n",
    "\n",
    "Reductions: `np.max`, `np.min`, `np.argmax`, `np.sum`, `np.mean`, …\n",
    "\n",
    "Always reduces along an axis! (Or will reduce along all axes if not specified.)\n",
    "\n",
    "(You can think of this as “collapsing” this axis into the function’s output.)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 22,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "3"
      ]
     },
     "execution_count": 22,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = np.array([1,2,3])#一维向量\n",
    "x.max()#np.max(x)\n",
    "#x.min()\n",
    "#x.sum()\n",
    "#x.mean()\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 25,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[5],\n",
       "       [8]])"
      ]
     },
     "execution_count": 25,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y = np.array([[3,4,5],[6,7,8]])#按维度取最大值\n",
    "#np.max(y,axis = 1)\n",
    "np.max(y, axis = 1, keepdims = True)\n",
    "#https://docs.scipy.org/doc/numpy/reference/generated/numpy.amax.html#numpy.amax"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 基本矩阵运算\n",
    "\n",
    "\n",
    "`np.dot`矩阵点乘\n",
    "$$ np.dot(v,w)=v^T w $$\n",
    "https://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html?highlight=dot#numpy.dot\n",
    "\n",
    "`np.multiply` 在 np.array 中重载为元素乘法，在 np.matrix 中重载为矩阵乘法\n",
    "\n",
    "https://docs.scipy.org/doc/numpy/reference/generated/numpy.multiply.html\n",
    "\n",
    "\n",
    "我们这里只讨论一维向量"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 26,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "14"
      ]
     },
     "execution_count": 26,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#np.dot点乘\n",
    "\n",
    "x=np.array([1,2,3])#一维向量\n",
    "y=np.array([1,2,3])#一维向量\n",
    "np.dot(x,y)\n",
    "#"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 27,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "14"
      ]
     },
     "execution_count": 27,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "sum(x.T*y)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 28,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1, 4, 9])"
      ]
     },
     "execution_count": 28,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x=np.array([1,2,3])#一维向量\n",
    "np.multiply(x,x)"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Indexing  索引"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 29,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([3])"
      ]
     },
     "execution_count": 29,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "#基本同list\n",
    "x = np.array([1,2,3])#一维向量\n",
    "x[x > 2]\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 30,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([3, 2, 1])"
      ]
     },
     "execution_count": 30,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "index=[2,1,0]#按索引排序\n",
    "x[index]"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 矩阵遍历\n",
    "\n",
    "有时候需要遍历矩阵里所有的向量"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 31,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[[3 4 5]\n",
      " [6 7 8]]\n"
     ]
    }
   ],
   "source": [
    "y = np.array([[3,4,5],[6,7,8]])#二维矩阵\n",
    "print(y)\n"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 32,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "[3 4 5]\n",
      "-----\n",
      "[6 7 8]\n",
      "-----\n"
     ]
    }
   ],
   "source": [
    "#默认按第1维度遍历\n",
    "for y1 in y:\n",
    "    print(y1)\n",
    "    print(\"-----\")"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 33,
   "metadata": {
    "scrolled": true
   },
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "2 3\n"
     ]
    }
   ],
   "source": [
    "#按指定维度遍历\n",
    "d1,d2= y.shape\n",
    "print(d1,d2)"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 34,
   "metadata": {},
   "outputs": [
    {
     "name": "stdout",
     "output_type": "stream",
     "text": [
      "0 [3 6]\n",
      "1 [4 7]\n",
      "2 [5 8]\n"
     ]
    }
   ],
   "source": [
    "for d in range(d2):\n",
    "    print(d,y[:,d])"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# Efficient Numpy Code\n",
    "尽量用Numpy的特性提升效率"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 36,
   "metadata": {},
   "outputs": [],
   "source": [
    "x = np.array([[3,4,5],[6,7,8]])#二维矩阵\n",
    "y = np.array([[1,2,3],[9,0,10]])#二维矩阵"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 37,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 9, 16, 25],\n",
       "       [36, 49, 64]])"
      ]
     },
     "execution_count": 37,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "for i in range(x.shape[0]):\n",
    "    for j in range(x.shape[1]):\n",
    "        x[i,j] **= 2\n",
    "x"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 38,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[  81,  256,  625],\n",
       "       [1296, 2401, 4096]])"
      ]
     },
     "execution_count": 38,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x **= 2\n",
    "x"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 全0 和全 1 矩阵"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 40,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([1.30950800e+06, 1.82888704e+08])"
      ]
     },
     "execution_count": 40,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "z=np.zeros((2,))\n",
    "for i in range(x.shape[0]):\n",
    "    x1=x[i]\n",
    "    y1=y[i]\n",
    "    z[i]=np.dot(x1,y1)\n",
    "z"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 41,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[1., 1., 1.],\n",
       "       [1., 1., 1.]])"
      ]
     },
     "execution_count": 41,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "z=np.ones((2,3))\n",
    "z"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 矩阵和常数计算以及 Broadcasting广播"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 42,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(3, 3)"
      ]
     },
     "execution_count": 42,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x = np.array([[3,4,5],[6,7,8],[1,2,3]])#二维矩阵\n",
    "x.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 43,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 5,  6,  7],\n",
       "       [ 8,  9, 10],\n",
       "       [ 3,  4,  5]])"
      ]
     },
     "execution_count": 43,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x+2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 44,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 6,  8, 10],\n",
       "       [12, 14, 16],\n",
       "       [ 2,  4,  6]])"
      ]
     },
     "execution_count": 44,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x*2"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 52,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(3, 1)"
      ]
     },
     "execution_count": 52,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "y=np.array([[2],[4],[8]])\n",
    "y.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 46,
   "metadata": {
    "scrolled": false
   },
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 5,  6,  7],\n",
       "       [10, 11, 12],\n",
       "       [ 9, 10, 11]])"
      ]
     },
     "execution_count": 46,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x+y"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "## 矩阵变换"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 48,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(1, 3)"
      ]
     },
     "execution_count": 48,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "z=np.array([[2, 4, 8]])\n",
    "z.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 50,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(3, 1)"
      ]
     },
     "execution_count": 50,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "z=y.reshape(-1,1)\n",
    "z.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 53,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "(1, 3)"
      ]
     },
     "execution_count": 53,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "z=y.T\n",
    "z.shape"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": 54,
   "metadata": {},
   "outputs": [
    {
     "data": {
      "text/plain": [
       "array([[ 6, 16, 40],\n",
       "       [12, 28, 64],\n",
       "       [ 2,  8, 24]])"
      ]
     },
     "execution_count": 54,
     "metadata": {},
     "output_type": "execute_result"
    }
   ],
   "source": [
    "x*z"
   ]
  },
  {
   "cell_type": "markdown",
   "metadata": {},
   "source": [
    "# 思考题\n",
    "y=np.array([[2],[4],[8]])\n",
    "\n",
    "(y + y.T)是什么\n",
    "\n",
    "\n",
    "# 如果对操作有不确定，开一个jupyter notebook，测试后使用"
   ]
  },
  {
   "cell_type": "code",
   "execution_count": null,
   "metadata": {},
   "outputs": [],
   "source": []
  }
 ],
 "metadata": {
  "kernelspec": {
   "display_name": "Python 3",
   "language": "python",
   "name": "python3"
  },
  "language_info": {
   "codemirror_mode": {
    "name": "ipython",
    "version": 3
   },
   "file_extension": ".py",
   "mimetype": "text/x-python",
   "name": "python",
   "nbconvert_exporter": "python",
   "pygments_lexer": "ipython3",
   "version": "3.7.3"
  }
 },
 "nbformat": 4,
 "nbformat_minor": 2
}