Files
cs224n_2019/Assignment_1_intro_word_vectors/python review.ipynb

964 lines
17 KiB
Plaintext
Raw Normal View History

2019-10-21 18:05:16 +08:00
{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Python Numpy Review\n",
"\n",
"主要复习numpy\n",
"\n",
"tutor: `chongjiujin # gmail.com`\n",
"\n",
"```\n",
"if you have any question in python or pytorch:\n",
"\n",
" print(add personal weichat:flypython)\n",
" ```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# List Slicing\n",
"\n",
"List elements can be accessed in convenient ways.\n",
"\n",
"Basic format: some_list[start_index:end_index]"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[0, 1, 2, 3, 4, 5, 6]"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"numbers = [0, 1, 2, 3, 4, 5, 6]\n",
"numbers"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[0, 1, 2]"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"numbers[0:3]"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[0, 1, 2, 3]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"numbers[:4]"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"[5, 6]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"numbers[5:]"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[0, 1, 2, 3, 4, 5, 6]"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"numbers[:]"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"6"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Negative index wraps around\n",
"numbers[-1]"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[4, 5, 6]"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"numbers[-3:]"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"[]"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Can mix and match\n",
"numbers[1:-10]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Numpy python矩阵计算库\n",
"\n",
"\n",
"Optimized library for matrix and vector computation.\n",
"\n",
"用于矩阵和向量\n",
"\n",
"\n",
"\n",
"Makes use of C/C++ subroutines and memory-efficient data structures.\n",
"\n",
"底层是C/C++编译的,效率更高\n",
"\n",
"(Lots of computation can be efficiently represented as vectors.)\n",
"\n",
"**Main data type: `np.ndarray`**\n",
"\n",
"This is the data type that you will use to represent matrix/vector computations.\n",
"这个数据结构是用来放矩阵/向量的\n",
"\n",
"Note: constructor function is `np.array()`\n",
"\n",
" `np.array()`初始化函数\n"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np#导入库"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(3,)"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = np.array([1,2,3])#一维向量\n",
"x\n",
"x.shape"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(2, 3)"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"y = np.array([[3,4,5],[6,7,8]])#二维矩阵\n",
"y.shape"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(3, 1)"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"y = np.array([[1],[2],[3]])#每个框是增加一个维度\n",
"y.shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# np.ndarray Operations 操作函数\n",
"\n",
"Reductions: `np.max`, `np.min`, `np.argmax`, `np.sum`, `np.mean`, …\n",
"\n",
"Always reduces along an axis! (Or will reduce along all axes if not specified.)\n",
"\n",
"(You can think of this as “collapsing” this axis into the functions output.)"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = np.array([1,2,3])#一维向量\n",
"x.max()#np.max(x)\n",
"#x.min()\n",
"#x.sum()\n",
"#x.mean()\n"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"array([[5],\n",
" [8]])"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"y = np.array([[3,4,5],[6,7,8]])#按维度取最大值\n",
"#np.max(y,axis = 1)\n",
"np.max(y, axis = 1, keepdims = True)\n",
"#https://docs.scipy.org/doc/numpy/reference/generated/numpy.amax.html#numpy.amax"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 基本矩阵运算\n",
"\n",
"\n",
"`np.dot`矩阵点乘\n",
"$$ np.dot(v,w)=v^T w $$\n",
"https://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html?highlight=dot#numpy.dot\n",
"\n",
"`np.multiply` 在 np.array 中重载为元素乘法,在 np.matrix 中重载为矩阵乘法\n",
"\n",
"https://docs.scipy.org/doc/numpy/reference/generated/numpy.multiply.html\n",
"\n",
"\n",
"我们这里只讨论一维向量"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"14"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#np.dot点乘\n",
"\n",
"x=np.array([1,2,3])#一维向量\n",
"y=np.array([1,2,3])#一维向量\n",
"np.dot(x,y)\n",
"#"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"14"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sum(x.T*y)"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([1, 4, 9])"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x=np.array([1,2,3])#一维向量\n",
"np.multiply(x,x)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Indexing 索引"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([3])"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#基本同list\n",
"x = np.array([1,2,3])#一维向量\n",
"x[x > 2]\n"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([3, 2, 1])"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"index=[2,1,0]#按索引排序\n",
"x[index]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 矩阵遍历\n",
"\n",
"有时候需要遍历矩阵里所有的向量"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[3 4 5]\n",
" [6 7 8]]\n"
]
}
],
"source": [
"y = np.array([[3,4,5],[6,7,8]])#二维矩阵\n",
"print(y)\n"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[3 4 5]\n",
"-----\n",
"[6 7 8]\n",
"-----\n"
]
}
],
"source": [
"#默认按第1维度遍历\n",
"for y1 in y:\n",
" print(y1)\n",
" print(\"-----\")"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2 3\n"
]
}
],
"source": [
"#按指定维度遍历\n",
"d1,d2= y.shape\n",
"print(d1,d2)"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 [3 6]\n",
"1 [4 7]\n",
"2 [5 8]\n"
]
}
],
"source": [
"for d in range(d2):\n",
" print(d,y[:,d])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Efficient Numpy Code\n",
"尽量用Numpy的特性提升效率"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [],
"source": [
"x = np.array([[3,4,5],[6,7,8]])#二维矩阵\n",
"y = np.array([[1,2,3],[9,0,10]])#二维矩阵"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 9, 16, 25],\n",
" [36, 49, 64]])"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"for i in range(x.shape[0]):\n",
" for j in range(x.shape[1]):\n",
" x[i,j] **= 2\n",
"x"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 81, 256, 625],\n",
" [1296, 2401, 4096]])"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x **= 2\n",
"x"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 全0 和全 1 矩阵"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([1.30950800e+06, 1.82888704e+08])"
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"z=np.zeros((2,))\n",
"for i in range(x.shape[0]):\n",
" x1=x[i]\n",
" y1=y[i]\n",
" z[i]=np.dot(x1,y1)\n",
"z"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[1., 1., 1.],\n",
" [1., 1., 1.]])"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"z=np.ones((2,3))\n",
"z"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 矩阵和常数计算以及 Broadcasting广播"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(3, 3)"
]
},
"execution_count": 42,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = np.array([[3,4,5],[6,7,8],[1,2,3]])#二维矩阵\n",
"x.shape"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 5, 6, 7],\n",
" [ 8, 9, 10],\n",
" [ 3, 4, 5]])"
]
},
"execution_count": 43,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x+2"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 6, 8, 10],\n",
" [12, 14, 16],\n",
" [ 2, 4, 6]])"
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x*2"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(3, 1)"
]
},
"execution_count": 52,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"y=np.array([[2],[4],[8]])\n",
"y.shape"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 5, 6, 7],\n",
" [10, 11, 12],\n",
" [ 9, 10, 11]])"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x+y"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 矩阵变换"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(1, 3)"
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"z=np.array([[2, 4, 8]])\n",
"z.shape"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(3, 1)"
]
},
"execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"z=y.reshape(-1,1)\n",
"z.shape"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(1, 3)"
]
},
"execution_count": 53,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"z=y.T\n",
"z.shape"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 6, 16, 40],\n",
" [12, 28, 64],\n",
" [ 2, 8, 24]])"
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x*z"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 思考题\n",
"y=np.array([[2],[4],[8]])\n",
"\n",
"(y + y.T)是什么\n",
"\n",
"\n",
"# 如果对操作有不确定开一个jupyter notebook测试后使用"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}