Files
cs224n_2019/Assignment_1_intro_word_vectors/python review.ipynb
chongjiu.jin 75b33e19fa a1
2019-10-21 18:05:16 +08:00

964 lines
17 KiB
Plaintext
Raw Permalink Blame History

This file contains ambiguous Unicode characters

This file contains Unicode characters that might be confused with other characters. If you think that this is intentional, you can safely ignore this warning. Use the Escape button to reveal them.

{
"cells": [
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Python Numpy Review\n",
"\n",
"主要复习numpy\n",
"\n",
"tutor: `chongjiujin # gmail.com`\n",
"\n",
"```\n",
"if you have any question in python or pytorch:\n",
"\n",
" print(add personal weichat:flypython)\n",
" ```"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# List Slicing\n",
"\n",
"List elements can be accessed in convenient ways.\n",
"\n",
"Basic format: some_list[start_index:end_index]"
]
},
{
"cell_type": "code",
"execution_count": 1,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[0, 1, 2, 3, 4, 5, 6]"
]
},
"execution_count": 1,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"numbers = [0, 1, 2, 3, 4, 5, 6]\n",
"numbers"
]
},
{
"cell_type": "code",
"execution_count": 2,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[0, 1, 2]"
]
},
"execution_count": 2,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"numbers[0:3]"
]
},
{
"cell_type": "code",
"execution_count": 4,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[0, 1, 2, 3]"
]
},
"execution_count": 4,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"numbers[:4]"
]
},
{
"cell_type": "code",
"execution_count": 8,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"[5, 6]"
]
},
"execution_count": 8,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"numbers[5:]"
]
},
{
"cell_type": "code",
"execution_count": 7,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[0, 1, 2, 3, 4, 5, 6]"
]
},
"execution_count": 7,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"numbers[:]"
]
},
{
"cell_type": "code",
"execution_count": 9,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"6"
]
},
"execution_count": 9,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Negative index wraps around\n",
"numbers[-1]"
]
},
{
"cell_type": "code",
"execution_count": 10,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"[4, 5, 6]"
]
},
"execution_count": 10,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"numbers[-3:]"
]
},
{
"cell_type": "code",
"execution_count": 14,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"[]"
]
},
"execution_count": 14,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"# Can mix and match\n",
"numbers[1:-10]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Numpy python矩阵计算库\n",
"\n",
"\n",
"Optimized library for matrix and vector computation.\n",
"\n",
"用于矩阵和向量\n",
"\n",
"\n",
"\n",
"Makes use of C/C++ subroutines and memory-efficient data structures.\n",
"\n",
"底层是C/C++编译的,效率更高\n",
"\n",
"(Lots of computation can be efficiently represented as vectors.)\n",
"\n",
"**Main data type: `np.ndarray`**\n",
"\n",
"This is the data type that you will use to represent matrix/vector computations.\n",
"这个数据结构是用来放矩阵/向量的\n",
"\n",
"Note: constructor function is `np.array()`\n",
"\n",
" `np.array()`初始化函数\n"
]
},
{
"cell_type": "code",
"execution_count": 15,
"metadata": {},
"outputs": [],
"source": [
"import numpy as np#导入库"
]
},
{
"cell_type": "code",
"execution_count": 18,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(3,)"
]
},
"execution_count": 18,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = np.array([1,2,3])#一维向量\n",
"x\n",
"x.shape"
]
},
{
"cell_type": "code",
"execution_count": 20,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(2, 3)"
]
},
"execution_count": 20,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"y = np.array([[3,4,5],[6,7,8]])#二维矩阵\n",
"y.shape"
]
},
{
"cell_type": "code",
"execution_count": 21,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(3, 1)"
]
},
"execution_count": 21,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"y = np.array([[1],[2],[3]])#每个框是增加一个维度\n",
"y.shape"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# np.ndarray Operations 操作函数\n",
"\n",
"Reductions: `np.max`, `np.min`, `np.argmax`, `np.sum`, `np.mean`, …\n",
"\n",
"Always reduces along an axis! (Or will reduce along all axes if not specified.)\n",
"\n",
"(You can think of this as “collapsing” this axis into the functions output.)"
]
},
{
"cell_type": "code",
"execution_count": 22,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"3"
]
},
"execution_count": 22,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = np.array([1,2,3])#一维向量\n",
"x.max()#np.max(x)\n",
"#x.min()\n",
"#x.sum()\n",
"#x.mean()\n"
]
},
{
"cell_type": "code",
"execution_count": 25,
"metadata": {
"scrolled": true
},
"outputs": [
{
"data": {
"text/plain": [
"array([[5],\n",
" [8]])"
]
},
"execution_count": 25,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"y = np.array([[3,4,5],[6,7,8]])#按维度取最大值\n",
"#np.max(y,axis = 1)\n",
"np.max(y, axis = 1, keepdims = True)\n",
"#https://docs.scipy.org/doc/numpy/reference/generated/numpy.amax.html#numpy.amax"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 基本矩阵运算\n",
"\n",
"\n",
"`np.dot`矩阵点乘\n",
"$$ np.dot(v,w)=v^T w $$\n",
"https://docs.scipy.org/doc/numpy/reference/generated/numpy.dot.html?highlight=dot#numpy.dot\n",
"\n",
"`np.multiply` 在 np.array 中重载为元素乘法,在 np.matrix 中重载为矩阵乘法\n",
"\n",
"https://docs.scipy.org/doc/numpy/reference/generated/numpy.multiply.html\n",
"\n",
"\n",
"我们这里只讨论一维向量"
]
},
{
"cell_type": "code",
"execution_count": 26,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"14"
]
},
"execution_count": 26,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#np.dot点乘\n",
"\n",
"x=np.array([1,2,3])#一维向量\n",
"y=np.array([1,2,3])#一维向量\n",
"np.dot(x,y)\n",
"#"
]
},
{
"cell_type": "code",
"execution_count": 27,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"14"
]
},
"execution_count": 27,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"sum(x.T*y)"
]
},
{
"cell_type": "code",
"execution_count": 28,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([1, 4, 9])"
]
},
"execution_count": 28,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x=np.array([1,2,3])#一维向量\n",
"np.multiply(x,x)"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Indexing 索引"
]
},
{
"cell_type": "code",
"execution_count": 29,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([3])"
]
},
"execution_count": 29,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"#基本同list\n",
"x = np.array([1,2,3])#一维向量\n",
"x[x > 2]\n"
]
},
{
"cell_type": "code",
"execution_count": 30,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([3, 2, 1])"
]
},
"execution_count": 30,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"index=[2,1,0]#按索引排序\n",
"x[index]"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 矩阵遍历\n",
"\n",
"有时候需要遍历矩阵里所有的向量"
]
},
{
"cell_type": "code",
"execution_count": 31,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[[3 4 5]\n",
" [6 7 8]]\n"
]
}
],
"source": [
"y = np.array([[3,4,5],[6,7,8]])#二维矩阵\n",
"print(y)\n"
]
},
{
"cell_type": "code",
"execution_count": 32,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"[3 4 5]\n",
"-----\n",
"[6 7 8]\n",
"-----\n"
]
}
],
"source": [
"#默认按第1维度遍历\n",
"for y1 in y:\n",
" print(y1)\n",
" print(\"-----\")"
]
},
{
"cell_type": "code",
"execution_count": 33,
"metadata": {
"scrolled": true
},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"2 3\n"
]
}
],
"source": [
"#按指定维度遍历\n",
"d1,d2= y.shape\n",
"print(d1,d2)"
]
},
{
"cell_type": "code",
"execution_count": 34,
"metadata": {},
"outputs": [
{
"name": "stdout",
"output_type": "stream",
"text": [
"0 [3 6]\n",
"1 [4 7]\n",
"2 [5 8]\n"
]
}
],
"source": [
"for d in range(d2):\n",
" print(d,y[:,d])"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# Efficient Numpy Code\n",
"尽量用Numpy的特性提升效率"
]
},
{
"cell_type": "code",
"execution_count": 36,
"metadata": {},
"outputs": [],
"source": [
"x = np.array([[3,4,5],[6,7,8]])#二维矩阵\n",
"y = np.array([[1,2,3],[9,0,10]])#二维矩阵"
]
},
{
"cell_type": "code",
"execution_count": 37,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 9, 16, 25],\n",
" [36, 49, 64]])"
]
},
"execution_count": 37,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"for i in range(x.shape[0]):\n",
" for j in range(x.shape[1]):\n",
" x[i,j] **= 2\n",
"x"
]
},
{
"cell_type": "code",
"execution_count": 38,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 81, 256, 625],\n",
" [1296, 2401, 4096]])"
]
},
"execution_count": 38,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x **= 2\n",
"x"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 全0 和全 1 矩阵"
]
},
{
"cell_type": "code",
"execution_count": 40,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([1.30950800e+06, 1.82888704e+08])"
]
},
"execution_count": 40,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"z=np.zeros((2,))\n",
"for i in range(x.shape[0]):\n",
" x1=x[i]\n",
" y1=y[i]\n",
" z[i]=np.dot(x1,y1)\n",
"z"
]
},
{
"cell_type": "code",
"execution_count": 41,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[1., 1., 1.],\n",
" [1., 1., 1.]])"
]
},
"execution_count": 41,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"z=np.ones((2,3))\n",
"z"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 矩阵和常数计算以及 Broadcasting广播"
]
},
{
"cell_type": "code",
"execution_count": 42,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(3, 3)"
]
},
"execution_count": 42,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x = np.array([[3,4,5],[6,7,8],[1,2,3]])#二维矩阵\n",
"x.shape"
]
},
{
"cell_type": "code",
"execution_count": 43,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 5, 6, 7],\n",
" [ 8, 9, 10],\n",
" [ 3, 4, 5]])"
]
},
"execution_count": 43,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x+2"
]
},
{
"cell_type": "code",
"execution_count": 44,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 6, 8, 10],\n",
" [12, 14, 16],\n",
" [ 2, 4, 6]])"
]
},
"execution_count": 44,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x*2"
]
},
{
"cell_type": "code",
"execution_count": 52,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(3, 1)"
]
},
"execution_count": 52,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"y=np.array([[2],[4],[8]])\n",
"y.shape"
]
},
{
"cell_type": "code",
"execution_count": 46,
"metadata": {
"scrolled": false
},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 5, 6, 7],\n",
" [10, 11, 12],\n",
" [ 9, 10, 11]])"
]
},
"execution_count": 46,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x+y"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"## 矩阵变换"
]
},
{
"cell_type": "code",
"execution_count": 48,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(1, 3)"
]
},
"execution_count": 48,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"z=np.array([[2, 4, 8]])\n",
"z.shape"
]
},
{
"cell_type": "code",
"execution_count": 50,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(3, 1)"
]
},
"execution_count": 50,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"z=y.reshape(-1,1)\n",
"z.shape"
]
},
{
"cell_type": "code",
"execution_count": 53,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"(1, 3)"
]
},
"execution_count": 53,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"z=y.T\n",
"z.shape"
]
},
{
"cell_type": "code",
"execution_count": 54,
"metadata": {},
"outputs": [
{
"data": {
"text/plain": [
"array([[ 6, 16, 40],\n",
" [12, 28, 64],\n",
" [ 2, 8, 24]])"
]
},
"execution_count": 54,
"metadata": {},
"output_type": "execute_result"
}
],
"source": [
"x*z"
]
},
{
"cell_type": "markdown",
"metadata": {},
"source": [
"# 思考题\n",
"y=np.array([[2],[4],[8]])\n",
"\n",
"(y + y.T)是什么\n",
"\n",
"\n",
"# 如果对操作有不确定开一个jupyter notebook测试后使用"
]
},
{
"cell_type": "code",
"execution_count": null,
"metadata": {},
"outputs": [],
"source": []
}
],
"metadata": {
"kernelspec": {
"display_name": "Python 3",
"language": "python",
"name": "python3"
},
"language_info": {
"codemirror_mode": {
"name": "ipython",
"version": 3
},
"file_extension": ".py",
"mimetype": "text/x-python",
"name": "python",
"nbconvert_exporter": "python",
"pygments_lexer": "ipython3",
"version": "3.7.3"
}
},
"nbformat": 4,
"nbformat_minor": 2
}