alloc_finish

2017-04-18 09:39:37 +08:00
parent d591f5a191
commit 5cc326767b
10 changed files with 1114 additions and 1 deletions
--- a/alloc_mem/README.md
+++ b/alloc_mem/README.md
@@ -1 +1,226 @@
+
 ## 内存分配
+
+这里不再讨论具体架构的内存管理问题，内存的具体结构可以参考我对内存关系分析的博客。我们主要来讨论下，一致的内存分配接口问题。
+
+
+### kmalloc
+
+“include/linux/slab.h”
+
+```
+static __always_inline void *kmalloc(size_t size, gfp_t flags)
+{
+	if (__builtin_constant_p(size)) {
+		if (size > KMALLOC_MAX_CACHE_SIZE)
+			return kmalloc_large(size, flags);
+	}
+	return __kmalloc(size, flags);
+}
+
+```
+
+两个参数：size : 需要分配的大小， flags: 分配标志 。
+
+GFP_KERNEL 不一直是使用的正确分配标志; 有时 kmalloc 从一个进程的上下文的外部调用. 例如, 这类的调用可能发生在中断处理, tasklet, 和内核定时器中. 在这个情况下, 当前进程不应当被置为睡眠, 并且驱动应当使用一个 GFP_ATOMIC 标志来代替. 内核正常地试图保持一些空闲页以便来满足原子的分配. 当使用 GFP_ATOMIC 时, kmalloc 能够使用甚至最后一个空闲页. 如果这最后一个空闲页不存在, 但是, 分配失败.
+
+其他用来代替或者增添 GFP_KERNEL 和 GFP_ATOMIC 的标志, 尽管它们 2 个涵盖大部分设备驱动的需要. 所有的标志定义在 "linux/gfp.h", 并且每个标志用一个双下划线做前缀, 例如 __GFP_DMA. 另外, 有符号代表常常使用的标志组合; 这些缺乏前缀并且有时被称为分配优先级. 后者包括:
+
+GFP_ATOMIC
+用来从中断处理和进程上下文之外的其他代码中分配内存. 从不睡眠.
+
+GFP_KERNEL
+内核内存的正常分配. 可能睡眠.
+
+GFP_USER
+用来为用户空间页来分配内存; 它可能睡眠.
+
+GFP_HIGHUSER
+如同 GFP_USER, 但是从高端内存分配, 如果有. 高端内存在下一个子节描述.
+
+GFP_NOIO
+GFP_NOFS
+这个标志功能如同 GFP_KERNEL, 但是它们增加限制到内核能做的来满足请求. 一个 GFP_NOFS 分配不允许进行任何文件系统调用, 而 GFP_NOIO 根本不允许任何 I/O 初始化. 它们主要地用在文件系统和虚拟内存代码, 那里允许一个分配睡眠, 但是递归的文件系统调用会是一个坏注意.
+
+上面列出的这些分配标志可以是下列标志的相或来作为参数, 这些标志改变这些分配如何进行:
+
+__GFP_DMA
+这个标志要求分配在能够 DMA 的内存区. 确切的含义是平台依赖的并且在下面章节来解释.
+
+__GFP_HIGHMEM
+这个标志指示分配的内存可以位于高端内存.
+
+__GFP_COLD
+正常地, 内存分配器尽力返回"缓冲热"的页 -- 可能在处理器缓冲中找到的页. 相反, 这个标志请求一个"冷"页, 它在一段时间没被使用. 它对分配页作 DMA 读是有用的, 此时在处理器缓冲中出现是无用的. 一个完整的对如何分配 DMA 缓存的讨论看"直接内存存取"一节在第 1 章.
+
+__GFP_NOWARN
+这个很少用到的标志阻止内核来发出警告(使用 printk ), 当一个分配无法满足.
+
+__GFP_HIGH
+这个标志标识了一个高优先级请求, 它被允许来消耗甚至被内核保留给紧急状况的最后的内存页.
+
+__GFP_REPEAT
+__GFP_NOFAIL
+__GFP_NORETRY
+这些标志修改分配器如何动作, 当它有困难满足一个分配. __GFP_REPEAT 意思是" 更尽力些尝试" 通过重复尝试 -- 但是分配可能仍然失败. __GFP_NOFAIL 标志告诉分配器不要失败; 它尽最大努力来满足要求. 使用 __GFP_NOFAIL 是强烈不推荐的; 可能从不会有有效的理由在一个设备驱动中使用它. 最后, __GFP_NORETRY 告知分配器立即放弃如果得不到请求的内存.
+
+kmalloc 能够分配的内存块的大小有一个上限. 这个限制随着体系和内核配置选项而变化. 如果你的代码是要完全可移植, 它不能指望可以分配任何大于 128 KB. 如果你需要多于几个 KB, 但是, 有个比 kmalloc 更好的方法来获得内存, 我们在本章后面描述.
+
+###   内存区段
+
+内核将我们的内存大致分成了三个区段。
+
+![sd](./image/mem_seg.png)
+
+
+### 后备高速缓存SLAB
+
+对于一个驱动设备来说，一般会使用固定分配的内存，不需要频繁的分配与释放，内核正好提供了SLAB机制，提前帮助我们分配一些相同类型的内存对象，然后我们就可以反复使用这些内存对象，直接减少了内存的分配释放次数。
+
+有了伙伴系统buddy，我们可以以页为单位获取连续的物理内存了，即4K为单位的获取，但如果需要频繁的获取/释放并不大的连续物理内存怎么办，如几十字节几百字节的获取/释放，这样的话用buddy就不太合适了，这就引出了slab。
+
+比如我需要一个100字节的连续物理内存，那么内核slab分配器会给我提供一个相应大小的连续物理内存单元，为128字节大小(不会是整好100字节，而是这个档的一个对齐值，如100字节对应128字节，30字节对应32字节，60字节对应64字节)，这个物理内存实际上还是从伙伴系统获取的物理页；当我不再需要这个内存时应该释放它，释放它并非把它归还给伙伴系统，而是归还给slab分配器，这样等再需要获取时无需再从伙伴系统申请，这也就是为什么slab分配器往往会把最近释放的内存(即所谓“热”)分配给申请者，这样效率是比较高的。
+
+在scullc 模块中我们的代码已经可以正常使用了，和最初的模块没有什么区别，只是效率提高了不少。
+
+![ss](./image/ssc.png)
+
+### 内核内存池
+
+内存池(Memery Pool)技术是在真正使用内存之前，先申请分配一定数量的、大小相等(一般情况下)的内存块留作备用。当有新的内存需求时，就从内存池中分出一部分内存块，若内存块不够再继续申请新的内存。这样做的一个显著优点是尽量避免了内存碎片，使得内存分配效率得到提升。 
+    不仅在用户态应用程序中被广泛使用，同时在Linux内核也被广泛使用，在内核中有不少地方内存分配不允许失败。作为一个在这些情况下确保分配的方式，内核开发者创建了一个已知为内存池(或者是 "mempool" )的抽象，内核中内存池真实地只是相当于后备缓存，它尽力一直保持一个空闲内存列表给紧急时使用，而在通常情况下有内存需求时还是从公共的内存中直接分配，这样的做法虽然有点霸占内存的嫌疑，但是可以从根本上保证关键应用在内存紧张时申请内存仍然能够成功。
+    下面看下内核内存池的源码，内核内存池的源码在中，实现上非常简洁，描述内存池的结构mempool_t在头文件中定义，结构描述如下：
+```
+typedef struct mempool_s {
+    spinlock_t lock; /*保护内存池的自旋锁*/
+    int min_nr; /*内存池中最少可分配的元素数目*/
+    int curr_nr; /*尚余可分配的元素数目*/
+    void **elements; /*指向元素池的指针*/
+    void *pool_data; /*内存源，即池中元素真实的分配处*/
+    mempool_alloc_t *alloc; /*分配元素的方法*/
+    mempool_free_t *free; /*回收元素的方法*/
+    wait_queue_head_t wait; /*被阻塞的等待队列*/
+} mempool_t;
+```
+内存池的创建函数mempool_create的函数原型如下：
+```
+mempool_t *mempool_create(int min_nr, mempool_alloc_t *alloc_fn,
+                mempool_free_t *free_fn, void *pool_data)
+{
+    return mempool_create_node(min_nr,alloc_fn,free_fn, pool_data,-1);
+}
+```
+函数原型指定内存池可以容纳元素的个数、申请元素的方法、释放元素的方法，以及一个可选的内存源(通常是一个cache)，内存池对象创建完成后会自动调用alloc方法从pool_data上分配min_nr个元素用来填充内存池。
+内存池的释放函数mempool_destory函数的原型很简单，应该也能猜到是依次将元素对象从池中移除，再释放给pool_data，最后释放池对象，如下：
+```
+void mempool_destroy(mempool_t *pool)
+{
+    while (pool->curr_nr) {
+        void *element = remove_element(pool);
+        pool->free(element, pool->pool_data);
+    }
+    kfree(pool->elements);
+    kfree(pool);
+}
+```
+值得注意的是内存池分配和回收对象的函数：mempool_alloc和mempool_free。mempool_alloc的作用是从指定的内存池中申请/获取一个对象，函数原型如下：
+```
+void * mempool_alloc(mempool_t *pool, gfp_t gfp_mask){
+......
+    element = pool->alloc(gfp_temp, pool->pool_data);
+    if (likely(element != NULL))
+        return element;
+
+    spin_lock_irqsave(&pool->lock, flags);
+    if (likely(pool->curr_nr)) {
+        element = remove_element(pool);/*从内存池中提取一个对象*/
+        spin_unlock_irqrestore(&pool->lock, flags);
+        /* paired with rmb in mempool_free(), read comment there */
+        smp_wmb();
+        return element;
+    }
+......
+    
+}
+```
+函数先是从pool_data中申请元素对象，当从pool_data无法成功申请到时，才会从池中提取对象使用，因此可以发现内核内存池mempool其实是一种后备池，在内存紧张的情况下才会真正从池中获取，这样也就能保证在极端情况下申请对象的成功率，单也不一定总是会成功，因为内存池的大小毕竟是有限的，如果内存池中的对象也用完了，那么进程就只能进入睡眠，也就是被加入到pool->wait的等待队列，等待内存池中有可用的对象时被唤醒，重新尝试从池中申请元素：
+```
+    init_wait(&wait);
+    prepare_to_wait(&pool->wait, &wait, TASK_UNINTERRUPTIBLE);
+    spin_unlock_irqrestore(&pool->lock, flags);
+    io_schedule_timeout(5*HZ);
+    finish_wait(&pool->wait, &wait);
+```
+池回收对象的函数mempool_free的原型如下：
+```
+void mempool_free(void *element, mempool_t *pool)
+{
+	if (pool->curr_nr < pool->min_nr) {
+		spin_lock_irqsave(&pool->lock, flags);
+		if (pool->curr_nr < pool->min_nr) {
+			add_element(pool, element);
+			spin_unlock_irqrestore(&pool->lock, flags);
+			wake_up(&pool->wait);
+			return;
+		}
+		spin_unlock_irqrestore(&pool->lock, flags);
+		}
+	pool->free(element, pool->pool_data);
+}
+```
+
+在驱动设备中建议不要使用内存池，内存浪费太大，其次必须保证一致性回收，不是很好处理小心OOPS.
+
+### VM 分配
+
+本质上还是物理分配，留给读者自学。
+
+### per-cpu
+
+2.6内核上一个新的特性就是per-CPU变量。顾名思义，就是每个处理器上有此变量的一个副本。
+per-CPU的最大优点就是，对它的访问几乎不需要锁，因为每个CPU都在自己的副本上工作。
+tasklet、timer_list等机制都使用了per-CPU技术。
+
+二、API使用
+
+注意，2.6内核是抢占式的。
+所以在访问per-CPU变量时，应使用特定的API来避免抢占，即避免它被切换到另一个CPU上被处理。
+
+per-CPU变量可以在编译时声明，也可以在系统运行时动态生成
+
+实例一：
+编译期间创建一个per-CPU变量：
+```
+    DEFINE_PER_CPU(int,my_percpu); //声明一个变量
+    DEFINE_PER_CPU(int[3],my_percpu_array); //声明一个数组
+```
+
+使用编译时生成的per-CPU变量：
+```
+    ptr = get_cpu_var(my_percpu); //
+    使用ptr
+    put_cpu_var(my_percpu); //
+```
+当然，也可以使用下列宏来访问特定CPU上的per-CPU变量
+```   
+    per_cpu(my_percpu, cpu_id); //
+
+per-CPU变量导出，供模块使用：
+    EXPORT_PER_CPU_SYMBOL(per_cpu_var);
+    EXPORT_PER_CPU_SYMBOL_GPL(per_cpu_var);
+
+```
+实例二：
+动态分配per-CPU变量：
+```
+    void *alloc_percpu(type);
+    void *__alloc_percpu(size_t size, size_t align);
+```
+使用动态生成的per-CPU变量：
+```   
+    int cpu;
+    cpu = get_cpu();
+    ptr = per_cpu_ptr(my_percpu);
+    //使用ptr
+    put_cpu();
+```
--- a/alloc_mem/code/scullslab/.tmp_versions/scull.mod
+++ b/alloc_mem/code/scullslab/.tmp_versions/scull.mod
@@ -0,0 +1,2 @@
+/home/hacker/git/Linux_Scull/alloc_mem/code/scullslab/scull.ko
+/home/hacker/git/Linux_Scull/alloc_mem/code/scullslab/main.o /home/hacker/git/Linux_Scull/alloc_mem/code/scullslab/mmap.o
--- a/alloc_mem/code/scullslab/Makefile
+++ b/alloc_mem/code/scullslab/Makefile
@@ -0,0 +1,12 @@
+scull-objs := main.o  mmap.o
+obj-m := scull.o
+CURRENT_PATH := ${shell pwd}
+CURRENT_KERNEL_PATH := ${shell uname -r}
+LINUX_KERNEL_PATH := /usr/src/kernels/$(CURRENT_KERNEL_PATH)
+
+all:
+	make -C $(LINUX_KERNEL_PATH) M=$(CURRENT_PATH) modules
+clean:
+	rm *.o    *.order       *.symvers      *.mod.c       *.ko
+
+	
--- a/alloc_mem/code/scullslab/main.c
+++ b/alloc_mem/code/scullslab/main.c
@@ -0,0 +1,582 @@
+#include <linux/module.h>
+#include <linux/moduleparam.h>
+#include <linux/init.h>
+#include <linux/kernel.h>	/* printk() */
+#include <linux/slab.h>		/* kmalloc() */
+#include <linux/fs.h>		/* everything... */
+#include <linux/errno.h>	/* error codes */
+#include <linux/types.h>	/* size_t */
+#include <linux/proc_fs.h>
+#include <linux/fcntl.h>	/* O_ACCMODE */
+#include <linux/aio.h>
+#include <asm/uaccess.h>
+#include <linux/mm.h>
+#include "scullc.h"		/* local definitions */
+
+
+int scullc_major =   SCULLC_MAJOR;
+int scullc_devs =    SCULLC_DEVS;	/* number of bare scullc devices */
+int scullc_qset =    SCULLC_QSET;
+int scullc_quantum = SCULLC_QUANTUM;
+
+module_param(scullc_major, int, 0);
+module_param(scullc_devs, int, 0);
+module_param(scullc_qset, int, 0);
+module_param(scullc_quantum, int, 0);
+MODULE_LICENSE("Dual BSD/GPL");
+
+struct scullc_dev *scullc_devices; /* allocated in scullc_init */
+
+int scullc_trim(struct scullc_dev *dev);
+void scullc_cleanup(void);
+
+/* declare one cache pointer: use it for all devices */
+struct kmem_cache *scullc_cache;
+
+
+
+
+
+#ifdef SCULLC_USE_PROC /* don't waste space if unused */
+/*
+ * The proc filesystem: function to read and entry
+ */
+
+void scullc_proc_offset(char *buf, char **start, off_t *offset, int *len)
+{
+	if (*offset == 0)
+		return;
+	if (*offset >= *len) {
+		/* Not there yet */
+		*offset -= *len;
+		*len = 0;
+	} else {
+		/* We're into the interesting stuff now */
+		*start = buf + *offset;
+		*offset = 0;
+	}
+}
+
+/* FIXME: Do we need this here??  It be ugly  */
+int scullc_read_procmem(char *buf, char **start, off_t offset,
+                   int count, int *eof, void *data)
+{
+	int i, j, quantum, qset, len = 0;
+	int limit = count - 80; /* Don't print more than this */
+	struct scullc_dev *d;
+
+	*start = buf;
+	for(i = 0; i < scullc_devs; i++) {
+		d = &scullc_devices[i];
+		if (down_interruptible (&d->sem))
+			return -ERESTARTSYS;
+		qset = d->qset;  /* retrieve the features of each device */
+		quantum=d->quantum;
+		len += sprintf(buf+len,"\nDevice %i: qset %i, quantum %i, sz %li\n",
+				i, qset, quantum, (long)(d->size));
+		for (; d; d = d->next) { /* scan the list */
+			len += sprintf(buf+len,"  item at %p, qset at %p\n",d,d->data);
+			scullc_proc_offset (buf, start, &offset, &len);
+			if (len > limit)
+				goto out;
+			if (d->data && !d->next) /* dump only the last item - save space */
+				for (j = 0; j < qset; j++) {
+					if (d->data[j])
+						len += sprintf(buf+len,"    % 4i:%8p\n",j,d->data[j]);
+					scullc_proc_offset (buf, start, &offset, &len);
+					if (len > limit)
+						goto out;
+				}
+		}
+	  out:
+		up (&scullc_devices[i].sem);
+		if (len > limit)
+			break;
+	}
+	*eof = 1;
+	return len;
+}
+
+#endif /* SCULLC_USE_PROC */
+
+/*
+ * Open and close
+ */
+
+int scullc_open (struct inode *inode, struct file *filp)
+{
+	struct scullc_dev *dev; /* device information */
+
+	/*  Find the device */
+	dev = container_of(inode->i_cdev, struct scullc_dev, cdev);
+
+    	/* now trim to 0 the length of the device if open was write-only */
+	if ( (filp->f_flags & O_ACCMODE) == O_WRONLY) {
+		if (down_interruptible (&dev->sem))
+			return -ERESTARTSYS;
+		scullc_trim(dev); /* ignore errors */
+		up (&dev->sem);
+	}
+
+	/* and use filp->private_data to point to the device data */
+	filp->private_data = dev;
+
+	return 0;          /* success */
+}
+
+int scullc_release (struct inode *inode, struct file *filp)
+{
+	return 0;
+}
+
+/*
+ * Follow the list 
+ */
+struct scullc_dev *scullc_follow(struct scullc_dev *dev, int n)
+{
+	while (n--) {
+		if (!dev->next) {
+			dev->next = kmalloc(sizeof(struct scullc_dev), GFP_KERNEL);
+			memset(dev->next, 0, sizeof(struct scullc_dev));
+		}
+		dev = dev->next;
+		continue;
+	}
+	return dev;
+}
+
+/*
+ * Data management: read and write
+ */
+
+ssize_t scullc_read (struct file *filp, char __user *buf, size_t count,
+                loff_t *f_pos)
+{
+	struct scullc_dev *dev = filp->private_data; /* the first listitem */
+	struct scullc_dev *dptr;
+	int quantum = dev->quantum;
+	int qset = dev->qset;
+	int itemsize = quantum * qset; /* how many bytes in the listitem */
+	int item, s_pos, q_pos, rest;
+	ssize_t retval = 0;
+
+	if (down_interruptible (&dev->sem))
+		return -ERESTARTSYS;
+	if (*f_pos > dev->size) 
+		goto nothing;
+	if (*f_pos + count > dev->size)
+		count = dev->size - *f_pos;
+	/* find listitem, qset index, and offset in the quantum */
+	item = ((long) *f_pos) / itemsize;
+	rest = ((long) *f_pos) % itemsize;
+	s_pos = rest / quantum; q_pos = rest % quantum;
+
+    	/* follow the list up to the right position (defined elsewhere) */
+	dptr = scullc_follow(dev, item);
+
+	if (!dptr->data)
+		goto nothing; /* don't fill holes */
+	if (!dptr->data[s_pos])
+		goto nothing;
+	if (count > quantum - q_pos)
+		count = quantum - q_pos; /* read only up to the end of this quantum */
+
+	if (copy_to_user (buf, dptr->data[s_pos]+q_pos, count)) {
+		retval = -EFAULT;
+		goto nothing;
+	}
+	up (&dev->sem);
+
+	*f_pos += count;
+	return count;
+
+  nothing:
+	up (&dev->sem);
+	return retval;
+}
+
+
+
+ssize_t scullc_write (struct file *filp, const char __user *buf, size_t count,
+                loff_t *f_pos)
+{
+	struct scullc_dev *dev = filp->private_data;
+	struct scullc_dev *dptr;
+	int quantum = dev->quantum;
+	int qset = dev->qset;
+	int itemsize = quantum * qset;
+	int item, s_pos, q_pos, rest;
+	ssize_t retval = -ENOMEM; /* our most likely error */
+
+	if (down_interruptible (&dev->sem))
+		return -ERESTARTSYS;
+
+	/* find listitem, qset index and offset in the quantum */
+	item = ((long) *f_pos) / itemsize;
+	rest = ((long) *f_pos) % itemsize;
+	s_pos = rest / quantum; q_pos = rest % quantum;
+
+	/* follow the list up to the right position */
+	dptr = scullc_follow(dev, item);
+	if (!dptr->data) {
+		dptr->data = kmalloc(qset * sizeof(void *), GFP_KERNEL);
+		if (!dptr->data)
+			goto nomem;
+		memset(dptr->data, 0, qset * sizeof(char *));
+	}
+	/* Allocate a quantum using the memory cache */
+	if (!dptr->data[s_pos]) {
+		dptr->data[s_pos] = kmem_cache_alloc(scullc_cache, GFP_KERNEL);
+		if (!dptr->data[s_pos])
+			goto nomem;
+		memset(dptr->data[s_pos], 0, scullc_quantum);
+	}
+	if (count > quantum - q_pos)
+		count = quantum - q_pos; /* write only up to the end of this quantum */
+	if (copy_from_user (dptr->data[s_pos]+q_pos, buf, count)) {
+		retval = -EFAULT;
+		goto nomem;
+	}
+	*f_pos += count;
+ 
+    	/* update the size */
+	if (dev->size < *f_pos)
+		dev->size = *f_pos;
+	up (&dev->sem);
+	return count;
+
+  nomem:
+	up (&dev->sem);
+	return retval;
+}
+
+/*
+ * The ioctl() implementation
+ */
+
+int scullc_ioctl (struct inode *inode, struct file *filp,
+                 unsigned int cmd, unsigned long arg)
+{
+
+	int err = 0, ret = 0, tmp;
+
+	/* don't even decode wrong cmds: better returning  ENOTTY than EFAULT */
+	if (_IOC_TYPE(cmd) != SCULLC_IOC_MAGIC) return -ENOTTY;
+	if (_IOC_NR(cmd) > SCULLC_IOC_MAXNR) return -ENOTTY;
+
+	/*
+	 * the type is a bitmask, and VERIFY_WRITE catches R/W
+	 * transfers. Note that the type is user-oriented, while
+	 * verify_area is kernel-oriented, so the concept of "read" and
+	 * "write" is reversed
+	 */
+	if (_IOC_DIR(cmd) & _IOC_READ)
+		err = !access_ok(VERIFY_WRITE, (void __user *)arg, _IOC_SIZE(cmd));
+	else if (_IOC_DIR(cmd) & _IOC_WRITE)
+		err =  !access_ok(VERIFY_READ, (void __user *)arg, _IOC_SIZE(cmd));
+	if (err)
+		return -EFAULT;
+
+	switch(cmd) {
+
+	case SCULLC_IOCRESET:
+		scullc_qset = SCULLC_QSET;
+		scullc_quantum = SCULLC_QUANTUM;
+		break;
+
+	case SCULLC_IOCSQUANTUM: /* Set: arg points to the value */
+		ret = __get_user(scullc_quantum, (int __user *) arg);
+		break;
+
+	case SCULLC_IOCTQUANTUM: /* Tell: arg is the value */
+		scullc_quantum = arg;
+		break;
+
+	case SCULLC_IOCGQUANTUM: /* Get: arg is pointer to result */
+		ret = __put_user (scullc_quantum, (int __user *) arg);
+		break;
+
+	case SCULLC_IOCQQUANTUM: /* Query: return it (it's positive) */
+		return scullc_quantum;
+
+	case SCULLC_IOCXQUANTUM: /* eXchange: use arg as pointer */
+		tmp = scullc_quantum;
+		ret = __get_user(scullc_quantum, (int __user *) arg);
+		if (ret == 0)
+			ret = __put_user(tmp, (int __user *) arg);
+		break;
+
+	case SCULLC_IOCHQUANTUM: /* sHift: like Tell + Query */
+		tmp = scullc_quantum;
+		scullc_quantum = arg;
+		return tmp;
+
+	case SCULLC_IOCSQSET:
+		ret = __get_user(scullc_qset, (int __user *) arg);
+		break;
+
+	case SCULLC_IOCTQSET:
+		scullc_qset = arg;
+		break;
+
+	case SCULLC_IOCGQSET:
+		ret = __put_user(scullc_qset, (int __user *)arg);
+		break;
+
+	case SCULLC_IOCQQSET:
+		return scullc_qset;
+
+	case SCULLC_IOCXQSET:
+		tmp = scullc_qset;
+		ret = __get_user(scullc_qset, (int __user *)arg);
+		if (ret == 0)
+			ret = __put_user(tmp, (int __user *)arg);
+		break;
+
+	case SCULLC_IOCHQSET:
+		tmp = scullc_qset;
+		scullc_qset = arg;
+		return tmp;
+
+	default:  /* redundant, as cmd was checked against MAXNR */
+		return -ENOTTY;
+	}
+
+	return ret;
+}
+
+/*
+ * The "extended" operations
+ */
+
+loff_t scullc_llseek (struct file *filp, loff_t off, int whence)
+{
+	struct scullc_dev *dev = filp->private_data;
+	long newpos;
+
+	switch(whence) {
+	case 0: /* SEEK_SET */
+		newpos = off;
+		break;
+
+	case 1: /* SEEK_CUR */
+		newpos = filp->f_pos + off;
+		break;
+
+	case 2: /* SEEK_END */
+		newpos = dev->size + off;
+		break;
+
+	default: /* can't happen */
+		return -EINVAL;
+	}
+	if (newpos<0) return -EINVAL;
+	filp->f_pos = newpos;
+	return newpos;
+}
+
+
+/*
+ * A simple asynchronous I/O implementation.
+ */
+
+struct async_work {
+	struct kiocb *iocb;
+	int result;
+	struct work_struct work;
+};
+
+/*
+ * "Complete" an asynchronous operation.
+ */
+static void scullc_do_deferred_op(void *p)
+{
+	struct async_work *stuff = (struct async_work *) p;
+	aio_complete(stuff->iocb, stuff->result, 0);
+	kfree(stuff);
+}
+
+
+static int scullc_defer_op(int write, struct kiocb *iocb, char __user *buf,
+		size_t count, loff_t pos)
+{
+	struct async_work *stuff;
+	int result;
+
+	/* Copy now while we can access the buffer */
+	if (write)
+		result = scullc_write(iocb->ki_filp, buf, count, &pos);
+	else
+		result = scullc_read(iocb->ki_filp, buf, count, &pos);
+
+	/* If this is a synchronous IOCB, we return our status now. */
+	if (is_sync_kiocb(iocb))
+		return result;
+
+	/* Otherwise defer the completion for a few milliseconds. */
+	stuff = kmalloc (sizeof (*stuff), GFP_KERNEL);
+	if (stuff == NULL)
+		return result; /* No memory, just complete now */
+	stuff->iocb = iocb;
+	stuff->result = result;
+	INIT_WORK(&stuff->work, scullc_do_deferred_op);
+	schedule_delayed_work(&stuff->work, HZ/100);
+	return -EIOCBQUEUED;
+}
+
+
+static ssize_t scullc_aio_read(struct kiocb *iocb, char __user *buf, size_t count,
+		loff_t pos)
+{
+	return scullc_defer_op(0, iocb, buf, count, pos);
+}
+
+static ssize_t scullc_aio_write(struct kiocb *iocb, const char __user *buf,
+		size_t count, loff_t pos)
+{
+	return scullc_defer_op(1, iocb, (char __user *) buf, count, pos);
+}
+
+
+ 
+
+/*
+ * The fops
+ */
+
+struct file_operations scullc_fops = {
+	.owner =     THIS_MODULE,
+	.llseek =    scullc_llseek,
+	.read =	     scullc_read,
+	.write =     scullc_write,
+	//.ioctl =     scullc_ioctl,
+	.open =	     scullc_open,
+	.release =   scullc_release,
+	.aio_read =  scullc_aio_read,
+	.aio_write = scullc_aio_write,
+};
+
+int scullc_trim(struct scullc_dev *dev)
+{
+	struct scullc_dev *next, *dptr;
+	int qset = dev->qset;   /* "dev" is not-null */
+	int i;
+
+	if (dev->vmas) /* don't trim: there are active mappings */
+		return -EBUSY;
+
+	for (dptr = dev; dptr; dptr = next) { /* all the list items */
+		if (dptr->data) {
+			for (i = 0; i < qset; i++)
+				if (dptr->data[i])
+					kmem_cache_free(scullc_cache, dptr->data[i]);
+
+			kfree(dptr->data);
+			dptr->data=NULL;
+		}
+		next=dptr->next;
+		if (dptr != dev) kfree(dptr); /* all of them but the first */
+	}
+	dev->size = 0;
+	dev->qset = scullc_qset;
+	dev->quantum = scullc_quantum;
+	dev->next = NULL;
+	return 0;
+}
+
+
+static void scullc_setup_cdev(struct scullc_dev *dev, int index)
+{
+	int err, devno = MKDEV(scullc_major, index);
+    
+	cdev_init(&dev->cdev, &scullc_fops);
+	dev->cdev.owner = THIS_MODULE;
+	dev->cdev.ops = &scullc_fops;
+	err = cdev_add (&dev->cdev, devno, 1);
+	/* Fail gracefully if need be */
+	if (err)
+		printk(KERN_NOTICE "Error %d adding scull%d", err, index);
+}
+
+
+
+/*
+ * Finally, the module stuff
+ */
+
+int scullc_init(void)
+{
+	int result, i;
+	dev_t dev = MKDEV(scullc_major, 0);
+	
+	/*
+	 * Register your major, and accept a dynamic number.
+	 */
+	if (scullc_major)
+		result = register_chrdev_region(dev, scullc_devs, "scullc");
+	else {
+		result = alloc_chrdev_region(&dev, 0, scullc_devs, "scullc");
+		scullc_major = MAJOR(dev);
+	}
+	if (result < 0)
+		return result;
+
+	
+	/* 
+	 * allocate the devices -- we can't have them static, as the number
+	 * can be specified at load time
+	 */
+	scullc_devices = kmalloc(scullc_devs*sizeof (struct scullc_dev), GFP_KERNEL);
+	if (!scullc_devices) {
+		result = -ENOMEM;
+		goto fail_malloc;
+	}
+	memset(scullc_devices, 0, scullc_devs*sizeof (struct scullc_dev));
+	for (i = 0; i < scullc_devs; i++) {
+		scullc_devices[i].quantum = scullc_quantum;
+		scullc_devices[i].qset = scullc_qset;
+		sema_init (&scullc_devices[i].sem, 1);
+		scullc_setup_cdev(scullc_devices + i, i);
+	}
+
+	scullc_cache = kmem_cache_create("scullc", scullc_quantum,
+			0, SLAB_HWCACHE_ALIGN, NULL); /* no ctor/dtor */
+	if (!scullc_cache) {
+		scullc_cleanup();
+		return -ENOMEM;
+	}
+
+#ifdef SCULLC_USE_PROC /* only when available */
+	create_proc_read_entry("scullcmem", 0, NULL, scullc_read_procmem, NULL);
+#endif
+	return 0; /* succeed */
+
+  fail_malloc:
+	unregister_chrdev_region(dev, scullc_devs);
+	return result;
+}
+
+
+
+void scullc_cleanup(void)
+{
+	int i;
+
+#ifdef SCULLC_USE_PROC
+	remove_proc_entry("scullcmem", NULL);
+#endif
+
+	for (i = 0; i < scullc_devs; i++) {
+		cdev_del(&scullc_devices[i].cdev);
+		scullc_trim(scullc_devices + i);
+	}
+	kfree(scullc_devices);
+
+	if (scullc_cache)
+		kmem_cache_destroy(scullc_cache);
+	unregister_chrdev_region(MKDEV (scullc_major, 0), scullc_devs);
+}
+
+
+module_init(scullc_init);
+module_exit(scullc_cleanup);
--- a/alloc_mem/code/scullslab/mmap.c
+++ b/alloc_mem/code/scullslab/mmap.c
@@ -0,0 +1,129 @@
+/*  -*- C -*-
+ * mmap.c -- memory mapping for the scullc char module
+ *
+ * Copyright (C) 2001 Alessandro Rubini and Jonathan Corbet
+ * Copyright (C) 2001 O'Reilly & Associates
+ *
+ * The source code in this file can be freely used, adapted,
+ * and redistributed in source or binary form, so long as an
+ * acknowledgment appears in derived source files.  The citation
+ * should list that the code comes from the book "Linux Device
+ * Drivers" by Alessandro Rubini and Jonathan Corbet, published
+ * by O'Reilly & Associates.   No warranty is attached;
+ * we cannot take responsibility for errors or fitness for use.
+ *
+ * $Id: _mmap.c.in,v 1.13 2004/10/18 18:07:36 corbet Exp $
+ */
+
+//#include <linux/config.h>
+#include <linux/module.h>
+
+#include <linux/mm.h>		/* everything */
+#include <linux/errno.h>	/* error codes */
+#include <asm/pgtable.h>
+#include <linux/moduleparam.h>
+#include <linux/init.h>
+#include <linux/kernel.h>	/* printk() */
+#include <linux/slab.h>		/* kmalloc() */
+#include <linux/fs.h>		/* everything... */
+#include <linux/errno.h>	/* error codes */
+#include <linux/types.h>	/* size_t */
+#include <linux/proc_fs.h>
+#include <linux/fcntl.h>	/* O_ACCMODE */
+#include <linux/aio.h>
+#include <asm/uaccess.h>
+
+#include "scullc.h"
+
+
+/*
+ * open and close: just keep track of how many times the device is
+ * mapped, to avoid releasing it.
+ */
+
+void scullc_vma_open(struct vm_area_struct *vma)
+{
+	struct scullc_dev *dev = vma->vm_private_data;
+
+	dev->vmas++;
+}
+
+void scullc_vma_close(struct vm_area_struct *vma)
+{
+	struct scullc_dev *dev = vma->vm_private_data;
+
+	dev->vmas--;
+}
+
+/*
+ * The nopage method: the core of the file. It retrieves the
+ * page required from the scullc device and returns it to the
+ * user. The count for the page must be incremented, because
+ * it is automatically decremented at page unmap.
+ *
+ * For this reason, "order" must be zero. Otherwise, only the first
+ * page has its count incremented, and the allocating module must
+ * release it as a whole block. Therefore, it isn't possible to map
+ * pages from a multipage block: when they are unmapped, their count
+ * is individually decreased, and would drop to 0.
+ */
+
+struct page *scullc_vma_nopage(struct vm_area_struct *vma,
+                                unsigned long address, int *type)
+{
+	unsigned long offset;
+	struct scullc_dev *ptr, *dev = vma->vm_private_data;
+	struct page *page = NULL;
+	void *pageptr = NULL; /* default to "missing" */
+
+	down(&dev->sem);
+	offset = (address - vma->vm_start) + (vma->vm_pgoff << PAGE_SHIFT);
+	if (offset >= dev->size) goto out; /* out of range */
+
+	/*
+	 * Now retrieve the scullc device from the list,then the page.
+	 * If the device has holes, the process receives a SIGBUS when
+	 * accessing the hole.
+	 */
+	offset >>= PAGE_SHIFT; /* offset is a number of pages */
+	for (ptr = dev; ptr && offset >= dev->qset;) {
+		ptr = ptr->next;
+		offset -= dev->qset;
+	}
+	if (ptr && ptr->data) pageptr = ptr->data[offset];
+	if (!pageptr) goto out; /* hole or end-of-file */
+
+	/* got it, now increment the count */
+	get_page(page);
+	if (type)
+		*type = VM_FAULT_MINOR;
+  out:
+	up(&dev->sem);
+	return page;
+}
+
+
+
+struct vm_operations_struct scullc_vm_ops = {
+	.open =     scullc_vma_open,
+	.close =    scullc_vma_close,
+	//.pmd_fault =   scullc_vma_nopage ,    //4.0,4 修改
+};
+
+
+int scullc_mmap(struct file *filp, struct vm_area_struct *vma)
+{
+	struct inode *inode = filp->f_inode;
+
+	/* refuse to map if order is not 0 */
+	/*if (scullc_devices[iminor(inode)].order)
+		return -ENODEV;
+    */
+	/* don't do anything here: "nopage" will set up page table entries */
+	vma->vm_ops = &scullc_vm_ops;
+	vma->vm_flags |= 0x00080000	;
+	vma->vm_private_data = filp->private_data;
+	scullc_vma_open(vma);
+	return 0;
+}
+
--- a/alloc_mem/code/scullslab/scullc.h
+++ b/alloc_mem/code/scullslab/scullc.h
@@ -0,0 +1,122 @@
+/* -*- C -*-
+ * scullc.h -- definitions for the scullc char module
+ *
+ * Copyright (C) 2001 Alessandro Rubini and Jonathan Corbet
+ * Copyright (C) 2001 O'Reilly & Associates
+ *
+ * The source code in this file can be freely used, adapted,
+ * and redistributed in source or binary form, so long as an
+ * acknowledgment appears in derived source files.  The citation
+ * should list that the code comes from the book "Linux Device
+ * Drivers" by Alessandro Rubini and Jonathan Corbet, published
+ * by O'Reilly & Associates.   No warranty is attached;
+ * we cannot take responsibility for errors or fitness for use.
+ */
+
+#include <linux/ioctl.h>
+#include <linux/cdev.h>
+#include <linux/mm.h>
+/*
+ * Macros to help debugging
+ */
+
+#undef PDEBUG             /* undef it, just in case */
+#ifdef SCULLC_DEBUG
+#  ifdef __KERNEL__
+     /* This one if debugging is on, and kernel space */
+#    define PDEBUG(fmt, args...) printk( KERN_DEBUG "scullc: " fmt, ## args)
+#  else
+     /* This one for user space */
+#    define PDEBUG(fmt, args...) fprintf(stderr, fmt, ## args)
+#  endif
+#else
+#  define PDEBUG(fmt, args...) /* not debugging: nothing */
+#endif
+
+#undef PDEBUGG
+#define PDEBUGG(fmt, args...) /* nothing: it's a placeholder */
+
+#define SCULLC_MAJOR 0   /* dynamic major by default */
+
+#define SCULLC_DEVS 4    /* scullc0 through scullc3 */
+
+/*
+ * The bare device is a variable-length region of memory.
+ * Use a linked list of indirect blocks.
+ *
+ * "scullc_dev->data" points to an array of pointers, each
+ * pointer refers to a memory page.
+ *
+ * The array (quantum-set) is SCULLC_QSET long.
+ */
+#define SCULLC_QUANTUM  4000 /* use a quantum size like scull */
+#define SCULLC_QSET     500
+
+struct scullc_dev {
+	void **data;
+	struct scullc_dev *next;  /* next listitem */
+	int vmas;                 /* active mappings */
+	int quantum;              /* the current allocation size */
+	int qset;                 /* the current array size */
+	size_t size;              /* 32-bit will suffice */
+	struct semaphore sem;     /* Mutual exclusion */
+	struct cdev cdev;
+};
+
+extern struct scullc_dev *scullc_devices;
+
+extern struct file_operations scullc_fops;
+
+/*
+ * The different configurable parameters
+ */
+extern int scullc_major;     /* main.c */
+extern int scullc_devs;
+extern int scullc_order;
+extern int scullc_qset;
+
+/*
+ * Prototypes for shared functions
+ */
+int scullc_trim(struct scullc_dev *dev);
+struct scullc_dev *scullc_follow(struct scullc_dev *dev, int n);
+
+
+#ifdef SCULLC_DEBUG
+#  define SCULLC_USE_PROC
+#endif
+
+/*
+ * Ioctl definitions
+ */
+
+/* Use 'K' as magic number */
+#define SCULLC_IOC_MAGIC  'K'
+
+#define SCULLC_IOCRESET    _IO(SCULLC_IOC_MAGIC, 0)
+
+/*
+ * S means "Set" through a ptr,
+ * T means "Tell" directly
+ * G means "Get" (to a pointed var)
+ * Q means "Query", response is on the return value
+ * X means "eXchange": G and S atomically
+ * H means "sHift": T and Q atomically
+ */
+#define SCULLC_IOCSQUANTUM _IOW(SCULLC_IOC_MAGIC,  1, int)
+#define SCULLC_IOCTQUANTUM _IO(SCULLC_IOC_MAGIC,   2)
+#define SCULLC_IOCGQUANTUM _IOR(SCULLC_IOC_MAGIC,  3, int)
+#define SCULLC_IOCQQUANTUM _IO(SCULLC_IOC_MAGIC,   4)
+#define SCULLC_IOCXQUANTUM _IOWR(SCULLC_IOC_MAGIC, 5, int)
+#define SCULLC_IOCHQUANTUM _IO(SCULLC_IOC_MAGIC,   6)
+#define SCULLC_IOCSQSET    _IOW(SCULLC_IOC_MAGIC,  7, int)
+#define SCULLC_IOCTQSET    _IO(SCULLC_IOC_MAGIC,   8)
+#define SCULLC_IOCGQSET    _IOR(SCULLC_IOC_MAGIC,  9, int)
+#define SCULLC_IOCQQSET    _IO(SCULLC_IOC_MAGIC,  10)
+#define SCULLC_IOCXQSET    _IOWR(SCULLC_IOC_MAGIC,11, int)
+#define SCULLC_IOCHQSET    _IO(SCULLC_IOC_MAGIC,  12)
+
+#define SCULLC_IOC_MAXNR 12
+
+
+
--- a/alloc_mem/code/scullslab/scullc_load
+++ b/alloc_mem/code/scullslab/scullc_load
@@ -0,0 +1,30 @@
+#!/bin/sh
+module="scullc"
+device="scullc"
+mode="664"
+
+# Group: since distributions do it differently, look for wheel or use staff
+if grep '^staff:' /etc/group > /dev/null; then
+    group="staff"
+else
+    group="wheel"
+fi
+
+# remove stale nodes
+rm -f /dev/${device}? 
+
+# invoke insmod with all arguments we got
+# and use a pathname, as newer modutils don't look in . by default
+/sbin/insmod -f ./scull.ko $* || exit 1
+
+major=`cat /proc/devices | awk "\\$2==\"$module\" {print \\$1}"`
+
+mknod /dev/${device}0 c $major 0
+mknod /dev/${device}1 c $major 1
+mknod /dev/${device}2 c $major 2
+mknod /dev/${device}3 c $major 3
+ln -sf ${device}0  /dev/${device}
+
+# give appropriate group/permissions
+chgrp $group /dev/${device}[0-3]
+chmod $mode  /dev/${device}[0-3]
--- a/alloc_mem/code/scullslab/scullc_unload
+++ b/alloc_mem/code/scullslab/scullc_unload
@@ -0,0 +1,11 @@
+#!/bin/sh
+module="scullc"
+device="scullc"
+
+# invoke rmmod with all arguments we got
+/sbin/rmmod $module $* || exit 1
+
+# remove nodes
+rm -f /dev/${device}[0-3] /dev/${device}
+
+exit 0
--- a/alloc_mem/image/mem_seg.png
+++ b/alloc_mem/image/mem_seg.png
--- a/alloc_mem/image/ssc.png
+++ b/alloc_mem/image/ssc.png