Nicksxs's Blog

What hurts more, the pain of hard work or the pain of regret?

这一篇不再是数据结构介绍了,大致的数据结构基本都介绍了,这一篇主要是查漏补缺,或者说讲一些重要且基本的概念,也可能是经常被忽略的,很多讲 redis 的系列文章可能都会忽略,学习 redis 的时候也会,因为觉得源码学习就是讲主要的数据结构和“算法”学习了就好了。
redis 的主要应用就是拿来作为高性能的缓存,那么缓存一般有些啥需要注意的,首先是访问速度,如果取得跟数据库一样快,那就没什么存在的意义,第二个是缓存的字面意思,我只是为了让数据读取快一些,通常大部分的场景这个是需要更新过期的,这里就把我要讲的第一点引出来了(真累,

redis过期策略

redis 是如何过期缓存的,可以猜测下,最无脑的就是每个设置了过期时间的 key 都设个定时器,过期了就删除,这种显然消耗太大,清理地最及时,还有的就是 redis 正在采用的懒汉清理策略和定期清理
懒汉策略就是在使用的时候去检查缓存是否过期,比如 get 操作时,先判断下这个 key 是否已经过期了,如果过期了就删掉,并且返回空,如果没过期则正常返回
主要代码是

/* This function is called when we are going to perform some operation
 * in a given key, but such key may be already logically expired even if
 * it still exists in the database. The main way this function is called
 * is via lookupKey*() family of functions.
 *
 * The behavior of the function depends on the replication role of the
 * instance, because slave instances do not expire keys, they wait
 * for DELs from the master for consistency matters. However even
 * slaves will try to have a coherent return value for the function,
 * so that read commands executed in the slave side will be able to
 * behave like if the key is expired even if still present (because the
 * master has yet to propagate the DEL).
 *
 * In masters as a side effect of finding a key which is expired, such
 * key will be evicted from the database. Also this may trigger the
 * propagation of a DEL/UNLINK command in AOF / replication stream.
 *
 * The return value of the function is 0 if the key is still valid,
 * otherwise the function returns 1 if the key is expired. */
int expireIfNeeded(redisDb *db, robj *key) {
    if (!keyIsExpired(db,key)) return 0;

    /* If we are running in the context of a slave, instead of
     * evicting the expired key from the database, we return ASAP:
     * the slave key expiration is controlled by the master that will
     * send us synthesized DEL operations for expired keys.
     *
     * Still we try to return the right information to the caller,
     * that is, 0 if we think the key should be still valid, 1 if
     * we think the key is expired at this time. */
    if (server.masterhost != NULL) return 1;

    /* Delete the key */
    server.stat_expiredkeys++;
    propagateExpire(db,key,server.lazyfree_lazy_expire);
    notifyKeyspaceEvent(NOTIFY_EXPIRED,
        "expired",key,db->id);
    return server.lazyfree_lazy_expire ? dbAsyncDelete(db,key) :
                                         dbSyncDelete(db,key);
}

/* Check if the key is expired. */
int keyIsExpired(redisDb *db, robj *key) {
    mstime_t when = getExpire(db,key);
    mstime_t now;

    if (when < 0) return 0; /* No expire for this key */

    /* Don't expire anything while loading. It will be done later. */
    if (server.loading) return 0;

    /* If we are in the context of a Lua script, we pretend that time is
     * blocked to when the Lua script started. This way a key can expire
     * only the first time it is accessed and not in the middle of the
     * script execution, making propagation to slaves / AOF consistent.
     * See issue #1525 on Github for more information. */
    if (server.lua_caller) {
        now = server.lua_time_start;
    }
    /* If we are in the middle of a command execution, we still want to use
     * a reference time that does not change: in that case we just use the
     * cached time, that we update before each call in the call() function.
     * This way we avoid that commands such as RPOPLPUSH or similar, that
     * may re-open the same key multiple times, can invalidate an already
     * open object in a next call, if the next call will see the key expired,
     * while the first did not. */
    else if (server.fixed_time_expire > 0) {
        now = server.mstime;
    }
    /* For the other cases, we want to use the most fresh time we have. */
    else {
        now = mstime();
    }

    /* The key expired if the current (virtual or real) time is greater
     * than the expire time of the key. */
    return now > when;
}
/* Return the expire time of the specified key, or -1 if no expire
 * is associated with this key (i.e. the key is non volatile) */
long long getExpire(redisDb *db, robj *key) {
    dictEntry *de;

    /* No expire? return ASAP */
    if (dictSize(db->expires) == 0 ||
       (de = dictFind(db->expires,key->ptr)) == NULL) return -1;

    /* The entry was found in the expire dict, this means it should also
     * be present in the main dict (safety check). */
    serverAssertWithInfo(NULL,key,dictFind(db->dict,key->ptr) != NULL);
    return dictGetSignedIntegerVal(de);
}

这里有几点要注意的,第一是当惰性删除时会根据lazyfree_lazy_expire这个参数去判断是执行同步删除还是异步删除,另外一点是对于 slave,是不需要执行的,因为会在 master 过期时向 slave 发送 del 指令。
光采用这个策略会有什么问题呢,假如一些key 一直未被访问,那这些 key 就不会过期了,导致一直被占用着内存,所以 redis 采取了懒汉式过期加定期过期策略,定期策略是怎么执行的呢

/* This function handles 'background' operations we are required to do
 * incrementally in Redis databases, such as active key expiring, resizing,
 * rehashing. */
void databasesCron(void) {
    /* Expire keys by random sampling. Not required for slaves
     * as master will synthesize DELs for us. */
    if (server.active_expire_enabled) {
        if (server.masterhost == NULL) {
            activeExpireCycle(ACTIVE_EXPIRE_CYCLE_SLOW);
        } else {
            expireSlaveKeys();
        }
    }

    /* Defrag keys gradually. */
    activeDefragCycle();

    /* Perform hash tables rehashing if needed, but only if there are no
     * other processes saving the DB on disk. Otherwise rehashing is bad
     * as will cause a lot of copy-on-write of memory pages. */
    if (!hasActiveChildProcess()) {
        /* We use global counters so if we stop the computation at a given
         * DB we'll be able to start from the successive in the next
         * cron loop iteration. */
        static unsigned int resize_db = 0;
        static unsigned int rehash_db = 0;
        int dbs_per_call = CRON_DBS_PER_CALL;
        int j;

        /* Don't test more DBs than we have. */
        if (dbs_per_call > server.dbnum) dbs_per_call = server.dbnum;

        /* Resize */
        for (j = 0; j < dbs_per_call; j++) {
            tryResizeHashTables(resize_db % server.dbnum);
            resize_db++;
        }

        /* Rehash */
        if (server.activerehashing) {
            for (j = 0; j < dbs_per_call; j++) {
                int work_done = incrementallyRehash(rehash_db);
                if (work_done) {
                    /* If the function did some work, stop here, we'll do
                     * more at the next cron loop. */
                    break;
                } else {
                    /* If this db didn't need rehash, we'll try the next one. */
                    rehash_db++;
                    rehash_db %= server.dbnum;
                }
            }
        }
    }
}
/* Try to expire a few timed out keys. The algorithm used is adaptive and
 * will use few CPU cycles if there are few expiring keys, otherwise
 * it will get more aggressive to avoid that too much memory is used by
 * keys that can be removed from the keyspace.
 *
 * Every expire cycle tests multiple databases: the next call will start
 * again from the next db, with the exception of exists for time limit: in that
 * case we restart again from the last database we were processing. Anyway
 * no more than CRON_DBS_PER_CALL databases are tested at every iteration.
 *
 * The function can perform more or less work, depending on the "type"
 * argument. It can execute a "fast cycle" or a "slow cycle". The slow
 * cycle is the main way we collect expired cycles: this happens with
 * the "server.hz" frequency (usually 10 hertz).
 *
 * However the slow cycle can exit for timeout, since it used too much time.
 * For this reason the function is also invoked to perform a fast cycle
 * at every event loop cycle, in the beforeSleep() function. The fast cycle
 * will try to perform less work, but will do it much more often.
 *
 * The following are the details of the two expire cycles and their stop
 * conditions:
 *
 * If type is ACTIVE_EXPIRE_CYCLE_FAST the function will try to run a
 * "fast" expire cycle that takes no longer than EXPIRE_FAST_CYCLE_DURATION
 * microseconds, and is not repeated again before the same amount of time.
 * The cycle will also refuse to run at all if the latest slow cycle did not
 * terminate because of a time limit condition.
 *
 * If type is ACTIVE_EXPIRE_CYCLE_SLOW, that normal expire cycle is
 * executed, where the time limit is a percentage of the REDIS_HZ period
 * as specified by the ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC define. In the
 * fast cycle, the check of every database is interrupted once the number
 * of already expired keys in the database is estimated to be lower than
 * a given percentage, in order to avoid doing too much work to gain too
 * little memory.
 *
 * The configured expire "effort" will modify the baseline parameters in
 * order to do more work in both the fast and slow expire cycles.
 */

#define ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP 20 /* Keys for each DB loop. */
#define ACTIVE_EXPIRE_CYCLE_FAST_DURATION 1000 /* Microseconds. */
#define ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC 25 /* Max % of CPU to use. */
#define ACTIVE_EXPIRE_CYCLE_ACCEPTABLE_STALE 10 /* % of stale keys after which
                                                   we do extra efforts. */
void activeExpireCycle(int type) {
    /* Adjust the running parameters according to the configured expire
     * effort. The default effort is 1, and the maximum configurable effort
     * is 10. */
    unsigned long
    effort = server.active_expire_effort-1, /* Rescale from 0 to 9. */
    config_keys_per_loop = ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP +
                           ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP/4*effort,
    config_cycle_fast_duration = ACTIVE_EXPIRE_CYCLE_FAST_DURATION +
                                 ACTIVE_EXPIRE_CYCLE_FAST_DURATION/4*effort,
    config_cycle_slow_time_perc = ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC +
                                  2*effort,
    config_cycle_acceptable_stale = ACTIVE_EXPIRE_CYCLE_ACCEPTABLE_STALE-
                                    effort;

    /* This function has some global state in order to continue the work
     * incrementally across calls. */
    static unsigned int current_db = 0; /* Last DB tested. */
    static int timelimit_exit = 0;      /* Time limit hit in previous call? */
    static long long last_fast_cycle = 0; /* When last fast cycle ran. */

    int j, iteration = 0;
    int dbs_per_call = CRON_DBS_PER_CALL;
    long long start = ustime(), timelimit, elapsed;

    /* When clients are paused the dataset should be static not just from the
     * POV of clients not being able to write, but also from the POV of
     * expires and evictions of keys not being performed. */
    if (clientsArePaused()) return;

    if (type == ACTIVE_EXPIRE_CYCLE_FAST) {
        /* Don't start a fast cycle if the previous cycle did not exit
         * for time limit, unless the percentage of estimated stale keys is
         * too high. Also never repeat a fast cycle for the same period
         * as the fast cycle total duration itself. */
        if (!timelimit_exit &&
            server.stat_expired_stale_perc < config_cycle_acceptable_stale)
            return;

        if (start < last_fast_cycle + (long long)config_cycle_fast_duration*2)
            return;

        last_fast_cycle = start;
    }

    /* We usually should test CRON_DBS_PER_CALL per iteration, with
     * two exceptions:
     *
     * 1) Don't test more DBs than we have.
     * 2) If last time we hit the time limit, we want to scan all DBs
     * in this iteration, as there is work to do in some DB and we don't want
     * expired keys to use memory for too much time. */
    if (dbs_per_call > server.dbnum || timelimit_exit)
        dbs_per_call = server.dbnum;

    /* We can use at max 'config_cycle_slow_time_perc' percentage of CPU
     * time per iteration. Since this function gets called with a frequency of
     * server.hz times per second, the following is the max amount of
     * microseconds we can spend in this function. */
    timelimit = config_cycle_slow_time_perc*1000000/server.hz/100;
    timelimit_exit = 0;
    if (timelimit <= 0) timelimit = 1;

    if (type == ACTIVE_EXPIRE_CYCLE_FAST)
        timelimit = config_cycle_fast_duration; /* in microseconds. */

    /* Accumulate some global stats as we expire keys, to have some idea
     * about the number of keys that are already logically expired, but still
     * existing inside the database. */
    long total_sampled = 0;
    long total_expired = 0;

    for (j = 0; j < dbs_per_call && timelimit_exit == 0; j++) {
        /* Expired and checked in a single loop. */
        unsigned long expired, sampled;

        redisDb *db = server.db+(current_db % server.dbnum);

        /* Increment the DB now so we are sure if we run out of time
         * in the current DB we'll restart from the next. This allows to
         * distribute the time evenly across DBs. */
        current_db++;

        /* Continue to expire if at the end of the cycle more than 25%
         * of the keys were expired. */
        do {
            unsigned long num, slots;
            long long now, ttl_sum;
            int ttl_samples;
            iteration++;

            /* If there is nothing to expire try next DB ASAP. */
            if ((num = dictSize(db->expires)) == 0) {
                db->avg_ttl = 0;
                break;
            }
            slots = dictSlots(db->expires);
            now = mstime();

            /* When there are less than 1% filled slots, sampling the key
             * space is expensive, so stop here waiting for better times...
             * The dictionary will be resized asap. */
            if (num && slots > DICT_HT_INITIAL_SIZE &&
                (num*100/slots < 1)) break;

            /* The main collection cycle. Sample random keys among keys
             * with an expire set, checking for expired ones. */
            expired = 0;
            sampled = 0;
            ttl_sum = 0;
            ttl_samples = 0;

            if (num > config_keys_per_loop)
                num = config_keys_per_loop;

            /* Here we access the low level representation of the hash table
             * for speed concerns: this makes this code coupled with dict.c,
             * but it hardly changed in ten years.
             *
             * Note that certain places of the hash table may be empty,
             * so we want also a stop condition about the number of
             * buckets that we scanned. However scanning for free buckets
             * is very fast: we are in the cache line scanning a sequential
             * array of NULL pointers, so we can scan a lot more buckets
             * than keys in the same time. */
            long max_buckets = num*20;
            long checked_buckets = 0;

            while (sampled < num && checked_buckets < max_buckets) {
                for (int table = 0; table < 2; table++) {
                    if (table == 1 && !dictIsRehashing(db->expires)) break;

                    unsigned long idx = db->expires_cursor;
                    idx &= db->expires->ht[table].sizemask;
                    dictEntry *de = db->expires->ht[table].table[idx];
                    long long ttl;

                    /* Scan the current bucket of the current table. */
                    checked_buckets++;
                    while(de) {
                        /* Get the next entry now since this entry may get
                         * deleted. */
                        dictEntry *e = de;
                        de = de->next;

                        ttl = dictGetSignedIntegerVal(e)-now;
                        if (activeExpireCycleTryExpire(db,e,now)) expired++;
                        if (ttl > 0) {
                            /* We want the average TTL of keys yet
                             * not expired. */
                            ttl_sum += ttl;
                            ttl_samples++;
                        }
                        sampled++;
                    }
                }
                db->expires_cursor++;
            }
            total_expired += expired;
            total_sampled += sampled;

            /* Update the average TTL stats for this database. */
            if (ttl_samples) {
                long long avg_ttl = ttl_sum/ttl_samples;

                /* Do a simple running average with a few samples.
                 * We just use the current estimate with a weight of 2%
                 * and the previous estimate with a weight of 98%. */
                if (db->avg_ttl == 0) db->avg_ttl = avg_ttl;
                db->avg_ttl = (db->avg_ttl/50)*49 + (avg_ttl/50);
            }

            /* We can't block forever here even if there are many keys to
             * expire. So after a given amount of milliseconds return to the
             * caller waiting for the other active expire cycle. */
            if ((iteration & 0xf) == 0) { /* check once every 16 iterations. */
                elapsed = ustime()-start;
                if (elapsed > timelimit) {
                    timelimit_exit = 1;
                    server.stat_expired_time_cap_reached_count++;
                    break;
                }
            }
            /* We don't repeat the cycle for the current database if there are
             * an acceptable amount of stale keys (logically expired but yet
             * not reclained). */
        } while ((expired*100/sampled) > config_cycle_acceptable_stale);
    }

    elapsed = ustime()-start;
    server.stat_expire_cycle_time_used += elapsed;
    latencyAddSampleIfNeeded("expire-cycle",elapsed/1000);

    /* Update our estimate of keys existing but yet to be expired.
     * Running average with this sample accounting for 5%. */
    double current_perc;
    if (total_sampled) {
        current_perc = (double)total_expired/total_sampled;
    } else
        current_perc = 0;
    server.stat_expired_stale_perc = (current_perc*0.05)+
                                     (server.stat_expired_stale_perc*0.95);
}

执行定期清除分成两种类型,快和慢,分别由beforeSleepdatabasesCron调用,快版有两个限制,一个是执行时长由ACTIVE_EXPIRE_CYCLE_FAST_DURATION限制,另一个是执行间隔是 2 倍的ACTIVE_EXPIRE_CYCLE_FAST_DURATION,另外这还可以由配置的server.active_expire_effort参数来控制,默认是 1,最大是 10

onfig_cycle_fast_duration = ACTIVE_EXPIRE_CYCLE_FAST_DURATION +
                                 ACTIVE_EXPIRE_CYCLE_FAST_DURATION/4*effort

然后会从一定数量的 db 中找出一定数量的带过期时间的 key(保存在 expires中),这里的数量是由

config_keys_per_loop = ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP +
                           ACTIVE_EXPIRE_CYCLE_KEYS_PER_LOOP/4*effort
```                                 
控制,慢速的执行时长是
```C
config_cycle_slow_time_perc = ACTIVE_EXPIRE_CYCLE_SLOW_TIME_PERC +
                                  2*effort
timelimit = config_cycle_slow_time_perc*1000000/server.hz/100;

这里还有一个额外的退出条件,如果当前数据库的抽样结果已经达到我们所允许的过期 key 百分比,则下次不再处理当前 db,继续处理下个 db

在Java8的stream之前,将对象进行排序的时候,可能需要对象实现Comparable接口,或者自己实现一个Comparator,

比如这样子

我的对象是Entity

public class Entity {

    private Long id;

    private Long sortValue;

    public Long getId() {
        return id;
    }

    public void setId(Long id) {
        this.id = id;
    }

    public Long getSortValue() {
        return sortValue;
    }

    public void setSortValue(Long sortValue) {
        this.sortValue = sortValue;
    }
}

Comparator

public class MyComparator implements Comparator {
    @Override
    public int compare(Object o1, Object o2) {
        Entity e1 = (Entity) o1;
        Entity e2 = (Entity) o2;
        if (e1.getSortValue() < e2.getSortValue()) {
            return -1;
        } else if (e1.getSortValue().equals(e2.getSortValue())) {
            return 0;
        } else {
            return 1;
        }
    }
}

比较代码

private static MyComparator myComparator = new MyComparator();

    public static void main(String[] args) {
        List<Entity> list = new ArrayList<Entity>();
        Entity e1 = new Entity();
        e1.setId(1L);
        e1.setSortValue(1L);
        list.add(e1);
        Entity e2 = new Entity();
        e2.setId(2L);
        e2.setSortValue(null);
        list.add(e2);
        Collections.sort(list, myComparator);

看到这里的e2的排序值是null,在Comparator中如果要正常运行的话,就得判空之类的,这里有两点需要,一个是不想写这个MyComparator,然后也没那么好排除掉list里排序值,那么有什么办法能解决这种问题呢,应该说java的这方面真的是很强大

看一下nullsFirst的实现

final static class NullComparator<T> implements Comparator<T>, Serializable {
        private static final long serialVersionUID = -7569533591570686392L;
        private final boolean nullFirst;
        // if null, non-null Ts are considered equal
        private final Comparator<T> real;

        @SuppressWarnings("unchecked")
        NullComparator(boolean nullFirst, Comparator<? super T> real) {
            this.nullFirst = nullFirst;
            this.real = (Comparator<T>) real;
        }

        @Override
        public int compare(T a, T b) {
            if (a == null) {
                return (b == null) ? 0 : (nullFirst ? -1 : 1);
            } else if (b == null) {
                return nullFirst ? 1: -1;
            } else {
                return (real == null) ? 0 : real.compare(a, b);
            }
        }

核心代码就是下面这段,其实就是帮我们把前面要做的事情做掉了,是不是挺方便的,小记一下哈

echo 实操技巧

最近做 docker 系列,会经常需要进到 docker 内部,如上一篇介绍的,这些镜像一般都有用 ubuntu 或者alpine 这样的 Linux 系统作为底包,如果构建镜像的时候没有替换源的话,因为特殊的网络原因,在内部想编辑下东西要安装个类似于 vim 这样的编辑器就会很慢很慢,像视频里 two thousand years later~ 而且如果在容器内部想改源配置的话也要编辑器,就陷入了一个鸡生蛋,跟蛋生鸡的死锁问题中,对于 linux 大神来说应该有一万种方法解决这个问题,对于我这个渣渣来说可能只想到了这个土方法,先 cp backup 一下 sources.list, 再 echo “xxx” > sources.list, 这里就碰到了一个问题,这个 sources.list 一般不止一行,直接 echo 的话就解析不了了,不过 echo 可以支持”\n”转义,就是加-e看一下解释和示例,我这里使用了 tldr ,可以用 npm install -g tldr 安装,也可以直接用man, 或者–help 来查看使用方式

查看镜像底包

还有一点也是在这个时候要安装 vim 之类的,得知道是什么镜像底包,如果是用 uname 指令,其实看到的是宿主机的系统,得用cat /etc/issue


这里稍稍记一下

寻找系统镜像源

目前国内系统源用得比较多的是阿里云源,不过这里也推荐清华源, 中科大源, 浙大源 这里不要脸的推荐下母校的源,不过还不是很完善,尽情期待下。

运行第一个 Dockerfile

上一篇的 Dockerfile 我们停留在构建阶段,现在来把它跑起来

docker run -d -p 80 --name static_web nicksxs/static_web \
nginx -g "daemon off;"

这里的-d表示以分离模型运行docker (detached),然后-p 是表示将容器的 80 端口开放给宿主机,然后容器名就叫 static_web,使用了我们上次构建的 static_web 镜像,后面的是让 nginx 在前台运行

可以看到返回了个容器 id,但是具体情况没出现,也没连上去,那我们想看看怎么访问在 Dockerfile 里写的静态页面,我们来看下docker 进程

发现为我们随机分配了一个宿主机的端口,32768,去服务器的防火墙把这个外网端口开一下,看看是不是符合我们的预期呢

好像不太对额,应该是 ubuntu 安装的 nginx 的默认工作目录不对,我们来进容器看看,再熟悉下命令docker exec -it 4792455ca2ed /bin/bash
记得容器 id 换成自己的,进入容器后得找找 nginx 的配置文件,通常在/etc/nginx,/usr/local/etc等目录下,然后找到我们的目录是在这

所以把刚才的内容复制过去再试试

目标达成,give me five✌️

第二个 Dockerfile

然后就想来动态一点的,毕竟写过 PHP,就来试试 PHP
再建一个目录叫 dynamic_web,里面创建 src 目录,放一个 index.php
内容是

<?php
echo "Hello World!";

然后在 dynamic_web 目录下创建 Dockerfile,

FROM trafex/alpine-nginx-php7:latest
COPY src/ /var/www/html
EXPOSE 80

Dockerfile 虽然只有三行,不过要着重说明下,这个底包其实不是docker 官方的,有两点考虑,一点是官方的基本都是 php apache 的镜像,还有就是 alpine这个,截取一段中文介绍

Alpine 操作系统是一个面向安全的轻型 Linux 发行版。它不同于通常 Linux 发行版,Alpine 采用了 musl libc 和 busybox 以减小系统的体积和运行时资源消耗,但功能上比 busybox 又完善的多,因此得到开源社区越来越多的青睐。在保持瘦身的同时,Alpine 还提供了自己的包管理工具 apk,可以通过 https://pkgs.alpinelinux.org/packages 网站上查询包信息,也可以直接通过 apk 命令直接查询和安装各种软件。
Alpine 由非商业组织维护的,支持广泛场景的 Linux发行版,它特别为资深/重度Linux用户而优化,关注安全,性能和资源效能。Alpine 镜像可以适用于更多常用场景,并且是一个优秀的可以适用于生产的基础系统/环境。

Alpine Docker 镜像也继承了 Alpine Linux 发行版的这些优势。相比于其他 Docker 镜像,它的容量非常小,仅仅只有 5 MB 左右(对比 Ubuntu 系列镜像接近 200 MB),且拥有非常友好的包管理机制。官方镜像来自 docker-alpine 项目。

目前 Docker 官方已开始推荐使用 Alpine 替代之前的 Ubuntu 做为基础镜像环境。这样会带来多个好处。包括镜像下载速度加快,镜像安全性提高,主机之间的切换更方便,占用更少磁盘空间等。

一方面在没有镜像的情况下,拉取 docker 镜像还是比较费力的,第二个就是也能节省硬盘空间,所以目前有大部分的 docker 镜像都将 alpine 作为基础镜像了
然后再来构建下

这里还有个点,就是上面的那个镜像我们也是 EXPOSE 80端口,然后外部宿主机会随机映射一个端口,为了偷个懒,我们就直接指定外部端口了
docker run -d -p 80:80 dynamic_web打开浏览器发现访问不了,咋回事呢
因为我们没看trafex/alpine-nginx-php7:latest这个镜像说明,它内部的服务是 8080 端口的,所以我们映射的暴露端口应该是 8080,再用docker run -d -p 80:8080 dynamic_web这个启动,

限制下 docker 的 cpu 使用率

这里我们开始玩一点有意思的,我们在容器里装下 vim 和 gcc,然后写这样一段 c 代码

#include <stdio.h>
int main(void)
{
    int i = 0;
    for(;;) i++;
    return 0;
}

就是一个最简单的死循环,然后在容器里跑起来

$ gcc 1.c 
$ ./a.out

然后我们来看下系统资源占用(CPU)
Xs562iawhHyMxeO
上图是在容器里的,可以看到 cpu 已经 100%了
然后看看容器外面的
ecqH8XJ4k7rKhzu
可以看到一个核的 cpu 也被占满了,因为是个双核的机器,并且代码是单线程的
然后呢我们要做点啥
因为已经在这个 ubuntu 容器中装了 vim 和 gcc,考虑到国内的网络,所以我们先把这个容器 commit 一下,

docker commit -a "nick" -m "my ubuntu" f63c5607df06 my_ubuntu:v1

然后再运行起来

docker run -it --cpus=0.1 my_ubuntu:v1 bash


我们的代码跟可执行文件都还在,要的就是这个效果,然后再运行一下

结果是这个样子的,有点神奇是不,关键就在于 run 的时候的--cpus=0.1这个参数,它其实就是基于我前一篇说的 cgroup 技术,能将进程之间的cpu,内存等资源进行隔离

开始第一个 Dockerfile

上一面为了复用那个我装了 vim 跟 gcc 的容器,我把它提交到了本地,使用了docker commit命令,有点类似于 git 的 commit,但是这个不是个很好的操作方式,需要手动介入,这里更推荐使用 Dockerfile 来构建镜像

From ubuntu:latest
MAINTAINER Nicksxs "nicksxs@hotmail.com"
RUN  sed -i s@/archive.ubuntu.com/@/mirrors.aliyun.com/@g /etc/apt/sources.list
RUN apt-get clean
RUN apt-get update && apt install -y nginx
RUN echo 'Hi, i am in container' \
    > /usr/share/nginx/html/index.html
EXPOSE 80

先解释下这是在干嘛,首先是这个From ubuntu:latest基于的 ubuntu 的最新版本的镜像,然后第二行是维护人的信息,第三四行么作为墙内人你懂的,把 ubuntu 的源换成阿里云的,不然就有的等了,然后就是装下 nginx,往默认的 nginx 的入口 html 文件里输入一行欢迎语,然后暴露 80 端口
然后我们使用sudo docker build -t="nicksxs/static_web" .命令来基于这个 Dockerfile 构建我们自己的镜像,过程中是这样的


可以看到图中,我的 Dockerfile 是 7 行,里面就执行了 7 步,并且每一步都有一个类似于容器 id 的层 id 出来,这里就是一个比较重要的东西,docker 在构建的时候其实是有这个层的概念,Dockerfile 里的每一行都会往上加一层,这里有还注意下命令后面的.,代表当前目录下会自行去寻找 Dockerfile 进行构建,构建完了之后我们再看下我们的本地镜像

我们自己的镜像出现啦
然后有个问题,如果这个构建中途报了错咋办呢,来试试看,我们把 nginx 改成随便的一个错误名,nginxx(不知道会不会运气好真的有这玩意),再来 build 一把

找不到 nginxx 包,是不是这个镜像就完全不能用呢,当然也不是,因为前面说到了,docker 是基于层去构建的,可以看到前面的 4 个 step 都没报错,那我们基于最后的成功步骤创建下容器看看
也就是sudo docker run -t -i bd26f991b6c8 /bin/bash
答案是可以的,只是没装成功 nginx

还有一点注意到没,前面的几个 step 都有一句 Using cache,说明 docker 在构建镜像的时候是有缓存的,这也更能说明 docker 是基于层去构建镜像,同样的底包,同样的步骤,这些层是可以被复用的,这就是 docker 的构建缓存,当然我们也可以在 build 的时候加上--no-cache去把构建缓存禁用掉。

0%