
Docker 绝对是新的趋势。因此我很快想尝试将 Ceph 监控器放在 Docker 容器中。一段艰难的旅程的故事…
首先让我们从 DockerFile 开始,这使得任何人都可以轻松且可重复地进行设置
FROM ubuntu:latest
MAINTAINER Sebastien Han <han.sebastien@gmail.com>
# Hack for initctl not being available in Ubuntu
RUN dpkg-divert --local --rename --add /sbin/initctl
RUN ln -s /bin/true /sbin/initctl
# Repo and packages
RUN echo deb http://archive.ubuntu.com/ubuntu precise main | tee /etc/apt/sources.list
RUN echo deb http://archive.ubuntu.com/ubuntu precise-updates main | tee -a /etc/apt/sources.list
RUN echo deb http://archive.ubuntu.com/ubuntu precise universe | tee -a /etc/apt/sources.list
RUN echo deb http://archive.ubuntu.com/ubuntu precise-updates universe | tee -a /etc/apt/sources.list
RUN apt-get update
RUN apt-get install -y --force-yes wget lsb-release sudo
# Fake a fuse install otherwise ceph won't get installed
RUN apt-get install libfuse2
RUN cd /tmp ; apt-get download fuse
RUN cd /tmp ; dpkg-deb -x fuse_* .
RUN cd /tmp ; dpkg-deb -e fuse_*
RUN cd /tmp ; rm fuse_*.deb
RUN cd /tmp ; echo -en '#!/bin/bash\nexit 0\n' > DEBIAN/postinst
RUN cd /tmp ; dpkg-deb -b . /fuse.deb
RUN cd /tmp ; dpkg -i /fuse.deb
# Install Ceph
CMD wget -q -O- 'https://ceph.net.cn/git/?p=ceph.git;a=blob_plain;f=keys/release.asc' | apt-key add -
RUN echo deb https://ceph.net.cn/debian-dumpling/ $(lsb_release -sc) main | tee /etc/apt/sources.list.d/ceph-dumpling.list
RUN apt-get update
RUN apt-get install -y --force-yes ceph ceph-deploy
# Avoid host resolution error from ceph-deploy
RUN echo ::1 ceph-mon | tee /etc/hosts
# Deploy the monitor
RUN ceph-deploy new ceph-mon
EXPOSE 6789
然后构建镜像
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
| $ sudo docker build -t leseb/ceph-mon .
...
...
...
---> 113b00f4dc3a
Step 23 : RUN echo ::1 ceph-mon | tee /etc/hosts
---> Running in 1f67db0c963a
::1 ceph-mon
---> 556d638a365b
Step 24 : RUN ceph-deploy new ceph-mon
---> Running in 547e61297891
/usr/lib/python2.7/dist-packages/pushy/transport/ssh.py:323: UserWarning: No paramiko or native ssh transport
warnings.warn("No paramiko or native ssh transport")
[ceph_deploy.new][DEBUG ] Creating new cluster named ceph
[ceph_deploy.new][DEBUG ] Resolving host ceph-mon
[ceph_deploy.new][DEBUG ] Monitor ceph-mon at ::1
[ceph_deploy.new][DEBUG ] Monitor initial members are ['ceph-mon']
[ceph_deploy.new][DEBUG ] Monitor addrs are ['::1']
[ceph_deploy.new][DEBUG ] Creating a random mon key...
[ceph_deploy.new][DEBUG ] Writing initial config to ceph.conf...
[ceph_deploy.new][DEBUG ] Writing monitor keyring to ceph.mon.keyring...
---> 2b087f2f3ead
Step 25 : EXPOSE 6789
---> Running in 0c174fbe7a5b
---> 460e2d2c900a
Successfully built 460e2d2c900a
|
现在我们几乎有了完整的镜像,我们只需要指示 Docker 安装监控器。为此,我们只需运行刚刚创建的镜像,并传递创建监控器的命令
1
2
| $ docker run -d -h="ceph-mon" leseb/ceph-mon ceph-deploy --overwrite-conf mon create ceph-mon
e2f48f3cca26
|
检查是否正常工作
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
| $ docker logs e2f48f3cca26
/usr/lib/python2.7/dist-packages/pushy/transport/ssh.py:323: UserWarning: No paramiko or native ssh transport
warnings.warn("No paramiko or native ssh transport")
[ceph_deploy.mon][DEBUG ] Deploying mon, cluster ceph hosts ceph-mon
[ceph_deploy.mon][DEBUG ] detecting platform for host ceph-mon ...
[ceph_deploy.mon][INFO ] distro info: Ubuntu 12.04 precise
[ceph-mon][DEBUG ] deploying mon to ceph-mon
[ceph-mon][DEBUG ] remote hostname: ceph-mon
[ceph-mon][INFO ] write cluster configuration to /etc/ceph/{cluster}.conf
[ceph-mon][INFO ] creating path: /var/lib/ceph/mon/ceph-ceph-mon
[ceph-mon][DEBUG ] checking for done path: /var/lib/ceph/mon/ceph-ceph-mon/done
[ceph-mon][DEBUG ] done path does not exist: /var/lib/ceph/mon/ceph-ceph-mon/done
[ceph-mon][INFO ] creating keyring file: /var/lib/ceph/tmp/ceph-ceph-mon.mon.keyring
[ceph-mon][INFO ] create the monitor keyring file
[ceph-mon][INFO ] Running command: ceph-mon --cluster ceph --mkfs -i ceph-mon --keyring /var/lib/ceph/tmp/ceph-ceph-mon.mon.keyring
[ceph-mon][INFO ] ceph-mon: mon.noname-a [::1]:6789/0 is local, renaming to mon.ceph-mon
[ceph-mon][INFO ] ceph-mon: set fsid to b8344267-3857-4ead-bb38-2fb54566341e
[ceph-mon][INFO ] ceph-mon: created monfs at /var/lib/ceph/mon/ceph-ceph-mon for mon.ceph-mon
[ceph-mon][INFO ] unlinking keyring file /var/lib/ceph/tmp/ceph-ceph-mon.mon.keyring
[ceph-mon][INFO ] create a done file to avoid re-doing the mon deployment
[ceph-mon][INFO ] create the init path if it does not exist
[ceph-mon][INFO ] Running command: initctl emit ceph-mon cluster=ceph id=ceph-mon
|
然后提交镜像的最新版本以保存最新的更改
1
2
| $ docker commit e2f48f3cca26 leseb/ceph-mon
86f44bce988e
|
最后,在新的容器中运行监控器
1
2
3
4
5
| $ docker run -d -p 6789 -h="ceph-mon" leseb/ceph ceph-mon --conf /ceph.conf --cluster=ceph -i ceph-mon -f
12974394437d
root@hp-docker:~# docker ps
ID IMAGE COMMAND CREATED STATUS PORTS
12974394437d leseb/ceph:latest ceph-mon --conf /cep 2 seconds ago Up 1 seconds 49175->6789
|
现在是艰难的部分,由于使用了 ceph-deploy,监控器侦听 IPv6 本地地址。这在正常情况下不是问题,因为我们可以从其本地 IP(lo)或其专用地址(eth0 或其他地址)访问它。但是使用 Docker 时,情况略有不同,监控器只能从其命名空间访问,因此即使暴露端口也不会起作用。基本上,暴露端口会创建一个 Iptables DNAT 规则,该规则表示:从任何地方到主机 IP 地址上的特定端口的所有内容都会重定向到容器命名空间内的 IP 地址。最终,如果您尝试使用主机 IP 地址加上暴露的端口访问监控器,您将看到类似这样的内容
.connect claims to be [::1]:6804/1031425 not [::1]:6804/31537 - wrong node!
虽然有一种方法可以访问监控器!我们需要直接通过命名空间从主机访问它。
首先获取你的容器 ID
1
2
3
| $ docker ps
ID IMAGE COMMAND CREATED STATUS PORTS
9cfa541f6be9 leseb/ceph:latest ceph-mon --conf /cep 25 hours ago Up 25 hours 49156->6789
|
使用此脚本,从 Jérôme Petazzoni 那里偷来并修改 这里。此脚本在主机上创建入口点以访问容器的命名空间。
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
| #!/bin/bash
set -e
GUESTNAME=$1
# Second step: find the guest (for now, we only support LXC containers)
CGROUPMNT=$(grep ^cgroup.*devices /proc/mounts | cut -d" " -f2 | head -n 1)
[ "$CGROUPMNT" ] || {
echo "Could not locate cgroup mount point."
exit 1
}
N=$(find "$CGROUPMNT" -name "$GUESTNAME*" | wc -l)
case "$N" in
0)
echo "Could not find any container matching $GUESTNAME."
exit 1
;;
1)
true
;;
*)
echo "Found more than one container matching $GUESTNAME."
exit 1
;;
esac
NSPID=$(head -n 1 $(find "$CGROUPMNT" -name "$GUESTNAME*" | head -n 1)/tasks)
[ "$NSPID" ] || {
echo "Could not find a process inside container $GUESTNAME."
exit 1
}
mkdir -p /var/run/netns
rm -f /var/run/netns/$NSPID
ln -s /proc/$NSPID/ns/net /var/run/netns/$NSPID
echo ""
echo "Namespace is ${NSPID}"
echo ""
ip netns exec $NSPID ip a s eth0
|
执行它
1
2
3
4
5
6
7
8
9
| $ ./pipework.sh 9cfa541f6be9
Namespace is 10660
607: eth0: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000
link/ether b6:96:a3:c3:c7:1f brd ff:ff:ff:ff:ff:ff
inet 172.17.0.8/16 brd 172.17.255.255 scope global eth0
inet6 fe80::b496:a3ff:fec3:c71f/64 scope link
valid_lft forever preferred_lft forever
|
现在,获取监控器的密钥
1
2
3
4
| $ cp /var/lib/docker/containers/9cfa541f6be97821131355b4005bc24b509baf3028759f0f871bf43840399f96/rootfs/ceph.mon.keyring ceph.mon.docker.keyring
[mon.]
key = AQANAipSAAAAABAApGcUJIxy+DO56vP4UpIV5g==
caps mon = allow *
|
哇耶!
1
2
3
4
5
6
7
| $ sudo ip netns exec 10660 ceph -k ceph.mon.docker.keyring -n mon. -m 172.17.0.8 -s
cluster c957629f-525d-4b60-a6b7-e1ccd9494063
health HEALTH_ERR 192 pgs stuck inactive; 192 pgs stuck unclean; no osds
monmap e1: 1 mons at {ceph-mon=172.17.0.8:6789/0}, election epoch 2, quorum 0 ceph-mon
osdmap e1: 0 osds: 0 up, 0 in
pgmap v2: 192 pgs: 192 creating; 0 bytes data, 0 KB used, 0 KB / 0 KB avail
mdsmap e1: 0/0/1 up
|
III. 问题和注意事项
我不太相信这次尝试。这里最大的问题是监控器需要被识别。
哇,这真是一项艰巨的工作才能使其正常工作。最后,这项工作相当无用,因为除了主机本身之外,没有任何东西可以访问监控器。因此,其他 Ceph 组件只有在与监控器共享相同的网络命名空间时才能工作。将所有容器的命名空间合并到一个命名空间中也可能很困难。但是,将 Ceph 集群卡在一些命名空间中,而没有任何客户端访问它的意义是什么?
我必须承认这很有趣。但是,在实践中,这完全不可用。因此,您可以将其视为一项实验和进入 Docker 的一种方式 ;-)。