发布 13.2.3 Mimic
TheAnalyst
这是 Mimic v13.2.x 长期稳定版系列的第三个 bugfix 版本。此版本包含 Ceph 所有组件的许多修复。
如果您尚未升级到 v13.2.3,请考虑升级到已发布的 v13.2.4,它在此版本的基础上包含一些安全修复。
- mon 的默认内存利用率有所增加。Rocksdb 现在默认使用 512 MB 内存,这对于中小型集群来说应该足够了;大型集群应该调高此设置。此外,mon_osd_cache_size 已从 10 个 OSDMap 增加到 500 个,这对于大型集群来说会增加 500 MB 到 1 GB 的内存,而对于小型集群来说则少得多。
- Ceph v13.2.2 包含一个错误的 backport,在从先前版本升级 Ceph 集群时,这可能会导致 mds 进入“损坏”状态。此 bug 已在 v13.2.3 中修复。如果您已经在运行 v13.2.2,升级到 v13.2.3 不需要特殊操作。
- 不再需要 bluestore_cache_* 选项。它们被 osd_memory_target 取代,默认为 4GB。BlueStore 将扩展和收缩其缓存以尝试保持在此限制内。升级的用户应注意,这个默认值高于以前 bluestore_cache_size 的 1GB 默认值,因此使用 BlueStore 的 OSD 默认情况下将使用更多内存。有关更多详细信息,请参阅 BlueStore 文档。
- 此版本包含一个升级 bug,http://tracker.ceph.com/issues/36686,由于此 bug,在恢复/回填期间升级可能会导致 OSD 失败。可以通过两种方式解决此 bug:要么在升级后重新启动所有 OSD,要么在所有 PG 处于“active+clean”状态时升级。如果您已经成功升级到 13.2.2,则此问题不会影响您。展望未来,我们正在为此功能开发一个清晰的升级路径。
变更日志 ¶
- build/ops: Can’t compile Ceph on Fedora 29 as it doesn’t recognize python*3*-tox as an install Tox (issue#18163, issue#37301, issue#37422, pr#25294, Nathan Cutler, Brad Hubbard)
- build/ops: debian: correct ceph-common relationship with older radosgw package (pr#25115, Matthew Vernon)
- ceph-bluestore-tool: fix set label functionality for specific keys (pr#24352, Igor Fedotov)
- ceph fs add_data_pool applies pool application metadata incorrectly (issue#36203, issue#36028, pr#24470, John Spray)
- cephfs: client: explicitly show blacklisted state via asok status command (issue#36457, issue#36352, pr#24993, Jonathan Brielmaier, Zhi Zhang)
- cephfs: client: request next osdmap for blacklisted client (issue#36668, issue#36690, pr#24987, Zhi Zhang)
- cephfs-journal-tool: wrong layout info used (issue#24933, issue#24644, pr#24583, Gu Zhongyan)
- cephfs: some tool commands silently operate on only rank 0, even if multiple ranks exist (issue#36218, pr#25036, Venky Shankar)
- ceph-fuse: add to selinux profile (issue#36103, issue#36197, pr#24439, Patrick Donnelly)
- ceph-volume: activate option –auto-detect-objectstore respects –no-systemd (issue#36249, pr#24357, Alfredo Deza)
- ceph-volume add device_id to inventory listing (pr#25349, Jan Fajerski)
- ceph-volume: add inventory command (issue#24972, pr#25013, Jan Fajerski)
- ceph-volume Additional work on ceph-volume to add some choose_disk capabilities (issue#36446, pr#24782, Erwan Velu)
- ceph-volume add new ceph-handlers role from ceph-ansible (issue#36251, pr#24337, Alfredo Deza)
- ceph-volume: adds a –prepare flag to lvm batch (issue#36363, pr#24760, Andrew Schoen)
- ceph-volume: allow to specify –cluster-fsid instead of reading from ceph.conf (issue#26953, pr#25116, Alfredo Deza)
- ceph_volume_client: py3 compatible (issue#26850, issue#17230, pr#24443, Rishabh Dave, Patrick Donnelly)
- ceph-volume custom cluster names fail on filestore trigger (issue#27210, pr#24279, Alfredo Deza)
- ceph-volume: do not send (lvm) stderr/stdout to the terminal, use the logfile (issue#36492, pr#24740, Alfredo Deza)
- ceph-volume enable –no-systemd flag for simple sub-command (issue#36470, pr#25011, Alfredo Deza)
- ceph-volume: fix journal and filestore data size in lvm batch –report (issue#36242, pr#24306, Andrew Schoen)
- ceph-volume: lsblk can fail to find PARTLABEL, must fallback to blkid (issue#36098, pr#24334, Alfredo Deza)
- ceph-volume lvm.prepare update help to indicate partitions are needed, not devices (issue#24795, pr#24449, Alfredo Deza)
- ceph-volume: make lvm batch idempotent (pr#24588, Andrew Schoen)
- ceph-volume: patch Device when testing (issue#36768, pr#25066, Alfredo Deza)
- ceph-volume: reject devices that have existing GPT headers (issue#27062, pr#25103, Andrew Schoen)
- ceph-volume: remove LVs when using zap –destroy (pr#25100, Alfredo Deza)
- ceph-volume remove version reporting from help menu (issue#36386, pr#24753, Alfredo Deza)
- ceph-volume: rename Device property valid to available (issue#36701, pr#25133, Jan Fajerski)
- ceph-volume: skip processing devices that don’t exist when scanning system disks (issue#36247, pr#24381, Alfredo Deza)
- ceph-volume systemd import main so console_scripts work for executable (issue#36648, pr#24852, Alfredo Deza)
- ceph-volume tests install ceph-ansible’s requirements.txt dependencies (issue#36672, pr#24959, Alfredo Deza)
- ceph-volume tests.systemd update imports for systemd module (issue#36704, pr#24957, Alfredo Deza)
- ceph-volume: use console_scripts (issue#36601, pr#24838, Mehdi Abaakouk)
- ceph-volume util.encryption don’t push stderr to terminal (issue#36246, pr#24826, Alfredo Deza)
- ceph-volume util.encryption robust blkid+lsblk detection of lockbox (pr#24980, Alfredo Deza)
- client: fix use-after-free in Client::link() (issue#35841, issue#24557, pr#24187, “Yan, Zheng”)
- client: statfs inode count odd (issue#35940, issue#24849, pr#24377, Rishabh Dave)
- client:two ceph-fuse client, one can not list out files created by an… (issue#27051, issue#35934, pr#24295, Peng Xie)
- client: update ctime when modifying file content (issue#35945, issue#36134, pr#24385, “Yan, Zheng”)
- common: get real hostname from container/pod environment (pr#23916, Sage Weil)
- core: _aio_log_start inflight overlap of 0x10000~1000 with [65536~4096] (issue#36754, issue#36625, pr#25062, Jonathan Brielmaier, Yang Honggang)
- core: FAILED assert(osdmap_manifest.pinned.empty()) in OSDMonitor::prune_init() (issue#24612, issue#35071, pr#24918, Joao Eduardo Luis)
- core: Interactive mode CLI prints no output since Mimic (issue#36358, issue#36432, pr#24971, John Spray, Mohamad Gebai)
- core: mgr crash on scrub of unconnected osd (issue#36110, issue#36465, pr#25029, Sage Weil)
- core: mon osdmap cash too small during upgrade to mimic (issue#36505, pr#25019, Sage Weil)
- core: monstore tool rebuild does not generate creating_pgs (issue#36306, issue#36433, pr#25016, Sage Weil)
- core: Objecter: add ignore cache flag if got redirect reply (issue#36658, pr#25075, Iain Buclaw, Jonathan Brielmaier)
- core: objecter cannot resend split-dropped op when racing with con reset (issue#22544, issue#35843, pr#24970, Sage Weil)
- core: os/bluestore: cache autotuning and memory limit (issue#37340, pr#25283, Josh Durgin, Mark Nelson)
- core: rados rm –force-full is blocked when cluster is in full status (issue#36435, pr#25017, Yang Honggang)
- crush/CrushWrapper: fix crush tree json dumper (issue#36150, pr#24481, Oshyn Song)
- debian/control: require fuse for ceph-fuse (issue#21057, pr#24037, Thomas Serlin)
- doc: add ceph-volume inventory sections (pr#25130, Jan Fajerski)
- doc: fix broken fstab url in cephfs/fuse (issue#36286, issue#36313, pr#24441, Jos Collin)
- doc: Put command template into literal block (pr#25000, Alexey Stupnikov)
- doc: remove deprecated ‘scrubq’ from ceph(8) (issue#35813, issue#35855, pr#24210, Ruben Kerkhof)
- docs: backport edit on github changes (pr#25362, Neha Ojha, Noah Watkins)
- doc: Typo error on cephfs/fuse/ (issue#36180, issue#36308, pr#24420, Karun Josy)
- ec: src/common/interval_map.h: 161: FAILED assert(len > 0) (issue#21931, issue#22330, pr#24581, Neha Ojha)
- fsck: cid is improperly matched to oid (issue#36146, issue#36551, issue#36099, issue#32731, pr#24480, Kefu Chai, Sage Weil)
- kernel_untar_build.sh: bison: command not found (issue#36121, pr#24241, Neha Ojha)
- libcephfs: expose CEPH_SETATTR_MTIME_NOW and CEPH_SETATTR_ATIME_NOW (issue#36205, issue#35961, pr#24464, Zhu Shangzhong)
- librados application’s symbol could conflict with the libceph-common (issue#26839, issue#25154, pr#24708, Kefu Chai)
- librbd: blacklisted client might not notice it lost the lock (issue#34534, pr#24401, Jason Dillaman)
- librbd: ensure exclusive lock acquired when removing sync point snaps… (issue#35714, issue#24898, pr#24137, Mykola Golub)
- librbd: fixed assert when flattening clone with zero overlap (issue#35957, issue#35702, pr#24356, Jason Dillaman)
- librbd: journaling unable request can not be sent to remote lock owner (issue#26939, issue#35712, pr#24122, Mykola Golub)
- librbd: object map improperly flagged as invalidated (issue#24516, issue#36225, pr#24413, Jason Dillaman)
- librgw: crashes in multisite configuration (issue#36302, issue#36415, pr#24908, Casey Bodley)
- mds: allows client to create .. and . dirents (issue#32104, pr#24384, Venky Shankar)
- mds: curate priority of perf counters sent to mgr (issue#35938, issue#26991, issue#32090, issue#35837, pr#24467, Patrick Donnelly, Venky Shankar)
- mds: evict cap revoke non-responding clients (pr#24661, Venky Shankar)
- mimic:mds: fix mds damaged due to unexpected journal length (issue#36199, pr#24463, Zhi Zhang)
- mds: internal op missing events time ‘throttled’, ‘all_read’, ‘dispatched’ (issue#36114, issue#36195, pr#24411, Yanhu Cao)
- mds: migrate strays part by part when shutdown mds (issue#26926, issue#32092, pr#24435, “Yan, Zheng”)
- mds: optimize the way how max export size is enforced (issue#25131, pr#23952, “Yan, Zheng”)
- mds: print is_laggy message once (issue#35250, issue#35719, pr#24161, Patrick Donnelly)
- mds: rctime may go back (issue#35916, issue#36136, pr#24379, “Yan, Zheng”)
- mds: rctime not set on system inode (root) at startup (issue#36221, issue#36461, pr#25042, Patrick Donnelly)
- mds: reset heartbeat map at potential time-consuming places (issue#26858, pr#23506, Yan, Zheng, “Yan, Zheng”)
- mds: src/mds/MDLog.cc: 281: FAILED ceph_assert(!capped) during max_mds thrashing (issue#36350, issue#37093, pr#25095, “Yan, Zheng”, Jonathan Brielmaier)
- mgr/DaemonServer: fix Session leak (pr#24233, Sage Weil)
- mgr/dashboard: Add http support to dashboard (issue#36069, pr#24734, Boris Ranto, Wido den Hollander)
- mgr/dashboard: Add support for URI encode (issue#24621, issue#26856, issue#24907, pr#24488, Tiago Melo)
- mgr/dashboard: Progress bar does not stop in TableKeyValueComponent (issue#35925, pr#24258, Volker Theile)
- mgr/dashboard: Remove fieldsets when using CdTable (issue#27851, issue#26999, pr#24478, Tiago Melo)
- mgr: hold lock while accessing the request list and submittin request (pr#25113, Jerry Lee)
- mgr: [restful] deep_scrub is not a valid OSD command (issue#36720, issue#36749, pr#25040, Boris Ranto)
- mon: mgr options not parse propertly (issue#35076, issue#35836, pr#24176, Sage Weil)
- mon/OSDMonitor: invalidate max_failed_since on cancel_report (issue#35930, issue#35860, pr#24281, xie xingguo)
- mon: test if gid exists in pending for prepare_beacon (issue#35848, pr#24272, Patrick Donnelly)
- msg/async: clean up local buffers on dispatch (issue#36127, issue#35987, pr#24386, Greg Farnum)
- msg: ceph_abort() when there are enough accepter errors in msg server (issue#36219, pr#25045, penglaiyxy@gmail.com)
- msg: challenging authorizer messages appear at debug_ms=0 (issue#35251, issue#35717, pr#24113, Patrick Donnelly)
- multisite: data full sync does not limit concurrent bucket sync (issue#26897, issue#36216, pr#24536, Casey Bodley)
- multisite: data sync error repo processing does not back off on empty (issue#35979, issue#26938, pr#24319, Casey Bodley)
- multisite: incremental data sync makes unnecessary call to RGWReadRemoteDataLogShardInfoCR (issue#35977, issue#26952, pr#24710, Casey Bodley)
- multisite: intermittent test_bucket_index_log_trim failures (issue#36201, issue#36034, pr#24400, Casey Bodley)
- multisite: invalid read in RGWCloneMetaLogCoroutine (issue#36208, issue#35851, pr#24414, Casey Bodley)
- multisite: segfault on shutdown/realm reload (issue#35857, issue#35543, pr#24235, Casey Bodley)
- os/bluestore: fix bloom filter num entry miscalculation in repairer (issue#25001, pr#24339, Igor Fedotov)
- os/bluestore: handle spurious read errors (issue#22464, pr#24647, Paul Emmerich)
- osd: add creating to pg_string_state (issue#36174, issue#36298, pr#24601, Dan van der Ster)
- osd: backport recent upmap fixes (pr#25419, ningtao, xie xingguo)
- osdc/Objecter: possible race condition with connection reset (issue#36183, issue#36296, pr#24600, Jason Dillaman)
- osd: crash in OpTracker::unregister_inflight_op via OSD::get_health_metrics (issue#24889, pr#23026, Radoslaw Zarzynski)
- osdc: reduce ObjectCacher’s memory fragments (issue#36192, issue#36643, pr#24873, “Yan, Zheng”)
- osd/ECBackend: don’t get result code of subchunk-read overwritten (issue#35959, issue#21769, pr#24298, songweibin)
- OSDMapMapping does not handle active.size() > pool size (issue#26866, issue#35936, pr#24431, Sage Weil)
- osd/PG: avoid choose_acting picking want with > pool size items (issue#35963, issue#35924, pr#24344, Sage Weil)
- osd/PrimaryLogPG: fix potential pg-log overtrimming (pr#24309, xie xingguo)
- osd: race condition opening heartbeat connection (issue#36637, issue#36602, pr#25026, Sage Weil)
- osd: RBD client IOPS pool stats are incorrect (2x higher; includes IO hints as an op) (issue#24909, issue#36557, pr#25024, Jason Dillaman)
- osd: Remove old bft= which has been superceded by backfill (issue#36292, issue#36170, pr#24573, David Zafman)
- qa: add test that builds example librados programs (issue#36228, issue#15100, pr#24537, Nathan Cutler)
- qa/ceph-ansible: Specify stable-3.2 branch (pr#25191, Brad Hubbard)
- qa: extend timeout for SessionMap flush (issue#36156, pr#24438, Patrick Donnelly)
- qa: fsstress workunit does not execute in parallel on same host without clobbering files (issue#36278, issue#24177, issue#36323, issue#36184, issue#36165, issue#36153, pr#24408, Patrick Donnelly)
- qa: increase rm timeout for workunit cleanup (issue#36501, issue#36365, pr#24684, Patrick Donnelly)
- qa: install dependencies for rbd_workunit_kernel_untar_build (issue#35074, issue#35077, pr#24240, Ilya Dryomov)
- qa: remove knfs site from future releases (issue#36075, issue#36102, pr#24269, Yuri Weinstein)
- qa/suites/rados/thrash-old-clients: exclude packages for hammer, jewel (pr#25193, Neha Ojha)
- qa/suites/rgw/verify/tasks/cls_rgw: test cls_rgw (issue#25024, pr#23197, Casey Bodley, Sage Weil)
- qa/tasks/qemu: use unique clone directory to avoid race with workunit (issue#36542, issue#36569, pr#24811, Jason Dillaman)
- qa: test_recovery_pool tries asok on wrong node (issue#24928, issue#24858, pr#23087, Patrick Donnelly)
- qa: tolerate failed rank while waiting for state (issue#36280, issue#35828, pr#24572, Patrick Donnelly)
- qa/workunits: replace ‘realpath’ with ‘readlink -f’ in fsstress.sh (issue#36409, issue#36430, issue#35538, pr#24622, Ilya Dryomov, Jason Dillaman)
- RADOS: probably missing clone location for async_recovery_targets (issue#35964, issue#35546, pr#24345, xie xingguo)
- mimic:rbd: fix error import when the input is a pipe (issue#35705, issue#34536, pr#24002, songweibin)
- [rbd-mirror] failed assertion when updating mirror status (issue#36084, issue#36120, pr#24321, Jason Dillaman)
- rbd: [rbd-mirror] forced promotion after killing remote cluster results in stuck state (issue#36659, issue#36693, pr#24952, Jonathan Brielmaier, Jason Dillaman)
- rbd: [rbd-mirror] periodic mirror status timer might fail to be scheduled (issue#36500, issue#36555, pr#24916, Jason Dillaman)
- rbd: rbd-nbd: do not ceph_abort() after print the usages (issue#36660, issue#36713, pr#24988, Shiyang Ruan)
- rbd: TokenBucketThrottle: use reference to m_blockers.front() and then update it (issue#36529, issue#36475, pr#24915, Dongsheng Yang)
- Revert “mimic: cephfs-journal-tool: enable purge_queue journal’s event commands” (issue#36346, issue#24604, pr#24485, Xuehan Xu, “Yan, Zheng”)
- rgw: abort_bucket_multiparts() ignores individual NoSuchUpload errors (issue#36129, issue#35986, pr#24388, Casey Bodley)
- rgw-admin: reshard add can add a non existant bucket (issue#36449, issue#36756, pr#25087, Jonathan Brielmaier, Abhishek Lekshmanan)
- rgw: async sync_object and remove_object does not access coroutine me… (issue#36138, issue#35905, pr#24417, Tianshan Qu)
- rgw/beast: drop privileges after binding ports (issue#36041, pr#24436, Paul Emmerich)
- rgw: beast frontend fails to parse ipv6 endpoints (issue#36662, issue#36734, pr#25079, Jonathan Brielmaier, Casey Bodley)
- rgw: cls_user_remove_bucket does not write the modified cls_user_stats (issue#36496, issue#36533, pr#24910, Casey Bodley)
- rgw: default quota not set in radosgw for Openstack users (issue#24595, issue#36223, pr#24907, Casey Bodley)
- mimic:rgw: fix chunked-encoding for chunks >1MiB (issue#36125, issue#35990, pr#24363, Robin H. Johnson)
- rgw: fix deadlock on RGWIndexCompletionManager::stop (issue#26949, issue#35710, pr#24101, Yao Zongyou)
- mimic:rgw: fix leak of curl handle on shutdown (issue#35715, issue#36213, pr#24518, Casey Bodley)
- mimic:rgw: list bucket can not show the object uploaded by RGWPostObj when enable bucket versioning (pr#24571, yuliyang)
- rgw: radosgw-admin user stats are incorrect when dynamic re-sharding is enabled (issue#36535, pr#24911, Casey Bodley)
- rgw: raise debug level on redundant data sync error messages (issue#35830, issue#36140, pr#24418, Casey Bodley)
- rgw: raise default rgw_curl_low_speed_time to 300 seconds (issue#35708, issue#27989, pr#24071, Casey Bodley)
- rgw: renew resharding locks to prevent expiration (issue#36687, issue#27219, issue#34307, pr#24899, Orit Wasserman, J. Eric Ivancich)
- rgw: resharding produces invalid values of bucket stats (issue#36290, issue#36381, pr#24526, Abhishek Lekshmanan)
- mimic:rgw: return x-amz-version-id: null when delete obj in versioning (issue#35814, pr#24189, yuliyang)
- rgw: RGWAsyncGetBucketInstanceInfo does not access coroutine memory (issue#36211, issue#35812, pr#24516, Casey Bodley)
- rgw: set default objecter_inflight_ops = 24576 (issue#36571, issue#25109, pr#24860, Jonathan Brielmaier, Matt Benjamin)
- rgw: support server-side encryption when SSL is terminated in a proxy (issue#36645, issue#27221, pr#24931, Jonathan Brielmaier, Casey Bodley)
- rgw: use-after-free from RGWRadosGetOmapKeysCR::~RGWRadosGetOmapKeysCR (issue#21154, issue#36537, issue#36539, pr#24912, Casey Bodley, Sage Weil)
- rpm: use updated gperftools (issue#36508, issue#35969, pr#24260, Brad Hubbard, Kefu Chai)
- segv in BlueStore::OldExtent::create (issue#36592, issue#36526, pr#24745, Sage Weil)
- test/librbd: not valid to have different parents between image snapshots (issue#36117, pr#24244, Jason Dillaman)
- [test] periodic seg faults within unittest_librbd (issue#36220, issue#36238, pr#24711, Jason Dillaman)
- test/rbd_mirror: race in WaitingOnLeaderReleaseLeader (issue#36236, issue#36276, pr#24551, Mykola Golub)
- tests: ceph-admin-commands.sh workunit does not log what it’s doing (issue#37153, issue#37089, pr#25085, Nathan Cutler)
- tests: librados api aio tests race condition (issue#24587, issue#36647, pr#25027, Josh Durgin)
- tests: make readable.sh fail if it doesn’t run anything (pr#25050, Greg Farnum)
- tests: rbd: move OpenStack devstack test to rocky release (issue#36410, issue#36428, pr#24913, Jason Dillaman)
- tests: unittest_rbd_mirror: TestMockImageMap.AddInstancePingPongImageTest: Value of: it != peer_ack_ctxs->end() (issue#36683, issue#36689, pr#24946, Mykola Golub, Jonathan Brielmaier)
- tests: use timeout for fs asok operations (issue#36335, issue#36503, pr#25332, Patrick Donnelly)
- tests: /usr/bin/ld: cannot find -lradospp in rados mimic (issue#37396, pr#25285, Nathan Cutler)
- test: Use a grep pattern that works across releases (issue#35845, issue#35909, pr#24017, David Zafman)
- tools: ceph-objectstore-tool: Allow target level as first positional … (issue#35846, issue#35992, pr#24116, David Zafman)