20. Member Status First let’s peek under the hood. $ riak-admin member_status ================================= Membership ================================Status Ring Pending Node-----------------------------------------------------------------------------valid 4.3% 16.8% riak@astonvalid 18.8% 16.8% riak@esbvalid 19.1% 16.8% riak@framboisevalid 19.5% 16.8% riak@ginvalid 19.1% 16.4% riak@highballvalid 19.1% 16.4% riak@ipa-----------------------------------------------------------------------------Valid:6 / Leaving:0 / Exiting:0 / Joining:0 / Down:0
21.
22.
23.
24.
25.
26. Hot Patch We patched their live, production system while still under load. (on all nodes) riak attachl(riak_kv_bitcask_backend).m(riak_kv_bitcask_backend).Module riak_kv_bitcask_backend compiled: Date: November 12 2011, Time: 04.18Compiler options: [{outdir,"ebin"}, debug_info,warnings_as_errors, {parse_transform,lager_transform}, {i,"include"}]Object file: /usr/lib/riak/lib/riak_kv-1.0.1/ebin/riak_kv_bitcask_backend.beamExports: api_version/0 is_empty/1callback/3 key_counts/0delete/4 key_counts/1drop/1 module_info/0fold_buckets/4 module_info/1fold_keys/4 put/5fold_objects/4 start/2get/3 status/1...
27. Bingo! And the new code did what we expected. {ok, R} = riak_core_ring_manager:get_my_ring().[riak_core_vnode_master:get_vnode_pid(Partition, riak_kv_vnode) || {Partition,_} <- riak_core_ring:all_owners(R)].(riak@gin)19> [riak_core_vnode_master:get_vnode_pid(Partition, riak_kv_vnode) || {Partition,_} <- riak_core_ring:all_owners(R)].22:48:07.423 [notice] Unused data directories exist for partition "11417981541647679048466287755595961091061972992 ": "/data/riak/bitcask/11417981541647679048466287755595961091061972992"22:48:07.785 [notice] Unused data directories exist for partition "582317058624031631471780675535394015644160622592 ": "/data/riak/bitcask/582317058624031631471780675535394015644160622592"22:48:07.829 [notice] Unused data directories exist for partition "782131735602866014819940711258323334737745149952 ": "/data/riak/bitcask/782131735602866014819940711258323334737745149952"[{ok,<0.30093.11>},...
28. Manual Cleanup So we backed up those vnodes with unused data on Gin to another system and manually removed them. gin:/data/riak/bitcask$ ls manual_cleanup/ 11417981541647679048466287755595961091061972992 782131735602866014819940711258323334737745149952582317058624031631471780675535394015644160622592 gin:/data/riak/bitcask$ rm -rf manual_cleanup
34. Highball’s Turn Highball was next lowest now that Gin was handing data off, time to restart it too. on highball application:unset_env(riak_core, forced_ownership_handoff).application:set_env(riak_core, vnode_inactivity_timeout, 60000).application:set_env(riak_core, handoff_concurrency, 1). on gin application:set_env(riak_core, handoff_concurrency, 4). % the default setting riak_core_vnode_manager:force_handoffs().
Thank you to Greg Burd for most of these slides. He was going to give the presentation, but did not feel well enough to be here tonight.
Gin had not removed that vnode’s data directory after sending it to Aston.We had confirmation, data was not being removed after transfers finished.This would have eventually eaten all space on all nodes and halted the cluster.
We already had a solution ready for 1.0.2 which would properly identify any orphaned vnodes, why not simply use that?So we tested that on our laptops, creating a close approximation of the customer’s environment.
At this point it was late at night, the cluster was servicing requests as always and customers had no idea anything was wrong.We all went to bed.And didn’t reconvene for 12 hours.
On Gin only, we reset things we’d changed to default values and then re-enabled handoffs.