Read more

Repair broken etcd node

Claus-Theodor Riegg
January 11, 2018Software engineer at makandra GmbH

If one etcd node is no longer a member of the remaining etcd cluster or fails to connect you need to remove it from the cluster and then add it again:

  1. Stop etcd on the broken node : sudo service etcd stop
  2. delete the data on the broken node sudo rm -r /var/lib/etcd/data/*
  3. delete the wal data on the broken node: sudo rm -r /var/lib/etcd/wal/*
  4. Follow the instructions for etcd runtime-configuration Show archive.org snapshot , remove the broken node from the cluster, then re-add it again and update the etcd config on the broken node with the parameters printed by the add command.
  5. start etcd again
Illustration book lover

Growing Rails Applications in Practice

Check out our e-book. Learn to structure large Ruby on Rails codebases with the tools you already know and love.

  • Introduce design conventions for controllers and user-facing models
  • Create a system for growth
  • Build applications to last
Read more Show archive.org snapshot

Even if etcd logging is configured to /var/log/etcd/etcd.log it can happen on new hosts (focal) that StandardOutput is only in journal (systemctl status etcd).

Posted by Claus-Theodor Riegg to makandra Operations (2018-01-11 16:04)