Adding nodes:
You might want to consider adding a new node if you have:
- Reached data capacity problem
- Your data has outgrown the node’s hardware capacity
- Reached traffic capacity
- Your application needs more rapid response with less latency
- Need more operational headroom
- Need more resources for node repair, compaction, and other resource-intensive operations
Adding Nodes: Best Practices
Vnodes: For vnode clusters, we can increment the size of the cluster if more nodes are needed
Two Minute Rule:
- Wait a period a time before adding each additional node (single-token and vnodes).
- Follow the '2 minute rule'.
- This ensures the range announcement is known to all nodes before the next one begins entering the cluster.
Bootstrapping: Adding capacity to Cluster without downtime
- Node announces itself to ring using seed nodes.
- Can be a long-running process.
- Time depends on the size of data
Bootstrapping process:
- Calculate rages of the new node, notify ring of these pending ranges
- Calculate the nodes that currently own these ranges and will no longer own them once the bootstrap completes.
- Stream the data from these nodes to the bootstrapping node.
- We can monitor the bootstrapping process through # nodetool netstats
- Join the new node to the ring so that it can serve traffic
- Length of time it takes to join depends on the amount of data to be streamed.
![]() |
Bootstrapping Process |
Boostrapping Steps: Adding node to cluster
Step1 : Install Cassandra on the new nodes, but do not start Cassandra.
Step2 : Depending on the snitch used in the cluster, set either the
properties in the cassandra-topology.properties or the cassandra-
rackdc.properties file:
Step3 : Set the following properties in the cassandra.yaml file:
cluster_name : cluster name same as mentioned in other nodes
listen_address : node ip address
seed-provider : seed node details same as mentioned in other nodes
auto_bootstrap : should be set to true.
endpoint_snitch: same as specified in other nodes.
other non-default settings : data, commitlog, cdc.. in Cassandra
.yamal and snitch config files.
Step4 : Start dse cassandra services:
$ sudo service dse start
Step5 : Use nodetool status to verify that the node is fully bootstrapped
and all other nodes are up (UN) and not in any other state.
Step6 : After all new nodes are running, run nodetool cleanup on each of the
previously existing nodes to remove the keys that no longer belong
to those nodes.
What if Bootstrap fails?
Two scenarios:
1. Boostrap node couldn't even connect to the cluster.
- Check the log file for errors.
- Change the configuration and try again.
2. Streaming portion fails:
- node exists in cluster in joining state
- nodetool rebuild to rebootstrap data
Nodetool cleanup:
- Perform cleanup after a bootstrap on the other nodes
- Reads all SSTables to make sure there is no token out of range for that particular node.
- If you don't run clean up, will get picked up through compaction over time.
How the cleanup process work:
- It creates a new SSTable and copies only the data belong to that node from the old table.
- It ignores the irrelevant data in old SSTable
No comments:
Post a Comment