Saturday, 20 June 2015

Installing Cassandra on Linux - Single Node Cluster

Install Cassandra on Linux

Prerequisites
Before installing Cassandra make sure the following prerequisites are met:
·               Root or sudo access to the install machine.
·               Oracle Java SE Runtime Environment (JRE)
              ·               Java Native Access (JNA) is required for production installations.

Check which version of Java is installed by running the following command in a terminal window:

java -version
echo $JAVA_HOME

To install Cassandra, download the binary files from the website, unpack them:

tar -xvzf apache-cassandra-1.2.16-bin.tar.gz
mv apache-cassandra-1.2.16 ~/cassandra


Configure Cassandra

Open the cassandra.yaml: file, which will be available in the
 <installation-directory>/conf directory of Cassandra.

$ vi cassandra.yaml
The above command opens the cassandra.yaml file.
Verify the following configurations.

By default, these values will be set to the specified directories.
·   data_file_directories “/var/lib/cassandra/data”
·   commitlog_directory “/var/lib/cassandra/commitlog”
·   saved_caches_directory “/var/lib/cassandra/saved_caches”
Note: To run a single-node test cluster of Cassandra, you aren’t
 going to need to change anything on the cassandra.yaml file.
Simply run:
./bin/cassandra
Next, make sure that the folders Cassandra accesses, such as the
log folder, exists and that Cassandra has the right to write on it:
mkdir /var/lib/cassandra
mkdir /var/log/cassandra
chown -R $USER:$GROUP /var/lib/cassandra
chown -R $USER:$GROUP /var/log/cassandra
Now set Cassandra’s variables by running:
export CASSANDRA_HOME=~/cassandra
export PATH=$PATH:$CASSANDRA_HOME/bin

Running Cassandra
To start Cassandra, open the terminal window, navigate to Cassandra
 home directory/home, where you unpacked Cassandra, and run
the following command to start your Cassandra server.
$ cd $CASSANDRA_HOME
$./bin/cassandra -f
Using the –f option tells Cassandra to stay in the foreground instead of
running as a background process.

If everything goes fine, you can see the Cassandra server starting.

And then run:
./bin/cassandra-cli                        
and if it says "Connected to: 'Test Cluster'", you are now running your
single-node cluster.

Starting and Stopping Cassandra

Starting and Stopping Cassandra as a service
                service cassandra start
                service cassandra stop


Cassandra operations in case of tarball installations:

                Starting Cassandra:        
                                $ cd <install_location>                  
                                $ bin/cassandra

                                - where <install_location> is where you have installed cassandra
                                - bin/cassandra will start cassandra as a background process
                                - bin/cassandra -f will start cassandra in foreground


                Stopping Cassandra:
                - As it's only a process and not a service you could do something like
                - where <pid> is what you get from ps auwx | grep cassandra
                                $ ps auwx | grep cassandra
                                $ sudo kill <pid>

Best Practices:

                - You could check if cassandra is running by checking the port, if you got a result back that
                   means it is running
                                $ lsof -i :9160
                - If you want to kill it , do kill -9 "pid"

                - Below procedure is suggested that minimizes the risk of something going wrong.

                - The other advantage of clean Cassandra restart procedure is saving some startup time.

                - A two-node cluster with nodes known as node01.hostserver.net and node02.hostserver.net :
                                 # nodetool -h node01.hostserver.net disablegossip         
                                 # nodetool -h node01.hostserver.net disablethrift
                                 # nodetool -h node01.hostserver.net drain


                disablegossip :  Because it makes node look like “dead” for other nodes.
                disablethrift   :  Turning it off makes Cassandra unable to accept user’s requests because it 
                                           disables Cassandra’s RPC server.
                drain              :   It flushes column families. In other words – converts Memtables into
                                            immutable SSTables, emptying Commit Log this way