Templates by BIGtheme NET

Apache Cassandra Quick Start

In this Article, I will show How to start quick development with Casandra.
How to start/stop Casandra server. How to start Cassandra Query Language shell.
How create our own schema. How to create index on tables.
Where to check server log files and How to set our own log files directory.

What is Cassandra?

Cassandra is one of the NOSQL database, that provides scalability and high availability and fault-tolerance
on hardware, virtual systems or cloud infrastructure. It is the fastest distributed database technology.
It never goes down and can easily handle the biggest enterprise workloads.

As we know, NoSQL technologies are addressing the Big Data scalability problem.
Many large-scale organizations have chosen to use Cassandra, because of the features column indexing,
log-structured updates, denormalized and materialized views and built-in caching and many.

Cassandra offers, features include automatic replication to multiple nodes for fault-tolerance, avoiding single points
of failure by keeping cluster nodes identical, synchronous or asynchronous replication during updates,
and read/write throughput supported without downtime or interruption.

Third party contract support services for Apache Cassandra are also available.

How Fault Tolerant can be achieved by Cassandra?

Data is automatically replicated to multiple nodes for fault-tolerance.
Replication across multiple data centers is supported.
Failed nodes can be replaced with no downtime.

Steps to be follow?

1) Download latest stable cassandra release from cassandra.

(In my case, the file name as “apache-cassandra-2.1.6-bin.tar”), Unzip to some folder (C:\apache-cassandra-2.1.6).

2) Three directories Used by Cassandra are,

(a) data_file_directories – C:/apache-cassandra-2.1.6/data/
(b) commitlog_directory – C:/apache-cassandra-2.1.6/commitlog/
(c) saved_caches_directory – C:/apache-cassandra-2.1.6/saved_caches/

Create all these three directories and make this directory exists and is writable.
(You can right click and check the permissions).

3) Start Cassandra server

Command to start Casandra server is,
C:\apache-cassandra-2.1.6\bin>cassandra.bat -f

Press “Control-C” to stop Cassandra.

4) Using cqlsh

cqlsh is an interactive command line interface for Cassandra.
cqlsh allows you to execute CQL (Cassandra Query Language) statements against Cassandra.
Using CQL, you can define a schema, insert data, execute queries.

Command to run cqlsh prompt to connect to your local Cassandra instance with cqlsh:
C:\apache-cassandra-2.1.6\bin>cqlsh.bat

cqlsh

How to Create User defined Schema with Tables?

1) Create a KEYSPACE (Simmilar to NAMESPACE in relatonal database).

cqlsh> CREATE KEYSPACE devJavaSource
WITH REPLICATION = { ‘class’ : ‘SimpleStrategy’, ‘replication_factor’ : 1 };

2) Use created key space(devJavaSource).

cqlsh> USE devJavaSource;

3) Create table name (USERS) in key space devJavaSource, with columns ID, NAME, ADDRESS.

cqlsh> CREATE TABLE USERS (
ID int PRIMARY KEY,
NAME text,
ADDRESS text
);

4) Create data into USER table.

cqlsh> INSERT INTO USERS (ID, NAME, ADDRESS) VALUES (11101, ‘john’, ‘Oakland’);
cqlsh> INSERT INTO USERS (ID, NAME, ADDRESS) VALUES (11102, ‘smith’, ‘California’);
cqlsh> INSERT INTO USERS (ID, NAME, ADDRESS) VALUES (11103, ‘Joe’, ‘Nederland’);

5) Retrieve the data from USERS table,

cqlsh> SELECT * FROM USERS;

 id    | address    | name
-------+------------+-------
 11103 |  Nederland |   Joe
 11102 | California | smith
 11101 |    Oakland |  john 

cqlsh1111

How to create index on tables?

Syntax to create index is,

CREATE CUSTOM INDEX index_name ON keyspace_name.table_name ( column_name )
(USING class_name) (WITH OPTIONS = map)

cqlsh:devjavasource> create index on users(name);

Cassandra default logging :

Default cassandra log file path location is C:\apache-cassandra-2.1.6\logs\system.txt.

If you required this can be changed by create log4j-server.properies and mention your own log file path
“C:\apache-cassandra-2.1.6\conf\log4j-server.properies”.

Basic cqlsh commands :
1) cqlsh – This command is to start the CQL interactive terminal.
Syntax of this command is,

$ cqlsh [options] [host [port]]
$ python cqlsh [options] [host [port]]

Make sure your cassandra server is started before starting CQL interactive terminal.

If you try to start CQL interactive terminal, without starting cassandra server.
You will end up with the error,“Unable to connect to any servers”.

error1

2) ASSUME – Treats a column name or value as a specified type, even if that type information
is not specified in the table’s metadata.

3) CAPTURE – Captures command output and appends it to a file.

 CAPTURE ('<file>' | OFF ) 

To start capture commands if CAPTURE ‘<file>’ and to stop is CAPTURE OFF
capture3

The out put file is,
capture2

4) CONSISTENCY – Shows the current consistency level, or given a level, sets it.
Syntax is,

 CONSISTENCY level 

5) COPY – Imports and exports CSV (comma-separated values) data to and from Cassandra 1.1.3
and higher. Syntax is,

 COPY table_name ( column, ...)
FROM ( 'file_name' | STDIN )
WITH option = 'value' AND ... 

copy

Exported file from cassandra,
temp

6) DESCRIBE – Provides information about the connected Cassandra cluster, or about the data objects stored
in the cluster.

DESCRIBE ( CLUSTER | SCHEMA ) 
| KEYSPACES
| ( KEYSPACE keyspace_name )
| TABLES
| ( TABLE table_name )

desc

7) EXIT – Terminates cqlsh.
Syntax is ,

 EXIT | QUIT 

8) SHOW – Shows the Cassandra version, host, or data type assumptions for the current cqlsh client session.

 SHOW VERSION | HOST | ASSUMPTIONS

show

9) SOURCE – Executes a file containing CQL statements.
Syntax is,

 SOURCE 'file' 

Source file is,
source

source1

10) TRACING – Enables or disables request tracing.
Syntax is,

 TRACING ( ON | OFF ) 

The link you can refer more CQL commands is, cqlsh Commands

Trouble Shooting :

While starting C:\apache-cassandra-2.1.6\bin>cqlsh.bat
You may get error some times as Execute CQLSH “can’t detect python version cqlsh”

The error indicating that, Python is not installed in your machine.
Steps to follow to fix the mentioned error is,

1) Download & Install Python 2.7.x

2) Add Path Environment Variable for “C:\Python27” Directory if it is not existed

3) Execute Command “python setup.py install” under “C:\apache-cassandra-2.1.6\pylib” Directory to install python.

4) Completed

If it is working then fine, other wise create environment variable CASSANDRA_HOME
with value “C:\apache-cassandra-2.1.6”.

evironmental_variable

*** Venkat – Happy learning ****