Templates by BIGtheme NET

Retrieve data from Cassandra using hector API

In this Article, I will show Different ways of retrieving data from Cassandra with hector API.
Simple use case that helps to understands, basic cassandra terminology “Column Family”, “Row Key”.
How to write a simple java program that retrieve data from Cassandra database.

Tools Uses :

1) Apache-cassandra-2.1.6
2) eclipse version Luna 4.4.1.
3) Maven 3.3.3
4) JDK 1.6 or above

Simple Use Case :

The Use case is to searching a property depends on different search criteria
City, State, Owner, Description.

The list of available properties are,

123

Here we Property is a column family, for understanding we can assume
column family like a table in traditional RDBS.

In any NOSQL like Cassandra all the data is stored as key-value pairs.
Here, keys of individual properties are P-1, P-2, P-3, P-4, P-5, P-6
and P-7 (we call these keys as “Row Key” ), each key map to the value say list
of columns (Country, State, Owner and Description).

1234

Again each column is also a key-value pairs and each key is map with value.
Here keys are City, State, Owner, Description and corresponding mapped values are
C1, S1, O1 and “2BHK, 1200 sq ft”.

Steps to write Java Program retrieve Data from Cassandra database:

1) Create a simple maven project.

2) Add the dependencies

3) Write a simple program to retrieve data from Cassandra Database.

4) Start the Cassandra server.

5) Run the program and verify the data in cassandra.

Add the given dependency for hector API,


	me.prettyprint
	hector-core
	0.8.0-2

Write a simple program to retrieve data from Cassandra :

This program will do the following,

1) Create a Cluster object :

Cluster cluster = null;
cluster = HFactory.getOrCreateCluster( "Cassandra DB Operations Cluster", "localHost:9160" );

HFactory is the hector convenience class with bunch of static methods.
getOrCreateCluster() is a static method, that tries to create a Cluster instance for an
existing Cassandra cluster.

If another class already called getOrCreateCluster, the factory returns the cached instance.
If the instance doesn’t exist in memory, a new ThriftCluster is created and cached.

This method is expecting two parameters,

(a) clusterName – This should be unique name (we should not have two clusters with same name )
and this name will be used as key to store the cluster object in map of clusters.

(b) hostIp – Using provided hostIp value, internally this method create CassandraHostConfigurator
instance and pass that as second parameter.

2) Create or use existing key space

    Keyspace keySpace = HFactory.createKeyspace("devjavasource", cluster);

Here “devjavasource” is the existed keyspace, I am using the same.
createKeyspace() static method in HFactory class will Creates a Keyspace with the default
consistency level policy (default is – ON_FAIL_TRY_ALL_AVAILABLE).

This consistency level,

What should the client do if a call to cassandra node fails
and we suspect that the node is down.
(e.g. it’s a communication error, not an application error).

There are three different consistency levels,

(a) FAIL_FAST : On communication failure, just return the error to the client and don’t retry.

(b) ON_FAIL_TRY_ONE_NEXT_AVAILABLE : On communication error try one more server before giving up.
Before giving up, cassandra node try one more server node is up to process the client request.

(c) ON_FAIL_TRY_ALL_AVAILABLE : On communication error try all known servers before giving up.
This is the case, If and only if all nodes are down. Then only client get communication failure.
That is why Cassandra is more stable and we can deliver most robust applications.

3) Create a Column Query :

we can create or get ColumnQuery object by calling the method createStringColumnQuery(),
is a static method in HFactory class and expecting keySpace instance as parameter.

final ColumnQuery<String, String, String> columnQuery = HFactory.createStringColumnQuery(keySpace);
columnQuery.setColumnFamily("property");
columnQuery.setName("P-1");
QueryResult<HColumn<String, String>> result = columnQuery.execute();
System.out.println("Column Key: " + result.get().getName() 
			+ " , Column Value: " + result.get().getValue());

What is a ColumnQuery?

A ColumnQuery is used for querying the value of a single and standard column.

Similar to ColumnQuery, other two we need to know is.

(a) SuperColumnQuery: A SuperColumnQuery is used for querying the value of a single entire
super-column from a SC family.

(b) SubColumnQuery: Used to get the value of a sub-column within a super column.

Actual Column Query Syntax is,

ColumnQuery<K, N, V>

Here, K is row key for this query (In my use case P-1, P-2, P-3 …..P-7)
N is column name for this query (In my use case City, State, Owner and Description)
V is the corresponding value of the column.

In my use case, all these are Strings, that is why I created as,

final ColumnQuery columnQuery = 
              HFactory.createStringColumnQuery(keySpace);

Another way of creating Column Query is,

final ColumnQuery<String, String, String> columnQuery =
                    HFactory.createColumnQuery(keySpace, SE, SE, SE); 
static StringSerializer SE= StringSerializer.get();

Complete Source code is Here,
App.java

package com.devjavasource.cassandra.CassandraDbService;

import java.util.Arrays;
import java.util.List;

import me.prettyprint.hector.api.Cluster;
import me.prettyprint.hector.api.Keyspace;
import me.prettyprint.hector.api.beans.HColumn;
import me.prettyprint.hector.api.factory.HFactory;
import me.prettyprint.hector.api.query.ColumnQuery;
import me.prettyprint.hector.api.query.QueryResult;

public class App {
	public static void main(String[] args) {
	Cluster cluster = null;
	Keyspace keySpace = null;
	QueryResult<HColumn<String, String>> result = null;
	try {
	cluster = HFactory.getOrCreateCluster(
			"production", "localHost:9160");
	// If the key space is not exist, you have to ctreate one with name
	// "devjavasource"
	keySpace = HFactory.createKeyspace("devjavasource", cluster);

	// Create a Column Query
	final ColumnQuery<String, String, String> columnQuery = HFactory.createStringColumnQuery(keySpace);
	columnQuery.setColumnFamily("property");
	
	System.out.println("Retrive Data from Cassandra Database with Hector Api ...");
	System.out.println("========================================================");	
	
	for( String key :KEYS ){
	System.out.println("KEY Value is: " + key );
	columnQuery.setKey(key);
	for(String col: COLUMNS)
	{
	columnQuery.setName(col);
	result = columnQuery.execute();
	System.out.println("Column Key: " + result.get().getName() 
			+ " , Column Value: " + result.get().getValue());
	}
	System.out.println("========================================================");
	}	
	
	} catch (Exception exp) {
		exp.printStackTrace();
	} finally {
		cluster.getConnectionManager().shutdown();
	}
 }		
	final static List<String> KEYS = Arrays.asList("P-1","P-2","P-3","P-4","P-5","P-6","P-7");
	final static List<String> COLUMNS = Arrays.asList("City","State","Owner","Description");
}

4) Start the Cassandra server :

Cassandra server should be up and running.
If the server is not running, run the server using following command.

Command to start Casandra server is,
C:\apache-cassandra-2.1.6\bin>cassandra.bat -f

5) Run Maven project :

Select App.java and Run As -> Java Application.

Out Put :

Retrive Data from Cassandra Database with Hector Api ...
========================================================
KEY Value is: P-1
Column Key: City , Column Value: C1
Column Key: State , Column Value: S1
Column Key: Owner , Column Value: O1
Column Key: Description , Column Value: 2BHK, 1200 sq ft
========================================================
KEY Value is: P-2
Column Key: City , Column Value: C2
Column Key: State , Column Value: S2
Column Key: Owner , Column Value: O2
Column Key: Description , Column Value: 2BHK, 1000 sq ft
========================================================
KEY Value is: P-3
Column Key: City , Column Value: C3
Column Key: State , Column Value: S1
Column Key: Owner , Column Value: O3
Column Key: Description , Column Value: 1BHK, 800 sq ft
========================================================
KEY Value is: P-4
Column Key: City , Column Value: C4
Column Key: State , Column Value: S3
Column Key: Owner , Column Value: O4
Column Key: Description , Column Value: 1BHK, 750 sq ft
========================================================
KEY Value is: P-5
Column Key: City , Column Value: C3
Column Key: State , Column Value: S4
Column Key: Owner , Column Value: O1
Column Key: Description , Column Value: 3BHK, 1450 sq ft
========================================================
KEY Value is: P-6
Column Key: City , Column Value: C3
Column Key: State , Column Value: S1
Column Key: Owner , Column Value: O5
Column Key: Description , Column Value: 2BHK, 950 sq ft
========================================================
KEY Value is: P-7
Column Key: City , Column Value: C1
Column Key: State , Column Value: S2
Column Key: Owner , Column Value: O5
Column Key: Description , Column Value: 3BHK, 1750 sq ft
========================================================

You can download complete project, Here

CassandraDbService

*** Venkat – Happy leaning ****