Templates by BIGtheme NET

SliceQuery and MultigetSliceQuery in Hector API

In this Article, I will show How to use SliceQuery and MultigetSliceQuery Interfaces to retrieve
data from Cassandra. Simple use case that helps to understands, basic cassandra terminology
“Column Family”, “Row Key”. When and How to Use SliceQuery and MultigetSliceQuery.
Simple java program that retrieve data with SliceQuery and MultigetSliceQuery Interfaces.
Importance of setRange() method of MultigetSliceQuery interface.

Tools Uses :

1) Apache-cassandra-2.1.6
2) eclipse version Luna 4.4.1.
3) Maven 3.3.3
4) JDK 1.6 or above

Simple Use Case :

The Use case is to searching a property depends on different search criteria
City, State, Owner, Description.

The list of available properties are,

123

Here we Property is a column family, for understanding we can assume
column family like a table in traditional RDBS.

In any NOSQL like Cassandra all the data is stored as key-value pairs.
Here, keys of individual properties are P-1, P-2, P-3, P-4, P-5, P-6
and P-7 (we call these keys as “Row Key” ), each key map to the value say list
of columns (Country, State, Owner and Description).

1234

Again each column is also a key-value pairs and each key is map with value.
Here keys are City, State, Owner, Description and corresponding mapped values are
C1, S1, O1 and “2BHK, 1200 sq ft”.

When and How to Use SliceQuery to retrieve data from Cassandra?

User can search Cassandra data base like,

1) Get all details (“City”, “State”, “Owner”, “Description”) of any one of the property.
2) Get limited details of any one of the property.

In the above two cases, we can use SliceQuery to pull data from database.

Use Case 1:
Use query the property with key P-1 and want to get all the details
like City, State, Owner and Description.

// Create SliceQuery and set required key and columns
final SliceQuery sliceQuery = HFactory
		.createSliceQuery(keySpace, SE, SE, SE);
sliceQuery.setColumnFamily("property");

// Get any one of the property details with specified all or limited columns.
sliceQuery.setKey("P-1");
sliceQuery.setColumnNames("City", "State", "Owner", "Description");

result = sliceQuery.execute();

Use Case 2:

Use query the property with key P-1 and want to get any combination of one or more
details like City, State, Owner and Description.

Here in below example, User query only City and State of property with key P-1.

// Create SliceQuery and set required key and columns
final SliceQuery sliceQuery = HFactory
		.createSliceQuery(keySpace, SE, SE, SE);
sliceQuery.setColumnFamily("property");

// Get any one of the property details with specified all or limited columns.
sliceQuery.setKey("P-1");
sliceQuery.setColumnNames("City", "State");

result = sliceQuery.execute();

The result set of SliceQuery is, QueryResult with ColumnSlice objects.
A ColumnSlice represents a set of columns

When and How to Use MultigetSliceQuery to retrieve data from Cassandra?

We can query for the multiget slice, We can set multiple keys to the query.

      multigetSliceQuery.setKeys("P-1", "P-2","P-3", "P-4","P-5", "P-6","P-7");

we can consider each key as single slice.

There are many real time examples,
1) In case of any cart applications. The Item key represents a price of the Item.
P-1 is item with price 1$ … P-10 is with 10$.

Suppose user query like, Get Item description whose value between 3$ to 7$.
In this case we need to set only the keys, P-3…P-7. and column only “Description”
that represents item description.

//Get any one of the property details with specified all columns.
MultigetSliceQuery multigetSliceQuery = 
	HFactory.createMultigetSliceQuery(keySpace, SE, SE, SE);
multigetSliceQuery.setColumnFamily("Items");            
multigetSliceQuery.setKeys("P-3", "P-4","P-5", "P-6","P-7");
multigetSliceQuery.setColumnNames("Description");

In same shopping cart example,
Suppose user is query with features.
In that case, we have keys as F-1, F-2, F-3…..F-10.

Suppose user query like, Get Item prices and description.
Whose Items are having features F-2, F-4, F-6, F-10.

//Get any one of the property details with specified all columns.
MultigetSliceQuery multigetSliceQuery = 
	HFactory.createMultigetSliceQuery(keySpace, SE, SE, SE);
multigetSliceQuery.setColumnFamily("Features");            
multigetSliceQuery.setKeys("F-2", "F-4","F-6","F-10");
multigetSliceQuery.setColumnNames("Price", "Description");

Simple program to retrieve data using SliceQuery and MultigetSliceQuery :

Simple steps to be follow,

1) Create a Cluster object.

Cluster cluster = null;
cluster = HFactory.getOrCreateCluster( "Cassandra DB Operations Cluster", "localHost:9160" );

HFactory is the hector convenience class with bunch of static methods.
getOrCreateCluster() is a static method, that tries to create a Cluster instance for an
existing Cassandra cluster.

If another class already called getOrCreateCluster, the factory returns the cached instance.
If the instance doesn’t exist in memory, a new ThriftCluster is created and cached.

This method is expecting two parameters,
(a) clusterName – This should be unique name (we should not have two clusters with same name )
and this name will be used as key to store the cluster object in map of clusters.
(b) hostIp – Using provided hostIp value, internally this method create CassandraHostConfigurator
instance and pass that as second parameter.

2) Create or use existing key space

    Keyspace keySpace = HFactory.createKeyspace("devjavasource", cluster);

Here “devjavasource” is the existed keyspace, I am using the same.
createKeyspace() static method in HFactory class will Creates a Keyspace with the default
consistency level policy (default is – ON_FAIL_TRY_ALL_AVAILABLE).

This consistency level,

What should the client do if a call to cassandra node fails
and we suspect that the node is down.
(e.g. it’s a communication error, not an application error).

There are three different consistency levels,

(a) FAIL_FAST – On communication failure, just return the error to the client and don’t retry.

(b) ON_FAIL_TRY_ONE_NEXT_AVAILABLE – On communication error try one more server before giving up.
Before giving up, cassandra node try one more server node is up to process the client request.

(c) ON_FAIL_TRY_ALL_AVAILABLE – On communication error try all known servers before giving up.
This is the case, If and only if all nodes are down. Then only client get communication failure.
That is why Cassandra is more stable and we can deliver most robust applications.

3) Create a SliceQuery or MultigetSliceQuery query object:

Syntax to create SliceQuery is,

// Create SliceQuery and set required key and columns.			
final SliceQuery<String, String, String> sliceQuery = HFactory
		.createSliceQuery(keySpace, SE, SE, SE);
sliceQuery.setColumnFamily("property");
// any one of key from keys "P-1", "P-2","P-3", "P-4","P-5", "P-6","P-7"
sliceQuery.setKey("any_key");
// Columns can be one or more depends on requirement
sliceQuery.setColumnNames("City", "State", "Owner", "Description");
result = sliceQuery.execute();

Syntax to create MultigetSliceQuery is,

//Get any one of the property details with specified all columns.
MultigetSliceQuery<String, String, String> multigetSliceQuery = 
	HFactory.createMultigetSliceQuery(keySpace, SE, SE, SE);
multigetSliceQuery.setColumnFamily("property");            
multigetSliceQuery.setKeys("P-1", "P-2","P-3", "P-4","P-5", "P-6","P-7");
multigetSliceQuery.setRange("City", "State", false, 2);
//multigetSliceQuery.setColumnNames("City", "State");

The result set of MultigetSliceQuery is always ordered.
In our case, we have column names as City, State, Owner and Description.
After the order then result set will be, City, Description, Owner and State.

Complete source code is Here,
App.java

package com.devjavasource.cassandra.CassandraDbService;

import java.util.List;

import me.prettyprint.cassandra.serializers.StringSerializer;
import me.prettyprint.hector.api.Cluster;
import me.prettyprint.hector.api.Keyspace;
import me.prettyprint.hector.api.beans.ColumnSlice;
import me.prettyprint.hector.api.beans.HColumn;
import me.prettyprint.hector.api.beans.Row;
import me.prettyprint.hector.api.beans.Rows;
import me.prettyprint.hector.api.factory.HFactory;
import me.prettyprint.hector.api.query.MultigetSliceQuery;
import me.prettyprint.hector.api.query.QueryResult;
import me.prettyprint.hector.api.query.SliceQuery;

public class App {
	public static void main(String[] args) {
		Cluster cluster = null;
		Keyspace keySpace = null;
		QueryResult<ColumnSlice<String, String>> result = null;
		try {
			cluster = HFactory.getOrCreateCluster("production",
					"localHost:9160");
			// If the key space is not exist, you have to crate one with name
			// "devjavasource"
			keySpace = HFactory.createKeyspace("devjavasource", cluster);

			// Create SliceQuery and set required key and columns.			
			final SliceQuery<String, String, String> sliceQuery = HFactory
					.createSliceQuery(keySpace, SE, SE, SE);
			sliceQuery.setColumnFamily("property");
			sliceQuery.setKey("P-1");
			
			System.out.println("Retrive Data from Cassandra Database SingleSlice Query");
			System.out.println("========================================================");			
			
			//Get any one of the property details with specified all columns.
			System.out.println("Use Case1: ");			
			sliceQuery.setColumnNames("City", "State", "Owner", "Description");
			result = sliceQuery.execute();
			
			printDetails(result);
			
			//Get any one of the property details with specified limited columns like only City and State.
			System.out.println("\nUse Case2: ");						
			sliceQuery.setColumnNames("City", "State");
			result = sliceQuery.execute();
			
			printDetails(result);
			
			System.out.println("========================================================\n\n");
			
			System.out.println("Retrive Data from Cassandra Database MultigetSliceQuery Query");
			System.out.println("========================================================");			
			
			//Get any one of the property details with specified all columns.
			MultigetSliceQuery<String, String, String> multigetSliceQuery = 
	            HFactory.createMultigetSliceQuery(keySpace, SE, SE, SE);
	        multigetSliceQuery.setColumnFamily("property");            
	        multigetSliceQuery.setKeys("P-1", "P-2","P-3", "P-4","P-5", "P-6","P-7");
	        // set null range for empty byte[] on the underlying predicate
	        multigetSliceQuery.setRange("City", "State", false, 2);
	        //multigetSliceQuery.setColumnNames("City", "State");

	        QueryResult<Rows<String, String, String>> multiSliceResult = multigetSliceQuery.execute();
	        Rows<String, String, String> orderedRows = multiSliceResult.get();
	                               
	        for (Row<String, String, String> r : orderedRows) {
	        	System.out.println("key is: " + r.getKey());
	            printColumnDetails(r.getColumnSlice().getColumns());
	        }
	        System.out.println("========================================================");

		} catch (Exception exp) {
			exp.printStackTrace();
		} finally {
			cluster.getConnectionManager().shutdown();
		}
	}

	private static void printDetails(
			final QueryResult<ColumnSlice<String, String>> inResult) {
		List<HColumn<String, String>> hColumnList = inResult.get().getColumns();
		printColumnDetails(hColumnList);
	}

	private static void printColumnDetails(
			final List<HColumn<String, String>> inHColumnList) {
		for (HColumn<String, String> col : inHColumnList) {
			System.out.println("Col_Name: " + col.getName() + "  Col_Val: "
					+ col.getValue());
		}
	}

	static StringSerializer SE = StringSerializer.get();
}

Select App.java and Run As -> Java Application.
Out Put :

Retrive Data from Cassandra Database SingleSlice Query
========================================================
Use Case1: 
Col_Name: City  Col_Val: C1
Col_Name: Description  Col_Val: 2BHK, 1200 sq ft
Col_Name: Owner  Col_Val: O1
Col_Name: State  Col_Val: S1

Use Case2: 
Col_Name: City  Col_Val: C1
Col_Name: State  Col_Val: S1
========================================================


Retrive Data from Cassandra Database MultigetSliceQuery Query
========================================================
key is: P-2
Col_Name: City  Col_Val: C2
Col_Name: Description  Col_Val: 2BHK, 1000 sq ft
key is: P-5
Col_Name: City  Col_Val: C3
Col_Name: Description  Col_Val: 3BHK, 1450 sq ft
key is: P-1
Col_Name: City  Col_Val: C1
Col_Name: Description  Col_Val: 2BHK, 1200 sq ft
key is: P-7
Col_Name: City  Col_Val: C1
Col_Name: Description  Col_Val: 3BHK, 1750 sq ft
key is: P-6
Col_Name: City  Col_Val: C3
Col_Name: Description  Col_Val: 2BHK, 950 sq ft
key is: P-4
Col_Name: City  Col_Val: C4
Col_Name: Description  Col_Val: 1BHK, 750 sq ft
key is: P-3
Col_Name: City  Col_Val: C3
Col_Name: Description  Col_Val: 1BHK, 800 sq ft
========================================================

Importance of setRange() method of MultigetSliceQuery interface :

Using setRange() method, we can set a predicate of start/finish column to retrieve a
list of columns in this range.

Syntax is,

MultigetSliceQuery<K, N, V> setRange(N start, N finish, boolean reversed, int count);

N – start column name
N – finish column name
reversed – is reversed is possible or not
count – no.of column should be taken into the result

From the ordered columns, ( City, Description, Owner and State )
1) If reverse is false, then takes first parameter value as base
and add the no.of columns that is mentioned in fourth parameter.

2) If reverse is false, then takes third parameter value as base
and add the no.of columns that is mentioned in fourth parameter.

Some usage examples,

1) We can set null range for empty byte[] on the underlying predicate

 
     multigetSliceQuery.setRange(null, null, false, 4);
     multigetSliceQuery.setRange(null, null, true, 4);

In null case, both true and false for reversed value is same.
There is no difference in out put.
All rows will be in result set.

2) Provide only start column name and reverse is false.

// results all four columns City, Description, Owner and State.
multigetSliceQuery.setRange("City", null, false, 4); 

// results two columns City and Description.
multigetSliceQuery.setRange("City", null, false, 2);

// results no columns.
multigetSliceQuery.setRange("City", null, false, 0);

// results all four columns City, Description, Owner and State.
multigetSliceQuery.setRange("City", null, false, 10);

Note: Even we mention second parameter instead of null, there would not
be any change in result set.

3) Reverse as true

multigetSliceQuery.setRange("City", "Owner", true, 4); // Error

// returns result set with column Owner, Description and City
multigetSliceQuery.setRange("Owner", "City", true, 4);

// returns result set with column Owner, Description and City
multigetSliceQuery.setRange("Owner", null, true, 4);

// results all four columns Owner, Description, City and State
multigetSliceQuery.setRange("State", "City", true, 4);

Complete Source code to understand setRange() function,

SetRangeMethodExample.java

package com.devjavasource.cassandra.CassandraDbService;

import java.util.List;

import me.prettyprint.cassandra.serializers.StringSerializer;
import me.prettyprint.hector.api.Cluster;
import me.prettyprint.hector.api.Keyspace;
import me.prettyprint.hector.api.beans.HColumn;
import me.prettyprint.hector.api.beans.Row;
import me.prettyprint.hector.api.beans.Rows;
import me.prettyprint.hector.api.factory.HFactory;
import me.prettyprint.hector.api.query.MultigetSliceQuery;
import me.prettyprint.hector.api.query.QueryResult;

public class SetRangeMethodExample {
	public static void main(String[] args) {
		Cluster cluster = null;
		Keyspace keySpace = null;
		QueryResult<Rows<String, String, String>> multiSliceResult = null;
		Rows<String, String, String> orderedRows = null;
		
		try {
			cluster = HFactory.getOrCreateCluster("production",
					"localHost:9160");
			// If the key space is not exist, you have to crate one with name
			// "devjavasource"
			keySpace = HFactory.createKeyspace("devjavasource", cluster);			
			
			MultigetSliceQuery<String, String, String> multigetSliceQuery = 
	            HFactory.createMultigetSliceQuery(keySpace, SE, SE, SE);
	        multigetSliceQuery.setColumnFamily("property");            
	        multigetSliceQuery.setKeys("P-1");
	        
	        //set null range for empty byte[] on the underlying predicate
	        System.out.println("Use Case 1: ");
	        multigetSliceQuery.setRange(null, null, false, 4);	        	        
	        multiSliceResult = multigetSliceQuery.execute();
	        orderedRows = multiSliceResult.get();	        
	        printDetails(orderedRows);	        
	        System.out.println("Use Case 2: ");
	        multigetSliceQuery.setRange(null, null, true, 4);	        	        
	        multiSliceResult = multigetSliceQuery.execute();
	        orderedRows = multiSliceResult.get();	        
	        printDetails(orderedRows);
	        
	        // reverse as false ( City, Description, Owner and State )
	        System.out.println("Use Case 3: ");
	        multigetSliceQuery.setRange("City", null, false, 4);	        	        
	        multiSliceResult = multigetSliceQuery.execute();
	        orderedRows = multiSliceResult.get();	        
	        printDetails(orderedRows);
	        
	        System.out.println("Use Case 4: ");
	        multigetSliceQuery.setRange("City", "Owner", false, 2);	        	        
	        multiSliceResult = multigetSliceQuery.execute();
	        orderedRows = multiSliceResult.get();	        
	        printDetails(orderedRows);
	        
	        System.out.println("Use Case 5: ");
	        multigetSliceQuery.setRange("City", null, false, 0);	        	        
	        multiSliceResult = multigetSliceQuery.execute();
	        orderedRows = multiSliceResult.get();	        
	        printDetails(orderedRows);
	        
	        System.out.println("Use Case 6: ");
	        multigetSliceQuery.setRange("City", "Owner", false, 10);	        	        
	        multiSliceResult = multigetSliceQuery.execute();
	        orderedRows = multiSliceResult.get();	        
	        printDetails(orderedRows);

	        // reverse as true ( City, Description, Owner and State )
	        System.out.println("Use Case 7: ");
	        multigetSliceQuery.setRange("Owner", "City", true, 4);	        	        
	        multiSliceResult = multigetSliceQuery.execute();
	        orderedRows = multiSliceResult.get();	        
	        printDetails(orderedRows);
	        
	        System.out.println("Use Case 8: ");
	        multigetSliceQuery.setRange("Owner", null, true, 4);	        	        
	        multiSliceResult = multigetSliceQuery.execute();
	        orderedRows = multiSliceResult.get();	        
	        printDetails(orderedRows);
	        
	        System.out.println("Use Case 8: ");
	        multigetSliceQuery.setRange("State", "City", true, 4);	        	        
	        multiSliceResult = multigetSliceQuery.execute();
	        orderedRows = multiSliceResult.get();	        
	        printDetails(orderedRows);
	        	                               
		} catch (Exception exp) {
			exp.printStackTrace();
		} finally {
			cluster.getConnectionManager().shutdown();
		}
	}

	private static void printDetails( final Rows<String, String, String> inOrderedRows) {		
		System.out.println("========================================================");
        for (Row<String, String, String> r : inOrderedRows) {
        	System.out.println("key is: " + r.getKey());
            printColumnDetails(r.getColumnSlice().getColumns());
        }
        System.out.println("========================================================");
	}
	
	private static void printColumnDetails(
			final List<HColumn<String, String>> inHColumnList) {
		for (HColumn<String, String> col : inHColumnList) {
			System.out.println("Col_Name: " + col.getName() + "  Col_Val: "
					+ col.getValue());
		}
	}

	static StringSerializer SE = StringSerializer.get();
}

Select Run As -> Java Application,
Out Put :

Use Case 1: 
========================================================
key is: P-1
Col_Name: City  Col_Val: C1
Col_Name: Description  Col_Val: 2BHK, 1200 sq ft
Col_Name: Owner  Col_Val: O1
Col_Name: State  Col_Val: S1
========================================================
Use Case 2: 
========================================================
key is: P-1
Col_Name: State  Col_Val: S1
Col_Name: Owner  Col_Val: O1
Col_Name: Description  Col_Val: 2BHK, 1200 sq ft
Col_Name: City  Col_Val: C1
========================================================
Use Case 3: 
========================================================
key is: P-1
Col_Name: City  Col_Val: C1
Col_Name: Description  Col_Val: 2BHK, 1200 sq ft
Col_Name: Owner  Col_Val: O1
Col_Name: State  Col_Val: S1
========================================================
Use Case 4: 
========================================================
key is: P-1
Col_Name: City  Col_Val: C1
Col_Name: Description  Col_Val: 2BHK, 1200 sq ft
========================================================
Use Case 5: 
========================================================
key is: P-1
========================================================
Use Case 6: 
========================================================
key is: P-1
Col_Name: City  Col_Val: C1
Col_Name: Description  Col_Val: 2BHK, 1200 sq ft
Col_Name: Owner  Col_Val: O1
========================================================
Use Case 7: 
========================================================
key is: P-1
Col_Name: Owner  Col_Val: O1
Col_Name: Description  Col_Val: 2BHK, 1200 sq ft
Col_Name: City  Col_Val: C1
========================================================
Use Case 8: 
========================================================
key is: P-1
Col_Name: Owner  Col_Val: O1
Col_Name: Description  Col_Val: 2BHK, 1200 sq ft
Col_Name: City  Col_Val: C1
========================================================
Use Case 8: 
========================================================
key is: P-1
Col_Name: State  Col_Val: S1
Col_Name: Owner  Col_Val: O1
Col_Name: Description  Col_Val: 2BHK, 1200 sq ft
Col_Name: City  Col_Val: C1
========================================================

You can download complete project, Here

CassandraDbService

*** Venkat – Happy leaning ****