Jumping from MySql to Cassandra, a success story…

Posted: January 12, 2012 in Cassandra

Today I’m gonna share with you my experience when I started with Apache Cassandra…One of the most complicated steps to learn any NoSql stuff, is to take away of your mind the normalization principles and those relational DB structures. Relational databases are designed to persist normalized data and without duplicated data. Well, one of the main changes here is that you need to think or design for your queries, in what your reports or finder methods want, and build a the persistent structure as it need.

Cents of web pages, books, papers treat about What Cassandra is, What Hazelcast is, What Hadoop, MemcacheDB, MongoDB, etc….But none of them treat about HOW TO migrate my data from a relational DB to one of them.

We wanted to migrate the persistent data of two our modules, Turmeric SOA Monitoring and Turmeric SOA Rate Limiting data. In Turmeric we use MySql as relational database. After a week reading and analyzing several NoSql options we decided for Cassandra. <— I hope to write another post about the whys…. btw, I highly recommended this reading: Cassandra: The Definitive Guide

From Relational tables to Keyspaces

The big deal now is How to migrate them. Well this is what we did:
Following an Agile best practice, if something is to hard or complex, just, break it in small challenges. After all we still had a good gap for a MMF (“Minimal Marketable Feature”, refer to Software by Numbers. So:

Step 1: Move our Relational DB tables to Cassandra Column Families.
Step 2: Customize our new Column Families in your Keyspace in order to have all needed data without JOIN operators
Step 3: Explode those Column Families as finder and query method needs. Typically a finder or query method should use 1 Column Family
Step 4: Customize Creators and Updater methods according previous changes. Don’t be scared if you are saving duplicated data. Keep in mind, “think for your queries!, forget to normalization rules.”
Step 5: while (!pleased) -> do step 3 and 4

A Cassandra DAO

Now, the hardest step is #1. Don’t panic, we developed a kind of generic (in fact it uses Java Generics) Cassandra DAO for your migration. As all this work was needed for the project I’m actually working on, you will find it as a submodule of TurmericSOA, but following the Apache License you can use it through your Maven dependency file.


<dependency>
<groupId>org.ebayopensource.turmeric.utils</groupId>
<artifactId>turmeric-utils-cassandra</artifactId>
<version>1.2.0.0-SNAPSHOT</version>
<type>jar</type>
</dependency>

Features

  • 100% Java code
  • It can runs an Embedded Cassandra Service or just talk to your external Cassandra Service
  • Uses Hector library as Java Cassandra client
  • Dynamically [Super] Column Family creation
  • Key Types and Data Types defined at runtime with the use of Generics
  • Main CRUD methods supported:
boolean containsKey(KeyType key);

void delete(KeyType key);

T find(KeyType key);

Map> findItems(final List keys, final Long rangeFrom, final Long rangeTo);

Set findItems(final List keys, final String rangeFrom, final String rangeTo);

Set getKeys();

void save(KeyType key, T model);

Main Classes
This util package contains the following package and classes:

org.ebayopensource.turmeric.utils.cassandra.service

  • CassandraManager: initialize a static EmbeddedCassandraService instance based on yaml configuration file

org.ebayopensource.turmeric.utils.cassandra.hector

  • HectorManager: Manages the keyspace and column family creation and reading. It uses Hector Api
  • HectorHelper: Includes some utility methods based on Java Reflection and Java Generics. IE: retrieving the field names from a POJO which are used as column names in cassandra keyspaces

org.ebayopensource.turmeric.utils.cassandra.dao

  • AbstractColumnFamilyDao: As it is called, this should be a base class that every dao should extends. It defines and implements basic DAO operation with the use of Hector Api.

Configuration files

Here is the directory structure of the configuration files:

META-INF/
         security/
                  config/
                         cassandra/
                                   cassandra.properties

An example of this property file:

cassandra-cluster-name=TurmericCluster
cassandra-host-ip=127.0.0.1
cassandra-rpc-port=9160
cassandra-my-keyspace=My-keyspace

#column families
cassandra-foo-column-family=foo
cassandra-bar-column-family=bar

How to use it….
It is very intuitive. Lets suppose we have a Foo table in our relational DB, ie MySql.
So:

Create the BaseDao interface

public interface BaseDao {
		  public void delete(String key);
		  public Set getKeys();
		  public boolean  containsKey(String key);
		  public void save(String key, FooPojoClass  fooPojo);
		  public FooPojoClass find(String key);
}

Create the FooDao interface

public interface FooDao extends BaseDao  {
}

Create the FooDao implementation


public class FooDaoImpl extends AbstractColumnFamilyDao
		implements FooDao {
	public FooDaoImpl(final String clusterName, final String host, final String keySpace, final String cf,  final Class kTypeClass) {
		super(clusterName, host, keySpace, kTypeClass, FooPojo.class, cf);
	}

}

… in your code

//initiates an embedded Cassandra Service
CassandraManager.initialize();

//creates our Foo Column Family
FooDao fooDao = new FooDaoImpl("myCluster", "127.0.0.1", "myKeyspace",
				"myColumnFamilyName", String.class);

and voilà, you have your relational table migrated as a Cassandra column family!!!

Anyways your can surf at UT classes to see how are they implemented…

enjoy it!!!

About these ads
Comments
  1. […] Cents of web pages, books, papers treat about What Cassandra is, What Hazelcast is, What Hadoop, MemcacheDB, MongoDB, etc….But none of them treat about HOW TO migrate my data from a relational DB to one of them….    Agile Read the original post on DZone… […]

  2. […] Read full article Tagged as: Cassandra, NoSQL Comments Off Comments (0) Trackbacks (0) ( subscribe to comments on this post ) […]

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s