Sunday, May 22, 2011

Cassandra - Part 1

As usual I will be discussing about why we moved from Amazon RDS (MySQL) to Cassandra (NOSQL), our experiences with Cassandra, and how we implemented it.

The main reason we moved from RDS (MySQL) is latency/response times. Read/Write operations took on average 80-100 ms which wasn't acceptable to us and we had to depend heavily on client cookies to store information. We started our development using RDS as short term choice and to bring our solution quick to market.

The second reason is cross data center replication. Though RDS supports read-replicas and Multi-AZ deployment but it was limited to the region. So we had to write lot of batch jobs or on the fly tasks to sync information between regions.

These two issues made us to push for new solutions. We researched and short listed Cassandra, HBase and MongoDB. We decided to go with Cassandra for these reasons: fast writes, P2P architecture (No Master/Slave) and built in cross data center support. We did use HBase for OLAP and in the process of migration to Brisk.

Response Times:
Average response times for writes are around 8ms (which is 10X better) and reads are around 10-15ms.  We are in the process of removing memcached from our PHP front-end. With removal of memcached our architecture will be simple and clean.

P2P/Gossip:
There is no single point of failure and if a node fails the impact is minimal. Application can communicate with any node in the cluster to read/write the data thus increasing DB/App throughput tremendously.

Sharding:
Cassandra takes care of sharding and it is seem-less and application doesn't have to manage it.

Cross Data Center:
We have 2 DC's and in the process of setting up 3rd DC, We use replication factor of 2 in each DC and at any time we have 4 copies of the given data.

Hadoop Support:
We extract data real time using streaming and move it to our data warehouse for analytics.  It really simplified our ETL process and able to keep OLAP data near real time.

 We did face quite lot of operational challenges which I will discuss in Part 2.

Thanks for reading my blog and have a nice time!