A Look at Apache Cassandra
Post By Sakshi Wagh on 18-June-2015
Apache Cassandra is an open source NoSQL(Not Only SQL) database and management system. It was designed to handle large amounts of data across many servers, data centres and clouds. It can store hundreds of terabytes of data, and differs a lot from relational database management systems. Cassandra was developed by Apache Software Foundation in 2008, is written in Java Language, and supports a cross-platform operating system. Initially developed by Facebook to power its inbox search feature, it is being used by many other organisations, such as Twitter, Reddit, Rackspace, Ciscom, Cloudkick, among others.
Cassandra’s architecture is different. It consists of various nodes all over the network. There is no master node concept in Cassandra. Every node can serve any request.
Cassandra provides various features such as scalability (if two nodes can manage 100,000 operations per second, four nodes will be able to manage 200,000 operations per second and eight nodes will handle 400,000 operations per second), durability, storing duplicate copies of data across all the nodes available in a network, and replication. This means that if any node goes down, other copies of the node’s data are available on other machines. Thus, it provides anytime access to data.
Cassandra can be used where there is a requirement for data to be ingested at a very high speed with all the features, reliability and redundancy. It is used where there is a need to store a large amount of data feeds from customers, and for geographically distributing the data and for real-time querying.