A term invented by Carlo Strozzi in 1998 , NoSQL has been a hard term to pin down from the beginning. For one thing, while most people now translate the term to mean ‘Not Only SQL,’ there are other accepted variations. More importantly, the term refers to a broad, emerging class of non-relational database solutions. NoSQL technologies have evolved to address specific business needs for which row technologies couldn’t scale to meet and column technologies were unsuited to address. Currently, there are over 112 products or open-source projects in the NoSQL space, with each solution matching a specific business need. For example:
• Real-time data logging such as in finance or web analytics
• Web apps or any app which needs better performance without having to define columns in an RDBMS
• Storing frequently requested data for a web app
Go a bit deeper into each of three main NoSQL subvariants: key-value stores, document stores and column stores.
A key-value store does what it sounds like it does: values are stored and indexed by a key, usually built on a hash or tree data-structure.
Key-value pairs are widely used in tables and configuration files. Key-value stores allow the application to store its data without predefining a schema—there is no need for a fixed data-model.
In a key-value store, for example, a record may look like:
12345 => “img456.jpg,checkout.js,20”
Companies turn to key-value stores when they require the functionality of key-values but do not require the technology overhead of a traditional RDBMS system, either because they require more efficient, cost-effective scalability or they are working with unstructured or semi-structured data. Key-value stores are great for unstructured data centered on a single object, and where data is stored in memory with some persistent backup. Consequently, they are typically used as a cache for data frequently requested by web applications such as online shopping carts or social-media sites. As these web pages are created on the fly, the static components are quickly retrieved and served up to the user.
As with a key-value store, companies turn to NoSQL document stores when they are dealing with huge volumes of data and transactions requiring massive horizontal scaling or sharing. And, similarly, there is no need for a pre-set schema. However, the data in document stores can contain several keys, so queries aren’t as limited as they are in key-value stores. For example, in a document data store an example record could read:
“id” => 12345,
“name” => “Robert”,
“age” => 22,
“email” => firstname.lastname@example.org
While multiple keys increase the types of possible queries, the data stored in these ‘documents’ do not need to be predefined and can change from document to document. The tradeoff for the more complex query-options is speed: queries with a key-value store are much simpler and often faster.
Document stores are often deployed for web-traffic analysis, user-behavior/action analysis, or log-file analysis in real time. However, while document stores allow more query capabilities than key-value stores, there are still limitations given the non-relational basis of the document-store database.
Column stores are an emerging NoSQL option, created in response to very specific database problems involving beyond-massive amounts of data across a hugely distributed system. Think Google. Think Facebook. Imagine the colossal amount of data that Google stores in its data farms. And then imagine how many permutations of data sets need to be compiled to respond to all possible Google searches. Clearly, this task could never be accomplished in any reasonable time frame with a traditional relational database. It requires the ability to handle massive amounts of data but with more query complexity than either key-value stores or document stores would deliver.
Most column stores also use MapReduce, a fault-tolerant framework for processing huge data sets on certain kinds of distributable problems using a large number of computers. This technology is still emerging—and use cases may eventually overlap with document stores as both technologies mature.