Wednesday, October 1, 2014

Blog 2: NoSQL vs. Relational Databases

In the world of databases, the trend toward NoSQL databases is increasing. There are strong opinions from both sides, with each side claiming that their database solution is superior to the other while citing different reasons. An example of this is the comparison of Datastax to other DBMS solutions as seen in the table below. The question is, who among these different sects of databases is right, or are they wrong altogether?

[3]

The Pros and Cons of NoSQL vs. Relational Databases

A good debate on what database type to use, and when to use it, is illustrated in the following video of Craig Steadmans interview of William McKnight, president of McKnight Consulting Group.


The two main advantages of NoSQL database over relational database:
  1. The NoSQL database scales very well.
With NoSQL, when the database is too slow or too big, you can easily add more servers by creating a cluster or replica-set of multiple shards. However, a relational database does not have such good scalability.
“NoSQL databases usually support auto-sharding, meaning that they natively and automatically spread data across an arbitrary number of servers, without requiring the application to even be aware of the composition of the server pool. Data and query load are automatically balanced across servers, and when a server goes down, it can be quickly and transparently replaced with no application disruption.” [1]
  1. The NoSQL database allows for heterogeneous data.
In a computer hardware store all of the products have a price and a vendor. However, different components of the computer have different properties. For example, CPUs have a clock rate, hard drives and RAM chips have a capacity, monitors have a resolution. In the relational database, there are two ways to deal with this real world problem. The first option is to create a very long productID-property-value table. The second option is to create a very wide and sparse product table with every property, but the problem is that most of the values in the table would be NULL. In a NoSQL database, this problem is more easily avoided because it allows each document in a collection to have a different set of properties.
The two main disadvantages of NoSQL database over relational database:
  1. Denormalization
When your data is very relational and able to be denormalized, the relational database would be your best choice because the NoSQL database doesn't use JOINs.

For denormalizing the data in a relational database, there are several rules on how to normalize the data. But the NoSQL database is rather new technology, it lacks rules about denormalization.
  1. Complex transactions
The NoSQL database does not handle complex transactions as well as the relational database. When the actions affect more than one document, the NoSQL database cannot guarantee consistency between the tables.

Its about your data
Jnan Dash, a long time database professional, points out several things to consider about the data before deciding what choice to make in a database solution.
Tabular vs Complex
If the data has a simple tabular structure, like an accounting spreadsheet, then the relational model could be adequate. Data such as geo-spatial, engineering parts, or molecular modeling data, on the other hand, tends to be very complex. It may have multiple levels of nesting and the complete data model can be complicated. Such data has, in the past, been modeled into relational tables, but has not fit into that two-dimensional row-column structure naturally. [2]
Historical vs Dynamic
“What is the volatility of the data model?" Is the data model likely to change and evolve or is it most likely going to stay the same? Generally speaking, all the facts about the data model are not known at design time, so some flexibility is needed. This presents many issues to the relational database management system (RDBMS) users of the world. [2]
Conclusion:
Each type of database serves a different purpose. NoSQL is good for complex unstructured data that is constantly changing and gaining new attributes. RDBMS is good for structured data where the data attributes are well defined and understood. RDBMS has a proven record spanning decades and is trusted, while NoSQL is a relatively new configuration lacking the same trust. So the best database is the one that best fits your data and goals. But, if you still can't make up your mind don’t worry, some database vendors are developing solutions that allow for the coexistence of NoSQL and RDBMS.[2]

References APA Format:
  1. NoSQL Databases Explained. (2014, October 1). Retrieved October 1, 2014, from http://www.mongodb.com/nosql-explained

  1. Dash, J. (2013, September 18). RDBMS vs. NoSQL: How do you pick? | ZDNet. Retrieved October 1, 2014, from http://www.zdnet.com/rdbms-vs-nosql-how-do-you-pick-7000020803/

  1. Relational Database to NoSQL. (2014, October 1). Retrieved October 1, 2014, from http://www.datastax.com/relational-database-to-nosql

  1. Preslar, E. (2013, September 16). McKnight: Relational vs. NoSQL databases not a winner-take-all game. Retrieved October 1, 2014, from http://searchdatamanagement.techtarget.com/video/McKnight-Relational-vs-NoSQL-databases-not-a-winner-take-all-game

No comments:

Post a Comment