“Data” is growing exponentially today due to the Internet age and “studying Data”, “analyzing Data” have become a necessity for most professions. Data is stored in “databases” and storing this data in databases too has evolved over time. Oracle and Microsoft SQL Server were examples of popular databases that were used to store data. Things slowly changed with the explosion of data and open source databases were born. We will look at a few open source databases in this post.
MongoDB:
MongoDB is an open source database developed by MongoDB Inc. and written in C++. Data was traditionally only stored in tables as rows and columns. For any database developer in the 1990s, it was difficult to think of data in any other way.
But all this changed in the mid 2000s, when MongoDB entered the “NoSQL” way of handling data. Dynamic schemas are used in MongoDB and the structure of the database, such as the type of field does not have to be defined first. This is one of the highlights of the MongoDB database – the ability to add and delete fields dynamically, whereby we can change the structure of database.
In MongoDB, a table is a “collection”, a row is a “document” and a column is a “field”. MongoDB has its own query language known as the MongoDB query language. Its chief advantages are high scalability and high availability.
MetLife, Expedia.com, Facebook, Cisco are examples of few organizations who have embraced MongoDB. (Flexible enough to fit any industry)
MySQL:
MySQL is a traditional open source RDBMS created and released in 1995. It was eventually acquired, by Oracle in 2010. It features the traditional terminology and concepts related to RDBMS – namely, tables, primary keys, foreign keys, relationships and more.
It is written in ‘C’ and ‘C++’ and uses the SQL or ‘Structured query language’ for querying, inserting and updating records. It supports up to 50 million records in a table. While MySQL does not have the dynamic schema or the rich data model of NoSQL databases like MongoDB, it is still used by legacy systems and cannot be entirely replaced by NoSQL databases.
Some of the organizations that are using MySQL are Alcatel-Lucent, Pinterest, Sears, Walmart and more.
The need of an organization sometimes might be to use both NoSQL and SQL databases to satisfy all the business requirements accordingly.
PostgreSQL:
PostgreSQL or just Postgres was initially released in 1996 and is an open source object relational database management system. It is written in the ‘C’ programming language and is one of the most popular open source databases for startups. (Oracle’s biggest database foe: Could it be Postgres?)
It runs on all major operating systems and some limits are stated as follows:
Maximum database size in Postgres: Unlimited
Maximum Table size in Postgres: 32TB
Maximum row size in Postgres: 1.6TB
Maximum rows/table in Postgres: unlimited
Postgres also supports MVCC or ‘Multiversion Concurrency Control’. MVCC involves transaction isolation for each database session. This largely avoids “read locks” and encourages better performance in multi-user environments.
Apple, University of Alabama, Birmingham, University of California, Berkeley are some organizations and universities that have built products, solutions using Postgres open source database.
Cassandra:
Apache Cassandra initially developed by Facebook in 2008 is an open source distributed database management system to handle today’s requirement of humungous amounts of data that is constantly streaming from different sources. It is a column oriented database.
Some of the important features of Cassandra include constant up-time ensuring 24/7 access to information and fault tolerance along with scalability. This is done by maintaining a “ring” design than the classic master-slave design. All nodes in a cluster have the same role and nodes can be added with no major issues. Data is replicated across multiple data centers which provide low latency.
Facebook’s Instagram, Cisco’s Webex, Netflix are examples of organizations that use Cassandra.
We saw some interesting and popular open source databases in this post. We will explore some more technical topics in the next post.
* Images from Google
Bibliography Flexible enough to fit any industry. (n.d.). Retrieved from mongodb.com: https://www.mongodb.com/who-uses-mongodb Oracle's biggest database foe: Could it be Postgres? (n.d.). Retrieved from techrepublic.com: http://www.techrepublic.com/article/oracles-biggest-database-foe-could-it-be-postgres/
- Top 45 Fresher Java Interview Questions - March 9, 2023
- 25 Free Practice Questions – GCP Certified Professional Cloud Architect - December 3, 2021
- 30 Free Questions – Google Cloud Certified Digital Leader Certification Exam - November 24, 2021
- 4 Types of Google Cloud Support Options for You - November 23, 2021
- APACHE STORM (2.2.0) – A Complete Guide - November 22, 2021
- Data Mining Vs Big Data – Find out the Best Differences - November 18, 2021
- Understanding MapReduce in Hadoop – Know how to get started - November 15, 2021
- What is Data Visualization? - October 22, 2021