How To Choose Data Models and Databases For Your Use Case

Selecting a data model and database for your system

Chandan Kumar
3 min readJan 14, 2022
data models for different use cases

Data Model, why it is so important? The answer lies in looking all around us. Applications in our phone, communication between two systems, how we look at the real world and how to think about the problem we are solving, data models play the most important role.

Data can be represented in a lot of ways but the major ones are relational, document, graph, key-value, column-family, and time series. It depends on the use case which model can be used. Linkedin profile or resume, for example, can be stored in relational tables or document or graph. The other requirement is how much scalability, performance and availability are required. This is part 1 of 5 article series.

1. Relational

Data is organised into relations (tables), where each relation is a tuple of records. This data model is most prevalent where data has one-to-one, one-to-many and many-to-many relationships and transaction processing like banking, sales, airline reservations.
It supports strong consistency and data integrity using normalisation. It ensures ACID transactions. Data is queried using SQL and query optimisers are advanced. Example: PostgreSQL, MySQL

2. Document

This model targets the area where data comes in self-contained documents and relationships between two documents are rare. Data is semi-structured and JSON-like format. Some use cases are real-time feeds, live sports app, user comments, product catalogues.
It provides better performance due to data locality, schema flexibility and nested records. Example: MongoDB, CouchDB, DocumentDB

3. Graph

This data model targets the data which comes with multiple many-to-many relationships between the entities and becomes complex in relational model. It is represented in the form of vertices and edges which are entities and their relationships respectively.
Most of the big tech companies like Facebook, Google use graph models like social graphs and web graphs. Data can be heterogenous like vertices can be anything from people, locations, events. It provides better visualisation, latency in traversing, etc. Example: Neo4J, Cypher, Datalog

4. Key-Value

This data model is used where data is fetched with very low latency. Caching, implementing queues, implementing pub/sub systems are some use cases. Example: Redis, Memcached

5. Wide-Column (Column Family)

This data model organises the data in such a way that column names and format can differ in different rows. It is used when scalability, performance, availability and analytics are involved.
Example: Cassandra, HBase, Google BigTable

6. Time-Series

This data model targets the use cases with time series data to track and analyse the behaviour of a system. IoT devices, sensors, stock market data are some of the use cases. Example: Influx DB, Timescale DB, Prometheus

Data models and databases are a vast subject and a short overview have been given. Different models serve different purposes and there is no single one-size-fits-all solution.

--

--

Chandan Kumar

data engineer and architect, taking leaps into building data products. I love reading books, fitness, sports and travel