Data modeling is a practice that defines the logic of queries and the structure of the data in storage. A well-designed model is the key to leveraging the strengths of a graph database as it improves query performance, supports flexible queries, and optimizes storage.
To organize data into a [data model], the first thing to do is to think about what questions you want to answer.
For example, assume that you work for a retail company and want to learn what products customers are buying. To answer that, you need to:
-
Have data on the products sold and the customers who bought them. This process is known as "entity extraction".
-
Understand how these entities relate to each other.
-
Think about what other details that need to be provided, i.e. what properties should be added to these entities (e.g. customer name).
-
Optionally, visualize the model before you create it using no-code data modeling tools.
-
If satisfied, you can start writing the data into an database.
In this fictional scenario, you can start by adding this information to the graph:
CREATE (c:Customer {name: "John"})
CREATE (p:Product {name: “Camera”})
CREATE (c)-[:BUYS]->(p)
Then, you can test this model with a query (e.g. what did John buy):
MATCH (c:Customer {name: "John"})-[b:BUYS]->(p)
RETURN p
Keep in mind that graph data modeling is an iterative process. Your initial graph data model is only a starting point. As you learn more about your use cases or if they change, the model needs to adapt.
Additionally, you may find that, especially when the graph scales, you need to refactor your model to ensure it is aligned with your business needs as they evolve.
In summary, the process of creating a data model includes the following:
-
Understand the domain and define specific use cases (questions) for the application.
-
Develop an initial graph data model by extracting entities and decide how they relate to each other.
-
Test the use cases against the initial data model.
-
Create the graph with test data using Cypher.
-
Test the use cases, including performance against the graph.
-
Refactor the graph data model due to changes in the key use cases or for performance reasons.
For a more hands-on approach to data modeling, try the following resources:
-
GraphAcademy: Data Modeling Fundamentals: enroll to an interactive course.
-
From relational to graph: learn how to adapt data from a relational to a graph data model.
-
Data modeling tools: see a list of tools you can use to create your data model.
-
Data modeling tips: check tips on how to improve your data modeling skills.
-
Modeling designs: see examples of data modeling designs that can be used as strategy for your project.
-
Neo4j GraphGists: find examples of graph data modeling shared by the Neo4j community.