- Learning Neo4j 3.x(Second Edition)
- Jér?me Baton Rik Van Bruggen
- 578字
- 2021-07-08 09:37:46
Granulate nodes
The typical graph modeling pattern that we will discuss in this section will be called the granulate pattern. This means that in graph database modeling, we will tend to have much more fine-grained data models with a higher level of granularity than we would be used to having in a relational model.
In a relational model, we use a process called database normalization to come up with the granularity of our model. Wikipedia defines this process as follows:
The reality of this process is that we will create smaller and smaller table structures until we reach the third normal form. This is a convention that the IT industry seems to have agreed on: a database is considered to have been normalized as soon as it achieves the third normal form. Visit http://en.wikipedia.org/wiki/Database_normalization#Normal_forms for more details.
As we discussed before, this model can be quite expensive as it effectively introduces the need for join tables and join operations at query time. Database administrators tend to denormalize the data for this very reason, which introduces data-duplication--another very tricky problem to manage.
In graph database modeling, however, normalization is much cheaper for the simple reason that these infamous join operations are much easier to perform. This is why we see a clear tendency in graph models to create thin nodes and relationships, that is, nodes and relationships with few properties on them. These nodes and relationships are very granular and have been granulated.
Related to this pattern is a typical question that we ask ourselves in every modeling session--should I keep this as a property or should the property become its own node? For example, should we model the alcohol percentage of a beer as a property on a beer brand? The following diagram shows the model with the alcohol percentage as a property:

The alternative would be to split the alcohol percentage off as a different kind of node.
The following diagram illustrates this:

Which one of these models is right? I would say both and neither. The real fundamental thing here is that we should be looking at our queries to determine which version is appropriate. In general, I would present the following arguments:
- If we don't need to evaluate the alcohol percentage during the course of a graph traversal, we are probably better off keeping it as a property of the end node of the traversal. After all, we keep our model a bit simpler when doing this, and everyone appreciates simplicity.
- If we need to evaluate the alcohol percentage of a particular (set of) beer brands during the course of our graph traversal, then splitting it off into its own node category is probably a good idea. Traversing through a node is often easier and faster than evaluating properties for each and every path.
As we will see in the next paragraph, many people actually take this approach a step further by working with in-graph indexes.
- Java多線程編程實戰指南:設計模式篇(第2版)
- Java系統分析與架構設計
- Java高并發核心編程(卷2):多線程、鎖、JMM、JUC、高并發設計模式
- C# 2012程序設計實踐教程 (清華電腦學堂)
- 數據結構和算法基礎(Java語言實現)
- 深入理解Django:框架內幕與實現原理
- 架構不再難(全5冊)
- Mastering Concurrency in Go
- C和C++安全編碼(原書第2版)
- Apache Mesos Essentials
- SAP BusinessObjects Dashboards 4.1 Cookbook
- Node.js全程實例
- Learning OpenStack Networking(Neutron)(Second Edition)
- Mastering Elasticsearch(Second Edition)
- Redmine Cookbook