- Applied Unsupervised Learning with Python
- Benjamin Johnston Aaron Jones Christopher Kruger
- 405字
- 2021-06-11 13:23:56
The Organization of Hierarchy
Both the natural and human-made world contain many examples of organizing systems into hierarchies and why, for the most part, it makes a lot of sense. A common representation that is developed from these hierarchies can be seen in tree-based data structures. Imagine that you had a parent node with any number of child nodes that could subsequently be parent nodes themselves. By organizing concepts into a tree structure, you can build an information-dense diagram that clearly shows how things are related to their peers and their larger abstract concepts.
An example from the natural world to help illustrate this concept can be seen in how we view the hierarchy of animals, which goes from parent classes to individual species:

Figure 2.2: Navigating the relationships of animal species in a hierarchical tree structure
In Figure 2.2, you can see an example of how relational information between varieties of animals can be easily mapped out in a way that both saves space and still transmits a large amount of information. This example can be seen as both a tree of its own (showing how cats and dogs are different but both domesticated animals), and as a potential piece of a larger tree that shows a breakdown of domesticated versus non-domesticated animals.
In the event that most of you are not biologists, let's move back toward the concept of a web store selling products. If you sold a large variety of products, then you would likely want to create a hierarchical system of navigation for your customers. By withholding all of the information in your product catalog, customers will only be exposed to the path down the tree that matches their interests. An example of the hierarchical benefits of navigation can be seen in Figure 2.3:

Figure 2.3: Navigating product categories in a hierarchical tree structure
Clearly, the benefits of a hierarchical system of navigation cannot be overstated in terms of improving your customer experience. By organizing information into a hierarchical structure, you can build an intuitive structure out of your data that demonstrates explicit nested relationships. If this sounds like another approach to finding clusters in your data, then you're definitely on the right track! Through the use of similar distance metrics such as the Euclidean distance from k-means, we can develop a tree that shows the many cuts of data that allow a user to subjectively create clusters at their discretion.