捕鱼机后台怎么操作

書名： The Data Science Workshop
作者名： Anthony So Thomas V. Joseph Robert Thas John Andrew Worsley Dr. Samuel Asare
本章字數： 283字
更新時間： 2021-06-11 18:27:26

Introduction

The previous chapters introduced you to very popular and extremely powerful machine learning algorithms. They all have one thing in common, which is that they belong to the same category of algorithms: supervised learning. This kind of algorithm tries to learn patterns based on a specified outcome column (target variable) such as sales, employee churn, or class of customer.

But what if you don't have such a variable in your dataset or you don't want to specify a target variable? Will you still be able to run some machine learning algorithms on it and find interesting patterns? The answer is yes, with the use of clustering algorithms that belong to the unsupervised learning category.

Clustering algorithms are very popular in the data science industry for grouping similar data points and detecting outliers. For instance, clustering algorithms can be used by banks for fraud detection by identifying unusual clusters from the data. They can also be used by e-commerce companies to identify groups of users with similar browsing behaviors, as in the following figures:

Figure 5.1: Example of data on customers with similar browsing behaviors without clustering analysis performed

Clustering analysis performed on this data would uncover natural patterns by grouping similar data points such that you may get the following result:

Figure 5.2: Clustering analysis performed on the data on customers with similar browsing behaviors

The data is now segmented into three customer groups depending on their recurring visits and time spent on the website, and different marketing plans can then be used for each of these groups in order to maximize sales.

In this chapter, you will learn how to perform such analysis using a very famous clustering algorithm called k-means.

官术网_书友最值得收藏!

The Data Science Workshop

Introduction