官术网_书友最值得收藏!

The heuristic approach

Earlier in this chapter, we introduced the three models that we will be using to detect fraud, now it's time to explore each of them in more detail. We're going to start with the heuristic approach.

Let's start by defining a simple heuristic model and measuring how well it does at measuring fraud rates.

Making predictions using the heuristic model

We will be making our predictions using the heuristic approach over the entire training data set in order to get an idea of how well this heuristic model does at predicting fraudulent transactions.

The following code will create a new column, Fraud_Heuristic, and in turn assigns a value of 1 in rows where the type is TRANSFER, and the amount is more than $200,000:

df['Fraud_Heuristic '] = np.where(((df['type'] == 'TRANSFER') &(df['amount'] > 200000)),1,0)

With just two lines of code, it's easy to see how such a simple metric can be easy to write, and quick to deploy.

The F1 score

One important thing we must consider is the need for a common metric on which we can evaluate all of our models on. In Chapter 1, Neural Networks and Gradient-Based Optimization, we used accuracy as our emulation tool. However, as we've seen, there are far fewer fraudulent transactions than there are genuine ones. Therefore a model that classifies all the transactions as genuine can have a very high level of accuracy.

One such metric that is designed to deal with such a skewed distribution is the F1 score, which considers true and false positives and negatives, as you can see in this chart:

We can first compute the precision of our model, which specifies the share of predicted positives that were positives, using the following formula:

Recall measures the share of predicted positives over the actual number of positives, as seen in this formula:

The F1 score is then calculated from the harmonic mean, an average, of the two measures, which can be seen in the following formula:

To compute this metric in Python, we can use the metrics module of scikit-learn, or sklearn for short:

from sklearn.metrics import f1_score

Given the predictions we've made, we can now easily compute the F1 score using the following command:

f1_score(y_pred=df['Fraud_Heuristic '],y_true=df['isFraud'])
out: 0.013131315551742895

You'll see that the preceding command outputs a number–starting 0.013131315…-What this number means exactly is that our heuristic model is not doing too well, as the best possible F1 score is 1, and the worst is 0. In our case, this number represents the harmonic mean of the share of correctly caught frauds over everything labeled as fraud and the share of correctly caught frauds over all frauds.

Evaluating with a confusion matrix

A more qualitative and interpretable way of evaluating a model is with a confusion matrix. As the name suggests, the matrix shows how our classifier confuses classes.

Firstly, let's study the code appendix for the plot_confusion_matrix function:

from sklearn.metrics import confusion_matrix cm = confusion_matrix(
    y_pred=df['Fraud_Heuristic '],y_true=df['isFraud']) 
plot_confusion_matrix(cm,['Genuine','Fraud'])

Which, when we run, produces the following graphic:

A confusion matrix for a heuristic model

So, just how accurate was that model? As you can see in our confusion matrix, from our dataset of 2,770,409 examples, 2,355,826 were correctly classified as genuine, while 406,370 were falsely classified as fraud. In fact, only 2,740 examples were correctly classified as fraud.

When our heuristic model classified a transaction as fraudulent, it was genuine in 99.3% of those cases. Only 34.2% of the total frauds got caught. All this information is incorporated into the F1 score we formulated. However, as we saw, it is easier to read this from the generated confusion matrix graphic. The reason we used both the heuristic model and the F1 score is that it is good practice to have a single number that tells us which model is better, and also a more graphical insight into how that model is better.

To put it frankly, our heuristic model has performed quite poorly, detecting only 34.2% of fraud, which is not good enough. So, using the other two methods in the following sections, we're going to see whether we can do better.

主站蜘蛛池模板: 乌兰浩特市| 镇雄县| 铁岭县| 绥滨县| 当雄县| 崇信县| 青铜峡市| 称多县| 昭平县| 鹿邑县| 长白| 佛教| 南华县| 佛冈县| 庆安县| 荃湾区| 天全县| 衡东县| 安泽县| 桃园县| 富裕县| 蓝田县| 池州市| 公安县| 湘乡市| 新竹县| 满城县| 保定市| 克山县| 鹤岗市| 额济纳旗| 扬中市| 塔城市| 平山县| 叶城县| 鹤庆县| 黔西县| 浦东新区| 阿拉善盟| 秦安县| 武隆县|