- Statistics for Machine Learning
- Pratap Dangeti
- 241字
- 2021-07-02 19:05:59
Example of multilinear regression - step-by-step methodology of model building
In this section, we actually show the approach followed by industry experts while modeling using linear regression with sample wine data. The statmodels.api package has been used for multiple linear regression demonstration purposes instead of scikit-learn, due to the fact that the former provides diagnostics on variables, whereas the latter only provides final accuracy, and so on:
>>> import numpy as np >>> import pandas as pd >>> import statsmodels.api as sm >>> import matplotlib.pyplot as plt >>> import seaborn as sns >>> from sklearn.model_selection import train_test_split >>> from sklearn.metrics import r2_score >>> wine_quality = pd.read_csv("winequality-red.csv",sep=';') # Step for converting white space in columns to _ value for better handling >>> wine_quality.rename(columns=lambda x: x.replace(" ", "_"), inplace=True) >>> eda_colnms = [ 'volatile_acidity', 'chlorides', 'sulphates', 'alcohol','quality'] # Plots - pair plots >>> sns.set(style='whitegrid',context = 'notebook')
Pair plots for sample five variables are shown as follows; however, we encourage you to try various combinations to check various relationships visually between the various other variables:
>>> sns.pairplot(wine_quality[eda_colnms],size = 2.5,x_vars= eda_colnms, y_vars= eda_colnms) >>> plt.show()

In addition to visual plots, correlation coefficients are calculated to show the level of correlation in numeric terminology; these charts are used to drop variables in the initial stage, if there are many of them to start with:
>>> # Correlation coefficients >>> corr_mat = np.corrcoef(wine_quality[eda_colnms].values.T) >>> sns.set(font_scale=1) >>> full_mat = sns.heatmap(corr_mat, cbar=True, annot=True, square=True, fmt='.2f',annot_kws={'size': 15}, yticklabels=eda_colnms, xticklabels=eda_colnms) >>> plt.show()

- PostgreSQL Cookbook
- D3.js 4.x Data Visualization(Third Edition)
- HTML5從入門到精通 (第2版)
- 軟件品質(zhì)之完美管理:實(shí)戰(zhàn)經(jīng)典
- 第一行代碼 C語(yǔ)言(視頻講解版)
- Mastering Python Design Patterns
- Unity Character Animation with Mecanim
- SSH框架企業(yè)級(jí)應(yīng)用實(shí)戰(zhàn)
- 實(shí)驗(yàn)編程:PsychoPy從入門到精通
- Learning Redux
- Managing Windows Servers with Chef
- Mastering React Test:Driven Development
- Visual C++網(wǎng)絡(luò)編程教程(Visual Studio 2010平臺(tái))
- HTML+CSS+JavaScript前端開發(fā)(慕課版)
- VLSI設(shè)計(jì)基礎(chǔ)