- Keras 2.x Projects
- Giuseppe Ciaburro
- 826字
- 2021-07-02 14:36:19
Pattern recognition using a Keras neural network
Heart diseases are often underestimated, but, in reality, they are the leading cause of death in the world. Among them, coronary artery disease (CAD) accounts for about a third of all deaths worldwide in people over 35 years of age. CAD is the result of arteriosclerosis, which consists in the narrowing of the blood vessels and the hardening of its walls. In some cases, CAD can completely block the influx of oxygen-rich blood to the heart muscle, causing a heart attack.
CAD is caused by an accumulation of waxy grease deposits on the inner walls of the arteries. These deposits consist of cholesterol, calcium, and other substances that travel in the blood; the product of their accumulation is called atherosclerotic plaque. This plaque can clog the coronary arteries and make them rigid and irregular, causing the so-called hardening of the arteries or atherosclerosis. These obstructions can be single or multiple and present various levels of gravity and different locations. Gradually, the deposits restrict the lumen of the coronary arteries, thus reducing the supply of blood and oxygen to the heart muscle. This reduction in blood flow can cause chest pain (angina), difficulty in breathing (dyspnoea), and other symptoms, while complete obstruction can induce a heart attack.
Coronary angiography is used to diagnose CAD. Angiography is the diagnostic representation of the blood or lymphatic vessels of the human body through a technique that involves the infusion of a water soluble contrast agent within the vessels and the generation of medical images through various biomedical imaging techniques.
In this example, we will try to predict a condition of heart disease through a classification algorithm based on neural networks. To do this, we will use the Heart Disease Data Set, which is available in the UCI Machine Learning Repository.
These databases contain several pieces of data information on heart disease instances. These are provided by the following four clinical institutions: Cleveland Clinic Foundation (CCF), Hungarian Institute of Cardiology (HIC), Long Beach Medical Center (LBMC), and University Hospital in Switzerland (SUH).
More specifically, we will refer to the data that was made available by the CCF (edited by Robert Detrano, MD, PhD). This database contains 76 attributes, but all published experiments refer to using a subset of 14 of them. The goal is to predict the presence of heart disease in the patient. The target is an integer value from 0 (no presence) to 4. Experiments with the Cleveland database have concentrated on simply attempting to distinguish presence (values 1, 2, 3, 4) from absence (value 0).
The following list shows all the variables, followed by a brief description:
- Number of instances: 302
- Number of attributes: 14 continuous attributes (including the class attribute HeartDisease)
Each of the attributes are detailed as follows:
- age: Age in years
- sex: Sex (1 = male; 0 = female)
- cp: Chest pain type (Value 1: typical angina; Value 2: atypical angina; Value 3: non-anginal pain, Value 4: asymptomatic)
- trestbps: Resting blood pressure (in mm Hg on admission to the hospital)
- chol: Serum cholestoral in mg/dl
- fbs: (fasting blood sugar > 120 mg/dl) (1 = true; 0 = false)
- restecg: Resting electrocardiographic results (Value 0: normal; Value 1: having ST-T wave abnormality (T wave inversions and/or ST elevation or depression of > 0.05 mV), Value 2: showing probable or definite left ventricular hypertrophy by Estes' criteria)
- thalach: Maximum heart rate achieved
- exang: Exercise induced angina (1 = yes; 0 = no)
- oldpeak: ST depression induced by exercise relative to rest
- slope: The slope of the peak exercise ST segment (Value 1: upsloping; Value 2: flat; Value 3: downsloping)
- ca: Number of major vessels (0-3) colored by flourosopy
- thal: 3 = normal; 6 = fixed defect; 7 = reversable defect
- HeartDisease: Diagnosis of heart disease – angiographic disease status (Value 0: < 50% diameter narrowing; Value 1: > 50% diameter narrowing)—in any major vessel: attributes 59 through 68 are vessels
The data is available in a .xlsx file named ClevelandData.xlsx, which can be downloaded from the UCI dataset. To make our job easier, the target has been reworked to present only two values (0 and 1). To start, let's look at how we can import the data into Python. To do this, we will use the read_excel module of the pandas library. The read_ excel method reads an Excel table into a pandas DataFrame. The first thing to do is import the library that we will use:
import pandas as pd
The available data does not contain the header, so it is necessary to retrieve the names of the variables that are contained in another file, which is always available in the UCI archive. Let's put them in a list:
HDNames= ['age','sex','cp','trestbps','chol','fbs','restecg','thalach','exang','oldpeak','slope','ca','hal',' HeartDisease ']
Now let's import the data contained in the dataset in Python:
Data = pd.read_excel('ClevelandData.xlsx', names=HDNames)
Two parameters are passed: filename, and the list of column names to use.
- Dreamweaver CS3網(wǎng)頁(yè)設(shè)計(jì)與網(wǎng)站建設(shè)詳解
- Hands-On Machine Learning with TensorFlow.js
- Expert AWS Development
- SharePoint 2010開(kāi)發(fā)最佳實(shí)踐
- 可編程序控制器應(yīng)用實(shí)訓(xùn)(三菱機(jī)型)
- 步步圖解自動(dòng)化綜合技能
- 貫通Java Web開(kāi)發(fā)三劍客
- 嵌入式操作系統(tǒng)原理及應(yīng)用
- R Machine Learning Projects
- 智能制造系統(tǒng)及關(guān)鍵使能技術(shù)
- 簡(jiǎn)明學(xué)中文版Flash動(dòng)畫(huà)制作
- Flink內(nèi)核原理與實(shí)現(xiàn)
- 互聯(lián)網(wǎng)單元測(cè)試及實(shí)踐
- 數(shù)字中國(guó):大數(shù)據(jù)與政府管理決策
- 局域網(wǎng)應(yīng)用一點(diǎn)通