readme finished
parent
26080bfada
commit
06b862b1a1
23
README.md
23
README.md
|
@ -145,6 +145,8 @@ The colour scale represents the correlation between the two types of categorizat
|
|||
- Blue (low)
|
||||
- Red (high)
|
||||
|
||||
Notably, the age group between 60-70 in SB shows a noticeably higher correlation and will therefore be considered in our hypothesis analysis. The other groups align mostly as expected, with a slight increase in correlation observed with age.
|
||||
|
||||
The exact procedure for creating the matrix can be found in the notebook [demographic_plots.ipynb](notebooks/demographic_plots.ipynb).
|
||||
|
||||
![Alt-Text](readme_data/Korrelationsmatrix.png)
|
||||
|
@ -192,17 +194,32 @@ With those Classifiers, the hypothesis can be proven, that a classifier is able
|
|||
The sample size in the study conducted may also play a role in the significance of the frequency.
|
||||
|
||||
### Noise reduction
|
||||
Noise suppression was performed on the existing ECG data. A three-stage noise reduction was performed to reduce the noise in the ECG signals. First, a Butterworth filter was applied to the signals to remove the high frequency noise. Then a Loess filter was applied to the signals to remove the low frequency noise. Finally, a non-local-means filter was applied to the signals to remove the remaining noise.
|
||||
Noise suppression was performed on the existing ECG data. A three-stage noise reduction was performed to reduce the noise in the ECG signals. First, a Butterworth filter was applied to the signals to remove the high frequency noise. Then a Loess filter was applied to the signals to remove the low frequency noise. Finally, a non-local-means filter was applied to the signals to remove the remaining noise. For noise reduction, the built-in noise reduction function from NeuroKit2 `ecg_clean` was utilized for all data due to considerations of time performance.
|
||||
|
||||
How the noise reduction was performed in detail can be seen in the following notebook: [noise_reduction.ipynb](notebooks/noise_reduction.ipynb)
|
||||
|
||||
### Features
|
||||
The detection ability of the NeuroKit2 library is tested to detect features in the ECG dataset. Those features are important for the training of the model in order to detect the different diagnostic groups. The features are generated using the NeuroKit2 library.
|
||||
The detection ability of the NeuroKit2 library is tested to detect features in the ECG dataset. Those features are important for the training of the model in order to detect the different diagnostic groups. The features are detected using the NeuroKit2 library.
|
||||
|
||||
For the training, the features considered are:
|
||||
- ventricular rate
|
||||
- atrial rate
|
||||
- T axis
|
||||
- R axis
|
||||
- Q peak amplitude
|
||||
- QT length
|
||||
- QRS duration
|
||||
- QRS count
|
||||
- gender
|
||||
- age
|
||||
|
||||
The selection of features was informed by an analysis presented in a paper (source: https://rdcu.be/dH2jI, last accessed: 15.05.2024), where various feature sets were evaluated. These features were chosen for their optimal balance between performance and significance.
|
||||
|
||||
The exact process can be found in the notebook: [features_detection.ipynb](notebooks/features_detection.ipynb).
|
||||
|
||||
### ML-models
|
||||
First, the grid was tested to find the best model which was then trained to identify the best hyperparameters out of it. That way, an accuracy of 83 % was achieved with the XGBoost classifier. The Gradient Boosting Tree Classifier had an accuracy of 82%.
|
||||
For machine learning, the initial step involved tailoring the features for the models, followed by employing a grid search to identify the best hyperparameters. This approach led to the highest performance being achieved by the Extreme Gradient Boosting (XGBoost) model, which attained an accuracy of 83%. Additionally, a Gradient Boosting Tree model was evaluated using the same procedure and achieved an accuracy of 82%. The selection of these models was influenced by the team's own experience and the performance metrics highlighted in the paper (source: https://rdcu.be/dH2jI, last accessed: 15.05.2024). The models have also been evaluated, and it is noticeable that some features, like the ventricular rate, are shown to be more important than other features.
|
||||
|
||||
<br>The detailed procedures can be found in the following notebooks:
|
||||
<br>[ml_xgboost.ipynb](notebooks/ml_xgboost.ipynb)
|
||||
<br>[ml_grad_boost_tree.ipynb](notebooks/ml_grad_boost_tree.ipynb)
|
||||
|
|
Loading…
Reference in New Issue