Training the Long Short-Term Memory Model (LSTM)




LSTM is a type of recurrent neural network (RNN) architecture that is particularly effective in processing and predicting sequences of data. LSTM networks are designed to address the limitations of traditional RNNs, which tend to struggle with capturing long-term dependencies in sequential data [1].

LSTMs were introduced by Hochreiter and Schmidhuber in 1997 as an enhancement to RNNs. The key innovation of LSTM networks is the incorporation of a memory cell, which allows them to selectively remember or forget information over extended time periods [2]. This memory cell is equipped with various gating mechanisms that regulate the flow of information, making it capable of capturing and preserving relevant information while disregarding irrelevant or redundant information.

LSTM models are quite suitable for regression problems with time series data specially where forecasting comes into place. The models are a type of RNN (Recurrent Neural Networks) that are aimed at illuminating the vanishing gradient problem and maintaining long term dependencies in time series data. These networks have the ability to observe the input signals and based on that decide on the which information to forget or remember and update in the memory cell.

The algorithm as shown below in figure 1 was made with the learning rate as 0.001, Max epochs as 400 number of hidden units or layers as 200. The following image illustrates the training results of the LSTM model

Figure 1: Flowchart representing the Matlab algorithm for LSTM

Feeding and Training the Network



The model was trained using the features for the first 300 cycles because we want to predict the capacitance values for the next 300 cycles

Training results

Figure 2: LSTM Training results

Figure 3: LSTM Training Errors

As seen from the above figures, the validation RMSE came to be 0.007199 which is quite small and close to zero which is desired case. The training ran for 400 epochs and in each epoch the model was exposed to the dataset one time since the number of iterations was set as 1. Furthermore, the training errors of the LSTM model can be observed from figure 3. An important point to note is that the Validation RMSE keeps decreasing as the epochs increases proving that if the error needs to be further reduced, then the number of epochs and learning rate can be adjusted.

Testing The Models

Figure 4: Capacitance curve of the Batch 4 SC2 trained and SC2 tested LSTM model

From the forecasted capacitance curve shown above it can be seen that it is slightly away from the actual capacitance plot. The difference can be observed in the image below that distinguishes the cycle at which the threshold was met

Figure 5: Closeup view of the forecasted capacitance curve for figure 34

From the closeup view of the capacitance curve seen above it can be seen that the threshold is met at the 410th cycle whereas the forecasted capacitance graph shows that at the 410th cycle, the capacitance slightly goes below the threshold

Table 1: Error Values For SC2 Trained and Tested LSTM Model

Model

MSE

RMSE

MAE

MAPE

Batch 4 Trained SC2

Tested SC2

1.2393e-07

0.0011132

0.0009592

0.10447

Similar to the shallow neural network model, the LSTM model was also tested with foreign datapoints to observe the performance and accuracy.

Figure 5: Capacitance curve of the Batch 4 SC2 trained and SC9 tested LSTM model

Table 2: Error values for SC2 Trained and SC9 Tested model 

Model

MSE

RMSE

MAE

MAPE

Batch 4 Trained SC2

Tested SC9

5.704e-06

0.0023883

0.0023544

0.25608


It was observed when the supercapacitor 2 trained model was trained with the data from supercapacitor 9, the values slightly differed however were still quite close. When comparing the RMSE values from table 4 and 6, they were very close to each other with the shallow neural network having an RMSE of 0.0023759 and the LSTM having an RMSE of 0.0023883.

Table 3: Error Values for the Various Trained LSTM Models.

Model

MSE

RMSE

MAE

MAPE

Batch 4 Trained SC2

Tested SC3

0.0010952

0.033093

0.033016

3.4729

Tested SC6

0.0015147

0.038919

0.038871

4.0644

Tested SC9

5.704e-06

0.0023883

0.0023544

0.25608

TestedSC12

0.00011899

0.010908

0.010902

1.1747

TestedSC15

0.0017542

0.041884

0.041704

4.7679

Batch 4 Trained SC8

Tested SC3

1.6213e-06

0.0012733

0.0011092

0.11691

Tested SC6

3.0146e-05

0.0054905

0.0054177

0.56607

Tested SC9

0.00096642

0.031087

0.031061

3.377

TestedSC12

0.00050871

0.022555

0.022521

2.4259

TestedSC15

0.0056438

0.075125

0.075074

8.5779

Batch 4 Trained SC13

Tested SC3

0.0010122

0.031816

0.031812

3.3482

Tested SC6

0.0014188

0.037666

0.03766

3.9396

Tested SC9

7.8718e-06

0.0028057

0.0023687

0.25812

TestedSC12

0.00010112

0.010056

0.0097455

1.0515

TestedSC15

0.0018321

0.042804

0.042777

4.8875

Batch 4 Trained SC19

Tested SC3

0.00027459

0.016571

0.016548

1.7417

Tested SC6

0.00050258

0.022418

0.022392

2.3426

Tested SC9

0.00020458

0.014303

0.014055

1.5269

TestedSC12

3.7337e-05

0.0061104

0.0055227

0.59356

TestedSC15

0.0033707

0.058058

0.05803

6.6297

Batch 4 Trained SC25

Tested SC3

0.00019521

0.013972

0.013959

1.4688

Tested SC6

0.00039263

0.019815

0.019808

2.0718

Tested SC9

0.00028003

0.016734

0.016616

1.8057

TestedSC12

6.9175e-05

0.0083171

0.0080869

0.87019

TestedSC15

0.0036754

0.060625

0.060588

6.9226


As displayed from the error values obtained from the different LSTM models in table 6 above, the values were quite close to zero, thus indicating that the model was able to quite accurately predict the unknown capacitance values



References

[1] H. D. Nguyen, K. P. Tran, S. Thomassey, and M. Hamad, "Forecasting and Anomaly Detection               approaches using LSTM and LSTM Autoencoder techniques with the applications in supply              chain management," International Journal of Information Management, vol. 57, p. 102282,             2021.

[2] W. Luo, W. Liu and S. Gao, "Remembering history with convolutional LSTM for anomaly                   detection," 2017 IEEE International Conference on Multimedia and Expo (ICME), Hong                    Kong, China, 2017, pp. 439-444, doi: 10.1109/ICME.2017.8019325.




Edited by Shahil and Henal

S11172483@student.usp.ac.fj 

S11085370@student.usp.ac.fj


Comments

Popular posts from this blog