Training the Long Short-Term Memory Model (LSTM)
LSTM is a type of recurrent neural network (RNN) architecture that is particularly effective in processing and predicting sequences of data. LSTM networks are designed to address the limitations of traditional RNNs, which tend to struggle with capturing long-term dependencies in sequential data [1].
LSTMs were introduced by Hochreiter and Schmidhuber in 1997
as an enhancement to RNNs. The key innovation of LSTM networks is the
incorporation of a memory cell, which allows them to selectively remember or
forget information over extended time periods [2]. This memory cell is equipped
with various gating mechanisms that regulate the flow of information, making it
capable of capturing and preserving relevant information while disregarding
irrelevant or redundant information.
LSTM models are quite suitable for regression problems with
time series data specially where forecasting comes into place. The models are a
type of RNN (Recurrent Neural Networks) that are aimed at illuminating the
vanishing gradient problem and maintaining long term dependencies in time
series data. These networks have the ability to observe the input signals and
based on that decide on the which information to forget or remember and update
in the memory cell.
The algorithm as shown below in figure 1 was made with the
learning rate as 0.001, Max epochs as 400 number of hidden units or layers as
200. The following image illustrates the training results of the LSTM model
Feeding and Training the Network
Training results
As seen from the above figures, the validation RMSE came to
be 0.007199 which is quite small and close to zero which is desired case. The
training ran for 400 epochs and in each epoch the model was exposed to the
dataset one time since the number of iterations was set as 1. Furthermore, the
training errors of the LSTM model can be observed from figure 3. An important
point to note is that the Validation RMSE keeps decreasing as the epochs
increases proving that if the error needs to be further reduced, then the
number of epochs and learning rate can be adjusted.
Testing The Models
From the forecasted capacitance curve shown above it can be
seen that it is slightly away from the actual capacitance plot. The difference
can be observed in the image below that distinguishes the cycle at which the
threshold was met
From the closeup view of the capacitance curve seen above it
can be seen that the threshold is met at the 410th cycle whereas the forecasted
capacitance graph shows that at the 410th cycle, the capacitance slightly goes
below the threshold
Table 1: Error Values For SC2 Trained and Tested LSTM Model
Model | MSE | RMSE | MAE | MAPE |
Batch 4 Trained SC2 | ||||
Tested SC2 | 1.2393e-07 | 0.0011132 | 0.0009592 | 0.10447 |
Similar to the shallow neural network model, the LSTM model
was also tested with foreign datapoints to observe the performance and
accuracy.
Model | MSE | RMSE | MAE | MAPE |
Batch 4 Trained SC2 | ||||
Tested SC9 | 5.704e-06 | 0.0023883 | 0.0023544 | 0.25608 |
It was observed when the
supercapacitor 2 trained model was trained with the data from supercapacitor 9,
the values slightly differed however were still quite close. When comparing the
RMSE values from table 4 and 6, they were very close to each other with the
shallow neural network having an RMSE of 0.0023759 and the LSTM having an RMSE
of 0.0023883.
Table 3: Error Values for the Various Trained LSTM Models.
|
Model |
MSE |
RMSE |
MAE |
MAPE |
|
Batch 4 Trained
SC2 |
||||
|
Tested SC3 |
0.0010952 |
0.033093 |
0.033016 |
3.4729 |
|
Tested SC6 |
0.0015147 |
0.038919 |
0.038871 |
4.0644 |
|
Tested SC9 |
5.704e-06 |
0.0023883 |
0.0023544 |
0.25608 |
|
TestedSC12 |
0.00011899 |
0.010908 |
0.010902 |
1.1747 |
|
TestedSC15 |
0.0017542 |
0.041884 |
0.041704 |
4.7679 |
|
Batch 4 Trained
SC8 |
||||
|
Tested SC3 |
1.6213e-06 |
0.0012733 |
0.0011092 |
0.11691 |
|
Tested SC6 |
3.0146e-05 |
0.0054905 |
0.0054177 |
0.56607 |
|
Tested SC9 |
0.00096642 |
0.031087 |
0.031061 |
3.377 |
|
TestedSC12 |
0.00050871 |
0.022555 |
0.022521 |
2.4259 |
|
TestedSC15 |
0.0056438 |
0.075125 |
0.075074 |
8.5779 |
|
Batch 4 Trained
SC13 |
||||
|
Tested SC3 |
0.0010122 |
0.031816 |
0.031812 |
3.3482 |
|
Tested SC6 |
0.0014188 |
0.037666 |
0.03766 |
3.9396 |
|
Tested SC9 |
7.8718e-06 |
0.0028057 |
0.0023687 |
0.25812 |
|
TestedSC12 |
0.00010112 |
0.010056 |
0.0097455 |
1.0515 |
|
TestedSC15 |
0.0018321 |
0.042804 |
0.042777 |
4.8875 |
|
Batch 4 Trained
SC19 |
||||
|
Tested SC3 |
0.00027459 |
0.016571 |
0.016548 |
1.7417 |
|
Tested SC6 |
0.00050258 |
0.022418 |
0.022392 |
2.3426 |
|
Tested SC9 |
0.00020458 |
0.014303 |
0.014055 |
1.5269 |
|
TestedSC12 |
3.7337e-05 |
0.0061104 |
0.0055227 |
0.59356 |
|
TestedSC15 |
0.0033707 |
0.058058 |
0.05803 |
6.6297 |
|
Batch 4 Trained
SC25 |
||||
|
Tested SC3 |
0.00019521 |
0.013972 |
0.013959 |
1.4688 |
|
Tested SC6 |
0.00039263 |
0.019815 |
0.019808 |
2.0718 |
|
Tested SC9 |
0.00028003 |
0.016734 |
0.016616 |
1.8057 |
|
TestedSC12 |
6.9175e-05 |
0.0083171 |
0.0080869 |
0.87019 |
|
TestedSC15 |
0.0036754 |
0.060625 |
0.060588 |
6.9226 |
References
Edited by Shahil and Henal
S11172483@student.usp.ac.fj
S11085370@student.usp.ac.fj








Comments
Post a Comment