Long Short Term Memory (LSTM) and how to implement LSTM using Python

Mrinal Walia
11 min readJul 10, 2022

What is LSTM? You might have heard this term in the last interview you gave for a Machine Learning Engineer position, or some of your friends might have mentioned using LSTM in their predictive modelling projects. So the big question that may arise here is what is LSTM, the purpose of using LSTM in your projects, what type of projects can be made using the LSTM algorithm, etc. Do not worry. This article will cover the in-depth architecture of LSTM networks, how an LSTM works, and its application in the real world.

In the coming sections of this article, you will understand:

  • The architecture of RNNs and problems involved with time series forecasting using RNNs
  • Standard LSTM architecture and how to build a network of LSTM cells

What is Time-Series Forecasting?

Time Series Forecasting is a technique of using the time series data values and then using it to make predictions about future values on our historical data points.

Time series forecasting has many applications in the field of medical health(for preventing a disease), finance(for predicting future stock prices), weather forecasting(for predicting the weather in future), etc.

Let our time series data vector be:

T = [[t1], [t2], [t3],…….., [tn]]

our task is to predict or forecast the future values [[tn+1], [tn+2],…] based on the historical data i.e our time series data vector.

Note: Time Series Forecasting is a technique in Machine Learning using which we can analyze our sequence of ordered values of time to predict the outcome in the future, but as of now, there is no algorithm using which we can achieve human-like performance using machine learning for predictions has some limitations and drawbacks as well but for now, it is out of the course of this article as I aim to show the technique/algorithm which can be potentially used for getting good accuracy for time-series predictions.

Understanding the structure of RNN

Let us assume a sequence of data containing vectors:

x = [x(1), x(2), ….., x(t)] where each element x(i) is a vector.

When we train a simple generic Neural Network on the sequence of data, we…



Mrinal Walia

Data Scientist and a Technical Writer! I will give you the best of Open-Source and AI.