Back to Glossary

Long Short-Term Memory (LSTM)

Models & Architectures

RNN variant with gating to prevent forgetting.


LSTMs use input, output, and forget gates to retain information over long sequences.

  • Advantages: More stable gradients than vanilla RNNs.
  • Applications: Speech, music, time series forecasting.
  • Trade-offs: Higher computational cost than GRUs.