Recurrent Neural Network (RNN)

A neural network architecture that processes sequences one step at a time, passing a hidden state forward to remember earlier inputs.

A Recurrent Neural Network (RNN) is a neural network designed for sequential data — text, audio, time series — where the order of inputs matters. Instead of processing the whole sequence at once, an RNN reads one element at a time and maintains a hidden state that gets updated at each step, acting as a kind of memory of what came before. RNNs were the dominant approach to language modeling, machine translation, and speech recognition through the mid-2010s. You'll still encounter them in older NLP pipelines, in some on-device speech models, and in time-series forecasting where compute is tight. But for most large-scale language tasks they've been replaced by Transformers, which can look at the whole sequence in parallel and don't suffer from the same memory bottleneck. The classic analogy: reading a book one word at a time and only remembering a short summary in your head as you go. By chapter 10, details from chapter 1 are fuzzy — that's the "vanishing gradient" problem RNNs are famous for. Variants like LSTM and GRU were invented specifically to let the network hold onto information across longer spans by adding gates that decide what to remember and what to forget. In practice, training RNNs is also slow because each step depends on the previous one — you can't easily parallelize across the sequence. That's a big reason Transformers won: they trade memory for parallelism and scale much better on GPUs. Related concepts to explore next: LSTM, GRU, sequence-to-sequence models, attention mechanism, Transformer.