About 151,000 results
Open links in new tab
  1. Architecture and Working of Transformers in Deep Learning

    Feb 27, 2025 · Understanding Transformer Architecture. The transformer model is built on an encoder-decoder architecture where both the encoder and decoder are composed of a series of layers that utilize self-attention mechanisms and feed-forward neural networks.

  2. Understanding Transformer Architecture: A Beginner’s Guide to Encoders

    Dec 26, 2024 · In this article, we’ll explore the core components of the transformer architecture: encoders, decoders, and encoder-decoder models. Don’t worry if you’re new to these concepts — we’ll...

  3. Visualizing and Explaining Transformer Models From the Ground …

    Jan 19, 2023 · In an encoder-decoder schema, the encoder takes in the entirety of the input sequence. It transforms it into a vectorized representation that contains accumulated knowledge of the input sequence at every time step.

  4. Transformer (deep learning architecture) - Wikipedia

    Its architecture consists of two parts. The encoder is an LSTM that takes in a sequence of tokens and turns it into a vector. The decoder is another LSTM that converts the vector into a sequence of tokens. Similarly, another 130M-parameter model used gated recurrent units (GRU) instead of …

  5. An In-Depth Look at the Transformer Based Models - Medium

    Mar 17, 2023 · Fig. 1: Transformer-based models graph. The graph illustrates models of different architectures — encoder-only (autoencoding AE), decoder-only (autoregressive AR), and encoder-decoder...

  6. Transformers made easy: architecture and data flow

    Oct 29, 2019 · Seq2seq neural networks are composed mainly of two elements: an encoder and a decoder. The encoder is fed up with the input data. It encodes data into a hidden state called context vector. Then...

  7. How Transformers Work: A Detailed Exploration of Transformer ...

    Jan 9, 2024 · Transformers are a current state-of-the-art NLP model and are considered the evolution of the encoder-decoder architecture. However, while the encoder-decoder architecture relies mainly on Recurrent Neural Networks (RNNs) to extract sequential information, Transformers completely lack this recurrency. So, how do they do it?

  8. Transformer-based Encoder-Decoder Models - Hugging Face

    We will focus on the mathematical model defined by the architecture and how the model can be used in inference. Along the way, we will give some background on sequence-to-sequence models in NLP and break down the transformer-based encoder-decoder architecture into its encoder and decoder parts.

  9. A Gentle Introduction to Attention and Transformer Models

    Mar 29, 2025 · The Transformer Architecture. The original transformer architecture is composed of an encoder and a decoder. Its layout is shown in the figure below. Recall that the transformer model was developed for translation tasks, replacing the seq2seq architecture that was commonly used with RNNs. Therefore, it borrowed the encoder-decoder architecture.

  10. Transformer-based Encoder-Decoder Models

    We will focus on the mathematical model defined by the architecture and how the model can be used in inference. Along the way, we will give some background on sequence-to-sequence models in NLP and...

  11. Some results have been removed