About 154,000 results
Open links in new tab
  1. Architecture and Working of Transformers in Deep Learning

    Feb 27, 2025 · Understanding Transformer Architecture. The transformer model is built on an encoder-decoder architecture where both the encoder and decoder are composed of a series of layers that utilize self-attention mechanisms and feed-forward neural networks.

  2. Transformer (deep learning architecture) - Wikipedia

    Like earlier seq2seq models, the original transformer model used an encoder-decoder architecture. The encoder consists of encoding layers that process all the input tokens together one layer after another, while the decoder consists of decoding layers that iteratively process the encoder's output and the decoder's output tokens so far.

  3. How Transformers Work: A Detailed Exploration of Transformer

    Jan 9, 2024 · It is the first transduction model relying entirely on self-attention to compute representations of its input and output without using sequence-aligned RNNs or convolution. The main core characteristic of the Transformers architecture is …

  4. The Transformer Model - MachineLearningMastery.com

    Jan 6, 2023 · In this tutorial, you discovered the network architecture of the Transformer model. Specifically, you learned: How the Transformer architecture implements an encoder-decoder structure without recurrence and convolutions; How the Transformer encoder and decoder work; How the Transformer self-attention compares to recurrent and convolutional layers

  5. Transformer-based Encoder-Decoder Models - Hugging Face

    The transformer-based encoder-decoder model was introduced by Vaswani et al. in the famous Attention is all you need paper and is today the de-facto standard encoder-decoder architecture in natural language processing (NLP).

  6. A Gentle Introduction to Attention and Transformer Models

    Mar 29, 2025 · The Transformer Architecture. The original transformer architecture is composed of an encoder and a decoder. Its layout is shown in the figure below. Recall that the transformer model was developed for translation tasks, replacing the seq2seq architecture that was commonly used with RNNs. Therefore, it borrowed the encoder-decoder architecture.

  7. Understanding Transformer Architecture: A Beginner’s Guide to Encoders

    Dec 26, 2024 · In this article, we’ll explore the core components of the transformer architecture: encoders, decoders, and encoder-decoder models. Don’t worry if you’re new to these concepts — we’ll break them...

  8. Transformer using PyTorch - GeeksforGeeks

    Mar 26, 2025 · 7. Transformer Model. This block defines the main Transformer class which combines the encoder and decoder layers. It also includes the embedding layers and the final output layer. self.encoder_embedding = nn.Embedding(src_vocab_size, d_model): Initializes the embedding layer for the source sequence, mapping tokens to continuous vectors of size ...

  9. What is Transformer Architecture and How It Works? - Great …

    Apr 7, 2025 · The transformer architecture is a deep learning model introduced in the paper Attention Is All You Need by Vaswani et al. (2017). It eliminates the need for recurrence by using self-attention and positional encoding, making it highly effective for sequence-to-sequence tasks such as language translation and text generation.

  10. Deep Learning Series 22:- Encoder and Decoder Architecture in …

    Dec 26, 2024 · In this blog, we’ll deep dive into the inner workings of the Transformer Encoder and Decoder Architecture. The core of the Transformer architecture lies the encoder, a sophisticated mechanism...

  11. Some results have been removed