Decoder Only Translation Algorithm

About 139,000 results

Open links in new tab

Any time

stackexchange.com
https://ai.stackexchange.com › questions › why-can-decoder-only...
Why can decoder-only transformers be so good at machine translation …
Jun 8, 2023 · In my understanding encoder-decoder transformers for translation are trained with sentence or text pairs. How can it be explained in simple (high-level) terms that decoder-only transformers (e.g. G...
analyticsvidhya.com
https://www.analyticsvidhya.com › blog › mastering-decoder...
Mastering Decoder-Only Transformer: A Comprehensive Guide
Apr 26, 2024 · Explore the architecture and components of the Decoder-Only Transformer model. Understand the role of attention mechanisms, including Scaled Dot-Product Attention and Masked Self-Attention, in the model. Examine the importance of positional embeddings and normalization techniques in transformer models.
stackexchange.com
https://ai.stackexchange.com › questions › how-does-the-decoder...
How does the (decoder-only) transformer architecture work?
May 30, 2023 · However, models such as GPT-3, ChatGPT, GPT-4 & LaMDa use the (decoder-only) transformer architecture. It is key first to understand the input and output of a transformer: The input is a prompt (often referred to as context) fed into the transformer as a whole. There is no recurrence. The output depends on the goal of the model.
medium.com
https://medium.com › decoder-only-transformers-explained...
Decoder-Only Transformers Explained: The Engine Behind LLMs
Aug 31, 2024 · Large language models (LLMs) like GPT-3, LLaMA, and Gemini are revolutionizing how we interact with and generate text. At the heart of these powerful models lies a specific type of neural network...
substack.com
https://cameronrwolfe.substack.com › decoder-only-transformers-the...
Decoder-Only Transformers: The Workhorse of Generative LLMs
Decoder-only transformers receive a textual prompt as input. First, we use a tokenizer— based upon an algorithm like Byte-Pair Encoding —to break this text into discrete tokens. Then, we map each of these tokens to a corresponding token vector stored within an embedding layer.
aclanthology.org
https://aclanthology.org
Decoder-only Streaming Transformer for Simultaneous Translation
Apr 18, 2025 · However, directly applying the Decoder-only architecture to SiMT poses challenges in terms of training and inference. To alleviate the above problems, we propose the first Decoder-only SiMT model, named Decoder-only Streaming Transformer (DST).
arxiv.org
https://arxiv.org › abs
Scaling Laws of Decoder-Only Models on the Multilingual …
Sep 23, 2024 · Recent studies have showcased remarkable capabilities of decoder-only models in many NLP tasks, including translation. Yet, the machine translation field has been largely dominated by encoder-decoder models based on the Transformer architecture.
prism14.com
https://prism14.com › decoder-only-transformer
Exploring Decoder-Only Transformers for NLP and More
Jan 27, 2023 · A “decoder-only transformer” is a type of neural network architecture that’s commonly used in natural language processing tasks such as machine translation and text summarization. It is a variation of the original transformer architecture, which was introduced in the 2017 paper by Google researchers “Attention is All you Need.”
prism14.com
https://prism14.com › the-mechanics-of-transformer-models-decoding...
The Mechanics of Transformer Models: Decoding the Decoder-Only ...
Encoder-only models, exemplified by BERT (Bidirectional Encoder Representations from Transformers), specialize in understanding or “encoding” language. They excel at tasks that require a deep comprehension of context, such as sentiment analysis, language understanding, and text classification.
aclanthology.org
https://aclanthology.org
Scaling Laws of Decoder-Only Models on the Multilingual …
We trained a collection of six decoder-only models, ranging from 70M to 7B parameters, on a sentence-level, multilingual (8 languages) and multidomain (9 domains) dataset.
Pagination
- 1
- 2
- 3
- 4
- Next

Why can decoder-only transformers be so good at machine translation …

Mastering Decoder-Only Transformer: A Comprehensive Guide

How does the (decoder-only) transformer architecture work?

Decoder-Only Transformers Explained: The Engine Behind LLMs

Decoder-Only Transformers: The Workhorse of Generative LLMs

Decoder-only Streaming Transformer for Simultaneous Translation

Scaling Laws of Decoder-Only Models on the Multilingual …

Exploring Decoder-Only Transformers for NLP and More

The Mechanics of Transformer Models: Decoding the Decoder-Only ...

Scaling Laws of Decoder-Only Models on the Multilingual …