Autoencoder and Vision Transformer

About 1,050,000 results

Open links in new tab

Past week

sciencedirect.com
https://www.sciencedirect.com › science › article › pii
ViTAE-SL: A vision transformer-based autoencoder and spatial ...
Mar 1, 2025 · In this paper, we present a new deep learning model called Vision Transformer-based Autoencoder (ViTAE) for reconstructing large-scale and complex fields. The proposed …
arxiv.org
https://arxiv.org › abs
[2301.07382] ViT-AE++: Improving Vision Transformer Autoencoder …
Jan 18, 2023 · Vision transformer-based autoencoder (ViT-AE) by He et al. (2021) is a recent self-supervised learning technique that employs a patch-masking strategy to learn a meaningful …
nature.com
https://www.nature.com › articles
Leveraging two-dimensional pre-trained vision transformers for …
Jan 25, 2025 · We employ the adept 2D information to direct a 3D masking-based autoencoder, which uses an encoder-decoder architecture to rebuild the masked point tokens through self …
springer.com
link.springer.com › International Journal of Computer Vision
Rethinking Vision Transformer and Masked Autoencoder in
Jun 5, 2024 · In this paper, we investigate three key factors (i.e., inputs, pre-training, and finetuning) in ViT for multimodal FAS with RGB, Infrared (IR), and Depth. First, in terms of the …
sciencedirect.com
https://www.sciencedirect.com › science › article › pii
SMAE-Fusion: Integrating saliency-aware masked autoencoder …
May 1, 2025 · Emerging as a powerful self-supervised training paradigm, masked image modeling enables the learning of robust feature representations applicable to various downstream tasks. …
sciencedirect.com
https://www.sciencedirect.com › science › article › pii
Adaptive Masked Autoencoder Transformer for image …
Oct 1, 2024 · In order to address these challenges, we propose the Adaptive Masked Autoencoder Transformer (AMAT), a masked image modeling-based method. AMAT …
imperial.ac.uk
https://spiral.imperial.ac.uk › server › api › core › bitstreams
[PDF]
ViTAE-SL: a vision transformer-based autoencoder and
In this paper, we present a new deep learning model called Vision Transformer-based Autoencoder (ViTAE) for re-constructing large-scale and complex fields. The proposed …
ieee.org
https://ieeexplore.ieee.org › document
Medical Image Synthesis Using Autoencoder with Vision Transformer
This paper proposes a novel architecture for synthesizing CMR images from TTE inputs using an integrated autoencoder and vision transformer. The autoencoder captures TTE patterns and …
mdpi.com
https://www.mdpi.com
Spatial–Temporal Heatmap Masked Autoencoder for Skeleton …
2 days ago · During pre-training, a Vision Transformer-based autoencoder equipped with a lightweight prediction head reconstructs the masked regions, fostering the extraction of robust …
ieee.org
https://ieeexplore.ieee.org › document
Generalized Concordant Vision Transformer with Masked Image …
3 days ago · Abstract: The vision transformer (ViT) architecture offers significant advantages in object detection tasks. However, some limitations affect improving task performance. Firstly, …
Pagination
- 1
- 2
- 3
- 4
- Next

ViTAE-SL: A vision transformer-based autoencoder and spatial ...

[2301.07382] ViT-AE++: Improving Vision Transformer Autoencoder …

Leveraging two-dimensional pre-trained vision transformers for …

Rethinking Vision Transformer and Masked Autoencoder in

SMAE-Fusion: Integrating saliency-aware masked autoencoder …

Adaptive Masked Autoencoder Transformer for image …

ViTAE-SL: a vision transformer-based autoencoder and

Medical Image Synthesis Using Autoencoder with Vision Transformer

Spatial–Temporal Heatmap Masked Autoencoder for Skeleton …

Generalized Concordant Vision Transformer with Masked Image …