
MDSTA: masked diffusion spatio-temporal autoencoder for …
Apr 23, 2025 · To address this, we propose the Masked Diffusion Spatio-Temporal Autoencoder (MDSTA) network for the joint classification of remote sensing data under arbitrary modalities. MDSTA consists of three main components: a Conditional Masked Diffusion Process (CMDP), a Reverse Diffusion Reconstruction Process (RDRP), and an Attention Multi-Layer ...
[2304.03283] Diffusion Models as Masked Autoencoders - arXiv.org
Apr 6, 2023 · While directly pre-training with diffusion models does not produce strong representations, we condition diffusion models on masked input and formulate diffusion models as masked autoencoders (DiffMAE).
learning to regress pixels of masked patches given the other visible patches. Inspired by MAE, we incorporate masking into diffusion models and cast Diffusion Models as Masked Autoencoders (DiffMAE). We formulate the masked pre-diction task as a conditional generative objective, i.e., to ap-proximate the pixel distribution of the masked region con-
Feature Guided Masked Autoencoder for Self-Supervised …
Nov 25, 2024 · In this article, we explore spectral and spatial remote sensing image features as improved MAE-reconstruction targets. We first conduct a study on reconstructing various image features, all performing comparably well or better than raw pixels.
DEMAE: Diffusion Enhanced Masked Autoencoder for Hyperspectral ... - GitHub
Pre-train a Masked Autoencoder with the idea of Diffusion Models for Hyperspectral Image Classification.
arious image features, all performing comparably well or better than raw pixels. Based on such observations, we propose Feature Guided Masked Autoencoder (FG-MAE): reconstructing a combination of Histograms of Oriented Graidents (HOG) and Normalized Differen.
Fus-MAE: A cross-attention-based data fusion approach for Masked ...
In this paper, we introduce Fus-MAE, a self-supervised learning framework based on masked autoencoders that uses cross-attention to perform early and feature-level data fusion between synthetic aperture radar and multispectral optical data - two modalities with a …
Saliency supervised masked autoencoder pretrained salient …
We introduce a novel pretraining framework called Saliency Supervised Masked Autoencoder (SSMAE), which improves the traditional masked autoencoder by incorporating saliency supervision. This innovation enables more targeted feature …
Enhancing Remote Sensing Representations Through Mixed-Modality Masked ...
Feb 28, 2025 · Using a novel variation on the masked autoencoder (MAE) framework, our model incorporates a dual-task setup: reconstructing masked Sentinel-2 images and predicting corresponding Sentinel-1 images. This multitask design enables the encoder to capture both spectral and structural features across diverse environmental conditions.
LDS2AE: Local Diffusion Shared-Specific Autoencoder for …
Mar 24, 2024 · In addition, we incorporate masked training to the diffusion autoencoder to achieve local diffusion, which significantly reduces the training cost of model. The approach is tested on widely-used multimodal remote sensing datasets, demonstrating the effectiveness of the proposed LDS2AE in addressing the classification of arbitrary missing ...