About 286,000 results
Open links in new tab
  1. Image Captioning using Encoder-Attention-Decoder

    Apr 25, 2021 · Show, Attend and Tell: Neural Image Caption Generation with Visual Attention; Building Encoder and Decoder with Deep Neural Networks: On the Way to Reality

  2. Image Captioning - Papers With Code

    Most image captioning systems use an encoder-decoder framework, where an input image is encoded into an intermediate representation of the information in the image, and then decoded into a descriptive text sequence. The most popular benchmarks are nocaps and COCO, and models are typically evaluated according to a BLEU or CIDER metric.

  3. Image Captioning using CNN+RNN Encoder-Decoder

    Image Captioning using CNN+RNN Encoder-Decoder Architecture in PyTorch. Image Captioning Model. In this project, you will create a neural network architecture to automatically generate captions from images. After using the Microsoft Common Objects in COntext (MS COCO) dataset to train your network, you will test your network on novel images!

  4. Parallel encoder–decoder framework for image captioning

    Dec 20, 2023 · We presented a parallel encoder–decoder framework for image captioning, which consists of two parallel blocks to take advantage of multi-type encoders and decoders simultaneously and integrates their results in order to model the prior knowledge.

  5. Transformer based Encoder-Decoder models for image-captioning

    Dec 3, 2024 · In this blog we provide you with hands-on tutorials on implementing three different Transformer-based encoder-decoder image captioning models using ROCm running on AMD GPUs. The three models we will cover in this blog, ViT-GPT2, BLIP, and Alpha- CLIP, are presented in ascending order of complexity.

  6. Image Captioning based on Encoder Decoder Architecture

    In this research work, EfficientNetV2B0 is utilized in the encoder part, for extracting objects from an image. Then Long Short-Term Memory (LSTM), a type of recurrent neural network as a decoder for generating descriptions by using encoder output.

  7. Compressed Image Captioning using CNN-based Encoder-Decoder

    Apr 28, 2024 · Our project is focused on addressing these challenges by developing an automatic image captioning architecture that combines the strengths of convolutional neural networks (CNNs) and encoder-decoder models.

  8. Image Captioning Using VGG-16 and VGG-19: A Mixed Model …

    Apr 18, 2025 · Our paper aims to establish an encoder-decoder (E-D) located hybrid captioning of image that utilize VGG-16, VGG-19, ResNet50, YOLO. ResNet50, VGG-16 are pre-prepared characteristic eradication models on numerous (more than 100,000) of images.

  9. A comprehensive construction of deep neural network‐based encoder

    Nov 25, 2024 · By addressing spatial relationships in images and producing logical, contextually relevant captions, the paper advances image captioning technology. Insightful ideas for future study directions are generated by the discussion of the difficulties faced during the experimentation phase.

  10. Integrating Region Proposals with Recurrent Neural Networks for Image

    In this work, we present an improved framework for the generative creation of paragraph captions for images. This research gives the brief of the method that exploits Region Proposal Networks (RPNs) and Convolutional Neural Networks (CNN), encoder and multilevel decoder using RNN, ensuring a more effective paragraph caption generation process ...

  11. Some results have been removed
Refresh