Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training

Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-training

Talking Papers Podcast with Yicong Hong - VLN BERTПодробнее

Talking Papers Podcast with Yicong Hong - VLN BERT

Grounded Entity-Landmark Adaptive Pre-Training for Vision-and-Language NavigationПодробнее

Grounded Entity-Landmark Adaptive Pre-Training for Vision-and-Language Navigation

(CVPR 2023) Improving Vision-and-Language Navigation by Generating Future-View Image SemanticsПодробнее

(CVPR 2023) Improving Vision-and-Language Navigation by Generating Future-View Image Semantics

Vision-Language Pre-training Survey PaperПодробнее

Vision-Language Pre-training Survey Paper

Learning Vision-and-Language Navigation from YouTube VideosПодробнее

Learning Vision-and-Language Navigation from YouTube Videos

[ICCV2021] Airbert: In-domain Pretraining for Vision-and-Language NavigationПодробнее

[ICCV2021] Airbert: In-domain Pretraining for Vision-and-Language Navigation

Vision-based navigation with language-based assistance (CVPR 2019)Подробнее

Vision-based navigation with language-based assistance (CVPR 2019)

Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-TrainingПодробнее

Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-Training

[CVPR 2021 VQA2VLN Tutorial] Introduction to Vision Language NavigationПодробнее

[CVPR 2021 VQA2VLN Tutorial] Introduction to Vision Language Navigation

Speaker-Follower Model for Vision-and-Language Navigation || Paper PresentationПодробнее

Speaker-Follower Model for Vision-and-Language Navigation || Paper Presentation

Vision-Language Navigation With Self-Supervised Auxiliary Reasoning TasksПодробнее

Vision-Language Navigation With Self-Supervised Auxiliary Reasoning Tasks

Cordelia Schmid: Transformers for Vision-Language Navigation and ManipulationПодробнее

Cordelia Schmid: Transformers for Vision-Language Navigation and Manipulation

Counterfactual Vision and Language LearningПодробнее

Counterfactual Vision and Language Learning

SASRA: Semantically-aware Spatio-temporal Reasoning Agent for Vision-and-Language NavigationПодробнее

SASRA: Semantically-aware Spatio-temporal Reasoning Agent for Vision-and-Language Navigation

Hybrid Learning for Vision-and-Language Navigation AgentsПодробнее

Hybrid Learning for Vision-and-Language Navigation Agents

Vision-Dialog Navigation by Exploring Cross-Modal MemoryПодробнее

Vision-Dialog Navigation by Exploring Cross-Modal Memory

History Enhanced and Order Aware Pre Training for Vision and Language NavigationПодробнее

History Enhanced and Order Aware Pre Training for Vision and Language Navigation

[ECCV 2022] ASSISTER: Assistive Navigation via Conditional Instruction GenerationПодробнее

[ECCV 2022] ASSISTER: Assistive Navigation via Conditional Instruction Generation