标签: 多模态 | 小嗷犬

多模态

2024

【论文笔记】CDFSL-V: Cross-Domain Few-Shot Learning for Videos

【论文笔记】CDFSL-V: Cross-Domain Few-Shot Learning for Videos1

论文笔记多模态少样本

2024-12-29

【论文笔记】CoSign: Exploring Co-occurrence Signals in Skeleton-based Continuous Sign Language Recognition

【论文笔记】CoSign: Exploring Co-occurrence Signals in Skeleton-based Continuous Sign Language Recognition2

论文笔记手语翻译多模态

2024-12-22

【论文笔记】Visual Alignment Pre-training for Sign Language Translation

【论文笔记】Visual Alignment Pre-training for Sign Language Translation3

论文笔记手语翻译多模态

2024-12-22

【论文笔记】CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive Learning

【论文笔记】CiCo: Domain-Aware Sign Language Retrieval via Cross-Lingual Contrastive Learning4

论文笔记手语翻译多模态

2024-12-22

【论文笔记】Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition

【论文笔记】Multimodal Cross-Domain Few-Shot Learning for Egocentric Action Recognition5

论文笔记多模态少样本动作识别

2024-12-22

【论文笔记】Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks

【论文笔记】Florence-2: Advancing a Unified Representation for a Variety of Vision Tasks6

大模型论文笔记多模态

2024-12-15

【论文笔记】CLIP-guided Prototype Modulating for Few-shot Action Recognition

【论文笔记】CLIP-guided Prototype Modulating for Few-shot Action Recognition7

论文笔记多模态少样本动作识别

2024-12-15

【论文笔记】How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites

【论文笔记】How Far Are We to GPT-4V? Closing the Gap to Commercial Multimodal Models with Open-Source Suites8

大模型论文笔记多模态

2024-12-08

【论文笔记】Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion

【论文笔记】Florence-VL: Enhancing Vision-Language Models with Generative Vision Encoder and Depth-Breadth Fusion9

大模型论文笔记多模态

2024-12-08

【论文笔记】VisionZip: Longer is Better but Not Necessary in Vision Language Models

【论文笔记】VisionZip: Longer is Better but Not Necessary in Vision Language Models10

大模型论文笔记多模态

2024-12-08