加载头像
【论文笔记】MLSLT: Towards Multilingual Sign Language Translation
【论文笔记】X-Former: Unifying Contrastive and Reconstruction Learning for MLLMs
【论文笔记】VISTA: Visualized Text Embedding For Universal Multi-Modal Retrieval
【论文笔记】MMFuser: Multimodal Multi-Layer Feature Fuser for Fine-Grained Vision-Language Understanding
【论文笔记】Sign2GPT Leveraging Large Language Models for Gloss-Free Sign Language Translation
【论文笔记】Fine-tuned CLIP Models are Efficient Video Learners
【论文笔记】Factorized Learning Assisted with Large Language Model for Gloss-free Sign Language Translation
【论文笔记】CLIP4Clip An empirical study of CLIP for end to end video clip retrieval and captioning
【论文笔记】VideoLLaMA 2: Advancing Spatial-Temporal Modeling and Audio Understanding in Video-LLMs
【论文笔记】Flamingo: a Visual Language Model for Few-Shot Learning
avatar
status
这有关于语言、算法、AI相关的问题和看法,还有文章翻译分享
相信你可以在这里找到对你有用的知识教程
引用到评论
随便逛逛博客分类文章标签
复制地址关闭热评深色模式轉為繁體