加载头像
【论文笔记】VCoder: Versatile Vision Encoders for Multimodal Large Language Models
Tmux 使用教程
【论文笔记】Dense Connector for MLLMs
【论文笔记】Attention Prompting on Image for Large  Vision-Language Models
【论文笔记】Token Turing Machines
【论文笔记】Gloss-free Sign Language Translation: Improving from Visual-Language Pretraining
【论文笔记】C$^2$RL: Content and Context Representation Learning for Gloss-free Sign Language Translation and Retrieval
【论文笔记】Perceiver: General Perception with Iterative Attention
【论文笔记】xGen-MM (BLIP-3): A Family of Open Large Multimodal Models
【论文笔记】xGen-MM-Vid (BLIP-3-Video): You Only Need 32 Tokens to Represent a Video Even in VLMs
avatar
status
这有关于语言、算法、AI相关的问题和看法,还有文章翻译分享
相信你可以在这里找到对你有用的知识教程
引用到评论
随便逛逛博客分类文章标签
复制地址关闭热评深色模式轉為繁體