21 8 8

Miquel Farré

mfarre

AI & ML interests

I like everything video

Recent Activity

new activity 18 days ago

HuggingFaceFV/finevideo:Cleanup TTS

liked a Space 21 days ago

HuggingFaceH4/blogpost-scaling-test-time-compute

upvoted a paper 22 days ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

View all activity

Articles

Organizations

mfarre's activity

upvoted a paper 22 days ago

Apollo: An Exploration of Video Understanding in Large Multimodal Models

Paper • 2412.10360 • Published 24 days ago • 136

upvoted a paper 2 months ago

LongVU: Spatiotemporal Adaptive Compression for Long Video-Language Understanding

Paper • 2410.17434 • Published Oct 22, 2024 • 25

upvoted 3 articles 4 months ago

Article

FineVideo: behind the scenes

Sep 23, 2024

• 27

Article

Docmatix - a huge dataset for Document Visual Question Answering

Jul 18, 2024

• 72

Article

Scaling robotics datasets with video encoding

Aug 27, 2024

• 34

upvoted a paper 4 months ago

Building and better understanding vision-language models: insights and future directions

Paper • 2408.12637 • Published Aug 22, 2024 • 124

Miquel Farré

AI & ML interests

Recent Activity

Articles

SmolVLM - small yet mighty Vision Language Model

CinePile 2.0 - making stronger datasets with adversarial refinement

FineVideo: behind the scenes

Scaling robotics datasets with video encoding

Organizations

mfarre's activity

FineVideo: behind the scenes

Docmatix - a huge dataset for Document Visual Question Answering

Scaling robotics datasets with video encoding