69 1189 2031

taesiri PRO

taesiri

https://taesiri.ai/

AI & ML interests

AGI ... one linear layer at a time

Recent Activity

liked a model about 2 hours ago

kudzueye/boreal-flux-dev-v2

updated a dataset about 8 hours ago

taesiri/PhotoshopRequest-DailyDump-January-2025

updated a dataset about 9 hours ago

taesiri/BugsBunny-InternVL2_5-78B-MPO-Extensive-Captioning

View all activity

Organizations

taesiri's activity

upvoted 4 papers about 9 hours ago

VisionReward: Fine-Grained Multi-Dimensional Human Preference Learning for Image and Video Generation

Paper • 2412.21059 • Published 7 days ago • 11

upvoted a collection about 24 hours ago

Dolphin 3.0

Collection

Dolphin 3.0 is the next generation of the Dolphin series of instruct-tuned models. Designed to be the ultimate general purpose local model. • 7 items • Updated 1 day ago • 28

upvoted 6 papers 3 days ago

MapEval: A Map-Based Evaluation of Geo-Spatial Reasoning in Foundation Models

Paper • 2501.00316 • Published 7 days ago • 21

MapQaTor: A System for Efficient Annotation of Map Query Datasets

Paper • 2412.21015 • Published 7 days ago • 8

VideoRefer Suite: Advancing Spatial-Temporal Object Understanding with Video LLM

Paper • 2501.00599 • Published 6 days ago • 38

CodeElo: Benchmarking Competition-level Code Generation of LLMs with Human-comparable Elo Ratings

Paper • 2501.01257 • Published 4 days ago • 41

VideoAnydoor: High-fidelity Video Object Insertion with Precise Motion Control

Paper • 2501.01427 • Published 4 days ago • 42

2.5 Years in Class: A Multimodal Textbook for Vision-Language Pretraining

Paper • 2501.00958 • Published 5 days ago • 82

upvoted 4 papers 4 days ago

MLLM-as-a-Judge for Image Safety without Human Labeling

Paper • 2501.00192 • Published 7 days ago • 22

A3: Android Agent Arena for Mobile GUI Agents

Paper • 2501.01149 • Published 5 days ago • 20

ProgCo: Program Helps Self-Correction of Large Language Models

Paper • 2501.01264 • Published 4 days ago • 23

OS-Genesis: Automating GUI Agent Trajectory Construction via Reverse Task Synthesis

Paper • 2412.19723 • Published 10 days ago • 70

upvoted 3 papers 6 days ago

Edicho: Consistent Image Editing in the Wild

Paper • 2412.21079 • Published 7 days ago • 20

On the Compositional Generalization of Multimodal LLMs for Medical Imaging

Paper • 2412.20070 • Published 10 days ago • 40

Explanatory Instructions: Towards Unified Vision Tasks Understanding and Zero-shot Generalization

Paper • 2412.18525 • Published 13 days ago • 63

upvoted 2 papers 7 days ago

1.58-bit FLUX

Paper • 2412.18653 • Published 13 days ago • 67

HuatuoGPT-o1, Towards Medical Complex Reasoning with LLMs

Paper • 2412.18925 • Published 12 days ago • 86