UnifiedCrawl: Aggregated Common Crawl for Affordable Adaptation of LLMs on Low-Resource Languages Paper ā¢ 2411.14343 ā¢ Published Nov 21, 2024 ā¢ 7
nvidia/Llama-3.1-Nemotron-70B-Instruct-HF Text Generation ā¢ Updated Oct 25, 2024 ā¢ 284k ā¢ 1.97k
view reply Interesting, but how does this approach generalize to arbitrary user query / document domains? Would you need to train a separate network for each domain / dataset?
view article Article GaLore: Advancing Large Model Training on Consumer-grade Hardware Mar 20, 2024 ā¢ 26
Qwen2-VL Collection Vision-language model series based on Qwen2 ā¢ 16 items ā¢ Updated Dec 6, 2024 ā¢ 187