Jenish-23 (Jenish-23)

New activity in OpenGVLab/InternVL2_5-4B-AWQ 18 days ago

Is it possible to use this model with huggingface's transformers library?

1

#2 opened 18 days ago by

Jenish-23

liked 3 models 9 months ago

liked a dataset 10 months ago

numind/NuNER

Viewer • Updated Mar 19, 2024 • 2M • 111 • 32

liked 5 models 10 months ago

vikhyatk/moondream2

Image-Text-to-Text • Updated Nov 15, 2024 • 163k • 769

BAAI/bge-m3

prithivida/bge-base-en-v1.5-gguf

Updated Mar 2, 2024 • 31 • 1

cross-encoder/ms-marco-MiniLM-L-6-v2

Text Classification • Updated 25 days ago • 4.91M • 64

colbert-ir/colbertv2.0

Updated Apr 5, 2024 • 1.01M • 227

replied to smangrul's post 10 months ago

Are there any examples or notebooks showing how to use AWQ in LORA fine-tuning a LLM? Or just use AWQ model from huggingface directly? I'm asking as neither the docs nor the Release notes explain anything.

reacted to smangrul's post with 👍 10 months ago

Post

🚨 New Release of 🤗PEFT!

1. New methods for merging LoRA weights. Refer this HF Post for more details: https://huggingface.co/posts/smangrul/850816632583824

2. AWQ and AQLM support for LoRA. You can now:
- Train adapters on top of 2-bit quantized models with AQLM
- Train adapters on top of powerful AWQ quantized models
Note for inference you can't merge the LoRA weights into the base model!

3. DoRA support: Enabling DoRA is as easy as adding use_dora=True to your LoraConfig. Find out more about this method here: https://arxiv.org/abs/2402.09353

4. Improved documentation, particularly docs regarding PEFT LoRA+DeepSpeed and PEFT LoRA+FSDP! 📄 Check out the docs at https://huggingface.co/docs/peft/index.

5. Full Release Notes: https://github.com/huggingface/peft/releases/tag/v0.9.0

5 replies

·

reacted to akhaliq's post with 👍 11 months ago

Post

MobileLLM

Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases (2402.14905)

paper addresses the growing need for efficient large language models (LLMs) on mobile devices, driven by increasing cloud costs and latency concerns. We focus on designing top-quality LLMs with fewer than a billion parameters, a practical choice for mobile deployment. Contrary to prevailing belief emphasizing the pivotal role of data and parameter quantity in determining model quality, our investigation underscores the significance of model architecture for sub-billion scale LLMs. Leveraging deep and thin architectures, coupled with embedding sharing and grouped-query attention mechanisms, we establish a strong baseline network denoted as MobileLLM, which attains a remarkable 2.7%/4.3% accuracy boost over preceding 125M/350M state-of-the-art models. Additionally, we propose an immediate block-wise weight sharing approach with no increase in model size and only marginal latency overhead. The resultant models, denoted as MobileLLM-LS, demonstrate a further accuracy enhancement of 0.7%/0.8% than MobileLLM 125M/350M. Moreover, MobileLLM model family shows significant improvements compared to previous sub-billion models on chat benchmarks, and demonstrates close correctness to LLaMA-v2 7B in API calling tasks, highlighting the capability of small models for common on-device use cases.

commented a paper 11 months ago

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Paper • 2402.14905 • Published Feb 22, 2024 • 126 •

13

upvoted a paper 11 months ago

MobileLLM: Optimizing Sub-billion Parameter Language Models for On-Device Use Cases

Paper • 2402.14905 • Published Feb 22, 2024 • 126

reacted to mrfakename's post with 👍 11 months ago

Post

Hugging Face announces Cosmo 1B, a fully open sourced Phi competitor with an open sourced dataset. The dataset references various articles and textbooks as "seed data" to generate conversations. Licensed under the Apache 2.0 license. The dataset, dubbed "Cosmopedia," is published on the Hugging Face Hub under the Apache 2.0 license. It was generated using Mixtral 8x7B with various sources (AutoMathText, OpenStax, WikiHow, etc) as "seed data."

Model: HuggingFaceTB/cosmo-1b
Dataset: HuggingFaceTB/cosmopedia

upvoted a paper 11 months ago

OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement

Paper • 2402.14658 • Published Feb 22, 2024 • 82

reacted to akhaliq's post with 👍 11 months ago

Post

OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement

OpenCodeInterpreter: Integrating Code Generation with Execution and Refinement (2402.14658)

The introduction of large language models has significantly advanced code generation. However, open-source models often lack the execution capabilities and iterative refinement of advanced systems like the GPT-4 Code Interpreter. To address this, we introduce OpenCodeInterpreter, a family of open-source code systems designed for generating, executing, and iteratively refining code. Supported by Code-Feedback, a dataset featuring 68K multi-turn interactions, OpenCodeInterpreter integrates execution and human feedback for dynamic code refinement. Our comprehensive evaluation of OpenCodeInterpreter across key benchmarks such as HumanEval, MBPP, and their enhanced versions from EvalPlus reveals its exceptional performance. Notably, OpenCodeInterpreter-33B achieves an accuracy of 83.2 (76.4) on the average (and plus versions) of HumanEval and MBPP, closely rivaling GPT-4's 84.2 (76.2) and further elevates to 91.6 (84.6) with synthesized human feedback from GPT-4. OpenCodeInterpreter brings the gap between open-source code generation models and proprietary systems like GPT-4 Code Interpreter.