andthattoo
/

subquery-SmolLM

Text Generation

Model card Files Files and versions Community

andthattoo commited on Aug 16, 2024

Commit

8241242

·

verified ·

1 Parent(s): 4e12e69

Create README.md

Files changed (1) hide show

README.md +70 -0

README.md ADDED Viewed

	@@ -0,0 +1,70 @@

+---
+base_model: nisten/Biggie-SmoLlm-0.15B-Base
+license: mit
+datasets:
+- LDJnr/Capybara
+pipeline_tag: text-generation
+tags:
+- llama
+---
+### Fine-tuned [Biggie-SmoLlm-0.15B-Base](https://huggingface.co/nisten/Biggie-SmoLlm-0.15B-Base) for generating subqueries
+This dude is trained for boosting the performance of your IR app, or RAG
+My motivation was to tackle a core problem of IR with an extremely lightweight, but capable model.
+If queries are
+- multi-hop logic, break into simpler subqueries that focuses on a different step
+- vague, ask follow up questions
+- multiple sub questions, generate multiple queries for each of them
+Heads up: [Ollama](https://ollama.com/andthattoo/subquery-smollm) version works 160 tps on 1 CPU core. No GPU? No worries. This little dude’s got you.
+Use the model:
+```python
+from transformers import AutoModel, AutoConfig, AutoTokenizer, AutoModelForCausalLM
+config = AutoConfig.from_pretrained("andthattoo/subquery-SmolLM")
+tokenizer = AutoTokenizer.from_pretrained("andthattoo/subquery-SmolLM")
+model = AutoModelForCausalLM.from_pretrained("andthattoo/subquery-SmolLM", torch_dtype=torch.bfloat16)
+if tokenizer.pad_token is None:
+    tokenizer.pad_token = tokenizer.eos_token
+    model.config.pad_token_id = model.config.eos_token_id
+input_data = "Generate subqueries for a given question. <question>What is this?</question>"
+inputs = tokenizer(input_data, return_tensors='pt')
+output = model.generate(**inputs, max_new_tokens=100)
+decoded_output = tokenizer.decode(output[0], skip_special_tokens=True)
+```
+Also created a python package for ease of use
+```python
+pip install subquery
+```
+```python
+from subquery import TransformersSubqueryGenerator
+# Using the Transformers backend
+generator = TransformersSubqueryGenerator()
+result = generator.generate("What is this?")
+print("Follow-up questions:", result.follow_up)
+print("Subqueries:", result.subquery)
+```
+or
+```python
+from subquery import OllamaSubqueryGenerator
+# Using the Ollama backend
+generator = OllamaSubqueryGenerator()
+result = generator.generate("Are the Indiana Harbor and Ship Canal and the Folsom South Canal in the same state?")
+print("Follow-up questions:", result.follow_up)
+print("Subqueries:", result.subquery)
+```