--- base_model: nisten/Biggie-SmoLlm-0.15B-Base license: mit datasets: - LDJnr/Capybara - andthattoo/subqueries pipeline_tag: text-generation tags: - llama --- ### Fine-tuned [Biggie-SmoLlm-0.15B-Base](https://huggingface.co/nisten/Biggie-SmoLlm-0.15B-Base) for generating subqueries This dude is trained for boosting the performance of your RAG based question answering app My motivation was to tackle a core problem of RAG with an extremely lightweight, but capable model. If queries are - multi-hop logic, break into simpler subqueries that focuses on a different step - vague, ask follow up questions - multiple sub questions, generate multiple queries for each of them Training data was generated with [Dria](https://dria.co): A decentralized p2p network for synthetic data. Join [discord](https://discord.gg/dria) to help decentralized data generation. Heads up: [Ollama](https://ollama.com/andthattoo/subquery-smollm) version works 160 tps on 1 CPU core. No GPU? No worries. This little dude’s got you. Use the model: ```python from transformers import AutoModel, AutoConfig, AutoTokenizer, AutoModelForCausalLM config = AutoConfig.from_pretrained("andthattoo/subquery-SmolLM") tokenizer = AutoTokenizer.from_pretrained("andthattoo/subquery-SmolLM") model = AutoModelForCausalLM.from_pretrained("andthattoo/subquery-SmolLM", torch_dtype=torch.bfloat16) if tokenizer.pad_token is None: tokenizer.pad_token = tokenizer.eos_token model.config.pad_token_id = model.config.eos_token_id input_data = "Generate subqueries for a given question. What is this?" inputs = tokenizer(input_data, return_tensors='pt') output = model.generate(**inputs, max_new_tokens=100) decoded_output = tokenizer.decode(output[0], skip_special_tokens=True) ``` Also created a python package for ease of use ```python pip install subquery ``` ```python from subquery import TransformersSubqueryGenerator # Using the Transformers backend generator = TransformersSubqueryGenerator() result = generator.generate("What is this?") print("Follow-up questions:", result.follow_up) print("Subqueries:", result.subquery) ``` or ```python from subquery import OllamaSubqueryGenerator # Using the Ollama backend generator = OllamaSubqueryGenerator() result = generator.generate("Are the Indiana Harbor and Ship Canal and the Folsom South Canal in the same state?") print("Follow-up questions:", result.follow_up) print("Subqueries:", result.subquery) ```