Clelia (Astra) Bertelli's picture

Clelia (Astra) Bertelli PRO

as-cle-bert

AI & ML interests

Biology + Artificial Intelligence = โค๏ธ | AI for sustainable development, sustainable development for AI | Researching on Machine Learning Enhancement | I love automation for everyday things | Blogger | Open Source

Recent Activity

replied to their post about 3 hours ago
๐ŸŽ‰๐„๐š๐ซ๐ฅ๐ฒ ๐๐ž๐ฐ ๐˜๐ž๐š๐ซ ๐ซ๐ž๐ฅ๐ž๐š๐ฌ๐ž๐ฌ๐ŸŽ‰ Hi HuggingFacers๐Ÿค—, I decided to ship early this year, and here's what I came up with: ๐๐๐Ÿ๐ˆ๐ญ๐ƒ๐จ๐ฐ๐ง (https://github.com/AstraBert/PdfItDown) - If you're like me, and you have all your RAG pipeline optimized for PDFs, but not for other data formats, here is your solution! With PdfItDown, you can convert Word documents, presentations, HTML pages, markdown sheets and (why not?) CSVs and XMLs in PDF format, for seamless integration with your RAG pipelines. Built upon MarkItDown by Microsoft GitHub Repo ๐Ÿ‘‰ https://github.com/AstraBert/PdfItDown PyPi Package ๐Ÿ‘‰ https://pypi.org/project/pdfitdown/ ๐’๐ž๐ง๐“๐ซ๐„๐ฏ ๐ฏ๐Ÿ.๐ŸŽ.๐ŸŽ (https://github.com/AstraBert/SenTrEv/tree/v1.0.0) - If you need to evaluate the ๐—ฟ๐—ฒ๐˜๐—ฟ๐—ถ๐—ฒ๐˜ƒ๐—ฎ๐—น performance of your ๐˜๐—ฒ๐˜…๐˜ ๐—ฒ๐—บ๐—ฏ๐—ฒ๐—ฑ๐—ฑ๐—ถ๐—ป๐—ด models, I have good news for you๐Ÿฅณ๐Ÿฅณ The new release for ๐’๐ž๐ง๐“๐ซ๐„๐ฏ now supports ๐—ฑ๐—ฒ๐—ป๐˜€๐—ฒ and ๐˜€๐—ฝ๐—ฎ๐—ฟ๐˜€๐—ฒ retrieval (thanks to FastEmbed by Qdrant) with ๐˜๐—ฒ๐˜…๐˜-๐—ฏ๐—ฎ๐˜€๐—ฒ๐—ฑ ๐—ณ๐—ถ๐—น๐—ฒ ๐—ณ๐—ผ๐—ฟ๐—บ๐—ฎ๐˜๐˜€ (.docx, .pptx, .csv, .html, .xml, .md, .pdf) and new ๐—ฟ๐—ฒ๐—น๐—ฒ๐˜ƒ๐—ฎ๐—ป๐—ฐ๐—ฒ ๐—บ๐—ฒ๐˜๐—ฟ๐—ถ๐—ฐ๐˜€! GitHub repo ๐Ÿ‘‰ https://github.com/AstraBert/SenTrEv Release Notes ๐Ÿ‘‰ https://github.com/AstraBert/SenTrEv/releases/tag/v1.0.0 PyPi Package ๐Ÿ‘‰ https://pypi.org/project/sentrev/ Happy New Year and have fun!๐Ÿฅ‚
posted an update about 9 hours ago
๐ŸŽ‰๐„๐š๐ซ๐ฅ๐ฒ ๐๐ž๐ฐ ๐˜๐ž๐š๐ซ ๐ซ๐ž๐ฅ๐ž๐š๐ฌ๐ž๐ฌ๐ŸŽ‰ Hi HuggingFacers๐Ÿค—, I decided to ship early this year, and here's what I came up with: ๐๐๐Ÿ๐ˆ๐ญ๐ƒ๐จ๐ฐ๐ง (https://github.com/AstraBert/PdfItDown) - If you're like me, and you have all your RAG pipeline optimized for PDFs, but not for other data formats, here is your solution! With PdfItDown, you can convert Word documents, presentations, HTML pages, markdown sheets and (why not?) CSVs and XMLs in PDF format, for seamless integration with your RAG pipelines. Built upon MarkItDown by Microsoft GitHub Repo ๐Ÿ‘‰ https://github.com/AstraBert/PdfItDown PyPi Package ๐Ÿ‘‰ https://pypi.org/project/pdfitdown/ ๐’๐ž๐ง๐“๐ซ๐„๐ฏ ๐ฏ๐Ÿ.๐ŸŽ.๐ŸŽ (https://github.com/AstraBert/SenTrEv/tree/v1.0.0) - If you need to evaluate the ๐—ฟ๐—ฒ๐˜๐—ฟ๐—ถ๐—ฒ๐˜ƒ๐—ฎ๐—น performance of your ๐˜๐—ฒ๐˜…๐˜ ๐—ฒ๐—บ๐—ฏ๐—ฒ๐—ฑ๐—ฑ๐—ถ๐—ป๐—ด models, I have good news for you๐Ÿฅณ๐Ÿฅณ The new release for ๐’๐ž๐ง๐“๐ซ๐„๐ฏ now supports ๐—ฑ๐—ฒ๐—ป๐˜€๐—ฒ and ๐˜€๐—ฝ๐—ฎ๐—ฟ๐˜€๐—ฒ retrieval (thanks to FastEmbed by Qdrant) with ๐˜๐—ฒ๐˜…๐˜-๐—ฏ๐—ฎ๐˜€๐—ฒ๐—ฑ ๐—ณ๐—ถ๐—น๐—ฒ ๐—ณ๐—ผ๐—ฟ๐—บ๐—ฎ๐˜๐˜€ (.docx, .pptx, .csv, .html, .xml, .md, .pdf) and new ๐—ฟ๐—ฒ๐—น๐—ฒ๐˜ƒ๐—ฎ๐—ป๐—ฐ๐—ฒ ๐—บ๐—ฒ๐˜๐—ฟ๐—ถ๐—ฐ๐˜€! GitHub repo ๐Ÿ‘‰ https://github.com/AstraBert/SenTrEv Release Notes ๐Ÿ‘‰ https://github.com/AstraBert/SenTrEv/releases/tag/v1.0.0 PyPi Package ๐Ÿ‘‰ https://pypi.org/project/sentrev/ Happy New Year and have fun!๐Ÿฅ‚
View all activity

Articles

Organizations

Social Post Explorers's profile picture Hugging Face Discord Community's profile picture GreenFit AI's profile picture

Posts 29

view post
Post
412
๐ŸŽ‰๐„๐š๐ซ๐ฅ๐ฒ ๐๐ž๐ฐ ๐˜๐ž๐š๐ซ ๐ซ๐ž๐ฅ๐ž๐š๐ฌ๐ž๐ฌ๐ŸŽ‰

Hi HuggingFacers๐Ÿค—, I decided to ship early this year, and here's what I came up with:

๐๐๐Ÿ๐ˆ๐ญ๐ƒ๐จ๐ฐ๐ง (https://github.com/AstraBert/PdfItDown) - If you're like me, and you have all your RAG pipeline optimized for PDFs, but not for other data formats, here is your solution! With PdfItDown, you can convert Word documents, presentations, HTML pages, markdown sheets and (why not?) CSVs and XMLs in PDF format, for seamless integration with your RAG pipelines. Built upon MarkItDown by Microsoft
GitHub Repo ๐Ÿ‘‰ https://github.com/AstraBert/PdfItDown
PyPi Package ๐Ÿ‘‰ https://pypi.org/project/pdfitdown/

๐’๐ž๐ง๐“๐ซ๐„๐ฏ ๐ฏ๐Ÿ.๐ŸŽ.๐ŸŽ (https://github.com/AstraBert/SenTrEv/tree/v1.0.0) - If you need to evaluate the ๐—ฟ๐—ฒ๐˜๐—ฟ๐—ถ๐—ฒ๐˜ƒ๐—ฎ๐—น performance of your ๐˜๐—ฒ๐˜…๐˜ ๐—ฒ๐—บ๐—ฏ๐—ฒ๐—ฑ๐—ฑ๐—ถ๐—ป๐—ด models, I have good news for you๐Ÿฅณ๐Ÿฅณ
The new release for ๐’๐ž๐ง๐“๐ซ๐„๐ฏ now supports ๐—ฑ๐—ฒ๐—ป๐˜€๐—ฒ and ๐˜€๐—ฝ๐—ฎ๐—ฟ๐˜€๐—ฒ retrieval (thanks to FastEmbed by Qdrant) with ๐˜๐—ฒ๐˜…๐˜-๐—ฏ๐—ฎ๐˜€๐—ฒ๐—ฑ ๐—ณ๐—ถ๐—น๐—ฒ ๐—ณ๐—ผ๐—ฟ๐—บ๐—ฎ๐˜๐˜€ (.docx, .pptx, .csv, .html, .xml, .md, .pdf) and new ๐—ฟ๐—ฒ๐—น๐—ฒ๐˜ƒ๐—ฎ๐—ป๐—ฐ๐—ฒ ๐—บ๐—ฒ๐˜๐—ฟ๐—ถ๐—ฐ๐˜€!
GitHub repo ๐Ÿ‘‰ https://github.com/AstraBert/SenTrEv
Release Notes ๐Ÿ‘‰ https://github.com/AstraBert/SenTrEv/releases/tag/v1.0.0
PyPi Package ๐Ÿ‘‰ https://pypi.org/project/sentrev/

Happy New Year and have fun!๐Ÿฅ‚
view post
Post
458
Hi HF Community!๐Ÿค—

As my last 2024 contribution, I decided to write an article about a Competitive Debate Championship simulation I ran with 5 LLMs as competitors and 2 as judges:

https://huggingface.co/blog/as-cle-bert/debate-championship-for-llms

The article covers code, analyses and results, and you can find everything to reproduce this tournament in the GitHub repo ๐Ÿ‘‰ https://github.com/AstraBert/DebateLLM-Championship

I also released a dataset related to the data (motions, arguments, topics, winners...) collected during the tournament ๐Ÿ‘‰ as-cle-bert/DebateLLMs

Happy reading and happy new yeAIr!๐ŸŽ‰