Clelia (Astra) Bertelli PRO

as-cle-bert

https://www.cleliasportfolio.xyz

AI & ML interests

Recent Activity

posted an update about 12 hours ago

Are you using Obsidian to write your notes? If the answer is yes, then this post might be for you!✅ I recently created 𝐨𝐛𝐬𝐢𝐝𝐢𝐚𝐧-𝐝𝐢𝐠𝐞𝐬𝐭, a Google Gemini-powered application that gives you feedback on style and contents of the documents you have been working on🧠 Repo 👉 https://github.com/AstraBert/obsidian-digest PyPi Package 👉 https://pypi.org/project/obsidian-digest/ The app is available as: - 𝐜𝐨𝐦𝐦𝐚𝐧𝐝-𝐥𝐢𝐧𝐞 𝐭𝐨𝐨𝐥: install it as a python package with 𝗽𝗶𝗽, and execute it from terminal anytime!📦 -𝐃𝐢𝐬𝐜𝐨𝐫𝐝 𝐁𝐨𝐭 𝐛𝐮𝐢𝐥𝐭 𝐟𝐫𝐨𝐦 𝐬𝐨𝐮𝐫𝐜𝐞 𝐜𝐨𝐝𝐞: clone the GitHub repo, install the needed dependencies through 𝗰𝗼𝗻𝗱𝗮, and run the bot: you will get hourly messages with suggestions and considerations about your activity on Obsidian in the previous hour🤖 - 𝐃𝐢𝐬𝐜𝐨𝐫𝐝 𝐁𝐨𝐭 𝐝𝐞𝐩𝐥𝐨𝐲𝐞𝐝 𝐥𝐨𝐜𝐚𝐥𝐥𝐲 𝐰𝐢𝐭𝐡 𝐝𝐨𝐜𝐤𝐞𝐫 𝐜𝐨𝐦𝐩𝐨𝐬𝐞: clone the GitHub repo and launch 𝗱𝗼𝗰𝗸𝗲𝗿 𝗰𝗼𝗺𝗽𝗼𝘀𝗲 𝘂𝗽. Docker builds an image on the fly with all the needed dependencies and scripts, and runs them. You'll have the same functionalities as the ones from source code, but with a way easier deployment process🐋 Go check out the GitHub repo for more info 👉 https://github.com/AstraBert/obsidian-digest Have fun!✨

replied to their post 2 days ago

Hi HF Community!🤗 As my last 2024 contribution, I decided to write an article about a Competitive Debate Championship simulation I ran with 5 LLMs as competitors and 2 as judges: https://huggingface.co/blog/as-cle-bert/debate-championship-for-llms The article covers code, analyses and results, and you can find everything to reproduce this tournament in the GitHub repo 👉 https://github.com/AstraBert/DebateLLM-Championship I also released a dataset related to the data (motions, arguments, topics, winners...) collected during the tournament 👉 https://huggingface.co/datasets/as-cle-bert/DebateLLMs Happy reading and happy new yeAIr!🎉

replied to their post 2 days ago

🎉𝐄𝐚𝐫𝐥𝐲 𝐍𝐞𝐰 𝐘𝐞𝐚𝐫 𝐫𝐞𝐥𝐞𝐚𝐬𝐞𝐬🎉 Hi HuggingFacers🤗, I decided to ship early this year, and here's what I came up with: 𝐏𝐝𝐟𝐈𝐭𝐃𝐨𝐰𝐧 (https://github.com/AstraBert/PdfItDown) - If you're like me, and you have all your RAG pipeline optimized for PDFs, but not for other data formats, here is your solution! With PdfItDown, you can convert Word documents, presentations, HTML pages, markdown sheets and (why not?) CSVs and XMLs in PDF format, for seamless integration with your RAG pipelines. Built upon MarkItDown by Microsoft GitHub Repo 👉 https://github.com/AstraBert/PdfItDown PyPi Package 👉 https://pypi.org/project/pdfitdown/ 𝐒𝐞𝐧𝐓𝐫𝐄𝐯 𝐯𝟏.𝟎.𝟎 (https://github.com/AstraBert/SenTrEv/tree/v1.0.0) - If you need to evaluate the 𝗿𝗲𝘁𝗿𝗶𝗲𝘃𝗮𝗹 performance of your 𝘁𝗲𝘅𝘁 𝗲𝗺𝗯𝗲𝗱𝗱𝗶𝗻𝗴 models, I have good news for you🥳🥳 The new release for 𝐒𝐞𝐧𝐓𝐫𝐄𝐯 now supports 𝗱𝗲𝗻𝘀𝗲 and 𝘀𝗽𝗮𝗿𝘀𝗲 retrieval (thanks to FastEmbed by Qdrant) with 𝘁𝗲𝘅𝘁-𝗯𝗮𝘀𝗲𝗱 𝗳𝗶𝗹𝗲 𝗳𝗼𝗿𝗺𝗮𝘁𝘀 (.docx, .pptx, .csv, .html, .xml, .md, .pdf) and new 𝗿𝗲𝗹𝗲𝘃𝗮𝗻𝗰𝗲 𝗺𝗲𝘁𝗿𝗶𝗰𝘀! GitHub repo 👉 https://github.com/AstraBert/SenTrEv Release Notes 👉 https://github.com/AstraBert/SenTrEv/releases/tag/v1.0.0 PyPi Package 👉 https://pypi.org/project/sentrev/ Happy New Year and have fun!🥂

View all activity

Articles

Organizations

as-cle-bert's activity

posted an update about 12 hours ago

Post

193

Are you using Obsidian to write your notes?
If the answer is yes, then this post might be for you!✅
I recently created 𝐨𝐛𝐬𝐢𝐝𝐢𝐚𝐧-𝐝𝐢𝐠𝐞𝐬𝐭, a Google Gemini-powered application that gives you feedback on style and contents of the documents you have been working on🧠

Repo 👉 https://github.com/AstraBert/obsidian-digest
PyPi Package 👉 https://pypi.org/project/obsidian-digest/

The app is available as:
- 𝐜𝐨𝐦𝐦𝐚𝐧𝐝-𝐥𝐢𝐧𝐞 𝐭𝐨𝐨𝐥: install it as a python package with 𝗽𝗶𝗽, and execute it from terminal anytime!📦
-𝐃𝐢𝐬𝐜𝐨𝐫𝐝 𝐁𝐨𝐭 𝐛𝐮𝐢𝐥𝐭 𝐟𝐫𝐨𝐦 𝐬𝐨𝐮𝐫𝐜𝐞 𝐜𝐨𝐝𝐞: clone the GitHub repo, install the needed dependencies through 𝗰𝗼𝗻𝗱𝗮, and run the bot: you will get hourly messages with suggestions and considerations about your activity on Obsidian in the previous hour🤖
- 𝐃𝐢𝐬𝐜𝐨𝐫𝐝 𝐁𝐨𝐭 𝐝𝐞𝐩𝐥𝐨𝐲𝐞𝐝 𝐥𝐨𝐜𝐚𝐥𝐥𝐲 𝐰𝐢𝐭𝐡 𝐝𝐨𝐜𝐤𝐞𝐫 𝐜𝐨𝐦𝐩𝐨𝐬𝐞: clone the GitHub repo and launch 𝗱𝗼𝗰𝗸𝗲𝗿 𝗰𝗼𝗺𝗽𝗼𝘀𝗲 𝘂𝗽. Docker builds an image on the fly with all the needed dependencies and scripts, and runs them. You'll have the same functionalities as the ones from source code, but with a way easier deployment process🐋

Go check out the GitHub repo for more info 👉 https://github.com/AstraBert/obsidian-digest

Have fun!✨

replied to their post 2 days ago

Hi and thanks a lot for the specification!🥰

Just as a note from my side, in the article I specify that there is a difference between "open weights" and "open source" models, and I link this blog post: https://www.agora.software/en/llm-open-source-open-weight-or-proprietary/ for a deeper explanation of the difference. I never (and I would never) claimed that Llama is open source, let alone a free software (see the introduction in this article of mine on privacy and data "stealing" risks: https://huggingface.co/blog/as-cle-bert/build-an-ai-powered-search-engine-from-scratch).

And I would have gladly used also DeepSeek, if it had been available on HuggingChat! :)

I nevertheless highly appreciate your comment and I'll for sure be more cautious in using the word "open/open source" in the future. Thanks!✨

replied to their post 2 days ago

Both PdfItDown and SenTrEv only work with text for now: in future releases, support for image will be added :)
For text extraction, I use PyPDF + Langchain

posted an update 3 days ago

Post

1918

🎉𝐄𝐚𝐫𝐥𝐲 𝐍𝐞𝐰 𝐘𝐞𝐚𝐫 𝐫𝐞𝐥𝐞𝐚𝐬𝐞𝐬🎉

Hi HuggingFacers🤗, I decided to ship early this year, and here's what I came up with:

𝐏𝐝𝐟𝐈𝐭𝐃𝐨𝐰𝐧 (https://github.com/AstraBert/PdfItDown) - If you're like me, and you have all your RAG pipeline optimized for PDFs, but not for other data formats, here is your solution! With PdfItDown, you can convert Word documents, presentations, HTML pages, markdown sheets and (why not?) CSVs and XMLs in PDF format, for seamless integration with your RAG pipelines. Built upon MarkItDown by Microsoft
GitHub Repo 👉 https://github.com/AstraBert/PdfItDown
PyPi Package 👉 https://pypi.org/project/pdfitdown/

𝐒𝐞𝐧𝐓𝐫𝐄𝐯 𝐯𝟏.𝟎.𝟎 (https://github.com/AstraBert/SenTrEv/tree/v1.0.0) - If you need to evaluate the 𝗿𝗲𝘁𝗿𝗶𝗲𝘃𝗮𝗹 performance of your 𝘁𝗲𝘅𝘁 𝗲𝗺𝗯𝗲𝗱𝗱𝗶𝗻𝗴 models, I have good news for you🥳🥳
The new release for 𝐒𝐞𝐧𝐓𝐫𝐄𝐯 now supports 𝗱𝗲𝗻𝘀𝗲 and 𝘀𝗽𝗮𝗿𝘀𝗲 retrieval (thanks to FastEmbed by Qdrant) with 𝘁𝗲𝘅𝘁-𝗯𝗮𝘀𝗲𝗱 𝗳𝗶𝗹𝗲 𝗳𝗼𝗿𝗺𝗮𝘁𝘀 (.docx, .pptx, .csv, .html, .xml, .md, .pdf) and new 𝗿𝗲𝗹𝗲𝘃𝗮𝗻𝗰𝗲 𝗺𝗲𝘁𝗿𝗶𝗰𝘀!
GitHub repo 👉 https://github.com/AstraBert/SenTrEv
Release Notes 👉 https://github.com/AstraBert/SenTrEv/releases/tag/v1.0.0
PyPi Package 👉 https://pypi.org/project/sentrev/

Happy New Year and have fun!🥂

2 replies

reacted to nroggendorff's post with ➕ 3 days ago

Post

5078

hey nvidia, can you send me a gpu?
comment or react if you want ~~me~~ to get one too. 👉👈

22 replies

posted an update 5 days ago

Post

509

Hi HF Community!🤗

As my last 2024 contribution, I decided to write an article about a Competitive Debate Championship simulation I ran with 5 LLMs as competitors and 2 as judges:

https://huggingface.co/blog/as-cle-bert/debate-championship-for-llms

The article covers code, analyses and results, and you can find everything to reproduce this tournament in the GitHub repo 👉 https://github.com/AstraBert/DebateLLM-Championship

I also released a dataset related to the data (motions, arguments, topics, winners...) collected during the tournament 👉 as-cle-bert/DebateLLMs

Happy reading and happy new yeAIr!🎉

3 replies

upvoted an article 5 days ago

Article

Debate Championship for LLMs

•

5 days ago

• 4

published an article 5 days ago

Article

Debate Championship for LLMs

•

5 days ago

• 4

updated a dataset 5 days ago

as-cle-bert/DebateLLMs

Viewer • Updated 5 days ago • 20 • 13 • 2

liked a dataset 6 days ago

as-cle-bert/DebateLLMs

Viewer • Updated 5 days ago • 20 • 13 • 2

liked 2 models 6 days ago

microsoft/Phi-3.5-mini-instruct

Text Generation • Updated Sep 18, 2024 • 478k • • 736

google/gemma-2-2b-it

Text Generation • Updated Aug 27, 2024 • 381k • • 842

posted an update 9 days ago

Post

2149

I got my GitHub Wrapped for 2024 today!🥂

Get yours here on HuggingFace 👉 as-cle-bert/what-a-git-year

GitHub repo with the code to reproduce it 👉 https://github.com/AstraBert/what-a-git-year

Hope that everybody had a Git year!🎉

1 reply

updated 2 Spaces 10 days ago

Running

🐨

Pokemon Bot

A bot that knows a lot about Pokemons

Running

🐢

What A Git Year

Showcase your GitHub achievements in the past year!

posted an update 11 days ago

Post

1700

Hi HuggingFacers!🤶🏼

As my last 2024 project, I've dropped a Discord Bot that knows a lot about Pokemons🦋

GitHub 👉 https://github.com/AstraBert/Pokemon-Bot
Demo Space 👉 as-cle-bert/pokemon-bot

The bot integrates:
- Chat features (Cohere's Command-R) with RAG functionalities (hybrid search and reranking with Qdrant) and chat memory (managed through PostgreSQL) to produce information about Pokemons
- Image-based search to identify Pokemons from their images (via Qdrant)
- Card package random extraction and description

HuggingFace🤗, as usual, plays the most important role in the application stack, with the following models:

- sentence-transformers/LaBSE
- prithivida/Splade_PP_en_v1
- facebook/dinov2-large

And datasets:

- Karbo31881/Pokemon_images
- wanghaofan/pokemon-wiki-captions
- TheFusion21/PokemonCards

Have fun!🍕

liked a Space 11 days ago

Running

🐨

Pokemon Bot

A bot that knows a lot about Pokemons

liked a dataset 12 days ago

wanghaofan/pokemon-wiki-captions

Viewer • Updated Dec 9, 2022 • 898 • 107 • 6

liked a dataset 16 days ago

Karbo31881/Pokemon_images

Viewer • Updated Dec 29, 2023 • 1.81k • 95 • 1

posted an update 24 days ago

Post

598

Hi HF Community!

I just published a blog article on building PrAIvateSearch (https://github.com/AstraBert/PrAIvateSearch), a user-owend, local and open-source AI-powered search engine🔍:

https://huggingface.co/blog/as-cle-bert/build-an-ai-powered-search-engine-from-scratch

"Own your AI, search the web with it🌐😎"

Feel free to try it out and contribute to it on GitHub: let's make OSS AI grown and thrive!🚀

Clelia (Astra) Bertelli PRO

AI & ML interests

Recent Activity

Articles

Debate Championship for LLMs

Building an AI-powered search engine from scratch

streamlit_supabase_auth_ui

AI is turning nuclear: a review

Is AI carbon footprint worrisome?

_Repetita iuvant_: how to improve AI code generation

BrAIn: next generation neurons?

What is going on with AlphaFold3?

Organizations

as-cle-bert's activity

Debate Championship for LLMs

Debate Championship for LLMs

Pokemon Bot

What A Git Year

Pokemon Bot