Furkan Gözükara

MonsterMMORPG

AI & ML interests

Check out my youtube page SECourses for Stable Diffusion tutorials. They will help you tremendously in every topic

Recent Activity

replied to their post about 4 hours ago
SANA: Ultra HD Fast Text to Image Model from NVIDIA Step by Step Tutorial on Windows, Cloud & Kaggle — Generate 2048x2048 Images Below is YouTube link for step by step tutorial and a 1-Click to installer having very advanced Gradio APP to use newest Text-to-Image SANA Model on your Windows PC locally and also on cloud services such as Massed Compute, RunPod and free Kaggle. https://youtu.be/KW-MHmoNcqo This above tutorial covers the newest SANA 2K model and I predict SANA 4K model will be published as well. Sana 2K model is 4 MegaPixel so it can generate the following aspect ratio and resolutions very well: “1:1”: (2048, 2048), “4:3”: (2304, 1792), “3:4”: (1792, 2304), “3:2”: (2432, 1664), “2:3”: (1664, 2432), “16:9”: (2688, 1536), “9:16”: (1536, 2688), “21:9”: (3072, 1280), “9:21”: (1280, 3072), “4:5”: (1792, 2240), “5:4”: (2240, 1792) I have developed an amazing Gradio app with so many new features : VAE auto offloading to reduce VRAM usage significantly which is not exists on official pipeline Gradio APP built upon official pipeline with improvements so works perfect Batch size working perfect Number of images working perfect Multi-line prompting working perfect Aspect ratios for both 1K and 2K models working perfect Randomized seed working perfect 1-Click installers for Windows (using Python 3.10 and VENV — isolated), RunPod, Massed Compute and even a free Kaggle account notebook With proper latest libraries working perfect speed on Windows too Automatically properly saving every generated image into accurate folder 🔗 Full Instructions, Configs, Installers, Information and Links Shared Post (the one used in the tutorial) ⤵️ ▶️ https://www.patreon.com/posts/click-to-open-post-used-in-tutorial-116474081 🔗 SECourses Official Discord 9500+ Members ⤵️ ▶️ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388
reacted to their post with 🤯 about 8 hours ago
SANA: Ultra HD Fast Text to Image Model from NVIDIA Step by Step Tutorial on Windows, Cloud & Kaggle — Generate 2048x2048 Images Below is YouTube link for step by step tutorial and a 1-Click to installer having very advanced Gradio APP to use newest Text-to-Image SANA Model on your Windows PC locally and also on cloud services such as Massed Compute, RunPod and free Kaggle. https://youtu.be/KW-MHmoNcqo This above tutorial covers the newest SANA 2K model and I predict SANA 4K model will be published as well. Sana 2K model is 4 MegaPixel so it can generate the following aspect ratio and resolutions very well: “1:1”: (2048, 2048), “4:3”: (2304, 1792), “3:4”: (1792, 2304), “3:2”: (2432, 1664), “2:3”: (1664, 2432), “16:9”: (2688, 1536), “9:16”: (1536, 2688), “21:9”: (3072, 1280), “9:21”: (1280, 3072), “4:5”: (1792, 2240), “5:4”: (2240, 1792) I have developed an amazing Gradio app with so many new features : VAE auto offloading to reduce VRAM usage significantly which is not exists on official pipeline Gradio APP built upon official pipeline with improvements so works perfect Batch size working perfect Number of images working perfect Multi-line prompting working perfect Aspect ratios for both 1K and 2K models working perfect Randomized seed working perfect 1-Click installers for Windows (using Python 3.10 and VENV — isolated), RunPod, Massed Compute and even a free Kaggle account notebook With proper latest libraries working perfect speed on Windows too Automatically properly saving every generated image into accurate folder 🔗 Full Instructions, Configs, Installers, Information and Links Shared Post (the one used in the tutorial) ⤵️ ▶️ https://www.patreon.com/posts/click-to-open-post-used-in-tutorial-116474081 🔗 SECourses Official Discord 9500+ Members ⤵️ ▶️ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388
reacted to their post with 🤝 about 8 hours ago
SANA: Ultra HD Fast Text to Image Model from NVIDIA Step by Step Tutorial on Windows, Cloud & Kaggle — Generate 2048x2048 Images Below is YouTube link for step by step tutorial and a 1-Click to installer having very advanced Gradio APP to use newest Text-to-Image SANA Model on your Windows PC locally and also on cloud services such as Massed Compute, RunPod and free Kaggle. https://youtu.be/KW-MHmoNcqo This above tutorial covers the newest SANA 2K model and I predict SANA 4K model will be published as well. Sana 2K model is 4 MegaPixel so it can generate the following aspect ratio and resolutions very well: “1:1”: (2048, 2048), “4:3”: (2304, 1792), “3:4”: (1792, 2304), “3:2”: (2432, 1664), “2:3”: (1664, 2432), “16:9”: (2688, 1536), “9:16”: (1536, 2688), “21:9”: (3072, 1280), “9:21”: (1280, 3072), “4:5”: (1792, 2240), “5:4”: (2240, 1792) I have developed an amazing Gradio app with so many new features : VAE auto offloading to reduce VRAM usage significantly which is not exists on official pipeline Gradio APP built upon official pipeline with improvements so works perfect Batch size working perfect Number of images working perfect Multi-line prompting working perfect Aspect ratios for both 1K and 2K models working perfect Randomized seed working perfect 1-Click installers for Windows (using Python 3.10 and VENV — isolated), RunPod, Massed Compute and even a free Kaggle account notebook With proper latest libraries working perfect speed on Windows too Automatically properly saving every generated image into accurate folder 🔗 Full Instructions, Configs, Installers, Information and Links Shared Post (the one used in the tutorial) ⤵️ ▶️ https://www.patreon.com/posts/click-to-open-post-used-in-tutorial-116474081 🔗 SECourses Official Discord 9500+ Members ⤵️ ▶️ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388
View all activity

Articles

Organizations

Social Post Explorers's profile picture Hugging Face Discord Community's profile picture

Posts 52

view post
Post
403
SANA: Ultra HD Fast Text to Image Model from NVIDIA Step by Step Tutorial on Windows, Cloud & Kaggle — Generate 2048x2048 Images

Below is YouTube link for step by step tutorial and a 1-Click to installer having very advanced Gradio APP to use newest Text-to-Image SANA Model on your Windows PC locally and also on cloud services such as Massed Compute, RunPod and free Kaggle.

https://youtu.be/KW-MHmoNcqo

This above tutorial covers the newest SANA 2K model and I predict SANA 4K model will be published as well. Sana 2K model is 4 MegaPixel so it can generate the following aspect ratio and resolutions very well:

“1:1”: (2048, 2048), “4:3”: (2304, 1792), “3:4”: (1792, 2304),
“3:2”: (2432, 1664), “2:3”: (1664, 2432), “16:9”: (2688, 1536),
“9:16”: (1536, 2688), “21:9”: (3072, 1280), “9:21”: (1280, 3072),
“4:5”: (1792, 2240), “5:4”: (2240, 1792)

I have developed an amazing Gradio app with so many new features :

VAE auto offloading to reduce VRAM usage significantly which is not exists on official pipeline

Gradio APP built upon official pipeline with improvements so works perfect

Batch size working perfect

Number of images working perfect

Multi-line prompting working perfect

Aspect ratios for both 1K and 2K models working perfect

Randomized seed working perfect

1-Click installers for Windows (using Python 3.10 and VENV — isolated), RunPod, Massed Compute and even a free Kaggle account notebook

With proper latest libraries working perfect speed on Windows too

Automatically properly saving every generated image into accurate folder

🔗 Full Instructions, Configs, Installers, Information and Links Shared Post (the one used in the tutorial) ⤵️
▶️ https://www.patreon.com/posts/click-to-open-post-used-in-tutorial-116474081

🔗 SECourses Official Discord 9500+ Members ⤵️
▶️ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388

view post
Post
2746
Best open source Image to Video CogVideoX1.5-5B-I2V is pretty decent and optimized for low VRAM machines with high resolution - native resolution is 1360px and up to 10 seconds 161 frames - audios generated with new open source audio model

Full YouTube tutorial for CogVideoX1.5-5B-I2V : https://youtu.be/5UCkMzP2VLE

1-Click Windows, RunPod and Massed Compute installers : https://www.patreon.com/posts/112848192

https://www.patreon.com/posts/112848192 - installs into Python 3.11 VENV

Official Hugging Face repo of CogVideoX1.5-5B-I2V : THUDM/CogVideoX1.5-5B-I2V

Official github repo : https://github.com/THUDM/CogVideo

Used prompts to generate videos txt file : https://gist.github.com/FurkanGozukara/471db7b987ab8d9877790358c126ac05

Demo images shared in : https://www.patreon.com/posts/112848192

I used 1360x768px images at 16 FPS and 81 frames = 5 seconds

+1 frame coming from initial image

Also I have enabled all the optimizations shared on Hugging Face

pipe.enable_sequential_cpu_offload()

pipe.vae.enable_slicing()

pipe.vae.enable_tiling()

quantization = int8_weight_only - you need TorchAO and DeepSpeed works great on Windows with Python 3.11 VENV

Used audio model : https://github.com/hkchengrex/MMAudio

1-Click Windows, RunPod and Massed Compute Installers for MMAudio : https://www.patreon.com/posts/117990364

https://www.patreon.com/posts/117990364 - Installs into Python 3.10 VENV

Used very simple prompts - it fails when there is human in input video so use text to audio in such cases

I also tested some VRAM usages for CogVideoX1.5-5B-I2V

Resolutions and here their VRAM requirements - may work on lower VRAM GPUs too but slower

512x288 - 41 frames : 7700 MB , 576x320 - 41 frames : 7900 MB

576x320 - 81 frames : 8850 MB , 704x384 - 81 frames : 8950 MB

768x432 - 81 frames : 10600 MB , 896x496 - 81 frames : 12050 MB

896x496 - 81 frames : 12050 MB , 960x528 - 81 frames : 12850 MB