397 6 2

Furkan Gözükara

MonsterMMORPG

https://www.youtube.com/@SECourses

AI & ML interests

Check out my youtube page SECourses for Stable Diffusion tutorials. They will help you tremendously in every topic

Recent Activity

replied to their post about 4 hours ago

SANA: Ultra HD Fast Text to Image Model from NVIDIA Step by Step Tutorial on Windows, Cloud & Kaggle — Generate 2048x2048 Images Below is YouTube link for step by step tutorial and a 1-Click to installer having very advanced Gradio APP to use newest Text-to-Image SANA Model on your Windows PC locally and also on cloud services such as Massed Compute, RunPod and free Kaggle. https://youtu.be/KW-MHmoNcqo This above tutorial covers the newest SANA 2K model and I predict SANA 4K model will be published as well. Sana 2K model is 4 MegaPixel so it can generate the following aspect ratio and resolutions very well: “1:1”: (2048, 2048), “4:3”: (2304, 1792), “3:4”: (1792, 2304), “3:2”: (2432, 1664), “2:3”: (1664, 2432), “16:9”: (2688, 1536), “9:16”: (1536, 2688), “21:9”: (3072, 1280), “9:21”: (1280, 3072), “4:5”: (1792, 2240), “5:4”: (2240, 1792) I have developed an amazing Gradio app with so many new features : VAE auto offloading to reduce VRAM usage significantly which is not exists on official pipeline Gradio APP built upon official pipeline with improvements so works perfect Batch size working perfect Number of images working perfect Multi-line prompting working perfect Aspect ratios for both 1K and 2K models working perfect Randomized seed working perfect 1-Click installers for Windows (using Python 3.10 and VENV — isolated), RunPod, Massed Compute and even a free Kaggle account notebook With proper latest libraries working perfect speed on Windows too Automatically properly saving every generated image into accurate folder 🔗 Full Instructions, Configs, Installers, Information and Links Shared Post (the one used in the tutorial) ⤵️ ▶️ https://www.patreon.com/posts/click-to-open-post-used-in-tutorial-116474081 🔗 SECourses Official Discord 9500+ Members ⤵️ ▶️ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388

reacted to their post with 🤯 about 8 hours ago

reacted to their post with 🤝 about 8 hours ago

View all activity

Articles

Full Training Tutorial and Guide and Research For a FLUX Style

Sep 8, 2024

• 5

20 New SDXL Fine Tuning Tests and Their Results (Better Workflow Obtained and Published)

Aug 13, 2024

• 1

Batch size 30 AdamW vs Batch Size 1 Adafactor SDXL Training Comparison

Aug 8, 2024

• 2

Expert-Level Tutorials on Stable Diffusion & SDXL: Master Advanced Techniques and Strategies

Jun 3, 2024

• 3

Organizations

Posts 52

Post

403

SANA: Ultra HD Fast Text to Image Model from NVIDIA Step by Step Tutorial on Windows, Cloud & Kaggle — Generate 2048x2048 Images

Below is YouTube link for step by step tutorial and a 1-Click to installer having very advanced Gradio APP to use newest Text-to-Image SANA Model on your Windows PC locally and also on cloud services such as Massed Compute, RunPod and free Kaggle.

https://youtu.be/KW-MHmoNcqo

This above tutorial covers the newest SANA 2K model and I predict SANA 4K model will be published as well. Sana 2K model is 4 MegaPixel so it can generate the following aspect ratio and resolutions very well:

“1:1”: (2048, 2048), “4:3”: (2304, 1792), “3:4”: (1792, 2304),
“3:2”: (2432, 1664), “2:3”: (1664, 2432), “16:9”: (2688, 1536),
“9:16”: (1536, 2688), “21:9”: (3072, 1280), “9:21”: (1280, 3072),
“4:5”: (1792, 2240), “5:4”: (2240, 1792)

I have developed an amazing Gradio app with so many new features :

VAE auto offloading to reduce VRAM usage significantly which is not exists on official pipeline

Gradio APP built upon official pipeline with improvements so works perfect

Batch size working perfect

Number of images working perfect

Multi-line prompting working perfect

Aspect ratios for both 1K and 2K models working perfect

Randomized seed working perfect

1-Click installers for Windows (using Python 3.10 and VENV — isolated), RunPod, Massed Compute and even a free Kaggle account notebook

With proper latest libraries working perfect speed on Windows too

Automatically properly saving every generated image into accurate folder

🔗 Full Instructions, Configs, Installers, Information and Links Shared Post (the one used in the tutorial) ⤵️
▶️ https://www.patreon.com/posts/click-to-open-post-used-in-tutorial-116474081

🔗 SECourses Official Discord 9500+ Members ⤵️
▶️ https://discord.com/servers/software-engineering-courses-secourses-772774097734074388

Post

2746

Best open source Image to Video CogVideoX1.5-5B-I2V is pretty decent and optimized for low VRAM machines with high resolution - native resolution is 1360px and up to 10 seconds 161 frames - audios generated with new open source audio model

Full YouTube tutorial for CogVideoX1.5-5B-I2V : https://youtu.be/5UCkMzP2VLE

1-Click Windows, RunPod and Massed Compute installers : https://www.patreon.com/posts/112848192

https://www.patreon.com/posts/112848192 - installs into Python 3.11 VENV

Official Hugging Face repo of CogVideoX1.5-5B-I2V : THUDM/CogVideoX1.5-5B-I2V

Official github repo : https://github.com/THUDM/CogVideo

Used prompts to generate videos txt file : https://gist.github.com/FurkanGozukara/471db7b987ab8d9877790358c126ac05

Demo images shared in : https://www.patreon.com/posts/112848192

I used 1360x768px images at 16 FPS and 81 frames = 5 seconds

+1 frame coming from initial image

Also I have enabled all the optimizations shared on Hugging Face

pipe.enable_sequential_cpu_offload()

pipe.vae.enable_slicing()

pipe.vae.enable_tiling()

quantization = int8_weight_only - you need TorchAO and DeepSpeed works great on Windows with Python 3.11 VENV

Used audio model : https://github.com/hkchengrex/MMAudio

1-Click Windows, RunPod and Massed Compute Installers for MMAudio : https://www.patreon.com/posts/117990364

https://www.patreon.com/posts/117990364 - Installs into Python 3.10 VENV

Used very simple prompts - it fails when there is human in input video so use text to audio in such cases

I also tested some VRAM usages for CogVideoX1.5-5B-I2V

Resolutions and here their VRAM requirements - may work on lower VRAM GPUs too but slower

512x288 - 41 frames : 7700 MB , 576x320 - 41 frames : 7900 MB

576x320 - 81 frames : 8850 MB , 704x384 - 81 frames : 8950 MB

768x432 - 81 frames : 10600 MB , 896x496 - 81 frames : 12050 MB

896x496 - 81 frames : 12050 MB , 960x528 - 81 frames : 12850 MB

View all posts