FirefliesAudio

🏠 Home ❤️ Liked ⏳ History

Quantizing LLMs - How & Why (8-Bit, 4-Bit, GGUF & More)

⏱ 26:26 | 👁 25 mil visualizações | 🗓 1 year ago

🎵 Baixar MP3 🎥 Baixar MP4

Vídeos relacionados

baixar Language Model Merging - Techniques, Tools, and Implementations mp3

Language Model Merging - Techniques, Tools, and Implementations

4.2k • 1 year ago

baixar How LLMs survive in low precision | Quantization Fundamentals mp3

How LLMs survive in low precision | Quantization Fundamentals

56k • 1 year ago

baixar Improving RAG Retrieval by 60% with Fine-Tuned Embeddings mp3

Improving RAG Retrieval by 60% with Fine-Tuned Embeddings

25k • 1 year ago

baixar Knowledge Graph or Vector Database… Which is Better? mp3

Knowledge Graph or Vector Database… Which is Better?

117k • 1 year ago

baixar 400x Faster Embeddings! - Static & Distilled Embedding Models mp3

400x Faster Embeddings! - Static & Distilled Embedding Models

7.7k • 1 year ago

baixar The End of the GPU Era? 1-Bit LLMs Are Here. mp3

The End of the GPU Era? 1-Bit LLMs Are Here.

132k • 2 months ago

baixar Why Inference is hard.. mp3

Why Inference is hard..

158k • 1 month ago

baixar Quantization in vLLM: From Zero to Hero mp3

Quantization in vLLM: From Zero to Hero

1.5k • 10 months ago

baixar Teach LLM Something New 💡 LoRA Fine Tuning on Custom Data mp3

Teach LLM Something New 💡 LoRA Fine Tuning on Custom Data

109k • 10 months ago

baixar Do Reranking Models Actually Improve RAG? mp3

Do Reranking Models Actually Improve RAG?

14k • 11 months ago

baixar 1-Bit LLM: The Most Efficient LLM Possible? mp3

1-Bit LLM: The Most Efficient LLM Possible?

379k • 11 months ago

baixar Which Quantization Method is Right for You? (GPTQ vs. GGUF vs. AWQ) mp3

Which Quantization Method is Right for You? (GPTQ vs. GGUF vs. AWQ)

39k • 2 years ago