FirefliesAudio

🏠 Home ❤️ Liked ⏳ History

Reinforcement Learning from Human Feedback explained with math derivations and the PyTorch code.

⏱ 2:15:13 | 👁 71 mil visualizações | 🗓 2 years ago

🎵 Baixar MP3 🎥 Baixar MP4

Vídeos relacionados

baixar Direct Preference Optimization (DPO) explained: Bradley-Terry model, log probabilities, math mp3

Direct Preference Optimization (DPO) explained: Bradley-Terry model, log probabilities, math

36k • 2 years ago

baixar [Full Workshop] Reinforcement Learning, Kernels, Reasoning, Quantization & Agents — Daniel Han mp3

[Full Workshop] Reinforcement Learning, Kernels, Reasoning, Quantization & Agents — Daniel Han

116k • 10 months ago

baixar Distributed Training with PyTorch: complete tutorial with cloud infrastructure and code mp3

Distributed Training with PyTorch: complete tutorial with cloud infrastructure and code

38k • 2 years ago

baixar Coding Stable Diffusion from scratch in PyTorch mp3

Coding Stable Diffusion from scratch in PyTorch

217k • 2 years ago

baixar PPO Implementation from Scratch | Reinforcement Learning mp3

PPO Implementation from Scratch | Reinforcement Learning

17k • 1 year ago

baixar RLHF in 90 min mp3

RLHF in 90 min

5.8k • 8 months ago

baixar Yann LeCun | Self-Supervised Learning, JEPA, World Models, and the future of AI mp3

Yann LeCun | Self-Supervised Learning, JEPA, World Models, and the future of AI

125k • 8 months ago

baixar Terence Tao: Nobody Understands Why AI Actually Works mp3

Terence Tao: Nobody Understands Why AI Actually Works

248k • 5 months ago

baixar Reinforcement Learning: A (practical) introduction mp3

Reinforcement Learning: A (practical) introduction

8.2k • 4 months ago

baixar [GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models mp3

[GRPO Explained] DeepSeekMath: Pushing the Limits of Mathematical Reasoning in Open Language Models

171k • 1 year ago

baixar Proximal Policy Optimization (PPO) for LLMs Explained Intuitively mp3

Proximal Policy Optimization (PPO) for LLMs Explained Intuitively

56k • 1 year ago

baixar Reinforcement Learning from Human Feedback: From Zero to chatGPT mp3

Reinforcement Learning from Human Feedback: From Zero to chatGPT

188k • Streamed 3 years ago