Otimização de Política Proximal (PPO) para LLMs explicada intuitivamente

⏱ 22:03 | 👁 56 mil visualizações | 🗓 1 year ago
🎵 Baixar MP3 🎥 Baixar MP4

Vídeos relacionados

baixar Proximal Policy Optimization (PPO) - How to train Large Language Models mp3 38:24

Proximal Policy Optimization (PPO) - How to train Large Language Models

85k • 2 years ago
baixar DeepSeek's GRPO (Group Relative Policy Optimization) | Reinforcement Learning for LLMs mp3 23:16

DeepSeek's GRPO (Group Relative Policy Optimization) | Reinforcement Learning for LLMs

46k • 1 year ago
baixar The (Un)Reliability of Reasoning in Frontier Models mp3 30:53

The (Un)Reliability of Reasoning in Frontier Models

28 • 6 days ago
baixar LLMs Don't Need More Parameters. They Need Loops. mp3 27:26

LLMs Don't Need More Parameters. They Need Loops.

273k • 3 months ago
baixar How LLMs Learn to Reason [GRPO] mp3 23:32

How LLMs Learn to Reason [GRPO]

12k • 1 year ago
baixar Mixture of Experts: How LLMs get bigger without getting slower mp3 26:42

Mixture of Experts: How LLMs get bigger without getting slower

32k • 1 year ago
baixar [Full Workshop] Reinforcement Learning, Kernels, Reasoning, Quantization & Agents — Daniel Han mp3 2:42:28

[Full Workshop] Reinforcement Learning, Kernels, Reasoning, Quantization & Agents — Daniel Han

116k • 10 months ago
baixar Reinforcement Learning from Human Feedback explained with math derivations and the PyTorch code. mp3 2:15:13

Reinforcement Learning from Human Feedback explained with math derivations and the PyTorch code.

71k • 2 years ago
baixar The FASTEST introduction to Reinforcement Learning on the internet mp3 1:33:28

The FASTEST introduction to Reinforcement Learning on the internet

459k • 1 year ago
baixar Proximal Policy Optimization (PPO) is Easy With PyTorch | Full PPO Tutorial mp3 1:02:47

Proximal Policy Optimization (PPO) is Easy With PyTorch | Full PPO Tutorial

87k • 5 years ago
baixar Why AI Agents are either the best or worst thing we’ve ever built mp3 20:19

Why AI Agents are either the best or worst thing we’ve ever built

1.4m • 1 month ago
baixar How to finetune LLMs to THINK with Reinforcement Learning (GRPO from scratch!) mp3 51:06

How to finetune LLMs to THINK with Reinforcement Learning (GRPO from scratch!)

26k • 11 months ago