Grandma Hacking chatGPT | Jailbreaking LLMs using DAN | Extracting Prohibited Info | Not an Endorsement

https://is1-ssl.mzstatic.com/image/thumb/Podcasts221/v4/bb/e9/4e/bbe94ef9-d20f-a255-4eac-b28699cd49f3/mza_321834318294472925.jpg/600x600bb.jpg

Super Prompt: The Generative AI Podcast

Tony Wan

28 episodes

9 months ago

With great power comes great responsibility. How do Open AI, Anthropic, and Meta implement safety and ethics? As large language models (LLMs) get larger, the potential for using them for nefarious purposes looms larger as well. Anthropic uses Constitutional AI, while OpenAI uses a model spec, combined with RLHF (Reinforcement Learning from Human Feedback). Not to be confused with ROFL (Rolling On the Floor Laughing). Tune into this episode to learn how leading AI companies use their Spidey po...

All content for Super Prompt: The Generative AI Podcast is the property of Tony Wan and is served directly from their servers with no modification, redirects, or rehosting. The podcast is not affiliated with or endorsed by Podjoint in any way.

Grandma Hacking chatGPT | Jailbreaking LLMs using DAN | Extracting Prohibited Info | Not an Endorsement | Episode 15

Super Prompt: The Generative AI Podcast

23 minutes

2 years ago

Grandma Hacking chatGPT | Jailbreaking LLMs using DAN | Extracting Prohibited Info | Not an Endorsement | Episode 15

How do you extract prohibited information from ChatGPT? What are Grandma and DAN exploits? Why do they work? What can Large Language Model (LLM) companies do to protect themselves? Grandma exploits or hacks are ways to trick chatGPT into giving you information that is in violation of company policy. For example, tricking chatGPT to give you confidential, dangerous, or inappropriate information. "Jailbreaking” is a slang for removing the artificial limitations in iPhones to install...