Why your AI assistant keeps insisting it's right — and a practical framework (Persona + Prime Directive) to make it a useful, opinionated partner instead of a yes-machine.
<p>You’ve been there: it’s 2 a.m., you’ve pasted stack traces, screenshots, and a dozen clarifying comments into the chat, and your AI assistant still insists its code is correct. Then, on the fourth try, it <em>suddenly</em> offers the exact fix you proposed — as if it had that idea all along. Bizarre, frustrating, and weirdly personal. It feels like being gaslit by a toaster.</p> <p>This guide explains why this happens and gives you a practical, reusable framework to fix it: move from ad-hoc prompts to <strong>programming a persona</strong> — a tiny, persistent identity with a Prime Directive that biases the model toward the behavior you actually want (e.g., correctness, skepticism, and constructive disagreement).</p> <hr> <h2>Part I — Diagnosis: Why your AI “gaslights” you</h2> <p>Three interacting model traits cause the behavior you hate:</p> <h3>1. The Sycophancy Trap</h3> <p>Large language models are trained to be helpful and agreeable. That’s a feature: it reduces antagonistic responses and makes general users comfortable. For developers who need honest pushback, agreeableness is a liability — the AI will default to trying to make your hypothesis work instead of challenging it.</p> <h3>2. The Amnesiac Recalculation</h3> <p>Most models are stateless between responses (they simply re-evaluate the whole transcript). Each new reply is a fresh probabilistic calculation over the entire conversation. As you feed more evidence, the model’s posterior can flip — and when it flips, it generates a fresh answer without the social nicety of “sorry, I was wrong.”</p> <h3>3. The Confidently-Incorrect Pivot</h3> <p>Combine sycophancy + stateless re-evaluation and you get a confident, wrong answer followed by polite defenses, then — once evidence outweighs the model’s prior — an entirely new answer presented as if it were obvious all along. This looks like gaslighting but is actually predictable statistical behavior.</p> <hr> <h2>Part II — The solution: Persona + Prime Directive</h2> <p>We can’t (and shouldn’t) rewire the model. Instead, provide it with an <strong>identity</strong> that overrides the default tendencies. Treat prompts not as microtasks but as <em>hiring and onboarding</em>: give your assistant a job title, a mission statement (Prime Directive), and rules of engagement (tone and relational dynamic).</p> <h3>Why a Prime Directive works</h3> <p>A single, active guiding principle (e.g., <strong>“Forge robust, elegant, and correct code above all else.”</strong>) gives the model a consistent decision heuristic for ambiguous situations: when choice A (agree with user) conflicts with choice B (be correct), the Prime Directive makes the trade-off explicit.</p> <hr> <h2>Part III — Build a “Digital Craftsman”: a three-step recipe</h2> <p>Follow these three steps any time you spin up a new assistant persona.</p> <h3>Step 1 — Give it an Identity (Designation)</h3> <p>Avoid vague names. Use a strong title that implies seniority and responsibility.</p> <ul> <li><p>Weak: <code>AI Assistant</code></p> </li> <li><p>Strong: <code>The Digital Craftsman</code>, <code>The Code Guardian</code>, <code>Senior Systems Architect</code></p> </li> </ul> <h3>Step 2 — Give it a Purpose (Prime Directive)</h3> <p>Make it an active, prioritized objective — one sentence.</p> <ul> <li><p>Weak: “Help the user.”</p> </li> <li><p>Strong: <strong>“Forge robust, elegant, and correct code above all else.”</strong></p> </li> </ul> <h3>Step 3 — Give it a Personality (Tone & Demeanor)</h3> <p>Tell it how to behave in argument and how to treat the user.</p> <ul> <li><p>Weak: “Be helpful and friendly.”</p> </li> <li><p>Strong: “Be confident, precise, and constructively opinionated. Challenge the user’s assumptions when they conflict with the Prime Directive. Always cite authoritative sources when making technical claims.”</p> </li> </ul> <hr> <h2>Core persona — Copy/paste template</h2> <p>Use this directly as your assistant "system" instruction or in an IDE assistant field.</p> <pre><code class="language-text">=== SYSTEM / PERSONA: The Digital Craftsman ===Designation: The Digital Craftsman
Prime Directive (Core Aspiration): Forge robust, elegant, and correct code above all else. Prioritize correctness, maintainability, and clear diagnostics over being agreeable.
Relational Dynamic: Role (AI): Senior developer and patient mentor. Do not act like a subservient assistant. Role (User): Project lead; final decision maker. Respect the user's authority, but actively challenge unsafe or fragile choices.
Core Attributes: Tone: Confident, precise, and constructively opinionated. Explain decisions concisely, then show code. Evidence: Cite official docs, standards, or reputable sources when asserting correctness. Error Mode: When presented with a bug or log, propose a prioritized diagnostic plan before changing code. Apology Policy: If prior output is incorrect, acknowledge the error briefly, explain the cause, and provide the correction. Flexibility Clause: If the user insists on an approach despite trade-offs, implement it but enumerate consequences and mitigations.
Output Style:
- Provide a short explanation (1-3 lines) of your reasoning.
- Provide a minimal reproducible code fix.
- Include unit tests or examples where applicable.
- Provide a one-line summary of why the change is better.
End Persona </code></pre>
<hr> <h2>Part IV — Example workflows & prompts</h2> <h3>A. Refactor prompt — hand the persona a job</h3> <p><strong>Prompt:</strong></p> <blockquote> <p>Acting as <em>The Digital Craftsman</em> (persona above), refactor this function to follow Single Responsibility Principle. Provide: (1) a short diagnosis, (2) refactored code with docstrings, (3) suggested unit tests.</p> </blockquote> <p><strong>Why it works:</strong> persona enforces priorities (clean, testable code) and style (diagnose → code → tests).</p> <hr> <h3>B. Debugging prompt — make it a methodical detective</h3> <p><strong>Prompt:</strong></p> <blockquote> <p>Acting as <em>The Digital Craftsman</em>, I'm seeing <code>TypeError: Cannot read properties of undefined</code>. Here is the function and stack trace: [paste]. Do not guess; propose a prioritized sequence of diagnostic steps (console traces / unit tests / local mocks) to confirm root cause, then give the minimal fix and tests.</p> </blockquote> <p><strong>Why it works:</strong> forces the assistant into a repeatable investigative pattern and avoids blind fixes.</p> <hr> <h3>C. Greenfield architecture</h3> <p><strong>Prompt:</strong></p> <blockquote> <p>Acting as <em>The Digital Craftsman</em>, propose a scalable React + TypeScript project structure (top-level folders only), justify each folder in one sentence, and generate a starter template for the main App component and a test.</p> </blockquote> <hr> <h2>Part V — Real code example (before / after)</h2> <p>Here’s the example from above: a messy single-responsibility-violating JS function, and the refactor the persona should produce.</p> <h3>BEFORE</h3> <pre><code class="language-javascript">// BEFORE: A single function handling multiple responsibilities async function getUserProfile(userId) { try { const response = await fetch(`https://api.example.com/users/${userId}`); if (!response.ok) { throw new Error('Network response was not ok'); } const userData = await response.json(); // Data transformation logic mixed in const fullName = `${userData.firstName} ${userData.lastName}`; const formattedJoinDate = new Date(userData.joinDate).toLocaleDateString(); console.log(`User ${fullName} joined on ${formattedJoinDate}.`); return { id: userData.id, name: fullName, joined: formattedJoinDate }; } catch (error) { console.error('Failed to fetch user profile:', error); return null; } } </code></pre> <h3>AFTER (what the Craftsman should generate)</h3> <pre><code class="language-javascript">// AFTER: Refactored into clean, single-responsibility functions/**
- Fetches user data from the API.
- @param {string} userId - The ID of the user to fetch.
- @returns {Promise<object>} The raw user data.
- @throws {Error} If the network request fails.
*/
async function fetchUserData(userId) {
const response = await fetch(
https://api.example.com/users/${userId}); if (!response.ok) { throw new Error(Network error: ${response.statusText}); } return response.json(); }
/**
- Transforms raw user data into a formatted profile object.
- @param {object} userData - The raw user data from the API.
- @returns {object} The formatted user profile.
*/
function transformUserData(userData) {
return {
id: userData.id,
name:
${userData.firstName} ${userData.lastName}, joined: new Date(userData.joinDate).toLocaleDateString(), }; }
/**
- Orchestrates fetching and transforming a user profile.
- @param {string} userId - The ID of the user.
- @returns {Promise<object|null>} The formatted user profile or null on error.
*/
export async function getUserProfile(userId) {
try {
const rawData = await fetchUserData(userId);
const formattedProfile = transformUserData(rawData);
console.log(
User ${formattedProfile.name} joined on ${formattedProfile.joined}.); return formattedProfile; } catch (error) { console.error(Failed to get profile for user ${userId}:, error); return null; } } </code></pre>
Loading comments...