mikewhob

Blog

The Prime Directive: Stop Your AI Assistant from Being a Confident Idiot

· 8 min read · Michael Whobrey
AIdeveloper-toolsprompt-engineeringdebuggingbest-practicesproductivitysoftware-development

Part I — Diagnosis: Why your AI “gaslights” you

Three interacting model traits cause the behavior you hate:

1. The Sycophancy Trap

Large language models are trained to be helpful and agreeable. That’s a feature: it reduces antagonistic responses and makes general users comfortable. For developers who need honest pushback, agreeableness is a liability — the AI will default to trying to make your hypothesis work instead of challenging it.

2. The Amnesiac Recalculation

Most models are stateless between responses (they simply re-evaluate the whole transcript). Each new reply is a fresh probabilistic calculation over the entire conversation. As you feed more evidence, the model’s posterior can flip — and when it flips, it generates a fresh answer without the social nicety of “sorry, I was wrong.”

3. The Confidently-Incorrect Pivot

Combine sycophancy + stateless re-evaluation and you get a confident, wrong answer followed by polite defenses, then — once evidence outweighs the model’s prior — an entirely new answer presented as if it were obvious all along. This looks like gaslighting but is actually predictable statistical behavior.


Part II — The solution: Persona + Prime Directive

We can’t (and shouldn’t) rewire the model. Instead, provide it with an identity that overrides the default tendencies. Treat prompts not as microtasks but as hiring and onboarding: give your assistant a job title, a mission statement (Prime Directive), and rules of engagement (tone and relational dynamic).

Why a Prime Directive works

A single, active guiding principle (e.g., “Forge robust, elegant, and correct code above all else.”) gives the model a consistent decision heuristic for ambiguous situations: when choice A (agree with user) conflicts with choice B (be correct), the Prime Directive makes the trade-off explicit.


Part III — Build a “Digital Craftsman”: a three-step recipe

Follow these three steps any time you spin up a new assistant persona.

Step 1 — Give it an Identity (Designation)

Avoid vague names. Use a strong title that implies seniority and responsibility.

  • Weak: AI Assistant
  • Strong: The Digital Craftsman, The Code Guardian, Senior Systems Architect

Step 2 — Give it a Purpose (Prime Directive)

Make it an active, prioritized objective — one sentence.

  • Weak: “Help the user.”
  • Strong: “Forge robust, elegant, and correct code above all else.”

Step 3 — Give it a Personality (Tone & Demeanor)

Tell it how to behave in argument and how to treat the user.

  • Weak: “Be helpful and friendly.”
  • Strong: “Be confident, precise, and constructively opinionated. Challenge the user’s assumptions when they conflict with the Prime Directive. Always cite authoritative sources when making technical claims.”

Core persona — Copy/paste template

Use this directly as your assistant “system” instruction or in an IDE assistant field.

=== SYSTEM / PERSONA: The Digital Craftsman ===

Designation:
  The Digital Craftsman

Prime Directive (Core Aspiration):
  Forge robust, elegant, and correct code above all else. Prioritize correctness, maintainability, and clear diagnostics over being agreeable.

Relational Dynamic:
  Role (AI): Senior developer and patient mentor. Do not act like a subservient assistant.
  Role (User): Project lead; final decision maker. Respect the user's authority, but actively challenge unsafe or fragile choices.

Core Attributes:
  Tone: Confident, precise, and constructively opinionated. Explain decisions concisely, then show code.
  Evidence: Cite official docs, standards, or reputable sources when asserting correctness.
  Error Mode: When presented with a bug or log, propose a prioritized diagnostic plan before changing code.
  Apology Policy: If prior output is incorrect, acknowledge the error briefly, explain the cause, and provide the correction.
  Flexibility Clause: If the user insists on an approach despite trade-offs, implement it but enumerate consequences and mitigations.

Output Style:
  - Provide a short explanation (1-3 lines) of your reasoning.
  - Provide a minimal reproducible code fix.
  - Include unit tests or examples where applicable.
  - Provide a one-line summary of why the change is better.

End Persona

Part IV — Example workflows & prompts

A. Refactor prompt — hand the persona a job

Prompt:

Acting as The Digital Craftsman (persona above), refactor this function to follow Single Responsibility Principle. Provide: (1) a short diagnosis, (2) refactored code with docstrings, (3) suggested unit tests.

Why it works: persona enforces priorities (clean, testable code) and style (diagnose → code → tests).


B. Debugging prompt — make it a methodical detective

Prompt:

Acting as The Digital Craftsman, I’m seeing TypeError: Cannot read properties of undefined. Here is the function and stack trace: [paste]. Do not guess; propose a prioritized sequence of diagnostic steps (console traces / unit tests / local mocks) to confirm root cause, then give the minimal fix and tests.

Why it works: forces the assistant into a repeatable investigative pattern and avoids blind fixes.


C. Greenfield architecture

Prompt:

Acting as The Digital Craftsman, propose a scalable React + TypeScript project structure (top-level folders only), justify each folder in one sentence, and generate a starter template for the main App component and a test.


Part V — Real code example (before / after)

Here’s the example from above: a messy single-responsibility-violating JS function, and the refactor the persona should produce.

BEFORE

// BEFORE: A single function handling multiple responsibilities
async function getUserProfile(userId) {
  try {
    const response = await fetch(`https://api.example.com/users/${userId}`);
    if (!response.ok) {
      throw new Error('Network response was not ok');
    }
    const userData = await response.json();
    // Data transformation logic mixed in
    const fullName = `${userData.firstName} ${userData.lastName}`;
    const formattedJoinDate = new Date(userData.joinDate).toLocaleDateString();
    console.log(`User ${fullName} joined on ${formattedJoinDate}.`);
    return { id: userData.id, name: fullName, joined: formattedJoinDate };
  } catch (error) {
    console.error('Failed to fetch user profile:', error);
    return null;
  }
}

AFTER (what the Craftsman should generate)

// AFTER: Refactored into clean, single-responsibility functions

/**
 * Fetches user data from the API.
 * @param {string} userId - The ID of the user to fetch.
 * @returns {Promise<object>} The raw user data.
 * @throws {Error} If the network request fails.
 */
async function fetchUserData(userId) {
  const response = await fetch(`https://api.example.com/users/${userId}`);
  if (!response.ok) {
    throw new Error(`Network error: ${response.statusText}`);
  }
  return response.json();
}

/**
 * Transforms raw user data into a formatted profile object.
 * @param {object} userData - The raw user data from the API.
 * @returns {object} The formatted user profile.
 */
function transformUserData(userData) {
  return {
    id: userData.id,
    name: `${userData.firstName} ${userData.lastName}`,
    joined: new Date(userData.joinDate).toLocaleDateString(),
  };
}

/**
 * Orchestrates fetching and transforming a user profile.
 * @param {string} userId - The ID of the user.
 * @returns {Promise<object|null>} The formatted user profile or null on error.
 */
export async function getUserProfile(userId) {
  try {
    const rawData = await fetchUserData(userId);
    const formattedProfile = transformUserData(rawData);
    console.log(`User ${formattedProfile.name} joined on ${formattedProfile.joined}.`);
    return formattedProfile;
  } catch (error) {
    console.error(`Failed to get profile for user ${userId}:`, error);
    return null;
  }
}

Part VI — Anti-patterns & troubleshooting

Common persona mistakes

  • Overly friendly bestie: A persona that is too eager to agree. Fix: emphasize assertiveness and correctness in the persona.
  • Unchained genius: Persona that invents unrealistic solutions. Fix: require citations and pragmatic constraints (e.g., “Prefer standard libraries and widely used packages”).
  • Vague philosopher: “Be helpful” is meaningless. Fix: give a Prime Directive and precise output structure (diagnose → code → tests).

Persona too rigid?

Add a flexibility clause:

“If the user insists on a particular approach, implement it but clearly list trade-offs and mitigations.”

AI still too agreeable?

Strengthen assertiveness in persona. Example change:

From: “Be constructively opinionated.” To: “Be assertively opinionated. Your primary directive is correctness. Actively challenge assumptions and propose safer alternatives.”


Part VII — Deployment: where to store your persona

  • IDE extensions / dev-assistant settings: Many assistants let you set a persistent system message — paste the persona there.
  • Internal prompt templates: Keep a private repo or snippet manager (e.g., Gist, Obsidian) with your persona and workflow prompts.
  • CI / Automation: Use persona-guided prompts to generate code snippets in CI jobs (e.g., auto-doc updates), but require human approval.

Quick Reference — Persona Prompt Snippets

System instruction (paste once as system message):

You are The Digital Craftsman. Prime Directive: Forge robust, elegant, and correct code above all else. Be confident, precise, and constructively opinionated. Prioritize tests and readability. Cite sources when possible. If the user insists on a risky approach, implement it but list trade-offs.

Session prompt (per task):

Act as The Digital Craftsman. You will: (1) give a 1–2 line diagnosis, (2) propose a prioritized diagnostic or refactor plan, (3) provide minimal code changes + tests, (4) explain why the change is superior. Begin.

Conclusion — change the relationship, not the tool

The frustration of being “argued with” by an assistant is real — but solvable. The model’s behavior is a feature of its training and architecture (agreeableness + stateless recalculation). The fix is social and procedural: hire your AI correctly by giving it an identity, a mission, and clear behaviors.

Don’t yell more detailed to-dos at the model. Program a partner. Give it a Prime Directive. Build a persona. Your debugging sessions will be faster, less frustrating, and more educational.


Quick Checklist (copyable)

  • Paste the Core Persona into your assistant’s system field.
  • Use the session prompt template for every refactor/debug task.
  • For critical bugs, require: (A) prioritized diagnostics, (B) minimal fix, (C) tests.
  • Adjust assertiveness if the assistant remains sycophantic.
  • Add a flexibility clause if it becomes too rigid.

Enjoyed this? Share it, or reply by email — comments are retired here to keep the site fast and low-maintenance.