Self-RAG Validator: Second LLM Call Design
In phase-1-eval this day introduces the second LLM call as a verbatim-quoting validator so the pipeline can reject weak extractions instead of silently accepting them; it directly upgrades the retry-rejection contract already present in the codebase toward reliable evaluation.
Resources
- 35 minreadingarXivSelf-RAG: Learning to Retrieve, Generate, and Critique through Self-Reflection
Sections 3 and 4 (reflection tokens and critique generation)
Codebase anchors
The Tribunal code that demonstrates today's concept. Click the line to open in GitHub or VS Code.
The existing retry-with-rejection logic in extractPainWithRetry provides the baseline pattern that the new Self-RAG style validator (second LLM call requiring verbatim quotes) will extend or replace.
40 /** The URL that finally produced a usable extraction. */41 usedUrl: string;42 /** How many URLs we tried (1 = first one worked, 3 = third one worked). */43 attempts: number;44}45 46/**47 * Try the primary URL, then each fallback in order, until one returns a48 * usable extraction or we exhaust MAX_PAIN_EXTRACTION_ATTEMPTS.49 *50 * @param urls Ordered list — index 0 is the top-ranked signal, [1..N] are51 * fallbacks selected from the same ranked pool. The function52 * only consumes up to MAX_PAIN_EXTRACTION_ATTEMPTS of them.53 * @param extract Pure async function that runs the extraction for one URL.54 * Throws on low-confidence (caught + retried) and on hard55 * errors (propagated immediately).56 *57 * Throws the LAST low-confidence error if every attempt rejected. Throws58 * other errors immediately (no retry).59 */60export async function extractPainWithRetry<T>(61 urls: string[],62 extract: (url: string) => Promise<T>,63): Promise<PainExtractionAttempt<T>> {64 if (urls.length === 0) {65 throw new Error('extractPainWithRetry: urls list is empty');66 }67 68 const candidates = urls.slice(0, MAX_PAIN_EXTRACTION_ATTEMPTS);69 let lastError: Error | null = null;70 71 for (let i = 0; i < candidates.length; i++) {72 const url = candidates[i];73 const attemptNum = i + 1;74 try {75 console.log(76 `[PAIN_RETRY] Attempt ${attemptNum}/${candidates.length}: ${url.substring(0, 80)}...`,77 );78 const extraction = await extract(url);79 if (attemptNum > 1) {80 console.log(`[PAIN_RETRY] ✅ Succeeded on attempt ${attemptNum} after ${i} low-confidence rejection(s)`);The orchestrator's call site to extractPainWithRetry is the integration point where the second-LLM validator will be inserted after the initial extraction.
130 extractPainFromUrl,131 enrichWithV3,132 runTribunalV2,133 generateRichIdeas,134 adaptV3PricingToPricerOutput,135 } = await import('./orchestrator-v2');136 137 // STEP 1: Pain Extraction with multi-shot retry (4-12s).138 //139 // Vercel-side pain discovery ranks dozens of candidate signals and140 // sends the orchestrator the top one + optional fallbacks. If Grok141 // rejects the primary as "low confidence" we walk the fallbacks142 // before aborting — wasteful otherwise, since the ranked pool143 // almost always has at least one usable runner-up.144 //145 // Cap of 3 total attempts (primary + 2 fallbacks). See146 // lib/spawnforge/extract-pain-with-retry.ts for the contract.147 console.log('\n📝 [COMPLETE] Step 1: Extracting pain point...');148 await emitSessionProgress(sessionId, 'extracting');149 150 const { extractPainWithRetry } = await import('./extract-pain-with-retry');151 const candidateUrls = [input.url, ...(input.fallbackUrls ?? [])];152 const retryResult = await extractPainWithRetry(153 candidateUrls,154 async (url) => {155 const result = await extractPainFromUrl(url, grok);156 // Stash content alongside extraction so the closure caller157 // can pull both — extract-pain-with-retry only cares about158 // extraction shape.159 return result;160 },161 );162 const { extraction, content } = retryResult.extraction;163 if (retryResult.attempts > 1) {164 console.log(165 `[COMPLETE] Pain extraction succeeded after ${retryResult.attempts} attempts (fell back to ranked signal #${retryResult.attempts}).`,166 );167 }168 169 // When the retry fell back past the primary URL, re-validate the170 // WINNING content against the slot's target country. The painDeliverable
Draft rag-validator.ts module exporting a verifyExtraction function plus one passing unit test in __tests__/lib/spawnforge/rag-validator.test.ts
Quiz · 2 questions
1. In Self-RAG the second LLM call is primarily used to
2. Why does requiring the validator to quote verbatim text reduce hallucinated acceptance?