Technical Deep Dive: Why Cook With Gasoline MCP?
Most LLM browser tools fail because they feed the model one of two things:
Raw HTML: This is token-expensive and full of noise (<div class="wrapper-v2-flex...">). The LLM gets lost in the “soup” of utility classes and nesting.
document.body.innerText: This flattens the page, losing all structure. A “Submit” button becomes just the word “Submit” floating in void—the LLM has no idea it’s clickable or which form it belongs to.
How Gasoline MCP (CWG) is Different
Section titled “How Gasoline MCP (CWG) is Different”CWG is an MCP server that acts as a “Vision Processing Unit” for the LLM.
1. The “Accessibility Tree” Strategy
Section titled “1. The “Accessibility Tree” Strategy”Instead of scraping HTML, CWG serializes the Accessibility Object Model (AOM). This is the same API screen readers use to navigate the web.
- Signal, Not Noise: We strip away thousands of
<div>and<span>wrappers, exposing only semantic elements: buttons, inputs, headings, and landmarks. - The Result: A 50,000-token HTML page becomes a clean, 2,000-token JSON structure that preserves hierarchy and interactivity.
2. Shadow DOM Piercing
Section titled “2. Shadow DOM Piercing”Modern enterprise apps (Salesforce, Adobe, Google Cloud) use Web Components and Shadow DOM to encapsulate styles.
- The Problem: Standard scrapers (and innerText) hit a “shadow root” and stop. They literally cannot see inside your complex UI components.
- The CWG Fix: Our serializer recursively pierces open Shadow Roots, flattening the component tree into a single, logical view for the AI.
3. “Stable ID” Interaction
Section titled “3. “Stable ID” Interaction”When Claude or ChatGPT wants to click a button, it usually guesses a CSS selector (e.g., button[class*="blue"]). This is brittle; if you change a class name, the agent breaks.
- Our Approach: CWG injects ephemeral, stable IDs (e.g.,
[cwg-id="12"]) into the DOM map it sends to the LLM. - The Loop:
- LLM reads: Button “Save” [id=“12”]
- LLM commands:
click("12") - CWG executes the click exactly on that element, regardless of CSS changes.
4. The “Console Vision” Pipeline
Section titled “4. The “Console Vision” Pipeline”Frontend errors are often invisible to the UI. If a button click fails silently, the LLM hallucinates that it worked.
- CWG hooks into the browser’s Console and Network streams.
- If a 500 API Error occurs after a click, CWG feeds that error log back into the LLM’s context window immediately.
- Result: The LLM sees “Click failed: 500 Internal Server Error” and self-corrects (e.g., “I will try reloading the page”).
Comparison Table
Section titled “Comparison Table”| Feature | Raw HTML Scraping | Vision (Screenshots) | Gasoline MCP |
|---|---|---|---|
| Token Cost | 🔴 Very High | 🟡 High | 🟢 Low (Optimized JSON) |
| Speed | 🟢 Fast | 🔴 Slow (Image Processing) | 🟢 Instant (Text) |
| Shadow DOM | 🔴 Invisible | 🟢 Visible | 🟢 Visible & Interactive |
| Dynamic Content | 🔴 Misses updates | 🟡 Can see updates | 🟢 Live MutationObserver |
| Click Reliability | 🟡 CSS Selectors (Brittle) | 🟡 Coordinate Guessing | 🟢 Stable ID System |