Skip to content

deep-dive

1 post with the tag “deep-dive”

Technical Deep Dive: Why Cook With Gasoline MCP?

Most LLM browser tools fail because they feed the model one of two things:

Raw HTML: This is token-expensive and full of noise (<div class="wrapper-v2-flex...">). The LLM gets lost in the “soup” of utility classes and nesting.

document.body.innerText: This flattens the page, losing all structure. A “Submit” button becomes just the word “Submit” floating in void—the LLM has no idea it’s clickable or which form it belongs to.


CWG is an MCP server that acts as a “Vision Processing Unit” for the LLM.

Instead of scraping HTML, CWG serializes the Accessibility Object Model (AOM). This is the same API screen readers use to navigate the web.

  • Signal, Not Noise: We strip away thousands of <div> and <span> wrappers, exposing only semantic elements: buttons, inputs, headings, and landmarks.
  • The Result: A 50,000-token HTML page becomes a clean, 2,000-token JSON structure that preserves hierarchy and interactivity.

Modern enterprise apps (Salesforce, Adobe, Google Cloud) use Web Components and Shadow DOM to encapsulate styles.

  • The Problem: Standard scrapers (and innerText) hit a “shadow root” and stop. They literally cannot see inside your complex UI components.
  • The CWG Fix: Our serializer recursively pierces open Shadow Roots, flattening the component tree into a single, logical view for the AI.

When Claude or ChatGPT wants to click a button, it usually guesses a CSS selector (e.g., button[class*="blue"]). This is brittle; if you change a class name, the agent breaks.

  • Our Approach: CWG injects ephemeral, stable IDs (e.g., [cwg-id="12"]) into the DOM map it sends to the LLM.
  • The Loop:
    • LLM reads: Button “Save” [id=“12”]
    • LLM commands: click("12")
    • CWG executes the click exactly on that element, regardless of CSS changes.

Frontend errors are often invisible to the UI. If a button click fails silently, the LLM hallucinates that it worked.

  • CWG hooks into the browser’s Console and Network streams.
  • If a 500 API Error occurs after a click, CWG feeds that error log back into the LLM’s context window immediately.
  • Result: The LLM sees “Click failed: 500 Internal Server Error” and self-corrects (e.g., “I will try reloading the page”).
FeatureRaw HTML ScrapingVision (Screenshots)Gasoline MCP
Token Cost🔴 Very High🟡 High🟢 Low (Optimized JSON)
Speed🟢 Fast🔴 Slow (Image Processing)🟢 Instant (Text)
Shadow DOM🔴 Invisible🟢 Visible🟢 Visible & Interactive
Dynamic Content🔴 Misses updates🟡 Can see updates🟢 Live MutationObserver
Click Reliability🟡 CSS Selectors (Brittle)🟡 Coordinate Guessing🟢 Stable ID System