Skip to content

ai-development

10 posts with the tag “ai-development”

AI-Powered QA: How to Test Your Web App Without Writing Test Code

What if you could test your web application by describing what should happen — in plain English — and have an AI actually run the tests?

No Playwright scripts. No Selenium WebDriver setup. No npm install or pip install. No learning CSS selectors, XPath, or assertion libraries. Just tell the AI what to test, and it tests it.

This isn’t a future vision. It works today with Gasoline MCP.

Writing automated tests is expensive:

  • Setup cost: Install Node.js, install Playwright, configure the test runner, set up CI/CD
  • Writing cost: Learn the API, figure out selectors, handle async operations, manage test data
  • Maintenance cost: Every UI change breaks selectors. Every flow change breaks sequences. Tests that took 2 hours to write take 4 hours to maintain.

The result? Most teams have either:

  1. No automated tests — manual QA only
  2. Fragile tests — break on every deploy, ignored by the team
  3. Expensive tests — dedicated QA engineers maintaining a test suite that’s always behind

With Gasoline, testing looks like this:

"Go to the login page. Enter 'test@example.com' as the email and 'password123'
as the password. Click Sign In. Verify that you land on the dashboard and there
are no console errors."

The AI:

  1. Navigates to the login page
  2. Finds the email field (using semantic selectors — label=Email, not #email-input-field-v2)
  3. Types the email
  4. Finds the password field
  5. Types the password
  6. Clicks the Sign In button (by text, not by CSS selector)
  7. Waits for navigation
  8. Checks the URL contains /dashboard
  9. Checks for console errors

If anything fails, the AI reports exactly what happened: “The Sign In button was found and clicked, but the page navigated to /error instead of /dashboard. The API returned a 401 with {"error": "invalid credentials"}.”

Selenium/Playwright test:

await page.goto('https://myapp.com/login');
await page.locator('#email-input').fill('test@example.com');
await page.locator('#password-input').fill('password123');
await page.locator('button[type="submit"]').click();
await expect(page).toHaveURL(/.*dashboard/);

Gasoline natural language:

Log in with test@example.com / password123.
Verify you reach the dashboard.

The Selenium test breaks when:

  • The email field ID changes from #email-input to #email-field
  • The submit button gets a new class or is replaced with a different component
  • The form structure changes (inputs wrapped in a new div)

The natural language test survives all of these because the AI uses meaning-based selectors: “the email field” → label=Email, “the sign in button” → text=Sign In.

"Sign up with a new account, verify the welcome email prompt appears,
dismiss it, navigate to settings, change the display name, and verify
the change is reflected in the header."
"Submit the contact form with an empty email. Verify an error message
appears. Then enter a valid email and submit. Verify it succeeds."
"Navigate to a product page that doesn't exist (/products/99999).
Verify a 404 page is shown and there are no console errors."
"Navigate to the homepage. Check that LCP is under 2.5 seconds and
there are no layout shifts above 0.1."
"Run an accessibility audit on the checkout page. Report any critical
or serious violations."
"Submit an order. Verify the API returns a 201 status and the response
includes an order ID."

Natural language tests are great for exploratory testing and quick validation. But for CI/CD, you need repeatable tests.

After running a natural language test session:

generate({format: "test", test_name: "guest-checkout",
assert_network: true, assert_no_errors: true})

Gasoline generates a complete Playwright test from the session — every action translated to Playwright commands with proper selectors, network assertions, and error checking. The AI ran the test in natural language; Gasoline converts it to code for CI.

This is the best of both worlds:

  1. Write tests in English — fast, no setup
  2. Export to Playwright — repeatable, CI-ready
  3. Re-run in English — if the generated test breaks, describe the flow again and regenerate

You know the user flows better than anyone. You shouldn’t need to write JavaScript to verify them. Describe the flow, the AI tests it, and you see the results.

You don’t have dedicated QA engineers, and your developers are building features, not writing tests. Natural language testing gives you test coverage without the headcount.

You already know how to test. Natural language testing lets you work faster — describe 10 test cases in the time it takes to code 1. Generate Playwright tests from the ones that should be permanent.

You just shipped a feature and want to verify the happy path before the PR review. A 30-second natural language test is faster than writing a proper test and faster than manual testing.

Resilience: Why AI Tests Survive UI Changes

Section titled “Resilience: Why AI Tests Survive UI Changes”

Traditional tests are tightly coupled to the UI implementation:

// Breaks when the button text changes from "Submit" to "Place Order"
await page.locator('button:has-text("Submit")').click();
// Breaks when the ID changes
await page.locator('#checkout-submit-btn').click();
// Breaks when the class changes
await page.locator('.btn-primary.submit').click();

The AI uses semantic selectors that adapt:

  • text=Submit → If the button now says “Place Order”, the AI reads the page and finds the new text
  • label=Email → Works regardless of whether it’s an <input>, a Material UI <TextField>, or a custom component
  • role=button → Works regardless of styling or class names

And if a selector doesn’t match, the AI doesn’t just fail — it calls interact({action: "list_interactive"}) to discover what’s actually on the page and adapts.

For tests you run regularly:

"Save this test flow as 'checkout-happy-path'."
configure({action: "store", store_action: "save",
namespace: "tests", key: "checkout-happy-path",
data: {steps: ["navigate to /checkout", "fill in shipping...", ...]}})
"Load and run the 'checkout-happy-path' test."
configure({action: "store", store_action: "load",
namespace: "tests", key: "checkout-happy-path"})

Save browser state at key points:

interact({action: "save_state", snapshot_name: "logged-in"})

Later, restore that state instead of repeating the login flow:

interact({action: "load_state", snapshot_name: "logged-in", include_url: true})
  1. Install Gasoline (Quick Start)
  2. Open your web app
  3. Tell your AI: “Test the login flow — go to the login page, enter test credentials, sign in, and verify you reach the dashboard.”

No setup. No dependencies. No test code. Just describe what should happen.

Best MCP Servers for Web Development in 2026

MCP (Model Context Protocol) lets AI coding assistants plug into external tools — browsers, databases, APIs, and more. The right combination of MCP servers turns your AI assistant from a code-only tool into a full-stack development partner.

Here are the most useful MCP servers for web developers, what they do, and how they work together.

A good MCP server:

  1. Gives the AI information it can’t get otherwise — runtime data, live state, external services
  2. Reduces copy-paste — the AI reads data directly instead of you pasting it in
  3. Enables actions — the AI can do things, not just observe
  4. Works locally — your data stays on your machine

With that in mind, here are the servers worth setting up.

What it does: Streams real-time browser telemetry to your AI — console logs, network errors, WebSocket events, Web Vitals, accessibility audits, user actions — and gives the AI browser control.

Why it matters: Without browser observability, your AI can read code but can’t see what happens when it runs. Every debugging session requires you to manually describe the problem. With Gasoline, the AI observes the bug directly.

Key capabilities:

  • 4 tools: observe (23 modes), generate (7 formats), configure (12 actions), interact (24 actions)
  • Real-time: Console errors, network failures, WebSocket traffic as they happen
  • Browser control: Navigate, click, type, run JavaScript, take screenshots
  • Artifact generation: Playwright tests, reproduction scripts, HAR exports, CSP headers, SARIF reports
  • Security auditing: Credential detection, PII scanning, third-party script analysis
  • Performance: Web Vitals with before/after comparison on every navigation

Setup: Chrome extension + npx gasoline-mcp

Zero dependencies: Single Go binary, no Node.js runtime. Localhost only.

Get started with Gasoline →

Most AI coding tools (Claude Code, Cursor, Windsurf) have built-in filesystem access. If yours doesn’t, the reference filesystem MCP server handles it:

What it does: Read, write, search, and navigate files.

Why it matters: The foundation. Everything else builds on the AI being able to read and edit your code.

Key capabilities: Read files, write files, search by name or content, directory listing.

What it does: Lets the AI query your database directly — read schemas, run SELECT queries, inspect data.

Why it matters: When debugging a “wrong data” bug, the AI can check the database instead of you running psql and pasting results. It can also verify that migrations ran correctly.

Key capabilities: Schema inspection, read queries, data exploration. Most implementations are read-only by default (safe for production databases).

Use case: “Why is the user’s email wrong on the profile page?” → AI checks the database, finds the email was never updated after the migration, identifies the migration bug.

What it does: Create PRs, read issues, check CI status, review code, manage releases.

Why it matters: The AI can close the loop — fix a bug, create a PR, link it to the issue, and check if CI passes. Without GitHub access, you’re the intermediary for every PR and issue interaction.

Key capabilities: Create/update PRs, read/comment on issues, check workflow runs, view PR reviews.

Use case: “Fix this bug and open a PR” → AI fixes the code, commits, pushes, creates the PR with a summary, and links it to the issue.

What it does: Searches the web and fetches page content.

Why it matters: When your AI encounters an unfamiliar error or needs documentation for a third-party library, it can search instead of guessing. This is especially useful for new APIs, recent library versions, and obscure error messages.

Key capabilities: Web search, URL fetching, content extraction.

Use case: “I’m getting a ERR_OSSL_EVP_UNSUPPORTED error” → AI searches, finds it’s a Node.js 17+ OpenSSL 3.0 issue, applies the fix.

What it does: List containers, read logs, start/stop services, check health.

Why it matters: If your backend runs in Docker, the AI can check container logs when the API returns 500s. No more “can you check the Docker logs?” copy-paste cycles.

Key capabilities: Container listing, log reading, service management, health checks.

Use case: “The API is returning 500s” → AI checks Gasoline for the error response, then checks Docker logs for the backend container, finds the database container is down, restarts it.

What it does: Check build status, read test results, manage tickets.

Why it matters: The AI can check if CI is green after pushing a fix, read test failure logs, and update tickets with results — closing the loop without tab-switching.

The real power is composition. Here’s a debugging workflow using multiple MCP servers:

  1. Gasoline: observe({what: "error_bundles"}) — sees a TypeError correlated with a 500 from /api/orders
  2. Gasoline: observe({what: "network_bodies", url: "/api/orders"}) — the 500 response says "column 'discount_code' does not exist"
  3. Filesystem: Reads the migration files — finds the discount_code column was added in a migration that hasn’t run
  4. Docker: Checks the database container logs — confirms the migration wasn’t applied
  5. Filesystem: Reads the deployment script — finds migrations don’t auto-run
  6. Filesystem: Fixes the deployment script to run migrations
  7. Gasoline: interact({action: "refresh"}) — refreshes the page, verifies the error is gone
  8. GitHub: Creates a PR with the fix

Six MCP servers. One conversation. No copy-paste. No tab-switching. The AI moved from symptom to root cause to fix to PR in a single flow.

For a typical web development workflow:

PriorityServerWhy
EssentialFilesystem (usually built-in)Read and edit code
EssentialGasoline (browser)See runtime errors, debug, test
High valueGitHubPRs, issues, CI status
High valueDatabaseData inspection, schema verification
UsefulSearchDocumentation, error lookup
UsefulDockerContainer log access

Start with Gasoline and your built-in filesystem access. Add GitHub and database when you find yourself copy-pasting between those tools and your AI. Add the rest as needed.

Most AI tools support multiple MCP servers in their config. Example for Claude Code (.mcp.json):

{
"mcpServers": {
"gasoline": {
"command": "npx",
"args": ["-y", "gasoline-mcp"]
}
}
}

Each server gets its own entry. The AI discovers all available tools on startup and uses them as needed.

MCP adoption is accelerating. Every major AI coding tool now supports MCP, and new servers appear weekly. The pattern is clear: AI assistants are becoming environment-aware, connecting to every data source and tool a developer uses.

The developers who set up the right MCP servers today work significantly faster — not because the AI is smarter, but because the AI can see more of the picture.

Gasoline MCP vs Playwright: When to Use Which

Gasoline and Playwright aren’t competitors — they’re complementary. Playwright is a browser automation library for writing repeatable test scripts. Gasoline is an AI-powered browser observation and control layer. Gasoline can even generate Playwright tests.

But they serve different purposes, and knowing when to use each saves significant time.

Gasoline MCPPlaywright
InterfaceNatural language via AIJavaScript/TypeScript/Python API
Who uses itDevelopers, PMs, QA — anyoneDevelopers and QA engineers
SetupInstall extension + npx gasoline-mcpnpm init playwright@latest
SelectorsSemantic (text=Submit, label=Email)CSS, XPath, role, text, test-id
Test creationDescribe in EnglishWrite code
ExecutionAI runs it interactivelyCLI or CI/CD pipeline
DebuggingReal-time browser observationTrace viewer, screenshots
MaintenanceAI adapts to UI changesManual selector updates
CI/CDGenerate Playwright tests → run in CINative CI/CD support
ObservabilityConsole, network, WebSocket, vitals, a11yLimited (what you assert)
PerformanceBuilt-in Web Vitals + perf_diffManual performance assertions
CostFree, open sourceFree, open source

You’re checking if a feature works. You don’t want to write a script — you want to try it.

Playwright: Write a script, run it, read the output, modify, repeat.

Gasoline: “Go to the checkout page, add two items, and complete the purchase. Tell me if anything breaks.”

For one-off verification, natural language is 10x faster.

Your test failed. Now what?

Playwright: Open the trace viewer. Scrub through screenshots. Check the assertion error message. Maybe add console.log statements to the test and re-run.

Gasoline: The AI already sees everything — console errors, network responses, WebSocket state, performance metrics. It can diagnose while testing.

observe({what: "error_bundles"})

One call returns the error with its correlated network requests and user actions. No trace viewer needed.

A designer renamed “Submit” to “Place Order” and restructured the form.

Playwright: Tests fail. You update selectors manually across 15 test files. You hope you caught them all.

Gasoline: The AI reads the page, finds the new button text, and continues. No manual updates.

A product manager wants to verify the user flow before release.

Playwright: Not an option without JavaScript knowledge.

Gasoline: “Walk through the signup flow and make sure it works.” The PM can do this themselves.

Playwright tests only check what you explicitly assert. If you don’t assert “no console errors,” you’ll never know about them.

Gasoline observes everything passively:

  • Console errors the test didn’t check for
  • Slow API responses the test didn’t measure
  • Layout shifts the test didn’t detect
  • Third-party script failures the test couldn’t see

Playwright: You can measure timing with custom code, but there’s no built-in Web Vitals collection or before/after comparison.

Gasoline: Web Vitals are captured automatically. Navigate or refresh, and you get a perf_diff with deltas, ratings, and a verdict. No custom code.

Playwright tests run headlessly in GitHub Actions, GitLab CI, or any CI system. They’re deterministic, repeatable, and fast.

Gasoline generates Playwright tests, but the actual CI execution is Playwright’s domain. Gasoline runs interactively with an AI assistant — it’s not designed to be a CI test runner.

Playwright can shard tests across multiple workers and run them in parallel. For a suite of 500 tests, this means finishing in minutes instead of hours.

Gasoline is single-session — one AI, one browser, one tab at a time.

Playwright supports Chromium, Firefox, and WebKit out of the box.

Gasoline’s extension currently runs in Chrome/Chromium only.

When you need a test that passes or fails the exact same way every time, Playwright’s explicit assertions are the right tool:

await expect(page.getByRole('heading')).toHaveText('Welcome back');
await expect(response.status()).toBe(200);

AI-driven testing is intelligent but non-deterministic — the AI might take different paths or interpret “verify it works” differently across runs.

Playwright can intercept and mock network requests, letting you test error states, slow responses, and edge cases without a real backend.

Gasoline observes real traffic — it doesn’t mock it.

The Best of Both: Generate Playwright from Gasoline

Section titled “The Best of Both: Generate Playwright from Gasoline”

The power move: use Gasoline for exploration and Playwright for CI.

"Walk through the checkout flow — add an item, go to cart, enter
shipping info, and complete the purchase."

The AI runs the flow interactively, handling UI variations and reporting issues in real time.

"Generate a Playwright test from this session."
generate({format: "test", test_name: "checkout-flow",
base_url: "http://localhost:3000",
assert_network: true,
assert_no_errors: true,
assert_response_shape: true})

Gasoline produces a complete Playwright test:

import { test, expect } from '@playwright/test';
test('checkout-flow', async ({ page }) => {
const consoleErrors = [];
page.on('console', msg => {
if (msg.type() === 'error') consoleErrors.push(msg.text());
});
await page.goto('http://localhost:3000/products');
await page.getByRole('button', { name: 'Add to Cart' }).click();
await page.getByRole('link', { name: 'Cart' }).click();
await page.getByLabel('Address').fill('123 Main St');
// ...
expect(consoleErrors).toHaveLength(0);
});

The generated test runs in your CI pipeline like any other Playwright test. Deterministic, repeatable, fast.

The UI changed and the Playwright test fails. Instead of manually updating selectors:

"The checkout test is failing because the form changed.
Walk through the checkout flow again and generate a new test."

The AI adapts to the new UI, generates a fresh Playwright test, and you’re back in CI.

ScenarioUse
Quick feature verificationGasoline
CI/CD regression suitePlaywright (generated by Gasoline)
Debugging a test failureGasoline (better observability)
Non-developer testingGasoline
Cross-browser testingPlaywright
Performance monitoringGasoline (built-in vitals)
Network mockingPlaywright
Accessibility auditingGasoline (built-in axe-core)
Exploratory testingGasoline
500+ test parallel executionPlaywright
Test maintenanceGasoline (regenerate broken tests)
  1. Develop — use Gasoline for real-time debugging and quick validation
  2. Generate — convert validated flows to Playwright tests
  3. CI — run Playwright tests on every push
  4. Maintain — when tests break, re-explore with Gasoline and regenerate

Gasoline doesn’t replace Playwright. It makes Playwright tests easier to create, easier to maintain, and easier to debug when they fail.

How Gasoline MCP Improves Your Application Security

Most developers discover security issues in production. A penetration test finds exposed credentials in an API response. A security review flags missing headers. A breach notification reveals that a third-party script was exfiltrating form data.

Gasoline MCP flips the timeline. Your AI assistant audits security while you develop, catching issues before they ship.

In the typical development cycle, security checks happen late:

  1. Development — features built, tested, deployed
  2. Security review — weeks later, if at all
  3. Penetration test — quarterly, expensive, findings arrive after context is lost
  4. Incident — the worst time to learn about a vulnerability

Every step between writing the code and finding the issue adds cost. A missing HttpOnly flag caught during development takes 30 seconds to fix. The same flag caught in a pen test takes a meeting, a ticket, a sprint, and a deploy.

Real-Time Security Auditing During Development

Section titled “Real-Time Security Auditing During Development”

Gasoline gives your AI assistant six categories of security checks that run against live browser traffic:

Your AI can scan every network request and response for exposed secrets:

observe({what: "security_audit", checks: ["credentials"]})

This catches:

  • AWS Access Keys (AKIA...) in API responses
  • GitHub PATs (ghp_..., ghs_...) in console logs
  • Stripe keys (sk_test_..., sk_live_...) in client-side code
  • JWTs in URL parameters (a common mistake)
  • Bearer tokens in responses that shouldn’t contain them
  • Private keys accidentally bundled in source maps

Every detection runs regex plus validation (Luhn algorithm for credit cards, structure checks for JWTs) to minimize false positives.

observe({what: "security_audit", checks: ["pii"]})

Finds personal data flowing through your application:

  • Social Security Numbers
  • Credit card numbers (with Luhn validation — not just pattern matching)
  • Email addresses in unexpected API responses
  • Phone numbers in contexts where they shouldn’t appear

This matters for GDPR, CCPA, and HIPAA compliance. If your user list API is returning full SSNs when the frontend only needs names, your AI catches it during development.

observe({what: "security_audit", checks: ["headers"]})

Validates that your responses include critical security headers:

HeaderWhat It Prevents
Strict-Transport-SecurityDowngrade attacks, cookie hijacking
X-Content-Type-OptionsMIME sniffing attacks
X-Frame-OptionsClickjacking
Content-Security-PolicyXSS, injection attacks
Referrer-PolicyReferrer leakage to third parties
Permissions-PolicyUnauthorized browser feature access

Missing any of these? Your AI knows immediately — and can fix it.

observe({what: "security_audit", checks: ["cookies"]})

Session cookies without HttpOnly are accessible to XSS attacks. Cookies without Secure can be intercepted over HTTP. Missing SameSite enables CSRF. Gasoline checks every cookie against every flag and rates severity based on whether it’s a session cookie.

observe({what: "security_audit", checks: ["transport"]})

Detects:

  • HTTP usage on non-localhost origins (unencrypted traffic)
  • Mixed content (HTTPS page loading HTTP resources)
  • HTTPS downgrade patterns
observe({what: "security_audit", checks: ["auth"]})

Identifies API endpoints that return PII without requiring authentication. If /api/users/123 returns a full user profile without an Authorization header, that’s a finding.

Third-party scripts are one of the largest attack surfaces in modern web applications. Every <script src="..."> from an external CDN is a trust decision.

observe({what: "third_party_audit"})

Gasoline classifies every third-party origin by risk:

  • Critical risk — scripts from suspicious domains, data exfiltration patterns
  • High risk — scripts from unknown origins, data sent to third parties with POST requests
  • Medium risk — non-essential third-party resources, suspicious TLDs (.xyz, .top, .click)
  • Low risk — fonts and images from known CDNs

It detects domain generation algorithm (DGA) patterns — high-entropy hostnames that indicate malware communication. It flags when your application sends PII-containing form data to third-party origins.

And it’s configurable. Specify your first-party origins and custom allow/block lists:

observe({what: "third_party_audit",
first_party_origins: ["https://api.myapp.com"],
custom_lists: {
allowed: ["https://cdn.mycompany.com"],
blocked: ["https://suspicious-tracker.xyz"]
}})

Security isn’t just about finding issues — it’s about making sure fixes stay fixed.

// Before your deploy
configure({action: "diff_sessions", session_action: "capture", name: "before-deploy"})
// After
configure({action: "diff_sessions", session_action: "capture", name: "after-deploy"})
// Compare
configure({action: "diff_sessions",
session_action: "compare",
compare_a: "before-deploy",
compare_b: "after-deploy"})

The security_diff mode specifically tracks:

  • Headers removed — did someone drop the CSP header?
  • Cookie flags removed — did HttpOnly get lost in a refactor?
  • Authentication removed — did an endpoint become public?
  • Transport downgrades — did something switch from HTTPS to HTTP?

Each change is severity-rated. A removed CSP header is high severity. A transport downgrade is critical.

Gasoline doesn’t just find problems — it generates the artifacts you need to fix and prevent them.

generate({format: "csp", mode: "strict"})

Gasoline observes which origins your page actually loads resources from during development and generates a CSP that allows exactly those origins — nothing more. It uses a confidence scoring system (3+ observations from 2+ pages = high confidence) to filter out extension noise and ad injection.

generate({format: "sri"})

Every third-party script and stylesheet gets a SHA-384 hash. If a CDN is compromised and serves modified JavaScript, the browser refuses to execute it.

The output includes ready-to-paste HTML tags:

<script src="https://cdn.example.com/lib.js"
integrity="sha384-oqVuAfXRKap7fdgcCY5uykM6+R9GqQ8K/uxy9rx7HNQlGYl1kPzQho1wx4JwY8w"
crossorigin="anonymous"></script>

Even before auditing, Gasoline protects against accidental data exposure. The redaction engine automatically scrubs sensitive data from all MCP tool responses before they reach the AI:

  • AWS keys become [REDACTED:aws-key]
  • Bearer tokens become [REDACTED:bearer-token]
  • Credit card numbers become [REDACTED:credit-card]
  • SSNs become [REDACTED:ssn]

This is a double safety net. The extension strips auth headers before data reaches the server. The server’s redaction engine catches anything else before it reaches the AI. Two layers, zero configuration.

Here’s the workflow that makes Gasoline transformative for security:

  1. Develop normally — write code, test features
  2. AI audits continuously — security checks run against live traffic
  3. Issues found immediately — in the same terminal where you’re coding
  4. Fix in context — the AI has the code open and the finding in hand
  5. Verify the fix — re-run the audit, confirm the finding is gone
  6. Prevent regression — capture a security snapshot, compare after future changes

The entire cycle takes minutes, not months. No separate tool. No context switch. No ticket in a backlog that nobody reads.

For developers: Security becomes part of your flow, not an interruption to it. The AI catches what you’d need a security expert to find — and you fix it while the code is still fresh in your mind.

For security teams: Shift-left isn’t a buzzword anymore. Developers arrive at security review with most issues already caught and fixed. Reviews focus on architecture and design, not missing headers.

For compliance: Every audit finding is captured with timestamp, severity, and evidence. SARIF export integrates directly with GitHub Code Scanning. The audit log records every security check the AI performed.

For enterprises: Zero data egress. All security scanning happens on the developer’s machine. No credentials sent to cloud services. No browser traffic leaving the network. Localhost only, zero dependencies, open source.

Install Gasoline, open your application, and ask your AI:

“Run a full security audit of this page and tell me what you find.”

You might be surprised what’s been hiding in plain sight.

How to Debug CORS Errors with AI Using Gasoline MCP

CORS errors are the most misleading errors in web development. The browser tells you “access has been blocked” — but the actual problem could be a missing header, a wrong origin, a preflight failure, a credentials mismatch, or a server that’s simply crashing and returning a 500 without CORS headers.

Here’s how to use Gasoline MCP to let your AI assistant see the full picture and fix CORS issues in minutes instead of hours.

The browser console shows you something like:

Access to fetch at 'https://api.example.com/users' from origin 'http://localhost:3000'
has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present
on the requested resource.

This tells you what happened but not why. Common causes:

  1. The server doesn’t send CORS headers at all — needs configuration
  2. The server sends the wrong origin* doesn’t work with credentials
  3. The preflight OPTIONS request fails — the server doesn’t handle OPTIONS
  4. The server errors out — a 500 response won’t have CORS headers either
  5. A proxy strips headers — nginx, Cloudflare, or your reverse proxy eats the headers
  6. Credentials mode mismatchwithCredentials: true requires explicit origin, not *

Chrome DevTools shows the failed request in the Network tab, but the response body is hidden for CORS-blocked requests. You can’t see what the server actually returned. You’re debugging blind.

With Gasoline connected, your AI can see the error, the network request details, and the response headers — everything needed to diagnose the root cause.

observe({what: "errors"})

The AI sees the CORS error message with the exact URL, origin, and which header is missing.

observe({what: "network_bodies", url: "/api/users"})

This shows the full request/response pair:

  • Request headers — the Origin header the browser sent
  • Response headers — whether Access-Control-Allow-Origin is present, and what value it has
  • Response status — is it a 200 with missing headers, or a 500 that also lacks headers?
  • Response body — the actual error payload (which Chrome hides for CORS failures)
observe({what: "network_waterfall", url: "/api/users"})

The waterfall shows if there are two requests — the preflight OPTIONS and the actual request. If the OPTIONS request fails or returns the wrong status, the browser never sends the real request.

observe({what: "timeline", include: ["network", "errors"]})

The timeline shows the sequence: did the preflight succeed? Did the main request fire? When did the error appear relative to the request? This catches timing-related CORS issues like the server sending headers on GET but not POST.

What the AI sees: Request to api.example.com, response status 200, no Access-Control-Allow-Origin header.

The fix: Add CORS headers to the server. The AI can look at your server code and add the appropriate middleware:

// Express
app.use(cors({ origin: 'http://localhost:3000' }));
// Go
w.Header().Set("Access-Control-Allow-Origin", "http://localhost:3000")
// Nginx
add_header 'Access-Control-Allow-Origin' 'http://localhost:3000';

What the AI sees: Request to /api/users, response status 500, body contains {"error": "database connection failed"}, no CORS headers.

The real problem: The server is crashing, and crash responses don’t go through the CORS middleware. The CORS error is a red herring.

This is why seeing the response body matters. Without Gasoline, you’d spend an hour debugging CORS headers when the actual issue is a database connection string.

What the AI sees: Two requests in the waterfall — an OPTIONS request returning 404, and no follow-up request.

The fix: The server doesn’t handle OPTIONS requests for that route. Add an OPTIONS handler or configure your framework’s CORS middleware to handle preflight requests.

What the AI sees: Response has Access-Control-Allow-Origin: *, request has credentials: include. Error says “wildcard cannot be used with credentials.”

The fix: Replace * with the specific origin. The AI can read the Origin header from the request and configure the server to echo it back (with a whitelist).

What the AI sees: Server code sends CORS headers (the AI can read the source), but the response in the browser doesn’t have them.

Diagnosis: Something between the server and browser is stripping headers. The AI checks nginx configs, Cloudflare settings, or reverse proxy configuration.

Here’s what it looks like end-to-end:

You: “I’m getting a CORS error when calling the API.”

The AI:

  1. Calls observe({what: "errors"}) — sees the CORS error with URL and origin
  2. Calls observe({what: "network_bodies", url: "/api"}) — sees the actual response (a 500 with a database error)
  3. Reads the server code — finds the missing error handler that skips CORS middleware
  4. Fixes the error handler to pass through CORS middleware even on errors
  5. Calls interact({action: "refresh"}) — reloads the page
  6. Calls observe({what: "errors"}) — confirms the CORS error is gone

Total time: 2 minutes. No manual DevTools inspection. No guessing about headers. No Stack Overflow rabbit holes.

Chrome DevTools has a fundamental limitation for CORS debugging: it hides the response body for CORS-blocked requests. The Network tab shows the request was blocked, but you can’t see what the server actually returned.

This means you can’t tell the difference between:

  • A correctly configured server that’s missing one header
  • A server that’s completely crashing and returning a 500

Gasoline captures the response at the network level before CORS enforcement, so the AI sees everything — headers, body, status code. The diagnosis goes from “something is wrong with CORS” to “the server returned a 500 because the database is down, and the error handler doesn’t set CORS headers.”

Check the timeline, not just the error. CORS errors sometimes cascade — one failed preflight blocks ten subsequent requests. The timeline shows the cascade pattern so you fix the root cause, not the symptoms.

Look at both staging and production headers. CORS works in staging with * but breaks in production with credentials? The network bodies show exactly which headers each environment returns.

Watch for mixed HTTP/HTTPS. http://localhost:3000 and https://localhost:3000 are different origins. The AI’s transport security check (observe({what: "security_audit", checks: ["transport"]})) catches this mismatch.

Use error_bundles for context. observe({what: "error_bundles"}) returns the CORS error along with the correlated network request and recent actions — everything in one call instead of three.

How to Debug React and Next.js Apps with AI Using Gasoline MCP

React and Next.js applications have a unique set of debugging challenges — hydration mismatches, stale closures, useEffect dependency bugs, SSR/client divergence, and API route failures. Your AI coding assistant can fix all of these faster if it can actually see your browser.

Here’s how Gasoline MCP gives your AI the runtime context it needs to debug React and Next.js apps effectively.

What Makes React/Next.js Debugging Different

Section titled “What Makes React/Next.js Debugging Different”

React errors are notoriously unhelpful:

Uncaught Error: Minified React error #418

Even in development mode, React errors like “Cannot update a component while rendering a different component” don’t tell you which component or what triggered the update. And Next.js adds its own layer of complexity:

  • Hydration mismatches — server HTML differs from client render
  • SSR errors — server-side code fails but the page looks fine on the client
  • API route failures/api/* routes return 500s that the client silently swallows
  • Middleware issues — redirects and rewrites that happen before the page loads
  • Client/server boundary confusion"use client" and "use server" scope mistakes

Your AI assistant can read your source code, but without browser data it can’t see what’s actually happening at runtime.

observe({what: "errors"})

Your AI sees every console error with the full message, stack trace, and source file location. For minified builds, Gasoline resolves source maps — so even in production, the AI sees the original component name and line number.

Most React bugs involve data:

observe({what: "error_bundles"})

Error bundles return each error with its correlated context — the network requests that happened around the same time, the user actions that preceded it, and relevant console logs. One call gives the AI the complete picture:

  • The error: TypeError: Cannot read properties of undefined (reading 'map')
  • The API call: GET /api/products → 200, but the response body was { products: null } instead of { products: [] }
  • The user action: Clicked “Load More” button

The AI immediately knows: the API returned null where the component expected an array.

For race conditions and ordering issues:

observe({what: "timeline"})

The timeline shows actions, network requests, and errors in chronological order. This reveals:

  • Components that fetch data before mounting
  • Effects that fire in unexpected order
  • Network requests that resolve after the component unmounts

Symptom: “Text content does not match server-rendered HTML” or “Hydration failed because the initial UI does not match.”

observe({what: "errors"})

The AI sees the hydration warning with the mismatched content. Common causes:

  • Using Date.now() or Math.random() during render (different on server vs client)
  • Checking window or localStorage during initial render
  • Conditional rendering based on typeof window !== 'undefined'

The AI can find the component, identify the non-deterministic code, and move it into a useEffect or behind a suppressHydrationWarning.

Symptom: A feature silently fails. No error in the UI, but the data is wrong.

observe({what: "network_bodies", url: "/api"})

The AI sees every API route call with the full request and response body. A 500 response from /api/checkout with {"error": "STRIPE_KEY is undefined"} tells the AI exactly what’s wrong — an environment variable isn’t set.

Symptom: The component re-renders endlessly, or an effect doesn’t fire when it should.

observe({what: "network_waterfall", url: "/api"})

If an effect with a missing dependency is refetching on every render, the waterfall shows dozens of identical API calls in rapid succession. The AI sees the pattern and checks the effect’s dependency array.

Symptom: “Can’t perform a React state update on an unmounted component.”

observe({what: "timeline", include: ["actions", "errors", "network"]})

The timeline shows: user navigates away → API call from the previous page resolves → state update on the now-unmounted component. The AI adds cleanup logic to the effect.

Symptom: Page transitions feel sluggish.

observe({what: "vitals"})
observe({what: "performance"})

The AI checks INP (responsiveness) and long tasks. If client-side navigation triggers heavy re-renders, the performance snapshot shows the blocking time. The AI can suggest React.memo, useMemo, code splitting, or moving work to a Web Worker.

Server components run on the server and stream HTML to the client. Errors in server components don’t always appear in the browser console.

observe({what: "network_bodies", url: "/"})

The response body for a Next.js page includes the serialized server component tree. If a server component throws, the error boundary HTML is visible in the response.

Next.js middleware runs before the page loads. If a redirect or rewrite misbehaves:

observe({what: "network_waterfall"})

The waterfall shows every request including redirects (301, 307, 308). The AI can see if middleware is redirecting to the wrong URL or creating redirect loops.

Next.js <Image> component can cause CLS if dimensions aren’t right:

observe({what: "vitals"}) // Check CLS
configure({action: "query_dom", selector: "img"}) // Check image dimensions

After adding a new dependency:

observe({what: "network_waterfall"})
observe({what: "performance"})

The network summary shows total JavaScript transfer size. If it jumped from 300KB to 800KB, the waterfall identifies which new bundles appeared.

You: “The product page is broken — it shows a blank screen after I click ‘Add to Cart’.”

The AI:

  1. Calls observe({what: "error_bundles"}) — sees a TypeError: Cannot read properties of undefined (reading 'quantity') correlated with POST /api/cart → 201 that returned {item: {id: 5}} (no quantity field)

  2. Reads the cart component — finds cartItem.quantity.toString() without null checking

  3. Checks the API route — finds the response omits quantity for new items (it defaults to 1 on the backend but isn’t serialized)

  4. Fixes both: adds quantity to the API response and adds a fallback in the component

  5. Calls interact({action: "refresh"}) then observe({what: "errors"}) — confirms zero errors

Total time: 3 minutes. No manual DevTools inspection. No reproducing the bug by clicking through the UI.

Use error_bundles as your first call. It returns errors with their network and action context in one shot — faster than calling errors, then network_bodies, then actions separately.

Check the waterfall after deploys. New React bundles, changed chunk names, and different loading order are all visible in the network waterfall. The AI spots unexpected changes immediately.

Profile page transitions. Use interact({action: "navigate", url: "/products"}) to trigger a client-side navigation. The perf_diff shows the performance impact of that navigation including any heavy re-renders.

For SSR issues, check response bodies. The HTML response for a Next.js page contains the server-rendered markup. If something is wrong on the server side, it’s visible in the network body before hydration even starts.

How to Debug WebSocket Connections in 2026

WebSocket debugging in Chrome DevTools is painful. You get a flat list of frames, no filtering, no search, no way to correlate messages with application state, and if you close the tab, everything is gone.

For real-time applications — chat, live dashboards, collaborative editors, trading platforms — you need better tools. Here’s the modern approach using AI-assisted debugging.

The Problem with DevTools WebSocket Debugging

Section titled “The Problem with DevTools WebSocket Debugging”

Open Chrome DevTools, go to the Network tab, filter by WS, click on your connection, and look at the Messages tab. That’s the entire experience. Here’s what’s missing:

No filtering by message type. If your WebSocket sends 10 message types (chat, typing indicators, presence updates, notifications), you can’t filter to just one. You scroll through hundreds of messages hunting for the one you need.

No directional filtering. You can’t show only incoming or only outgoing messages without reading every row.

No correlation. When a WebSocket message causes an error, there’s no link between the Network tab and the Console tab. You’re manually matching timestamps.

No persistence. Navigate away or refresh, and the WebSocket data is gone. You can’t compare messages across page loads.

No AI access. Even if you find the problematic message, you can’t easily get it to your AI assistant. You’re back to copy-pasting.

With Gasoline MCP, your AI can observe WebSocket traffic directly, filter it, correlate it with errors, and diagnose issues without you touching DevTools.

observe({what: "websocket_status"})

The AI immediately knows:

  • How many WebSocket connections are open
  • Their URLs and states (connecting, open, closed, error)
  • Message rates per connection
  • Total messages sent and received
  • Inferred message schemas (if JSON)
observe({what: "websocket_events", direction: "incoming", last_n: 20})

The AI sees the actual message payloads, filtered to just what’s relevant. No scrolling through thousands of frames.

observe({what: "timeline", include: ["websocket", "errors"]})

The timeline shows WebSocket events and console errors chronologically. The AI sees: “The user_presence message arrived at 14:23:05.123, and a TypeError occurred at 14:23:05.125 — the presence handler is crashing.”

Your real-time dashboard stopped updating. No error in the console. The data just went stale.

You: “The dashboard stopped getting live updates.”

The AI calls observe({what: "websocket_status"}) and sees:

Connection ws-1: wss://api.example.com/live
State: closed
Close code: 1006 (abnormal closure)
Messages received: 3,847
Last message: 2 minutes ago

Close code 1006 means the connection dropped without a proper close handshake — likely a network interruption or server crash. The AI checks:

observe({what: "websocket_events", connection_id: "ws-1", last_n: 5})

The last messages were normal data frames, then nothing. No close frame from the server. The AI looks at the client-side reconnection logic and finds it has a bug — it tries to reconnect but uses the wrong URL after a server failover.

After a backend deploy, the chat stops working. Messages send but nothing appears.

The AI calls observe({what: "websocket_events", direction: "outgoing", last_n: 5}):

{"type": "message", "payload": {"text": "hello", "room": "general"}}

Then observe({what: "websocket_events", direction: "incoming", last_n: 5}):

{"type": "error", "code": "INVALID_PAYLOAD", "message": "missing field: channel"}

The backend renamed room to channel but the frontend still sends room. The AI finds the mismatch, updates the frontend, and the chat works again.

The page slows down when connected to the WebSocket. CPU usage spikes.

observe({what: "websocket_status"})
Connection ws-2: wss://api.example.com/stream
State: open
Incoming rate: 340 msg/sec
Total messages: 48,291

340 messages per second is flooding the client. The AI checks:

observe({what: "vitals"})

INP is 890ms — the main thread is completely blocked processing messages. The AI looks at the message handler, finds it’s updating React state on every message (triggering a re-render 340 times per second), and refactors it to batch updates with requestAnimationFrame or useDeferredValue.

WebSocket connections fail immediately after a deploy.

observe({what: "websocket_events", last_n: 10})

Shows open followed immediately by close with code 1008 (policy violation). The AI checks the server’s WebSocket authentication — the new deploy requires a different auth token format, but the client is sending the old format.

The most powerful pattern: combining WebSocket data with error tracking.

observe({what: "error_bundles"})

Error bundles include WebSocket events in the correlation window. When a WebSocket message triggers a JavaScript error, the AI sees both together:

  • Error: TypeError: Cannot read properties of undefined (reading 'user')
  • Correlated WebSocket message: {"type": "presence_update", "data": null} (arrived 50ms before the error)
  • User action: None (this was server-pushed)

The AI knows the server sent a presence_update with null data, and the handler doesn’t check for null. One fix: add a null guard in the handler. Better fix: also fix the server so it doesn’t send null presence data.

Real-time features are everywhere in 2026:

  • AI chat interfaces with streaming responses
  • Collaborative editing (Notion, Figma, Google Docs style)
  • Live dashboards and monitoring
  • Multiplayer applications
  • Real-time notifications

These applications live and die by their WebSocket connections. A dropped connection means lost messages. A format change means silent failures. A flooding server means frozen UIs.

DevTools hasn’t evolved to match. The WebSocket debugging experience in Chrome is fundamentally the same as it was in 2018. Meanwhile, applications have moved from “we have one WebSocket for notifications” to “we have five WebSocket connections handling different data streams.”

AI-assisted debugging — where the AI can filter, correlate, and diagnose WebSocket issues programmatically — is the first real advancement in WebSocket debugging in years.

  1. Install Gasoline (Quick Start)
  2. Open your real-time application
  3. Ask your AI: “Show me all active WebSocket connections and their status.”

Your AI calls observe({what: "websocket_status"}) and you’re debugging WebSockets without opening DevTools.

How to Fix Slow Web Vitals with AI Using Gasoline MCP

Your Core Web Vitals are red. LCP is 4.2 seconds. CLS is 0.35. Google Search Console is sending angry emails. Lighthouse gives you a list of suggestions, but they’re generic — “reduce unused JavaScript” doesn’t tell you which JavaScript or why it’s slow.

Here’s how to use Gasoline MCP to give your AI assistant real-time performance data, so it can identify exactly what’s wrong and fix it.

The Problem with Traditional Performance Tools

Section titled “The Problem with Traditional Performance Tools”

Lighthouse runs a synthetic test on a throttled connection. It’s useful for benchmarking but disconnects from your actual development experience:

  • It’s a snapshot, not real-time — you fix something, re-run Lighthouse, wait 30 seconds, check the score, repeat
  • Suggestions are generic — “eliminate render-blocking resources” doesn’t tell you which stylesheet is the problem
  • No before/after — you can’t easily compare metrics across changes
  • No correlation — it doesn’t connect slow performance to specific code changes or network requests

Gasoline solves all four problems.

observe({what: "vitals"})

Your AI gets the real numbers immediately:

MetricValueRating
FCP2.1sneeds_improvement
LCP4.2spoor
CLS0.35poor
INP280msneeds_improvement

No waiting for Lighthouse. No throttled simulation. These are the real metrics from your real browser on your real page.

observe({what: "performance"})

This returns everything — not just vitals, but the full diagnostic picture:

Navigation timing: TTFB, DomContentLoaded, Load event — shows where time is spent during page load.

Network summary by type: How many scripts, stylesheets, images, and fonts loaded. Total transfer size and decoded size per category. Your AI can immediately see “you’re loading 2.1MB of JavaScript across 47 files.”

Slowest requests: The top resources by duration. If a single API call takes 3 seconds, it shows up here.

Long tasks: JavaScript execution that blocks the main thread for more than 50ms. The count, total blocking time, and longest task. If INP is bad, this is where you find out why.

LCP measures when the main content becomes visible. Common causes of slow LCP:

High TTFB: If time_to_first_byte is over 800ms, the server is the bottleneck. The AI checks your server code, database queries, or caching configuration.

Render-blocking resources: The network waterfall shows which scripts and stylesheets load before content paints:

observe({what: "network_waterfall"})

The AI looks for CSS and JavaScript files with early start_time and long duration. These are the render-blocking resources. The fix: defer non-critical scripts, inline critical CSS, use media attributes on non-essential stylesheets.

Large hero images: If the LCP element is an image, the performance snapshot shows its transfer size. A 2MB uncompressed PNG as the hero image? The AI suggests WebP, proper sizing, and fetchpriority="high".

Late-loading content: If FCP is fast but LCP is slow, the main content loads late — maybe behind an API call or a client-side render. The timeline shows the gap:

observe({what: "timeline", include: ["network"]})

CLS measures visual stability. Things that cause layout shifts:

Images without dimensions: An <img> without width and height causes the browser to reflow when the image loads. The AI can audit your images:

configure({action: "query_dom", selector: "img"})

Dynamic content insertion: Ads, banners, or lazy-loaded content that pushes existing content down. The timeline shows when shifts happen relative to network requests.

Font loading: Web fonts that cause text to resize. The AI checks for font-display: swap or font-display: optional in your CSS.

CSS without containment: The AI can check if your dynamic containers use contain: layout or explicit dimensions.

INP measures the worst-case responsiveness to user input. If INP is high, the main thread is busy when the user interacts.

Long tasks are the smoking gun: The performance snapshot shows total blocking time and the longest task. If you have 800ms of blocking time from 12 long tasks, the AI knows exactly what to target.

Heavy event handlers: The AI can read your click and input handlers to find expensive operations (DOM manipulation, synchronous computation, large state updates) that should be deferred or moved to a Web Worker.

Third-party scripts: The network waterfall shows which third-party scripts are loading and how long their execution takes:

observe({what: "third_party_audit"})

A third-party analytics script running 200ms of JavaScript on every page load directly impacts INP.

This is where Gasoline shines. After the AI makes a change:

interact({action: "refresh"})

Gasoline automatically captures before and after performance snapshots and computes a diff. The result includes:

  • Per-metric comparison: LCP went from 4200ms to 2800ms (-33%, improved, rating: needs_improvement)
  • Resource changes: “Removed analytics-v2.js (180KB), resized bundle.js from 450KB to 320KB”
  • Verdict: “improved” — more metrics got better than worse

The AI says: “LCP improved from 4.2s to 2.8s after removing the synchronous analytics script. CLS dropped from 0.35 to 0.08 after adding image dimensions. INP is still 250ms — let me look at the long tasks.”

No re-running Lighthouse. No waiting. Instant feedback.

If INP is the remaining problem, profile the actual interactions:

interact({action: "click", selector: "text=Load More", analyze: true})

The analyze: true parameter captures before/after performance around that specific click. The AI sees exactly how much main-thread time that button click consumes.

When you’re done optimizing:

generate({format: "pr_summary"})

This produces a before/after performance summary suitable for your pull request description — showing stakeholders exactly what improved and by how much.

Here’s a real workflow condensed:

Initial vitals: LCP 5.1s, CLS 0.42, INP 380ms

AI diagnosis:

  1. Network waterfall shows 3.2MB of JavaScript across 62 requests
  2. TTFB is 1.8s — slow API call blocks server-side rendering
  3. Five images without width/height attributes cause CLS
  4. Long tasks total 1.2s of blocking time — mostly from a charting library initializing synchronously

AI fixes:

  1. Adds loading="lazy" to below-fold charts, defers non-critical scripts → JS drops to 1.4MB initial
  2. Adds Redis caching to the slow API endpoint → TTFB drops to 200ms
  3. Adds explicit dimensions to all images → CLS drops to 0.02
  4. Wraps chart initialization in requestIdleCallback → blocking time drops to 180ms

Final vitals: LCP 1.9s (good), CLS 0.02 (good), INP 150ms (good)

Total time: One conversation, about 20 minutes. Each fix was verified immediately with perf_diff.

LighthouseGasoline
Speed30s synthetic run per checkReal-time, instant
ComparisonManual before/afterAutomatic perf_diff
DiagnosisGeneric suggestionsYour actual bottlenecks
Fix cycleRun → fix → re-run → checkFix → refresh → see diff
ContextScore and suggestionsFull waterfall, timeline, long tasks
IntegrationSeparate toolSame terminal as your AI assistant

Lighthouse tells you your LCP is 4.2 seconds and suggests “reduce unused JavaScript.” Gasoline tells your AI that analytics-v2.js (180KB) loads synchronously in the head, blocks FCP by 800ms, and can be deferred without breaking anything.

Set budgets in .gasoline.json to catch regressions automatically:

{
"budgets": {
"default": {
"lcp_ms": 2500,
"cls": 0.1,
"inp_ms": 200,
"total_transfer_kb": 500
},
"routes": {
"/": { "lcp_ms": 2000 },
"/dashboard": { "lcp_ms": 3000, "total_transfer_kb": 800 }
}
}
}

When any metric exceeds its budget, the AI gets an alert. Regressions are caught during development, not after deploy.

  1. Install Gasoline and connect your AI tool (Quick Start)
  2. Navigate to your slowest page
  3. Ask: “What are the Web Vitals for this page, and what’s causing the worst ones?”

Your AI sees the numbers, identifies the bottlenecks, and starts fixing. Real metrics, real fixes, real-time feedback.

One Tool Replaces Four: How Gasoline MCP Eliminates Loom, DevTools, Selenium, and Playwright

Most development teams juggle at least four tools to ship a feature: Loom for demos and bug reports, Chrome DevTools for debugging, Selenium or Playwright for automated testing, and some combination of all three for QA. Each tool has its own setup, its own learning curve, and its own context switch.

Gasoline MCP replaces all four with a single Chrome extension and one MCP server. And the result isn’t just fewer tools — it’s dramatically faster cycle times.

Loom — “Let Me Show You What’s Happening”

Section titled “Loom — “Let Me Show You What’s Happening””

Product managers record Loom videos to demo features. Developers record Loom videos to show bugs. QA records Loom videos to document test failures. Everyone records Loom videos because the alternative — writing a detailed description with screenshots — takes even longer.

The problem: Loom videos are static. They can’t be replayed against a new build. They can’t be edited when the flow changes. They can’t be version-controlled. And they require $12.50/user/month.

Chrome DevTools — “Let Me Check the Console”

Section titled “Chrome DevTools — “Let Me Check the Console””

Every debugging session starts with opening DevTools, switching between Console, Network, and Elements tabs, copying error messages, and pasting them somewhere the AI or another developer can see them.

The problem: DevTools is manual and disconnected. The AI can’t see what’s in DevTools. You’re the human bridge between the browser and your tools.

Selenium / WebDriver — “Let Me Automate This”

Section titled “Selenium / WebDriver — “Let Me Automate This””

Automated browser testing requires WebDriver binaries, a programming language (Java, Python, JavaScript), and coded selectors that break whenever the UI changes.

The problem: High setup cost, high maintenance cost, requires developer skills. Product managers and QA without coding experience can’t use it.

Playwright — “Let Me Write a Proper Test”

Section titled “Playwright — “Let Me Write a Proper Test””

Modern browser automation that’s better than Selenium but still requires JavaScript/TypeScript, an npm project, and coded selectors.

The problem: Same fundamental issue — you need code to create tests. And when tests break (they always break), you need code to fix them.

Instead of recording a video:

"Navigate to the dashboard. Add a subtitle: 'Welcome to the Q1 report.'
Click the revenue tab. Subtitle: 'Revenue is up 23% quarter over quarter.'
Click the export button. Subtitle: 'One click to export to PDF.'"

The AI navigates the application while displaying narration text at the bottom of the viewport — like closed captions. Action toasts show what’s happening (“Click: Revenue Tab”). The audience watches a live, narrated walkthrough.

Why it’s better than Loom:

  • Replayable — run the same script against tomorrow’s build
  • Editable — change one line of text, not re-record a whole video
  • Adaptive — semantic selectors survive UI redesigns
  • Versionable — store scripts in your repo, diff them in PRs
  • Free — no per-seat subscription

Instead of opening DevTools and copy-pasting:

"What browser errors do you see?"

The AI calls observe({what: "errors"}) and sees every console error with full stack traces. Then observe({what: "network_bodies", url: "/api"}) for the API response body. Then observe({what: "websocket_status"}) for WebSocket connection state. Then observe({what: "vitals"}) for performance metrics.

Why it’s better than DevTools:

  • The AI sees it directly — no human copy-paste bridge
  • Everything in one place — errors, network, WebSocket, performance, accessibility, security
  • Correlatederror_bundles returns the error with its network context and user actions
  • Persistent — data doesn’t vanish on page refresh
  • Actionable — the AI diagnoses and fixes, not just observes

Selenium → interact() + Natural Language

Section titled “Selenium → interact() + Natural Language”

Instead of writing Java with WebDriver:

"Go to the registration page. Fill in 'Jane Doe' as the name,
'jane@example.com' as the email, and 'secure123' as the password.
Click Register. Verify you see the welcome message."

The AI navigates, types, clicks, and verifies — using semantic selectors (label=Name, text=Register) that survive UI changes.

Why it’s better than Selenium:

  • No code — describe the test in English
  • No setup — no WebDriver, no JDK, no project scaffolding
  • Resilient — semantic selectors adapt to redesigns
  • Anyone can use it — PMs, QA, designers, not just developers

Playwright → generate(format: “test”)

Section titled “Playwright → generate(format: “test”)”

After running a natural language test, lock it in for CI:

generate({format: "test", test_name: "registration-flow",
assert_network: true, assert_no_errors: true})

Gasoline generates a complete Playwright test from the session — real selectors, network assertions, error checking. The AI explored in English; Gasoline exports for CI/CD.

Why it’s better than writing Playwright by hand:

  • Faster — describe the flow, don’t code it
  • Accurate — generated from real browser behavior, not guessed
  • Maintainable — when the test breaks, re-run in English and regenerate

The Compound Effect: Radical Cycle Time Reduction

Section titled “The Compound Effect: Radical Cycle Time Reduction”

Replacing four tools isn’t just about having fewer subscriptions. It’s about what happens when demo, debug, test, and automate are the same workflow.

  1. PM records a Loom showing the desired feature (10 minutes)
  2. Developer watches the Loom, opens DevTools, starts building (context switch)
  3. Developer debugs in DevTools, copies errors, pastes to AI, gets suggestions (context switch)
  4. Developer writes Playwright tests for the feature (30-60 minutes)
  5. QA records a Loom of a bug they found (10 minutes)
  6. Developer watches the Loom, reproduces, opens DevTools again (context switch)
  7. Developer fixes and re-runs tests (context switch)
  8. PM records another Loom for the stakeholder demo (10 minutes)

Four tools. Six context switches. Half the time spent on ceremony instead of building.

  1. PM describes the feature to the AI: “The user should be able to export the report as PDF”
  2. AI builds the feature, debugging in real time — it sees errors as they happen, fixes them, verifies with observe({what: "errors"}), checks performance with observe({what: "vitals"})
  3. AI generates a test: generate({format: "test", test_name: "pdf-export"})
  4. AI runs the demo with subtitles for the stakeholder
  5. If QA finds a bug, the AI already has the error context — observe({what: "error_bundles"}) — and fixes it in the same session
  6. AI regenerates the test if the fix changed the flow

One tool. Zero context switches. The cycle from “PM describes feature” to “tested, demo-ready feature” happens in a single conversation.

Activity4-Tool CycleGasoline Cycle
Feature demo (PM)10 min Loom recording0 — AI demos with subtitles
Debugging20 min (DevTools + copy-paste)2 min (AI observes directly)
Test creation30-60 min (Playwright)2 min (generate from session)
Bug report10 min Loom + reproduce1 min (AI already has context)
Bug fix verification5 min (re-run tests)30 sec (refresh + observe)
Stakeholder demo10 min (new Loom)1 min (replay demo script)
Total85-115 min~7 min

That’s not an incremental improvement. It’s an order of magnitude.

Product velocity isn’t about how fast you type. It’s about how fast you can go from “idea” to “shipped and verified.” Every context switch adds latency. Every tool boundary adds friction. Every manual step adds error.

When demo, debug, test, and automate collapse into a single AI conversation:

  • Feedback loops tighten — the AI sees the result of every change in real time
  • Iteration cost drops — trying a different approach is a sentence, not a sprint
  • Quality increases — tests are generated from real behavior, not written from memory
  • Everyone participates — PMs can demo, test, and file bugs without developer involvement

This is what AI-native development looks like. Not “AI helps you write code faster” — but “AI collapses the entire build-debug-test-demo cycle into minutes.”

The one remaining advantage Loom has over Gasoline is shareability — you can send a Loom link to anyone with a browser. Gasoline’s demo scripts require the AI to replay them.

The fix: tab recording. Chrome’s tabCapture API can record the active tab as video while the AI runs a demo script. Subtitles and action toasts are already rendered in the page, so they’d be captured automatically. The output: a narrated demo video, generated from a replayable script, with burned-in captions. No Loom subscription. No manual recording. No re-takes.

That feature is on the roadmap. When it ships, the Loom replacement is complete.

You don’t need four tools. You need one browser extension, one MCP server, and an AI that can see your browser.

Loom → Gasoline subtitles + demo scripts (+ tab recording, coming soon) Chrome DevTools → Gasoline observe() Selenium → Gasoline interact() + natural language Playwright → Gasoline generate(format: “test”)

One install. Zero subscriptions. Faster than all four combined.

Get started →

What Is MCP? The Model Context Protocol Explained for Web Developers

MCP — the Model Context Protocol — is the USB-C of AI tools. It’s a standard that lets AI assistants plug into external data sources and capabilities without custom integrations. If you’ve ever wished your AI coding assistant could see your browser, read your database, or check your CI pipeline, MCP is how that works.

Here’s what MCP means for web developers and why it changes how you build software.

AI coding assistants are powerful but blind. They can read your source code, but they can’t see:

  • The runtime error in your browser console
  • The 500 response from your API
  • The layout shift that happens after your component mounts
  • The WebSocket connection that silently drops
  • The third-party script that’s loading slowly

Without this context, every debugging session starts with you describing the problem to the AI instead of the AI observing it directly. You become a human copy-paste bridge between your browser and your terminal.

MCP eliminates that bridge.

MCP is a JSON-RPC 2.0 protocol with a simple contract:

  1. Servers expose tools (functions the AI can call) and resources (data the AI can read)
  2. Clients (AI assistants like Claude Code, Cursor, Windsurf) discover and invoke those tools
  3. Transport is flexible — stdio pipes, HTTP, or any bidirectional channel

A typical MCP server might expose tools like:

observe({what: "errors"}) → returns browser console errors
generate({format: "test"}) → generates a Playwright test
configure({action: "health"}) → returns server status
interact({action: "click", selector: "text=Submit"}) → clicks a button

The AI assistant discovers what tools are available, reads their descriptions, and calls them as needed during a conversation. No custom plugin architecture. No vendor-specific API. Just a protocol.

Before MCP, debugging with AI looked like this:

You: “I’m getting an error when I submit the form.” AI: “What error? Can you paste the console output?” You: [switches to browser, opens DevTools, copies error, pastes] AI: “Can you also show me the network request?” You: [switches to Network tab, finds request, copies, pastes]

With an MCP server like Gasoline connected:

You: “I’m getting an error when I submit the form.” AI: [calls observe({what: “errors”})] “I can see the TypeError. The API returned a 422 because the email field is missing from the request body. Let me check the form handler…”

The AI skips the back-and-forth and goes straight to diagnosing.

MCP tools compose naturally. An AI assistant with a browser MCP server and a filesystem MCP server can:

  1. Observe a runtime error in the browser
  2. Read the relevant source file
  3. Edit the code to fix the bug
  4. Refresh the browser
  5. Verify the error is gone

That’s a complete debugging loop without human intervention beyond the initial request.

Because MCP is a standard protocol, the same server works with every compatible client:

AI ToolMCP Support
Claude CodeBuilt-in
CursorBuilt-in
WindsurfBuilt-in
Claude DesktopBuilt-in
ZedBuilt-in
VS Code + ContinuePlugin

You configure the server once. Every AI tool that speaks MCP can use it.

MCP servers exist for many data sources:

CategoryExamples
BrowserGasoline (real-time telemetry, browser control)
FilesystemRead, write, search files
DatabasesPostgreSQL, SQLite, MongoDB
APIsGitHub, Slack, Jira, Linear
DevOpsDocker, Kubernetes, CI/CD
SearchBrave Search, web fetch

The power comes from combining them. A browser MCP server plus a GitHub MCP server means your AI can observe a bug, fix it, and open a PR — all in one conversation.

Not all browser MCP servers are equal. The critical capabilities for web development:

The server should capture browser state as it happens — console logs, network errors, exceptions, WebSocket events — not just static snapshots. When you’re debugging a race condition, you need the sequence of events, not a point-in-time dump.

Observation alone isn’t enough. The AI needs to navigate, click, type, and interact with the page. Otherwise it’s reading but not testing. Semantic selectors (text=Submit, label=Email) are more resilient than CSS selectors that break with every redesign.

Captured session data should translate into useful outputs: Playwright tests, reproduction scripts, accessibility reports, performance summaries. The AI has the data — let it produce the artifacts.

A browser MCP server sees everything — network traffic, form inputs, cookies. It must:

  • Strip credentials before storing or transmitting data
  • Bind to localhost only (no network exposure)
  • Minimize permissions (no broad host access)
  • Keep all data on the developer’s machine

Web Vitals, resource timing, long tasks, layout shifts — performance data should flow alongside error data. The AI shouldn’t need a separate tool to check if the page is fast.

If you want to add browser observability to your AI workflow:

Terminal window
git clone https://github.com/brennhill/gasoline-mcp-ai-devtools.git

Load the extension/ folder as an unpacked Chrome extension.

Add to your MCP config (example for Claude Code’s .mcp.json):

{
"mcpServers": {
"gasoline": {
"command": "npx",
"args": ["-y", "gasoline-mcp"]
}
}
}

Open your app, restart your AI tool, and ask:

“What browser errors do you see?”

The AI calls observe({what: "errors"}), gets the real-time error list, and starts diagnosing. No copy-paste. No screenshots. No description of the problem. The AI sees it directly.

MCP is still early. The protocol is evolving, new servers appear weekly, and AI tools are deepening their integration. But the direction is clear: AI assistants are becoming aware of their environment, not just their context window.

For web developers, this means the feedback loop between writing code and seeing results gets tighter. The AI sees the browser. The AI sees the error. The AI sees the fix work. All in real time.

That’s what MCP enables. And it’s just getting started.