ai-development

AI-Powered QA: How to Test Your Web App Without Writing Test Code

Feb 7, 2026

Brenn Hill

What if you could test your web application by describing what should happen — in plain English — and have an AI actually run the tests?

No Playwright scripts. No Selenium WebDriver setup. No npm install or pip install. No learning CSS selectors, XPath, or assertion libraries. Just tell the AI what to test, and it tests it.

This isn’t a future vision. It works today with Gasoline MCP.

The Testing Problem

Writing automated tests is expensive:

Setup cost: Install Node.js, install Playwright, configure the test runner, set up CI/CD
Writing cost: Learn the API, figure out selectors, handle async operations, manage test data
Maintenance cost: Every UI change breaks selectors. Every flow change breaks sequences. Tests that took 2 hours to write take 4 hours to maintain.

The result? Most teams have either:

No automated tests — manual QA only
Fragile tests — break on every deploy, ignored by the team
Expensive tests — dedicated QA engineers maintaining a test suite that’s always behind

Natural Language Testing

With Gasoline, testing looks like this:

"Go to the login page. Enter 'test@example.com' as the email and 'password123'
as the password. Click Sign In. Verify that you land on the dashboard and there
are no console errors."

The AI:

Navigates to the login page
Finds the email field (using semantic selectors — label=Email, not #email-input-field-v2)
Types the email
Finds the password field
Types the password
Clicks the Sign In button (by text, not by CSS selector)
Waits for navigation
Checks the URL contains /dashboard
Checks for console errors

If anything fails, the AI reports exactly what happened: “The Sign In button was found and clicked, but the page navigated to /error instead of /dashboard. The API returned a 401 with {"error": "invalid credentials"}.”

Why This Is Different

Selenium/Playwright test:

await page.goto('https://myapp.com/login');
await page.locator('#email-input').fill('test@example.com');
await page.locator('#password-input').fill('password123');
await page.locator('button[type="submit"]').click();
await expect(page).toHaveURL(/.*dashboard/);

Gasoline natural language:

Log in with test@example.com / password123.
Verify you reach the dashboard.

The Selenium test breaks when:

The email field ID changes from #email-input to #email-field
The submit button gets a new class or is replaced with a different component
The form structure changes (inputs wrapped in a new div)

The natural language test survives all of these because the AI uses meaning-based selectors: “the email field” → label=Email, “the sign in button” → text=Sign In.

What You Can Test

User Flows

"Sign up with a new account, verify the welcome email prompt appears,
dismiss it, navigate to settings, change the display name, and verify
the change is reflected in the header."

Form Validation

"Submit the contact form with an empty email. Verify an error message
appears. Then enter a valid email and submit. Verify it succeeds."

Error Handling

"Navigate to a product page that doesn't exist (/products/99999).
Verify a 404 page is shown and there are no console errors."

Performance

"Navigate to the homepage. Check that LCP is under 2.5 seconds and
there are no layout shifts above 0.1."

Accessibility

"Run an accessibility audit on the checkout page. Report any critical
or serious violations."

API Behavior

"Submit an order. Verify the API returns a 201 status and the response
includes an order ID."

The Lock-In: Generate Real Tests

Natural language tests are great for exploratory testing and quick validation. But for CI/CD, you need repeatable tests.

After running a natural language test session:

generate({format: "test", test_name: "guest-checkout",
          assert_network: true, assert_no_errors: true})

Gasoline generates a complete Playwright test from the session — every action translated to Playwright commands with proper selectors, network assertions, and error checking. The AI ran the test in natural language; Gasoline converts it to code for CI.

This is the best of both worlds:

Write tests in English — fast, no setup
Export to Playwright — repeatable, CI-ready
Re-run in English — if the generated test breaks, describe the flow again and regenerate

Who This Is For

Product Managers

You know the user flows better than anyone. You shouldn’t need to write JavaScript to verify them. Describe the flow, the AI tests it, and you see the results.

Startups Without QA Teams

You don’t have dedicated QA engineers, and your developers are building features, not writing tests. Natural language testing gives you test coverage without the headcount.

QA Engineers

You already know how to test. Natural language testing lets you work faster — describe 10 test cases in the time it takes to code 1. Generate Playwright tests from the ones that should be permanent.

Developers in a Hurry

You just shipped a feature and want to verify the happy path before the PR review. A 30-second natural language test is faster than writing a proper test and faster than manual testing.

Resilience: Why AI Tests Survive UI Changes

Traditional tests are tightly coupled to the UI implementation:

// Breaks when the button text changes from "Submit" to "Place Order"
await page.locator('button:has-text("Submit")').click();

// Breaks when the ID changes
await page.locator('#checkout-submit-btn').click();

// Breaks when the class changes
await page.locator('.btn-primary.submit').click();

The AI uses semantic selectors that adapt:

text=Submit → If the button now says “Place Order”, the AI reads the page and finds the new text
label=Email → Works regardless of whether it’s an <input>, a Material UI <TextField>, or a custom component
role=button → Works regardless of styling or class names

And if a selector doesn’t match, the AI doesn’t just fail — it calls interact({action: "list_interactive"}) to discover what’s actually on the page and adapts.

Save and Replay

For tests you run regularly:

Save the Flow

"Save this test flow as 'checkout-happy-path'."

configure({action: "store", store_action: "save",
           namespace: "tests", key: "checkout-happy-path",
           data: {steps: ["navigate to /checkout", "fill in shipping...", ...]}})

Replay Later

"Load and run the 'checkout-happy-path' test."

configure({action: "store", store_action: "load",
           namespace: "tests", key: "checkout-happy-path"})

State Checkpoints

Save browser state at key points:

interact({action: "save_state", snapshot_name: "logged-in"})

Later, restore that state instead of repeating the login flow:

interact({action: "load_state", snapshot_name: "logged-in", include_url: true})

Get Started

Install Gasoline (Quick Start)
Open your web app
Tell your AI: “Test the login flow — go to the login page, enter test credentials, sign in, and verify you reach the dashboard.”

No setup. No dependencies. No test code. Just describe what should happen.

Best MCP Servers for Web Development in 2026

Feb 7, 2026

Brenn Hill

MCP (Model Context Protocol) lets AI coding assistants plug into external tools — browsers, databases, APIs, and more. The right combination of MCP servers turns your AI assistant from a code-only tool into a full-stack development partner.

Here are the most useful MCP servers for web developers, what they do, and how they work together.

What Makes an MCP Server Useful

A good MCP server:

Gives the AI information it can’t get otherwise — runtime data, live state, external services
Reduces copy-paste — the AI reads data directly instead of you pasting it in
Enables actions — the AI can do things, not just observe
Works locally — your data stays on your machine

With that in mind, here are the servers worth setting up.

Browser Observability: Gasoline MCP

What it does: Streams real-time browser telemetry to your AI — console logs, network errors, WebSocket events, Web Vitals, accessibility audits, user actions — and gives the AI browser control.

Why it matters: Without browser observability, your AI can read code but can’t see what happens when it runs. Every debugging session requires you to manually describe the problem. With Gasoline, the AI observes the bug directly.

Key capabilities:

4 tools: observe (23 modes), generate (7 formats), configure (12 actions), interact (24 actions)
Real-time: Console errors, network failures, WebSocket traffic as they happen
Browser control: Navigate, click, type, run JavaScript, take screenshots
Artifact generation: Playwright tests, reproduction scripts, HAR exports, CSP headers, SARIF reports
Security auditing: Credential detection, PII scanning, third-party script analysis
Performance: Web Vitals with before/after comparison on every navigation

Setup: Chrome extension + npx gasoline-mcp

Zero dependencies: Single Go binary, no Node.js runtime. Localhost only.

Get started with Gasoline →

Filesystem: Built-In

Most AI coding tools (Claude Code, Cursor, Windsurf) have built-in filesystem access. If yours doesn’t, the reference filesystem MCP server handles it:

What it does: Read, write, search, and navigate files.

Why it matters: The foundation. Everything else builds on the AI being able to read and edit your code.

Key capabilities: Read files, write files, search by name or content, directory listing.

Database: PostgreSQL / SQLite MCP

What it does: Lets the AI query your database directly — read schemas, run SELECT queries, inspect data.

Why it matters: When debugging a “wrong data” bug, the AI can check the database instead of you running psql and pasting results. It can also verify that migrations ran correctly.

Key capabilities: Schema inspection, read queries, data exploration. Most implementations are read-only by default (safe for production databases).

Use case: “Why is the user’s email wrong on the profile page?” → AI checks the database, finds the email was never updated after the migration, identifies the migration bug.

GitHub: gh CLI or GitHub MCP

What it does: Create PRs, read issues, check CI status, review code, manage releases.

Why it matters: The AI can close the loop — fix a bug, create a PR, link it to the issue, and check if CI passes. Without GitHub access, you’re the intermediary for every PR and issue interaction.

Key capabilities: Create/update PRs, read/comment on issues, check workflow runs, view PR reviews.

Use case: “Fix this bug and open a PR” → AI fixes the code, commits, pushes, creates the PR with a summary, and links it to the issue.

Search: Brave Search / Web Fetch

What it does: Searches the web and fetches page content.

Why it matters: When your AI encounters an unfamiliar error or needs documentation for a third-party library, it can search instead of guessing. This is especially useful for new APIs, recent library versions, and obscure error messages.

Key capabilities: Web search, URL fetching, content extraction.

Use case: “I’m getting a ERR_OSSL_EVP_UNSUPPORTED error” → AI searches, finds it’s a Node.js 17+ OpenSSL 3.0 issue, applies the fix.

Docker / Container Management

What it does: List containers, read logs, start/stop services, check health.

Why it matters: If your backend runs in Docker, the AI can check container logs when the API returns 500s. No more “can you check the Docker logs?” copy-paste cycles.

Key capabilities: Container listing, log reading, service management, health checks.

Use case: “The API is returning 500s” → AI checks Gasoline for the error response, then checks Docker logs for the backend container, finds the database container is down, restarts it.

CI/CD: GitHub Actions / Linear / Jira

What it does: Check build status, read test results, manage tickets.

Why it matters: The AI can check if CI is green after pushing a fix, read test failure logs, and update tickets with results — closing the loop without tab-switching.

Putting It Together

The real power is composition. Here’s a debugging workflow using multiple MCP servers:

Gasoline: observe({what: "error_bundles"}) — sees a TypeError correlated with a 500 from /api/orders
Gasoline: observe({what: "network_bodies", url: "/api/orders"}) — the 500 response says "column 'discount_code' does not exist"
Filesystem: Reads the migration files — finds the discount_code column was added in a migration that hasn’t run
Docker: Checks the database container logs — confirms the migration wasn’t applied
Filesystem: Reads the deployment script — finds migrations don’t auto-run
Filesystem: Fixes the deployment script to run migrations
Gasoline: interact({action: "refresh"}) — refreshes the page, verifies the error is gone
GitHub: Creates a PR with the fix

Six MCP servers. One conversation. No copy-paste. No tab-switching. The AI moved from symptom to root cause to fix to PR in a single flow.

Recommended Setup

For a typical web development workflow:

Priority	Server	Why
Essential	Filesystem (usually built-in)	Read and edit code
Essential	Gasoline (browser)	See runtime errors, debug, test
High value	GitHub	PRs, issues, CI status
High value	Database	Data inspection, schema verification
Useful	Search	Documentation, error lookup
Useful	Docker	Container log access

Start with Gasoline and your built-in filesystem access. Add GitHub and database when you find yourself copy-pasting between those tools and your AI. Add the rest as needed.

Configuration

Most AI tools support multiple MCP servers in their config. Example for Claude Code (.mcp.json):

{
  "mcpServers": {
    "gasoline": {
      "command": "npx",
      "args": ["-y", "gasoline-mcp"]
    }
  }
}

Each server gets its own entry. The AI discovers all available tools on startup and uses them as needed.

The Trend

MCP adoption is accelerating. Every major AI coding tool now supports MCP, and new servers appear weekly. The pattern is clear: AI assistants are becoming environment-aware, connecting to every data source and tool a developer uses.

The developers who set up the right MCP servers today work significantly faster — not because the AI is smarter, but because the AI can see more of the picture.

Gasoline MCP vs Playwright: When to Use Which

Feb 7, 2026

Brenn Hill

Gasoline and Playwright aren’t competitors — they’re complementary. Playwright is a browser automation library for writing repeatable test scripts. Gasoline is an AI-powered browser observation and control layer. Gasoline can even generate Playwright tests.

But they serve different purposes, and knowing when to use each saves significant time.

The Quick Comparison

	Gasoline MCP	Playwright
Interface	Natural language via AI	JavaScript/TypeScript/Python API
Who uses it	Developers, PMs, QA — anyone	Developers and QA engineers
Setup	Install extension + `npx gasoline-mcp`	`npm init playwright@latest`
Selectors	Semantic (`text=Submit`, `label=Email`)	CSS, XPath, role, text, test-id
Test creation	Describe in English	Write code
Execution	AI runs it interactively	CLI or CI/CD pipeline
Debugging	Real-time browser observation	Trace viewer, screenshots
Maintenance	AI adapts to UI changes	Manual selector updates
CI/CD	Generate Playwright tests → run in CI	Native CI/CD support
Observability	Console, network, WebSocket, vitals, a11y	Limited (what you assert)
Performance	Built-in Web Vitals + perf_diff	Manual performance assertions
Cost	Free, open source	Free, open source

Where Gasoline Wins

Exploratory Testing

You’re checking if a feature works. You don’t want to write a script — you want to try it.

Playwright: Write a script, run it, read the output, modify, repeat.

Gasoline: “Go to the checkout page, add two items, and complete the purchase. Tell me if anything breaks.”

For one-off verification, natural language is 10x faster.

Debugging

Your test failed. Now what?

Playwright: Open the trace viewer. Scrub through screenshots. Check the assertion error message. Maybe add console.log statements to the test and re-run.

Gasoline: The AI already sees everything — console errors, network responses, WebSocket state, performance metrics. It can diagnose while testing.

observe({what: "error_bundles"})

One call returns the error with its correlated network requests and user actions. No trace viewer needed.

Adapting to UI Changes

A designer renamed “Submit” to “Place Order” and restructured the form.

Playwright: Tests fail. You update selectors manually across 15 test files. You hope you caught them all.

Gasoline: The AI reads the page, finds the new button text, and continues. No manual updates.

Non-Technical Users

A product manager wants to verify the user flow before release.

Playwright: Not an option without JavaScript knowledge.

Gasoline: “Walk through the signup flow and make sure it works.” The PM can do this themselves.

Observability Beyond Assertions

Playwright tests only check what you explicitly assert. If you don’t assert “no console errors,” you’ll never know about them.

Gasoline observes everything passively:

Console errors the test didn’t check for
Slow API responses the test didn’t measure
Layout shifts the test didn’t detect
Third-party script failures the test couldn’t see

Performance Testing

Playwright: You can measure timing with custom code, but there’s no built-in Web Vitals collection or before/after comparison.

Gasoline: Web Vitals are captured automatically. Navigate or refresh, and you get a perf_diff with deltas, ratings, and a verdict. No custom code.

Where Playwright Wins

CI/CD Pipelines

Playwright tests run headlessly in GitHub Actions, GitLab CI, or any CI system. They’re deterministic, repeatable, and fast.

Gasoline generates Playwright tests, but the actual CI execution is Playwright’s domain. Gasoline runs interactively with an AI assistant — it’s not designed to be a CI test runner.

Parallel Test Execution

Playwright can shard tests across multiple workers and run them in parallel. For a suite of 500 tests, this means finishing in minutes instead of hours.

Gasoline is single-session — one AI, one browser, one tab at a time.

Cross-Browser Testing

Playwright supports Chromium, Firefox, and WebKit out of the box.

Gasoline’s extension currently runs in Chrome/Chromium only.

Deterministic Assertions

When you need a test that passes or fails the exact same way every time, Playwright’s explicit assertions are the right tool:

await expect(page.getByRole('heading')).toHaveText('Welcome back');
await expect(response.status()).toBe(200);

AI-driven testing is intelligent but non-deterministic — the AI might take different paths or interpret “verify it works” differently across runs.

Network Mocking

Playwright can intercept and mock network requests, letting you test error states, slow responses, and edge cases without a real backend.

Gasoline observes real traffic — it doesn’t mock it.

The Best of Both: Generate Playwright from Gasoline

The power move: use Gasoline for exploration and Playwright for CI.

1. Explore with Gasoline

"Walk through the checkout flow — add an item, go to cart, enter
shipping info, and complete the purchase."

The AI runs the flow interactively, handling UI variations and reporting issues in real time.

2. Generate a Playwright Test

"Generate a Playwright test from this session."

generate({format: "test", test_name: "checkout-flow",
          base_url: "http://localhost:3000",
          assert_network: true,
          assert_no_errors: true,
          assert_response_shape: true})

Gasoline produces a complete Playwright test:

import { test, expect } from '@playwright/test';

test('checkout-flow', async ({ page }) => {
  const consoleErrors = [];
  page.on('console', msg => {
    if (msg.type() === 'error') consoleErrors.push(msg.text());
  });

  await page.goto('http://localhost:3000/products');
  await page.getByRole('button', { name: 'Add to Cart' }).click();
  await page.getByRole('link', { name: 'Cart' }).click();
  await page.getByLabel('Address').fill('123 Main St');
  // ...
  expect(consoleErrors).toHaveLength(0);
});

3. Run in CI

The generated test runs in your CI pipeline like any other Playwright test. Deterministic, repeatable, fast.

4. When the Test Breaks

The UI changed and the Playwright test fails. Instead of manually updating selectors:

"The checkout test is failing because the form changed.
Walk through the checkout flow again and generate a new test."

The AI adapts to the new UI, generates a fresh Playwright test, and you’re back in CI.

Decision Guide

Scenario	Use
Quick feature verification	Gasoline
CI/CD regression suite	Playwright (generated by Gasoline)
Debugging a test failure	Gasoline (better observability)
Non-developer testing	Gasoline
Cross-browser testing	Playwright
Performance monitoring	Gasoline (built-in vitals)
Network mocking	Playwright
Accessibility auditing	Gasoline (built-in axe-core)
Exploratory testing	Gasoline
500+ test parallel execution	Playwright
Test maintenance	Gasoline (regenerate broken tests)

The Workflow That Uses Both

Develop — use Gasoline for real-time debugging and quick validation
Generate — convert validated flows to Playwright tests
CI — run Playwright tests on every push
Maintain — when tests break, re-explore with Gasoline and regenerate

Gasoline doesn’t replace Playwright. It makes Playwright tests easier to create, easier to maintain, and easier to debug when they fail.

How Gasoline MCP Improves Your Application Security

Feb 7, 2026

Brenn Hill

Most developers discover security issues in production. A penetration test finds exposed credentials in an API response. A security review flags missing headers. A breach notification reveals that a third-party script was exfiltrating form data.

Gasoline MCP flips the timeline. Your AI assistant audits security while you develop, catching issues before they ship.

The Problem: Security Is an Afterthought

In the typical development cycle, security checks happen late:

Development — features built, tested, deployed
Security review — weeks later, if at all
Penetration test — quarterly, expensive, findings arrive after context is lost
Incident — the worst time to learn about a vulnerability

Every step between writing the code and finding the issue adds cost. A missing HttpOnly flag caught during development takes 30 seconds to fix. The same flag caught in a pen test takes a meeting, a ticket, a sprint, and a deploy.

Real-Time Security Auditing During Development

Gasoline gives your AI assistant six categories of security checks that run against live browser traffic:

Credential Detection

Your AI can scan every network request and response for exposed secrets:

observe({what: "security_audit", checks: ["credentials"]})

This catches:

AWS Access Keys (AKIA...) in API responses
GitHub PATs (ghp_..., ghs_...) in console logs
Stripe keys (sk_test_..., sk_live_...) in client-side code
JWTs in URL parameters (a common mistake)
Bearer tokens in responses that shouldn’t contain them
Private keys accidentally bundled in source maps

Every detection runs regex plus validation (Luhn algorithm for credit cards, structure checks for JWTs) to minimize false positives.

PII Exposure Detection

observe({what: "security_audit", checks: ["pii"]})

Finds personal data flowing through your application:

Social Security Numbers
Credit card numbers (with Luhn validation — not just pattern matching)
Email addresses in unexpected API responses
Phone numbers in contexts where they shouldn’t appear

This matters for GDPR, CCPA, and HIPAA compliance. If your user list API is returning full SSNs when the frontend only needs names, your AI catches it during development.

Security Header Analysis

observe({what: "security_audit", checks: ["headers"]})

Validates that your responses include critical security headers:

Header	What It Prevents
`Strict-Transport-Security`	Downgrade attacks, cookie hijacking
`X-Content-Type-Options`	MIME sniffing attacks
`X-Frame-Options`	Clickjacking
`Content-Security-Policy`	XSS, injection attacks
`Referrer-Policy`	Referrer leakage to third parties
`Permissions-Policy`	Unauthorized browser feature access

Missing any of these? Your AI knows immediately — and can fix it.

observe({what: "security_audit", checks: ["cookies"]})

Session cookies without HttpOnly are accessible to XSS attacks. Cookies without Secure can be intercepted over HTTP. Missing SameSite enables CSRF. Gasoline checks every cookie against every flag and rates severity based on whether it’s a session cookie.

Transport Security

observe({what: "security_audit", checks: ["transport"]})

Detects:

HTTP usage on non-localhost origins (unencrypted traffic)
Mixed content (HTTPS page loading HTTP resources)
HTTPS downgrade patterns

Authentication Gaps

observe({what: "security_audit", checks: ["auth"]})

Identifies API endpoints that return PII without requiring authentication. If /api/users/123 returns a full user profile without an Authorization header, that’s a finding.

Third-Party Script Auditing

Third-party scripts are one of the largest attack surfaces in modern web applications. Every <script src="..."> from an external CDN is a trust decision.

observe({what: "third_party_audit"})

Gasoline classifies every third-party origin by risk:

Critical risk — scripts from suspicious domains, data exfiltration patterns
High risk — scripts from unknown origins, data sent to third parties with POST requests
Medium risk — non-essential third-party resources, suspicious TLDs (.xyz, .top, .click)
Low risk — fonts and images from known CDNs

It detects domain generation algorithm (DGA) patterns — high-entropy hostnames that indicate malware communication. It flags when your application sends PII-containing form data to third-party origins.

And it’s configurable. Specify your first-party origins and custom allow/block lists:

observe({what: "third_party_audit",
         first_party_origins: ["https://api.myapp.com"],
         custom_lists: {
           allowed: ["https://cdn.mycompany.com"],
           blocked: ["https://suspicious-tracker.xyz"]
         }})

Security Regression Detection

Security isn’t just about finding issues — it’s about making sure fixes stay fixed.

// Before your deploy
configure({action: "diff_sessions", session_action: "capture", name: "before-deploy"})

// After
configure({action: "diff_sessions", session_action: "capture", name: "after-deploy"})

// Compare
configure({action: "diff_sessions",
           session_action: "compare",
           compare_a: "before-deploy",
           compare_b: "after-deploy"})

The security_diff mode specifically tracks:

Headers removed — did someone drop the CSP header?
Cookie flags removed — did HttpOnly get lost in a refactor?
Authentication removed — did an endpoint become public?
Transport downgrades — did something switch from HTTPS to HTTP?

Each change is severity-rated. A removed CSP header is high severity. A transport downgrade is critical.

Generating Security Artifacts

Gasoline doesn’t just find problems — it generates the artifacts you need to fix and prevent them.

Content Security Policy

generate({format: "csp", mode: "strict"})

Gasoline observes which origins your page actually loads resources from during development and generates a CSP that allows exactly those origins — nothing more. It uses a confidence scoring system (3+ observations from 2+ pages = high confidence) to filter out extension noise and ad injection.

Subresource Integrity Hashes

generate({format: "sri"})

Every third-party script and stylesheet gets a SHA-384 hash. If a CDN is compromised and serves modified JavaScript, the browser refuses to execute it.

The output includes ready-to-paste HTML tags:

<script src="https://cdn.example.com/lib.js"
        integrity="sha384-oqVuAfXRKap7fdgcCY5uykM6+R9GqQ8K/uxy9rx7HNQlGYl1kPzQho1wx4JwY8w"
        crossorigin="anonymous"></script>

Automatic Credential Redaction

Even before auditing, Gasoline protects against accidental data exposure. The redaction engine automatically scrubs sensitive data from all MCP tool responses before they reach the AI:

AWS keys become [REDACTED:aws-key]
Bearer tokens become [REDACTED:bearer-token]
Credit card numbers become [REDACTED:credit-card]
SSNs become [REDACTED:ssn]

This is a double safety net. The extension strips auth headers before data reaches the server. The server’s redaction engine catches anything else before it reaches the AI. Two layers, zero configuration.

The Security Feedback Loop

Here’s the workflow that makes Gasoline transformative for security:

Develop normally — write code, test features
AI audits continuously — security checks run against live traffic
Issues found immediately — in the same terminal where you’re coding
Fix in context — the AI has the code open and the finding in hand
Verify the fix — re-run the audit, confirm the finding is gone
Prevent regression — capture a security snapshot, compare after future changes

The entire cycle takes minutes, not months. No separate tool. No context switch. No ticket in a backlog that nobody reads.

What This Means for Teams

For developers: Security becomes part of your flow, not an interruption to it. The AI catches what you’d need a security expert to find — and you fix it while the code is still fresh in your mind.

For security teams: Shift-left isn’t a buzzword anymore. Developers arrive at security review with most issues already caught and fixed. Reviews focus on architecture and design, not missing headers.

For compliance: Every audit finding is captured with timestamp, severity, and evidence. SARIF export integrates directly with GitHub Code Scanning. The audit log records every security check the AI performed.

For enterprises: Zero data egress. All security scanning happens on the developer’s machine. No credentials sent to cloud services. No browser traffic leaving the network. Localhost only, zero dependencies, open source.

Try It

Install Gasoline, open your application, and ask your AI:

“Run a full security audit of this page and tell me what you find.”

You might be surprised what’s been hiding in plain sight.

How to Debug CORS Errors with AI Using Gasoline MCP

Feb 7, 2026

Brenn Hill

CORS errors are the most misleading errors in web development. The browser tells you “access has been blocked” — but the actual problem could be a missing header, a wrong origin, a preflight failure, a credentials mismatch, or a server that’s simply crashing and returning a 500 without CORS headers.

Here’s how to use Gasoline MCP to let your AI assistant see the full picture and fix CORS issues in minutes instead of hours.

Why CORS Errors Are Hard to Debug

The browser console shows you something like:

Access to fetch at 'https://api.example.com/users' from origin 'http://localhost:3000'
has been blocked by CORS policy: No 'Access-Control-Allow-Origin' header is present
on the requested resource.

This tells you what happened but not why. Common causes:

The server doesn’t send CORS headers at all — needs configuration
The server sends the wrong origin — * doesn’t work with credentials
The preflight OPTIONS request fails — the server doesn’t handle OPTIONS
The server errors out — a 500 response won’t have CORS headers either
A proxy strips headers — nginx, Cloudflare, or your reverse proxy eats the headers
Credentials mode mismatch — withCredentials: true requires explicit origin, not *

Chrome DevTools shows the failed request in the Network tab, but the response body is hidden for CORS-blocked requests. You can’t see what the server actually returned. You’re debugging blind.

The Gasoline Approach

With Gasoline connected, your AI can see the error, the network request details, and the response headers — everything needed to diagnose the root cause.

Step 1: See the Error

observe({what: "errors"})

The AI sees the CORS error message with the exact URL, origin, and which header is missing.

Step 2: Inspect the Network Request

observe({what: "network_bodies", url: "/api/users"})

This shows the full request/response pair:

Request headers — the Origin header the browser sent
Response headers — whether Access-Control-Allow-Origin is present, and what value it has
Response status — is it a 200 with missing headers, or a 500 that also lacks headers?
Response body — the actual error payload (which Chrome hides for CORS failures)

Step 3: Check for Preflight Issues

observe({what: "network_waterfall", url: "/api/users"})

The waterfall shows if there are two requests — the preflight OPTIONS and the actual request. If the OPTIONS request fails or returns the wrong status, the browser never sends the real request.

Step 4: Look at the Timeline

observe({what: "timeline", include: ["network", "errors"]})

The timeline shows the sequence: did the preflight succeed? Did the main request fire? When did the error appear relative to the request? This catches timing-related CORS issues like the server sending headers on GET but not POST.

Common CORS Scenarios and Fixes

Missing CORS Headers Entirely

What the AI sees: Request to api.example.com, response status 200, no Access-Control-Allow-Origin header.

The fix: Add CORS headers to the server. The AI can look at your server code and add the appropriate middleware:

// Express
app.use(cors({ origin: 'http://localhost:3000' }));

// Go
w.Header().Set("Access-Control-Allow-Origin", "http://localhost:3000")

// Nginx
add_header 'Access-Control-Allow-Origin' 'http://localhost:3000';

Server Returning 500

What the AI sees: Request to /api/users, response status 500, body contains {"error": "database connection failed"}, no CORS headers.

The real problem: The server is crashing, and crash responses don’t go through the CORS middleware. The CORS error is a red herring.

This is why seeing the response body matters. Without Gasoline, you’d spend an hour debugging CORS headers when the actual issue is a database connection string.

Preflight OPTIONS Failure

What the AI sees: Two requests in the waterfall — an OPTIONS request returning 404, and no follow-up request.

The fix: The server doesn’t handle OPTIONS requests for that route. Add an OPTIONS handler or configure your framework’s CORS middleware to handle preflight requests.

Wildcard With Credentials

What the AI sees: Response has Access-Control-Allow-Origin: *, request has credentials: include. Error says “wildcard cannot be used with credentials.”

The fix: Replace * with the specific origin. The AI can read the Origin header from the request and configure the server to echo it back (with a whitelist).

Proxy Stripping Headers

What the AI sees: Server code sends CORS headers (the AI can read the source), but the response in the browser doesn’t have them.

Diagnosis: Something between the server and browser is stripping headers. The AI checks nginx configs, Cloudflare settings, or reverse proxy configuration.

The Full Workflow

Here’s what it looks like end-to-end:

You: “I’m getting a CORS error when calling the API.”

The AI:

Calls observe({what: "errors"}) — sees the CORS error with URL and origin
Calls observe({what: "network_bodies", url: "/api"}) — sees the actual response (a 500 with a database error)
Reads the server code — finds the missing error handler that skips CORS middleware
Fixes the error handler to pass through CORS middleware even on errors
Calls interact({action: "refresh"}) — reloads the page
Calls observe({what: "errors"}) — confirms the CORS error is gone

Total time: 2 minutes. No manual DevTools inspection. No guessing about headers. No Stack Overflow rabbit holes.

Why This Is Better Than DevTools

Chrome DevTools has a fundamental limitation for CORS debugging: it hides the response body for CORS-blocked requests. The Network tab shows the request was blocked, but you can’t see what the server actually returned.

This means you can’t tell the difference between:

A correctly configured server that’s missing one header
A server that’s completely crashing and returning a 500

Gasoline captures the response at the network level before CORS enforcement, so the AI sees everything — headers, body, status code. The diagnosis goes from “something is wrong with CORS” to “the server returned a 500 because the database is down, and the error handler doesn’t set CORS headers.”

Tips

Check the timeline, not just the error. CORS errors sometimes cascade — one failed preflight blocks ten subsequent requests. The timeline shows the cascade pattern so you fix the root cause, not the symptoms.

Look at both staging and production headers. CORS works in staging with * but breaks in production with credentials? The network bodies show exactly which headers each environment returns.

Watch for mixed HTTP/HTTPS. http://localhost:3000 and https://localhost:3000 are different origins. The AI’s transport security check (observe({what: "security_audit", checks: ["transport"]})) catches this mismatch.

Use error_bundles for context. observe({what: "error_bundles"}) returns the CORS error along with the correlated network request and recent actions — everything in one call instead of three.

How to Debug React and Next.js Apps with AI Using Gasoline MCP

Feb 7, 2026

Brenn Hill

React and Next.js applications have a unique set of debugging challenges — hydration mismatches, stale closures, useEffect dependency bugs, SSR/client divergence, and API route failures. Your AI coding assistant can fix all of these faster if it can actually see your browser.

Here’s how Gasoline MCP gives your AI the runtime context it needs to debug React and Next.js apps effectively.

What Makes React/Next.js Debugging Different

React errors are notoriously unhelpful:

Uncaught Error: Minified React error #418

Even in development mode, React errors like “Cannot update a component while rendering a different component” don’t tell you which component or what triggered the update. And Next.js adds its own layer of complexity:

Hydration mismatches — server HTML differs from client render
SSR errors — server-side code fails but the page looks fine on the client
API route failures — /api/* routes return 500s that the client silently swallows
Middleware issues — redirects and rewrites that happen before the page loads
Client/server boundary confusion — "use client" and "use server" scope mistakes

Your AI assistant can read your source code, but without browser data it can’t see what’s actually happening at runtime.

The Gasoline Debugging Workflow

Step 1: See Runtime Errors

observe({what: "errors"})

Your AI sees every console error with the full message, stack trace, and source file location. For minified builds, Gasoline resolves source maps — so even in production, the AI sees the original component name and line number.

Step 2: Correlate with Network Data

Most React bugs involve data:

observe({what: "error_bundles"})

Error bundles return each error with its correlated context — the network requests that happened around the same time, the user actions that preceded it, and relevant console logs. One call gives the AI the complete picture:

The error: TypeError: Cannot read properties of undefined (reading 'map')
The API call: GET /api/products → 200, but the response body was { products: null } instead of { products: [] }
The user action: Clicked “Load More” button

The AI immediately knows: the API returned null where the component expected an array.

Step 3: Check the Timeline

For race conditions and ordering issues:

observe({what: "timeline"})

The timeline shows actions, network requests, and errors in chronological order. This reveals:

Components that fetch data before mounting
Effects that fire in unexpected order
Network requests that resolve after the component unmounts

Common React/Next.js Issues

Hydration Mismatches

Symptom: “Text content does not match server-rendered HTML” or “Hydration failed because the initial UI does not match.”

observe({what: "errors"})

The AI sees the hydration warning with the mismatched content. Common causes:

Using Date.now() or Math.random() during render (different on server vs client)
Checking window or localStorage during initial render
Conditional rendering based on typeof window !== 'undefined'

The AI can find the component, identify the non-deterministic code, and move it into a useEffect or behind a suppressHydrationWarning.

API Route 500 Errors

Symptom: A feature silently fails. No error in the UI, but the data is wrong.

observe({what: "network_bodies", url: "/api"})

The AI sees every API route call with the full request and response body. A 500 response from /api/checkout with {"error": "STRIPE_KEY is undefined"} tells the AI exactly what’s wrong — an environment variable isn’t set.

useEffect Dependency Bugs

Symptom: The component re-renders endlessly, or an effect doesn’t fire when it should.

observe({what: "network_waterfall", url: "/api"})

If an effect with a missing dependency is refetching on every render, the waterfall shows dozens of identical API calls in rapid succession. The AI sees the pattern and checks the effect’s dependency array.

State Update on Unmounted Component

Symptom: “Can’t perform a React state update on an unmounted component.”

observe({what: "timeline", include: ["actions", "errors", "network"]})

The timeline shows: user navigates away → API call from the previous page resolves → state update on the now-unmounted component. The AI adds cleanup logic to the effect.

Symptom: Page transitions feel sluggish.

observe({what: "vitals"})
observe({what: "performance"})

The AI checks INP (responsiveness) and long tasks. If client-side navigation triggers heavy re-renders, the performance snapshot shows the blocking time. The AI can suggest React.memo, useMemo, code splitting, or moving work to a Web Worker.

Next.js-Specific Debugging

Server Component Errors

Server components run on the server and stream HTML to the client. Errors in server components don’t always appear in the browser console.

observe({what: "network_bodies", url: "/"})

The response body for a Next.js page includes the serialized server component tree. If a server component throws, the error boundary HTML is visible in the response.

Middleware Debugging

Next.js middleware runs before the page loads. If a redirect or rewrite misbehaves:

observe({what: "network_waterfall"})

The waterfall shows every request including redirects (301, 307, 308). The AI can see if middleware is redirecting to the wrong URL or creating redirect loops.

Image Optimization Issues

Next.js <Image> component can cause CLS if dimensions aren’t right:

observe({what: "vitals"})  // Check CLS
configure({action: "query_dom", selector: "img"})  // Check image dimensions

Build Size Regression

After adding a new dependency:

observe({what: "network_waterfall"})
observe({what: "performance"})

The network summary shows total JavaScript transfer size. If it jumped from 300KB to 800KB, the waterfall identifies which new bundles appeared.

Full Debug Session Example

You: “The product page is broken — it shows a blank screen after I click ‘Add to Cart’.”

The AI:

Calls observe({what: "error_bundles"}) — sees a TypeError: Cannot read properties of undefined (reading 'quantity') correlated with POST /api/cart → 201 that returned {item: {id: 5}} (no quantity field)
Reads the cart component — finds cartItem.quantity.toString() without null checking
Checks the API route — finds the response omits quantity for new items (it defaults to 1 on the backend but isn’t serialized)
Fixes both: adds quantity to the API response and adds a fallback in the component
Calls interact({action: "refresh"}) then observe({what: "errors"}) — confirms zero errors

Total time: 3 minutes. No manual DevTools inspection. No reproducing the bug by clicking through the UI.

Tips

Use error_bundles as your first call. It returns errors with their network and action context in one shot — faster than calling errors, then network_bodies, then actions separately.

Check the waterfall after deploys. New React bundles, changed chunk names, and different loading order are all visible in the network waterfall. The AI spots unexpected changes immediately.

Profile page transitions. Use interact({action: "navigate", url: "/products"}) to trigger a client-side navigation. The perf_diff shows the performance impact of that navigation including any heavy re-renders.

For SSR issues, check response bodies. The HTML response for a Next.js page contains the server-rendered markup. If something is wrong on the server side, it’s visible in the network body before hydration even starts.

How to Debug WebSocket Connections in 2026

Feb 7, 2026

Brenn Hill

WebSocket debugging in Chrome DevTools is painful. You get a flat list of frames, no filtering, no search, no way to correlate messages with application state, and if you close the tab, everything is gone.

For real-time applications — chat, live dashboards, collaborative editors, trading platforms — you need better tools. Here’s the modern approach using AI-assisted debugging.

The Problem with DevTools WebSocket Debugging

Open Chrome DevTools, go to the Network tab, filter by WS, click on your connection, and look at the Messages tab. That’s the entire experience. Here’s what’s missing:

No filtering by message type. If your WebSocket sends 10 message types (chat, typing indicators, presence updates, notifications), you can’t filter to just one. You scroll through hundreds of messages hunting for the one you need.

No directional filtering. You can’t show only incoming or only outgoing messages without reading every row.

No correlation. When a WebSocket message causes an error, there’s no link between the Network tab and the Console tab. You’re manually matching timestamps.

No persistence. Navigate away or refresh, and the WebSocket data is gone. You can’t compare messages across page loads.

No AI access. Even if you find the problematic message, you can’t easily get it to your AI assistant. You’re back to copy-pasting.

The AI-Assisted Approach

With Gasoline MCP, your AI can observe WebSocket traffic directly, filter it, correlate it with errors, and diagnose issues without you touching DevTools.

See All Connections

observe({what: "websocket_status"})

The AI immediately knows:

How many WebSocket connections are open
Their URLs and states (connecting, open, closed, error)
Message rates per connection
Total messages sent and received
Inferred message schemas (if JSON)

Inspect Messages

observe({what: "websocket_events", direction: "incoming", last_n: 20})

The AI sees the actual message payloads, filtered to just what’s relevant. No scrolling through thousands of frames.

Correlate with Errors

observe({what: "timeline", include: ["websocket", "errors"]})

The timeline shows WebSocket events and console errors chronologically. The AI sees: “The user_presence message arrived at 14:23:05.123, and a TypeError occurred at 14:23:05.125 — the presence handler is crashing.”

Real Debugging Scenarios

The Silent Disconnect

Your real-time dashboard stopped updating. No error in the console. The data just went stale.

You: “The dashboard stopped getting live updates.”

The AI calls observe({what: "websocket_status"}) and sees:

Connection ws-1: wss://api.example.com/live
  State: closed
  Close code: 1006 (abnormal closure)
  Messages received: 3,847
  Last message: 2 minutes ago

Close code 1006 means the connection dropped without a proper close handshake — likely a network interruption or server crash. The AI checks:

observe({what: "websocket_events", connection_id: "ws-1", last_n: 5})

The last messages were normal data frames, then nothing. No close frame from the server. The AI looks at the client-side reconnection logic and finds it has a bug — it tries to reconnect but uses the wrong URL after a server failover.

Message Format Regression

After a backend deploy, the chat stops working. Messages send but nothing appears.

The AI calls observe({what: "websocket_events", direction: "outgoing", last_n: 5}):

{"type": "message", "payload": {"text": "hello", "room": "general"}}

Then observe({what: "websocket_events", direction: "incoming", last_n: 5}):

{"type": "error", "code": "INVALID_PAYLOAD", "message": "missing field: channel"}

The backend renamed room to channel but the frontend still sends room. The AI finds the mismatch, updates the frontend, and the chat works again.

High-Frequency Message Flooding

The page slows down when connected to the WebSocket. CPU usage spikes.

observe({what: "websocket_status"})

Connection ws-2: wss://api.example.com/stream
  State: open
  Incoming rate: 340 msg/sec
  Total messages: 48,291

340 messages per second is flooding the client. The AI checks:

observe({what: "vitals"})

INP is 890ms — the main thread is completely blocked processing messages. The AI looks at the message handler, finds it’s updating React state on every message (triggering a re-render 340 times per second), and refactors it to batch updates with requestAnimationFrame or useDeferredValue.

Connection Refused After Deploy

WebSocket connections fail immediately after a deploy.

observe({what: "websocket_events", last_n: 10})

Shows open followed immediately by close with code 1008 (policy violation). The AI checks the server’s WebSocket authentication — the new deploy requires a different auth token format, but the client is sending the old format.

WebSockets + Error Correlation

The most powerful pattern: combining WebSocket data with error tracking.

observe({what: "error_bundles"})

Error bundles include WebSocket events in the correlation window. When a WebSocket message triggers a JavaScript error, the AI sees both together:

Error: TypeError: Cannot read properties of undefined (reading 'user')
Correlated WebSocket message: {"type": "presence_update", "data": null} (arrived 50ms before the error)
User action: None (this was server-pushed)

The AI knows the server sent a presence_update with null data, and the handler doesn’t check for null. One fix: add a null guard in the handler. Better fix: also fix the server so it doesn’t send null presence data.

Why This Matters Now

Real-time features are everywhere in 2026:

AI chat interfaces with streaming responses
Collaborative editing (Notion, Figma, Google Docs style)
Live dashboards and monitoring
Multiplayer applications
Real-time notifications

These applications live and die by their WebSocket connections. A dropped connection means lost messages. A format change means silent failures. A flooding server means frozen UIs.

DevTools hasn’t evolved to match. The WebSocket debugging experience in Chrome is fundamentally the same as it was in 2018. Meanwhile, applications have moved from “we have one WebSocket for notifications” to “we have five WebSocket connections handling different data streams.”

AI-assisted debugging — where the AI can filter, correlate, and diagnose WebSocket issues programmatically — is the first real advancement in WebSocket debugging in years.

Get Started

Install Gasoline (Quick Start)
Open your real-time application
Ask your AI: “Show me all active WebSocket connections and their status.”

Your AI calls observe({what: "websocket_status"}) and you’re debugging WebSockets without opening DevTools.

How to Fix Slow Web Vitals with AI Using Gasoline MCP

Feb 7, 2026

Brenn Hill

Your Core Web Vitals are red. LCP is 4.2 seconds. CLS is 0.35. Google Search Console is sending angry emails. Lighthouse gives you a list of suggestions, but they’re generic — “reduce unused JavaScript” doesn’t tell you which JavaScript or why it’s slow.

Here’s how to use Gasoline MCP to give your AI assistant real-time performance data, so it can identify exactly what’s wrong and fix it.

The Problem with Traditional Performance Tools

Lighthouse runs a synthetic test on a throttled connection. It’s useful for benchmarking but disconnects from your actual development experience:

It’s a snapshot, not real-time — you fix something, re-run Lighthouse, wait 30 seconds, check the score, repeat
Suggestions are generic — “eliminate render-blocking resources” doesn’t tell you which stylesheet is the problem
No before/after — you can’t easily compare metrics across changes
No correlation — it doesn’t connect slow performance to specific code changes or network requests

Gasoline solves all four problems.

Step 1: See Your Current Vitals

observe({what: "vitals"})

Your AI gets the real numbers immediately:

Metric	Value	Rating
FCP	2.1s	needs_improvement
LCP	4.2s	poor
CLS	0.35	poor
INP	280ms	needs_improvement

No waiting for Lighthouse. No throttled simulation. These are the real metrics from your real browser on your real page.

Step 2: Get the Full Performance Snapshot

observe({what: "performance"})

This returns everything — not just vitals, but the full diagnostic picture:

Navigation timing: TTFB, DomContentLoaded, Load event — shows where time is spent during page load.

Network summary by type: How many scripts, stylesheets, images, and fonts loaded. Total transfer size and decoded size per category. Your AI can immediately see “you’re loading 2.1MB of JavaScript across 47 files.”

Slowest requests: The top resources by duration. If a single API call takes 3 seconds, it shows up here.

Long tasks: JavaScript execution that blocks the main thread for more than 50ms. The count, total blocking time, and longest task. If INP is bad, this is where you find out why.

Step 3: Diagnose Each Metric

Fixing LCP (Largest Contentful Paint)

LCP measures when the main content becomes visible. Common causes of slow LCP:

High TTFB: If time_to_first_byte is over 800ms, the server is the bottleneck. The AI checks your server code, database queries, or caching configuration.

Render-blocking resources: The network waterfall shows which scripts and stylesheets load before content paints:

observe({what: "network_waterfall"})

The AI looks for CSS and JavaScript files with early start_time and long duration. These are the render-blocking resources. The fix: defer non-critical scripts, inline critical CSS, use media attributes on non-essential stylesheets.

Large hero images: If the LCP element is an image, the performance snapshot shows its transfer size. A 2MB uncompressed PNG as the hero image? The AI suggests WebP, proper sizing, and fetchpriority="high".

Late-loading content: If FCP is fast but LCP is slow, the main content loads late — maybe behind an API call or a client-side render. The timeline shows the gap:

observe({what: "timeline", include: ["network"]})

Fixing CLS (Cumulative Layout Shift)

CLS measures visual stability. Things that cause layout shifts:

Images without dimensions: An <img> without width and height causes the browser to reflow when the image loads. The AI can audit your images:

configure({action: "query_dom", selector: "img"})

Dynamic content insertion: Ads, banners, or lazy-loaded content that pushes existing content down. The timeline shows when shifts happen relative to network requests.

Font loading: Web fonts that cause text to resize. The AI checks for font-display: swap or font-display: optional in your CSS.

CSS without containment: The AI can check if your dynamic containers use contain: layout or explicit dimensions.

Fixing INP (Interaction to Next Paint)

INP measures the worst-case responsiveness to user input. If INP is high, the main thread is busy when the user interacts.

Long tasks are the smoking gun: The performance snapshot shows total blocking time and the longest task. If you have 800ms of blocking time from 12 long tasks, the AI knows exactly what to target.

Heavy event handlers: The AI can read your click and input handlers to find expensive operations (DOM manipulation, synchronous computation, large state updates) that should be deferred or moved to a Web Worker.

Third-party scripts: The network waterfall shows which third-party scripts are loading and how long their execution takes:

observe({what: "third_party_audit"})

A third-party analytics script running 200ms of JavaScript on every page load directly impacts INP.

Step 4: Make Changes and Compare

This is where Gasoline shines. After the AI makes a change:

interact({action: "refresh"})

Gasoline automatically captures before and after performance snapshots and computes a diff. The result includes:

Per-metric comparison: LCP went from 4200ms to 2800ms (-33%, improved, rating: needs_improvement)
Resource changes: “Removed analytics-v2.js (180KB), resized bundle.js from 450KB to 320KB”
Verdict: “improved” — more metrics got better than worse

The AI says: “LCP improved from 4.2s to 2.8s after removing the synchronous analytics script. CLS dropped from 0.35 to 0.08 after adding image dimensions. INP is still 250ms — let me look at the long tasks.”

No re-running Lighthouse. No waiting. Instant feedback.

Step 5: Profile Specific Interactions

If INP is the remaining problem, profile the actual interactions:

interact({action: "click", selector: "text=Load More", analyze: true})

The analyze: true parameter captures before/after performance around that specific click. The AI sees exactly how much main-thread time that button click consumes.

Step 6: Generate a PR Summary

When you’re done optimizing:

generate({format: "pr_summary"})

This produces a before/after performance summary suitable for your pull request description — showing stakeholders exactly what improved and by how much.

Real-World Example: Fixing a Dashboard

Here’s a real workflow condensed:

Initial vitals: LCP 5.1s, CLS 0.42, INP 380ms

AI diagnosis:

Network waterfall shows 3.2MB of JavaScript across 62 requests
TTFB is 1.8s — slow API call blocks server-side rendering
Five images without width/height attributes cause CLS
Long tasks total 1.2s of blocking time — mostly from a charting library initializing synchronously

AI fixes:

Adds loading="lazy" to below-fold charts, defers non-critical scripts → JS drops to 1.4MB initial
Adds Redis caching to the slow API endpoint → TTFB drops to 200ms
Adds explicit dimensions to all images → CLS drops to 0.02
Wraps chart initialization in requestIdleCallback → blocking time drops to 180ms

Final vitals: LCP 1.9s (good), CLS 0.02 (good), INP 150ms (good)

Total time: One conversation, about 20 minutes. Each fix was verified immediately with perf_diff.

Why This Beats Lighthouse

	Lighthouse	Gasoline
Speed	30s synthetic run per check	Real-time, instant
Comparison	Manual before/after	Automatic perf_diff
Diagnosis	Generic suggestions	Your actual bottlenecks
Fix cycle	Run → fix → re-run → check	Fix → refresh → see diff
Context	Score and suggestions	Full waterfall, timeline, long tasks
Integration	Separate tool	Same terminal as your AI assistant

Lighthouse tells you your LCP is 4.2 seconds and suggests “reduce unused JavaScript.” Gasoline tells your AI that analytics-v2.js (180KB) loads synchronously in the head, blocks FCP by 800ms, and can be deferred without breaking anything.

Performance Budgets for Prevention

Set budgets in .gasoline.json to catch regressions automatically:

{
  "budgets": {
    "default": {
      "lcp_ms": 2500,
      "cls": 0.1,
      "inp_ms": 200,
      "total_transfer_kb": 500
    },
    "routes": {
      "/": { "lcp_ms": 2000 },
      "/dashboard": { "lcp_ms": 3000, "total_transfer_kb": 800 }
    }
  }
}

When any metric exceeds its budget, the AI gets an alert. Regressions are caught during development, not after deploy.

Get Started

Install Gasoline and connect your AI tool (Quick Start)
Navigate to your slowest page
Ask: “What are the Web Vitals for this page, and what’s causing the worst ones?”

Your AI sees the numbers, identifies the bottlenecks, and starts fixing. Real metrics, real fixes, real-time feedback.

One Tool Replaces Four: How Gasoline MCP Eliminates Loom, DevTools, Selenium, and Playwright

Feb 7, 2026

Brenn Hill

Most development teams juggle at least four tools to ship a feature: Loom for demos and bug reports, Chrome DevTools for debugging, Selenium or Playwright for automated testing, and some combination of all three for QA. Each tool has its own setup, its own learning curve, and its own context switch.

Gasoline MCP replaces all four with a single Chrome extension and one MCP server. And the result isn’t just fewer tools — it’s dramatically faster cycle times.

The Four Tools You’re Using Today

Loom — “Let Me Show You What’s Happening”

Product managers record Loom videos to demo features. Developers record Loom videos to show bugs. QA records Loom videos to document test failures. Everyone records Loom videos because the alternative — writing a detailed description with screenshots — takes even longer.

The problem: Loom videos are static. They can’t be replayed against a new build. They can’t be edited when the flow changes. They can’t be version-controlled. And they require $12.50/user/month.

Chrome DevTools — “Let Me Check the Console”

Every debugging session starts with opening DevTools, switching between Console, Network, and Elements tabs, copying error messages, and pasting them somewhere the AI or another developer can see them.

The problem: DevTools is manual and disconnected. The AI can’t see what’s in DevTools. You’re the human bridge between the browser and your tools.

Selenium / WebDriver — “Let Me Automate This”

Automated browser testing requires WebDriver binaries, a programming language (Java, Python, JavaScript), and coded selectors that break whenever the UI changes.

The problem: High setup cost, high maintenance cost, requires developer skills. Product managers and QA without coding experience can’t use it.

Playwright — “Let Me Write a Proper Test”

Modern browser automation that’s better than Selenium but still requires JavaScript/TypeScript, an npm project, and coded selectors.

The problem: Same fundamental issue — you need code to create tests. And when tests break (they always break), you need code to fix them.

How Gasoline Replaces Each One

Loom → Subtitles + Demo Scripts

Instead of recording a video:

"Navigate to the dashboard. Add a subtitle: 'Welcome to the Q1 report.'
Click the revenue tab. Subtitle: 'Revenue is up 23% quarter over quarter.'
Click the export button. Subtitle: 'One click to export to PDF.'"

The AI navigates the application while displaying narration text at the bottom of the viewport — like closed captions. Action toasts show what’s happening (“Click: Revenue Tab”). The audience watches a live, narrated walkthrough.

Why it’s better than Loom:

Replayable — run the same script against tomorrow’s build
Editable — change one line of text, not re-record a whole video
Adaptive — semantic selectors survive UI redesigns
Versionable — store scripts in your repo, diff them in PRs
Free — no per-seat subscription

Chrome DevTools → observe()

Instead of opening DevTools and copy-pasting:

"What browser errors do you see?"

The AI calls observe({what: "errors"}) and sees every console error with full stack traces. Then observe({what: "network_bodies", url: "/api"}) for the API response body. Then observe({what: "websocket_status"}) for WebSocket connection state. Then observe({what: "vitals"}) for performance metrics.

Why it’s better than DevTools:

The AI sees it directly — no human copy-paste bridge
Everything in one place — errors, network, WebSocket, performance, accessibility, security
Correlated — error_bundles returns the error with its network context and user actions
Persistent — data doesn’t vanish on page refresh
Actionable — the AI diagnoses and fixes, not just observes

Selenium → interact() + Natural Language

Instead of writing Java with WebDriver:

"Go to the registration page. Fill in 'Jane Doe' as the name,
'jane@example.com' as the email, and 'secure123' as the password.
Click Register. Verify you see the welcome message."

The AI navigates, types, clicks, and verifies — using semantic selectors (label=Name, text=Register) that survive UI changes.

Why it’s better than Selenium:

No code — describe the test in English
No setup — no WebDriver, no JDK, no project scaffolding
Resilient — semantic selectors adapt to redesigns
Anyone can use it — PMs, QA, designers, not just developers

Playwright → generate(format: “test”)

After running a natural language test, lock it in for CI:

generate({format: "test", test_name: "registration-flow",
          assert_network: true, assert_no_errors: true})

Gasoline generates a complete Playwright test from the session — real selectors, network assertions, error checking. The AI explored in English; Gasoline exports for CI/CD.

Why it’s better than writing Playwright by hand:

Faster — describe the flow, don’t code it
Accurate — generated from real browser behavior, not guessed
Maintainable — when the test breaks, re-run in English and regenerate

The Compound Effect: Radical Cycle Time Reduction

Replacing four tools isn’t just about having fewer subscriptions. It’s about what happens when demo, debug, test, and automate are the same workflow.

Before: The 4-Tool Cycle

PM records a Loom showing the desired feature (10 minutes)
Developer watches the Loom, opens DevTools, starts building (context switch)
Developer debugs in DevTools, copies errors, pastes to AI, gets suggestions (context switch)
Developer writes Playwright tests for the feature (30-60 minutes)
QA records a Loom of a bug they found (10 minutes)
Developer watches the Loom, reproduces, opens DevTools again (context switch)
Developer fixes and re-runs tests (context switch)
PM records another Loom for the stakeholder demo (10 minutes)

Four tools. Six context switches. Half the time spent on ceremony instead of building.

After: The Gasoline Cycle

PM describes the feature to the AI: “The user should be able to export the report as PDF”
AI builds the feature, debugging in real time — it sees errors as they happen, fixes them, verifies with observe({what: "errors"}), checks performance with observe({what: "vitals"})
AI generates a test: generate({format: "test", test_name: "pdf-export"})
AI runs the demo with subtitles for the stakeholder
If QA finds a bug, the AI already has the error context — observe({what: "error_bundles"}) — and fixes it in the same session
AI regenerates the test if the fix changed the flow

One tool. Zero context switches. The cycle from “PM describes feature” to “tested, demo-ready feature” happens in a single conversation.

The Math

Activity	4-Tool Cycle	Gasoline Cycle
Feature demo (PM)	10 min Loom recording	0 — AI demos with subtitles
Debugging	20 min (DevTools + copy-paste)	2 min (AI observes directly)
Test creation	30-60 min (Playwright)	2 min (generate from session)
Bug report	10 min Loom + reproduce	1 min (AI already has context)
Bug fix verification	5 min (re-run tests)	30 sec (refresh + observe)
Stakeholder demo	10 min (new Loom)	1 min (replay demo script)
Total	85-115 min	~7 min

That’s not an incremental improvement. It’s an order of magnitude.

Why Cycle Time Is Everything

Product velocity isn’t about how fast you type. It’s about how fast you can go from “idea” to “shipped and verified.” Every context switch adds latency. Every tool boundary adds friction. Every manual step adds error.

When demo, debug, test, and automate collapse into a single AI conversation:

Feedback loops tighten — the AI sees the result of every change in real time
Iteration cost drops — trying a different approach is a sentence, not a sprint
Quality increases — tests are generated from real behavior, not written from memory
Everyone participates — PMs can demo, test, and file bugs without developer involvement

This is what AI-native development looks like. Not “AI helps you write code faster” — but “AI collapses the entire build-debug-test-demo cycle into minutes.”

What’s Still Coming

The one remaining advantage Loom has over Gasoline is shareability — you can send a Loom link to anyone with a browser. Gasoline’s demo scripts require the AI to replay them.

The fix: tab recording. Chrome’s tabCapture API can record the active tab as video while the AI runs a demo script. Subtitles and action toasts are already rendered in the page, so they’d be captured automatically. The output: a narrated demo video, generated from a replayable script, with burned-in captions. No Loom subscription. No manual recording. No re-takes.

That feature is on the roadmap. When it ships, the Loom replacement is complete.

The Bottom Line

You don’t need four tools. You need one browser extension, one MCP server, and an AI that can see your browser.

Loom → Gasoline subtitles + demo scripts (+ tab recording, coming soon) Chrome DevTools → Gasoline observe() Selenium → Gasoline interact() + natural language Playwright → Gasoline generate(format: “test”)

One install. Zero subscriptions. Faster than all four combined.

Get started →

What Is MCP? The Model Context Protocol Explained for Web Developers

Feb 7, 2026

Brenn Hill

MCP — the Model Context Protocol — is the USB-C of AI tools. It’s a standard that lets AI assistants plug into external data sources and capabilities without custom integrations. If you’ve ever wished your AI coding assistant could see your browser, read your database, or check your CI pipeline, MCP is how that works.

Here’s what MCP means for web developers and why it changes how you build software.

The Problem MCP Solves

AI coding assistants are powerful but blind. They can read your source code, but they can’t see:

The runtime error in your browser console
The 500 response from your API
The layout shift that happens after your component mounts
The WebSocket connection that silently drops
The third-party script that’s loading slowly

Without this context, every debugging session starts with you describing the problem to the AI instead of the AI observing it directly. You become a human copy-paste bridge between your browser and your terminal.

MCP eliminates that bridge.

How MCP Works

MCP is a JSON-RPC 2.0 protocol with a simple contract:

Servers expose tools (functions the AI can call) and resources (data the AI can read)
Clients (AI assistants like Claude Code, Cursor, Windsurf) discover and invoke those tools
Transport is flexible — stdio pipes, HTTP, or any bidirectional channel

A typical MCP server might expose tools like:

observe({what: "errors"})        → returns browser console errors
generate({format: "test"})       → generates a Playwright test
configure({action: "health"})    → returns server status
interact({action: "click", selector: "text=Submit"})  → clicks a button

The AI assistant discovers what tools are available, reads their descriptions, and calls them as needed during a conversation. No custom plugin architecture. No vendor-specific API. Just a protocol.

Why MCP Matters for Web Development

Your AI Can See What You See

Before MCP, debugging with AI looked like this:

You: “I’m getting an error when I submit the form.” AI: “What error? Can you paste the console output?” You: [switches to browser, opens DevTools, copies error, pastes] AI: “Can you also show me the network request?” You: [switches to Network tab, finds request, copies, pastes]

With an MCP server like Gasoline connected:

You: “I’m getting an error when I submit the form.” AI: [calls observe({what: “errors”})] “I can see the TypeError. The API returned a 422 because the email field is missing from the request body. Let me check the form handler…”

The AI skips the back-and-forth and goes straight to diagnosing.

Tool Composition

MCP tools compose naturally. An AI assistant with a browser MCP server and a filesystem MCP server can:

Observe a runtime error in the browser
Read the relevant source file
Edit the code to fix the bug
Refresh the browser
Verify the error is gone

That’s a complete debugging loop without human intervention beyond the initial request.

Works With Any AI Tool

Because MCP is a standard protocol, the same server works with every compatible client:

AI Tool	MCP Support
Claude Code	Built-in
Cursor	Built-in
Windsurf	Built-in
Claude Desktop	Built-in
Zed	Built-in
VS Code + Continue	Plugin

You configure the server once. Every AI tool that speaks MCP can use it.

The MCP Ecosystem

MCP servers exist for many data sources:

Category	Examples
Browser	Gasoline (real-time telemetry, browser control)
Filesystem	Read, write, search files
Databases	PostgreSQL, SQLite, MongoDB
APIs	GitHub, Slack, Jira, Linear
DevOps	Docker, Kubernetes, CI/CD
Search	Brave Search, web fetch

The power comes from combining them. A browser MCP server plus a GitHub MCP server means your AI can observe a bug, fix it, and open a PR — all in one conversation.

What Makes a Good Browser MCP Server

Not all browser MCP servers are equal. The critical capabilities for web development:

Real-Time Observability

The server should capture browser state as it happens — console logs, network errors, exceptions, WebSocket events — not just static snapshots. When you’re debugging a race condition, you need the sequence of events, not a point-in-time dump.

Browser Control

Observation alone isn’t enough. The AI needs to navigate, click, type, and interact with the page. Otherwise it’s reading but not testing. Semantic selectors (text=Submit, label=Email) are more resilient than CSS selectors that break with every redesign.

Artifact Generation

Captured session data should translate into useful outputs: Playwright tests, reproduction scripts, accessibility reports, performance summaries. The AI has the data — let it produce the artifacts.

Security by Design

A browser MCP server sees everything — network traffic, form inputs, cookies. It must:

Strip credentials before storing or transmitting data
Bind to localhost only (no network exposure)
Minimize permissions (no broad host access)
Keep all data on the developer’s machine

Performance Awareness

Web Vitals, resource timing, long tasks, layout shifts — performance data should flow alongside error data. The AI shouldn’t need a separate tool to check if the page is fast.

Getting Started with MCP

If you want to add browser observability to your AI workflow:

1. Install the Extension

git clone https://github.com/brennhill/gasoline-mcp-ai-devtools.git

Load the extension/ folder as an unpacked Chrome extension.

2. Configure Your AI Tool

Add to your MCP config (example for Claude Code’s .mcp.json):

{
  "mcpServers": {
    "gasoline": {
      "command": "npx",
      "args": ["-y", "gasoline-mcp"]
    }
  }
}

3. Start Debugging

Open your app, restart your AI tool, and ask:

“What browser errors do you see?”

The AI calls observe({what: "errors"}), gets the real-time error list, and starts diagnosing. No copy-paste. No screenshots. No description of the problem. The AI sees it directly.

The Bigger Picture

MCP is still early. The protocol is evolving, new servers appear weekly, and AI tools are deepening their integration. But the direction is clear: AI assistants are becoming aware of their environment, not just their context window.

For web developers, this means the feedback loop between writing code and seeing results gets tighter. The AI sees the browser. The AI sees the error. The AI sees the fix work. All in real time.

That’s what MCP enables. And it’s just getting started.

ai-development

The Testing Problem

Natural Language Testing

Why This Is Different

What You Can Test

User Flows

Form Validation

Error Handling

Performance

Accessibility

API Behavior

The Lock-In: Generate Real Tests

Who This Is For

Product Managers

Startups Without QA Teams

QA Engineers

Developers in a Hurry

Resilience: Why AI Tests Survive UI Changes

Save and Replay

Save the Flow

Replay Later

State Checkpoints

Get Started

What Makes an MCP Server Useful

Browser Observability: Gasoline MCP

Filesystem: Built-In

Database: PostgreSQL / SQLite MCP

GitHub: gh CLI or GitHub MCP

Search: Brave Search / Web Fetch

Docker / Container Management

CI/CD: GitHub Actions / Linear / Jira

Putting It Together

Recommended Setup

Configuration

The Trend

The Quick Comparison

Where Gasoline Wins

Exploratory Testing

Debugging

Adapting to UI Changes

Non-Technical Users

Observability Beyond Assertions

Performance Testing

Where Playwright Wins

CI/CD Pipelines

Parallel Test Execution

Cross-Browser Testing

Deterministic Assertions

Network Mocking

The Best of Both: Generate Playwright from Gasoline

1. Explore with Gasoline

2. Generate a Playwright Test

3. Run in CI

4. When the Test Breaks

Decision Guide

The Workflow That Uses Both

The Problem: Security Is an Afterthought

Real-Time Security Auditing During Development

Credential Detection

PII Exposure Detection

Security Header Analysis

Cookie Security

Transport Security

Authentication Gaps

Third-Party Script Auditing

Security Regression Detection

Generating Security Artifacts

Content Security Policy

Subresource Integrity Hashes

Automatic Credential Redaction

The Security Feedback Loop

What This Means for Teams

Try It

Why CORS Errors Are Hard to Debug

The Gasoline Approach

Step 1: See the Error

Step 2: Inspect the Network Request

Step 3: Check for Preflight Issues

Step 4: Look at the Timeline

Common CORS Scenarios and Fixes