Daily: Ship to prod: v5.14 — Enterprise on-prem bullet + Open Query Agent footer link
dailyautomationaimarketingapi
46 commits — Improvements: - Ship to prod: v5.14 — Enterprise on-prem bullet + Open Query Agent footer link - Ship to prod: v5.13 — homepage: Cloud or On-Premise feature card
Improvements:
- Ship to prod: v5.14 — Enterprise on-prem bullet + Open Query Agent footer link
- Ship to prod: v5.13 — homepage: Cloud or On-Premise feature card
- deps(mcp-server): Bump @types/node from 25.9.0 to 25.9.1 in /mcp-server
- deps(runner): Bump @types/node from 25.9.0 to 25.9.1 in /runner
v5.13 — Cloud or On-Premise on the homepage
featurehomepagedeploymententerprise
New Features card promotes that bugAgent runs on a managed cloud or installs inside your own VPC / air-gapped network / compliance perimeter — same product, same MCP-native API, just hosted where your security and data-sovereignty requirements need it.
## Cloud or On-Premise — now front and center
bug<em>Agent</em> deploys two ways:
- **Managed cloud** — sign in and start using the dashboard, MCP server, automation runner, and Chrome extension immediately. Most teams pick this.
- **On-premise** — install the full stack inside your own VPC, air-gapped network, or compliance perimeter. The QA brain stays on your side of the firewall, which matters when your security, regulatory, or data-sovereignty requirements demand it.
The product, MCP-native API surface, and agent interfaces are identical across both deployments — the choice is purely about where the data lives. Reach out to sales for on-premise sizing and a deployment-fit conversation.
v5.11 — QA Brain feature and homepage iPhone polish
featurehomepageqa-brainmobile-polish
New QA Brain feature card explains how bugAgent connects to every tool you already use to build a vendor-neutral knowledge base your AI agents can reuse. Also fixed two missing icons (Code Review, Exploratory AI) that showed as dark squares, and cleaned up a redundant Live label on the homepage.
## QA Brain — the tool-agnostic memory layer
The homepage Features grid now leads with a dedicated **QA Brain** card. It captures the positioning the hero already centers on: bugAgent connects to every QA-adjacent tool your team already uses — Jira, Linear, GitHub, Slack, Cypress, Playwright, Postman, Figma, Sentry, BrowserStack, your CI — and continuously extracts the QA-relevant knowledge into a unified, vendor-neutral base **you own**. AI agents reuse that state instead of rediscovering your product on every prompt, so they always know the true context of what you ship and you save tokens and time on every run.
## Homepage polish
A few visual cleanups that were most obvious on iPhone Pro Max:
- The **Code Review** and **Exploratory AI** feature cards were missing their icons (showing as dark squares). Both now render proper icons.
- Removed a redundant green "Live" label that sat above the "Real humans. Real testing." heading — the heading speaks for itself.
v5.10 — Attach Word and Excel files to bug reports
bug-fixchrome-extensionbug-reportsattachments
You can now attach .doc/.docx/.xls/.xlsx files to bug reports from both the Chrome extension and the dashboard — previously they were either hidden in the file picker or rejected with an "Unsupported file type" error.
## Word and Excel attachments
Bug reports now accept Word (.doc, .docx) and Excel (.xls, .xlsx) attachments through both the dashboard composer and the bugAgent CoPilot Chrome extension.
Previously these were hidden in the file picker and — if you drag-dropped them — rejected at the server with `Unsupported file type: application/vnd.openxmlformats-…`. Fixed in both the upload allowlist and the file-picker UI so the surfaces agree.
Available now on the dashboard. Chrome extension v1.7.13 brings the matching file-picker change — it'll auto-update from the Chrome Web Store within a day or two.
v5.09 — Chrome extension: Submit button now hides after a successful report
bug-fixchrome-extensionbug-reports
Bug reports submitted from the bugAgent CoPilot Chrome extension no longer let you accidentally re-submit the same report — the form properly collapses to just the success card and the "File Another" button.
## Submit button now hides after a successful bug report
When you submit a bug from the bugAgent CoPilot Chrome extension, the form now properly collapses to just the success card — title, fields, attachments, and the Submit button all disappear, leaving only the "Bug report submitted!" confirmation with the new ticket ID and a clear "File Another" button.
Previously the form stayed open right above the success card, so it was easy to keep clicking the still-active Submit button and file the same bug 2–3 times before noticing the "File Another" button further down. That's fixed.
Available in extension v1.7.12 — it'll auto-update from the Chrome Web Store within a day or two.
Status page now updates every 3 minutes (and reports the truth)
reliabilitymonitoringstatus-page
status.bugagent.com moved to Better Stack — checks run every 3 minutes (the old setup was throttled by its provider to roughly hourly), and previously-misconfigured Supabase monitors are correctly reporting again.
## A more honest status page
The public system-status page at https://status.bugagent.com/ is now backed by Better Stack instead of the previous open-source setup. Two things you'll notice:
**It actually updates frequently.** Checks now run every 3 minutes. The old page promised "updates every 5 minutes" but its checker was being throttled by its underlying provider, so in practice it updated closer to once an hour — which made it look stuck during real incidents (e.g. when a service had already recovered but the page still said "down").
**The Supabase monitors are honest now.** Two of the six monitors (Supabase Database, Supabase Auth) had been falsely reporting "down" since April 1 because the URLs they were probing had quietly tightened their auth requirements upstream. Both now point at endpoints they can actually reach, and both are correctly green.
The URL is the same — https://status.bugagent.com — and incident notifications continue to flow through the same channels.
Bug fixes:
- fix(extension): surface bug-report attachment upload failures (TEST-345)
Improvements:
- Ship to prod: v5.06 — Chrome extension showcase + landing page
- Ship to prod: v5.05 — homepage: The QA Brain for your AI stack
- Ship to prod: v5.04 — fix DOM-snapshot uploads (TEST-337 round 2)
- Ship to prod: v5.03 — project leakage sweep (TEST-263/264/265)
- chore: add `npm run check` (astro check) + fix the build/typecheck note
v5.04 — DOM-snapshot attachments from the Chrome extension now save
bug-fixchrome-extensionbug-reports
Bug reports submitted via the bugAgent CoPilot extension with the DOM snapshot toggle on will now include the snapshot file, instead of dropping it silently.
## DOM-snapshot attachments now save
When you submit a bug report through the bugAgent CoPilot Chrome extension with the **Include DOM snapshot** toggle on, the sanitized HTML snapshot now lands on the report alongside your other attachments. Previously the server's upload allowlist did not accept the snapshot's file type, so the file was silently dropped — the rest of your attachments saved fine, only the DOM snapshot was lost.
No action needed — the fix is server-side and applies to all new submissions. The extension stays at v1.7.11; no reload required.
v5.03 — Project data no longer leaks across Security, Code Review, and Agent Queue
bug-fixdata-integritydashboard
Three project-scoped pages now properly limit their data to the project you are viewing, fixing leakage that was most visible when comparing two projects side-by-side.
## Project data no longer leaks between projects
The Security, Code Review, and Agent Queue screens now properly scope their data to the project you are viewing. Previously, opening one project in one tab and another project in a second tab — most visibly in Chrome's split view — could show the same data on both sides:
- **Security** — the "Recent Runs" section now only shows runs from the current project's scans.
- **Code Review** — recent reviews are filtered to the current project's repository.
- **Agent Queue** — for projects other than the agent's dogfood queue, the page now shows a clear empty state with a link to the bugagent queue instead of leaking that queue's counts into another project's namespace.
No action needed — the change is server-side and applies to all new page loads.
v5.02 — Bug report Source field shows the real submission channel
bug-fixbug-reports
The Source field on a bug report now correctly reflects how the report was filed, instead of mislabelling extension and Quick Submit reports as "Human".
## Bug report Source now shows the real submission channel
The Source field on a bug report's detail page now correctly reflects how the report was filed. Reports submitted through the bugAgent CoPilot Chrome extension previously showed "Human" on the detail page even though they were filed via the extension — they now show "Chrome Extension". Quick Submit reports likewise show "Quick Submit".
The analytics "By Source" chart was already accurate; this aligns the report detail page with it. The change is display-only — no report data was affected.
v5.01 — Chrome extension attachments now save
bug-fixchrome-extensionbug-reports
Fixed a bug where screenshots, files, and diagnostic data submitted via the CoPilot Chrome extension could be silently dropped from the bug report.
## Chrome extension attachments now save
Bug reports submitted through the bugAgent CoPilot Chrome extension now reliably keep their attachments. Previously, when you submitted a report with a screenshot, file attachment, console output, failed-request log, or DOM snapshot, the report itself was created but the attached evidence could be silently dropped — and the extension still showed a success message.
This affected reports submitted while not signed in to the dashboard in the same browser. The upload endpoint has been corrected so attachments and diagnostic data are saved regardless of how you are authenticated.
No action needed — the fix is server-side and applies to all new reports.
Bug reports and notes keep your line breaks
bug-reportsnotes
Line breaks you type into a bug report or note are now preserved when it renders — no more run-on text.
## Line breaks preserved
Bug-report descriptions and session notes render as Markdown, which was collapsing single line breaks into spaces — so line-structured content (like the CoPilot extension's element details) showed up as run-on text.
Single line breaks are now kept as written. This is a render-time fix, so existing reports and notes display correctly too — nothing needed to be re-submitted.
More reliable error monitoring for the Chrome extension
chrome-extension
Failed actions in the CoPilot extension are now always captured in our monitoring, so regressions get caught and fixed faster.
## Extension error reporting
The CoPilot extension's error telemetry was filtering too aggressively — it muted routine background-sync noise (intended) but also silently dropped failures of explicit actions like submitting a bug report. Those now always surface in our monitoring, so a regression is noticed and fixed instead of going unseen.
Chrome extension submits bug reports without a dashboard login
chrome-extensionbug-fix
The CoPilot extension now authenticates on its own — you no longer need to be signed into the dashboard in the same browser.
## Standalone extension auth
The bugAgent CoPilot Chrome extension now authenticates entirely on its own token. Previously, submitting a bug report could fail with an "Authentication required" error unless you were also signed into the dashboard (app.bugagent.com) in the same browser.
Sign into the extension and it just works — no dashboard tab needed.
New MCP tool: read a bug report's comment thread
mcpbug-reports
The MCP server can now list every comment on a bug report — author, content, and threaded replies — in one call.
## list_comments
The bugAgent MCP server now exposes a `list_comments` tool. It returns a bug report's full comment thread — oldest first, with each comment's author, content, threaded-reply links, and timestamps.
Comments are not part of `get_bug_report`, so `list_comments` is now the way for Claude, Cursor, CI agents, and other MCP clients to read a ticket's discussion before acting on it.
Daily: Ship to prod: v4.98 — list_comments MCP tool
Improvements:
- Ship to prod: v4.98 — list_comments MCP tool
- docs: compact CLAUDE.md and add review-ticket skill
- deps(mcp-server): bump tsx from 4.22.0 to 4.22.2 in /mcp-server
- deps(runner): bump @supabase/supabase-js in /runner
- deps(runner): bump @types/node from 25.8.0 to 25.9.0 in /runner
Security: dependency updates
securitydependencies
Routine security maintenance — patched a moderate-severity dependency advisory in the MCP server.
## Security maintenance
We updated dependencies in the MCP server to clear a moderate-severity advisory. No action is required from users.
- **brace-expansion** updated to 5.0.6, resolving a denial-of-service advisory (GHSA-jxxr-4gwj-5jf2).
- **@sentry/node** updated to 10.53.1.
Shipped in v4.88.
Daily: Ship to prod: v4.87 — TEST-315 fix project switch on Add Suite + sibling redirects
dailyai
2 commits — Improvements: - Ship to prod: v4.87 — TEST-315 fix project switch on Add Suite + sibling redirects
Improvements:
- Ship to prod: v4.87 — TEST-315 fix project switch on Add Suite + sibling redirects
Project context stays put when creating suites and runs
test-casesbug-fix
Creating a test suite, starting a run, or adding cases no longer risks switching you into a different project.
## Project context fixes
A few actions on the Test Cases page could land you in a different project than the one you were working in. Creating a test suite, starting a run, or using the assign-run flow would sometimes switch your active project if you had another browser tab open on a different project.
These flows now carry your project context explicitly, so your project stays put.
We also fixed the "Add Cases" picker on a suite, which could list test cases from other projects in your workspace — it now shows only the suite's own project.
Daily: docs: note Layer C decommission in the Supabase-clients section
Bug fixes:
- fix(extension): treat mid-body fetch drops as expected network blips
Improvements:
- docs: note Layer C decommission in the Supabase-clients section
- Ship to prod: v4.86 — decommission Singapore region (Layer C)
- chore: stop tracking chrome-extension/dist build artifact
- chore: stop tracking runner/dist build artifact
- deps(runner): bump browserstack-node-sdk in /runner (#315)
Dependency maintenance + security patch (v4.84)
securitydependenciesmaintenance
Merged seven Dependabot updates and patched a high-severity transitive dependency.
## Dependency maintenance
Merged seven Dependabot updates across the workspaces:
- **astro** 6.3.2 → 6.3.3 (website) — includes an upstream reflected-XSS fix
- **isomorphic-dompurify** 3.12.0 → 3.13.0 (dashboard)
- **tsx** 4.21.0 → 4.22.0 (runner & mcp-server)
- **@types/node** 25.7.0 → 25.8.0 (runner & mcp-server)
## Security patch
Bumped **devalue** 5.6.4 → 5.8.1 in the website and dashboard, resolving a high-severity advisory (GHSA-77vg-94rm-hx3p) — a denial-of-service via sparse-array deserialization. devalue is a transitive dependency of Astro; the patched release is in-range, so this was a lockfile-only update.
All workspaces now report zero critical/high dependency vulnerabilities.
v4.81 — Layer B / Layer C interlock: compute-region guard in pickReadRegion
Fixes a latency regression from v4.76-v4.80 where SEA-user reads were routed cross-Pacific to the Singapore replica even though Railway compute was still in us-east4. The replica routing now engages only when compute and replica are colocated (Layer C state). Until Layer C lands, all reads route to primary — same effect as flag-off, but with the wiring in place to activate automatically when the Singapore pod ships.
## The latency regression
v4.76-v4.80 routed reads to the Singapore replica whenever the user was in a SEA country, without considering where the server pod itself was running. With Railway compute still in us-east4 (Virginia), every replica read became cross-Pacific from the wrong end: us-east4 → Singapore → us-east4. For a 5-query SSR page like /dashboard/reports, that added ~1.2 seconds per page load for SEA users.
## The fix
Layer B routing now requires compute and replica to be in the same region. `pickReadRegion` reads Railway's `RAILWAY_REGION` env var and returns `primary` whenever compute is NOT colocated with the replica. With Railway in us-east4 today, every read routes to primary regardless of user country — same effect as turning the feature flag off, but with the routing wired up to activate automatically when Layer C (Singapore Railway pod) ships.
## Why this design
Layer B + Layer C is a "both halves shipped" contract. The interlock expresses that in code so a future operator can't inadvertently activate Layer B alone again. When the Singapore pod ships, its `RAILWAY_REGION=ap-southeast-1` env var (Railway-provided automatically) flips on the replica routing for that pod's traffic without any other change.
## Tests
5 new cases in `pickReadRegion`'s "compute-region guard" describe block lock the truth table: us-east4 + SEA → primary (the bug); ap-southeast-1 + SEA → replica-sg (Layer C state); ap-southeast-1 + US → primary (VPN edge); null/empty compute → primary (defensive). 703/703 passing.
v4.80 — Full SEA coverage: test-cases, test-runs, and notes APIs now route through the Singapore read replica
Largest Layer B coverage push to date. Every high-volume authed list endpoint in the dashboard now routes SEA reads to the Singapore replica; every corresponding write endpoint sets the 5-second read-after-write cookie pin for write-then-list freshness.
## What changed
Four more GET endpoints now use `getReadClient(locals)` instead of `getServiceClient()`:
- `/api/test-cases` (list)
- `/api/test-runs` (list + detail)
- `/api/notes` (list)
And eleven more write endpoints set `setReadAfterWritePin(cookies)` so a user who just created, edited, or deleted data sees their write on the next list reload regardless of replica lag.
## Combined with prior ships
- **v4.68**: Cache-Control headers (Layer A code)
- **v4.75**: Layer B groundwork (flag-OFF helpers)
- **v4.76**: Layer B activation code + reports endpoints
- **v4.79**: Agent Queue page migration
- **v4.80** (this ship): test-cases / test-runs / notes API migrations
Every high-volume authed list endpoint is now on the Layer B routing path. SEA users hitting any of these get replica reads in Singapore instead of round-tripping to us-east-1.
## What's still primary-only
Detail pages, write endpoints with no list-reload pattern (result-save during run execution), attachment / link CRUD, settings, admin. All by design; mechanical to migrate if SEA usage data warrants.
v4.79 — Agent Queue page now routes through Layer B read replica
sea-expansionread-replicasmall-shipinfrastructure
The /dashboard/agent-queue page now uses getReadClient + withTimeout — joining /dashboard/reports as the second page routed through the Singapore read replica for SEA users. All three of its Supabase queries gain per-query timeouts as well.
## What changed
`/dashboard/agent-queue` migrated to the Layer B read-replica routing path. For SEA-region traffic (Cloudflare `cf-ipcountry` ∈ SEA list), reads go to the Singapore replica; everyone else stays on the US primary. Read-after-write cookie pins continue to override geo routing for ~5s after a write.
The page also gains the v4.77 per-query timeout treatment that it didn't pick up earlier. Three new bounded awaits: 8s on the main `agent_priority_queue` list, 3s each on the assignee profile + team-prefix lookups. On timeout the existing error-card UI surfaces "Query timed out — refresh to retry" instead of hanging.
## Why
Second-highest-traffic `getServiceClient` page in the dashboard (after `/dashboard/reports`). The agent loop reads this view via the MCP `pick_next_bug` tool, and human QA refreshes it during ticket triage. Routing it through the replica narrows the SEA latency gap on the queue-walking workflow.
## What's still primary-only
`/dashboard/test-cases`, `/dashboard/notes`, `/dashboard/test-cases/runs/[id]` — these use `createSupabaseServer` (JWT-bound, RLS-aware) which isn't a drop-in for `getReadClient`. Extending Layer B to those needs a new helper (`createReadSupabaseServer`) and is the natural next code ship once we have soak data on the migrated pages.
v4.77 inadvertently broke /dashboard/reports and /dashboard/test-cases because the new withTimeout helper called .finally() on its input, which is undefined on Supabase's PostgrestFilterBuilder (a thenable, not a real Promise). One-line fix wraps the input in Promise.resolve() so .finally works on any awaitable. Two regression tests added.
## What happened
v4.77 added a `withTimeout(promise, ms, label)` helper to defend SSR pages against slow Supabase queries. The helper called `promise.finally(...)` for timer cleanup. My unit tests only passed real `Promise.resolve(...)` instances — none tested a *thenable*, which is what Supabase's query builder actually returns. In prod, every SSR query wrapped in `withTimeout` crashed with `TypeError: promise.finally is not a function`.
## What v4.78 does
One-line fix: `Promise.resolve(promise).finally(...)`. `Promise.resolve` adopts any thenable into a real Promise, so `.finally` is defined regardless of input type. Plus two new regression tests pinning the thenable contract.
## Why it slipped through CI
All v4.77 tests used real Promise instances, which DO have `.finally`. The TypeScript signature accepted `Promise<T>` and `await` works on thenables, so call sites compiled and looked fine. The gap only showed when the runtime actually saw a thenable input. The two new regression tests reproduce the prod crash and would have caught this in v4.77 if they'd existed then.
v4.77 — Per-query SSR timeouts on hot dashboard pages
A slow Supabase query no longer hangs an SSR render until Cloudflare 524s. Each awaited query on /dashboard/reports, /dashboard/automations, /dashboard/test-cases now has a budget; on expiry the value falls through to null and the page renders gracefully degraded with a structured log line for diagnosis.
## What changed
New pure helper `withTimeout(promise, ms, label)` wraps a Promise in a budgeted race. If the promise resolves first, the value passes through. If the budget expires first, returns null and console.warns `[ssr-timeout] <label> timed out after <ms>ms`. Rejections re-throw (caller's try/catch still works); the internal setTimeout is cleared in both branches so no timer leaks across thousands of renders.
Applied to 8 SSR queries across three pages:
- `/dashboard/reports` (5 queries — projects/teams/members/bug_reports.list/trends)
- `/dashboard/automations` (1 query)
- `/dashboard/test-cases` (2 queries)
The main bug-reports list timeout also surfaces a custom EmptyState ("Couldn't load reports right now — refresh to retry") so the user has a clear next step, not a blank page.
## Why now
Natural companion to Layer B (the SG read replica): once a replica is online, a transient replication lag spike that slows a particular read no longer takes down the whole render. The slow query times out within budget, the page degrades cleanly, and Railway logs identify exactly which query was slow.
Worst-case SSR latency on `/dashboard/reports` is now bounded at ~20s even with every query hitting max budget — well under Cloudflare's 100s ceiling that previously triggered the 524 error pages.
v4.75 — SEA expansion Layer B groundwork (flag-OFF foundation)
Lays the dormant foundation for the Singapore read-replica routing: pure region-routing helpers, read-after-write cookie helpers, new service-client API. Defaults OFF — zero production behavior change. Activation is a future env-var flip + per-page migration after Pro plan upgrade and replica provisioning.
## What changed
Foundation code for SEA expansion read-replica routing, all behind a feature flag that defaults OFF. With the flag off, every `getReadClient()` call returns the primary client — same behavior as today. Nothing in this ship is user-visible.
Three new pure helpers (TDD with 15 unit tests):
- `isSEACountry(code)` — country-code → SEA-routable boolean across ASEAN-10 + East Asia + Oceania
- `pickReadRegion({enabled, country, readAfterWritePin})` — single routing decision
- `setReadAfterWritePin(cookies)` / `hasReadAfterWritePin(cookies)` — cookie wiring for the read-after-write pin
Three new `service-client.ts` functions:
- `getPrimaryClient()` — explicit primary; alias for the existing service client today
- `getReadClient(locals)` — picks primary or SG replica via `pickReadRegion`; today the flag is off so it always returns primary
- `isReadReplicaEnabled()` — env-var checked on every call so a Railway flip takes effect on the next request without redeploy
`CLAUDE.md` gained a "Supabase clients — primary vs read replica" section with a decision table for which client to use where.
## Why this is safe to merge dormant
The new code is purely additive — no existing call sites change. The activation ship will wire `locals.cfCountry` + `locals.readAfterWritePin` in middleware, migrate the first list page, and flip the Railway env var. Until then, the helpers can sit indefinitely without effect.
## What's gating activation
Three decisions outstanding: Pro plan upgrade approval (~$35-50/mo incremental), replica provisioning in `ap-southeast-1`, and the activation code ship. Cost-benefit framing posted on TEST-228.
Daily: Ship to prod: v4.74 — TEST-291 unique Run # suggestion on Create Test Run
dailyautomationaiapi
56 commits — Improvements: - Ship to prod: v4.74 — TEST-291 unique Run # suggestion on Create Test Run - Ship to prod: v4.73 — TEST-290 fix project switch + attachment race during Start Run
Improvements:
- Ship to prod: v4.74 — TEST-291 unique Run # suggestion on Create Test Run
- Ship to prod: v4.73 — TEST-290 fix project switch + attachment race during Start Run
- Ship to prod: v4.72 — runner hotfix: revert Python playwright pin to 1.59.0 (PyPI lag)
- Ship to prod: v4.71 — bump runner Playwright to 1.60.0 (coordinated npm + Dockerfile pin)
v4.74 — Create Test Run pre-fill no longer collides on Run #6
data-integritytest-runssmall-shiptdd
Opening Create Test Run on a suite with ≥5 runs used to always suggest `Run #6`, causing duplicate run names to pile up. Pre-fill is now computed across all runs in the suite, not just the paged display window.
## What changed
The Create Test Run modal pre-fills the name input with `<suite> - Run #<N>`. Pre-v4.74 the suggestion used `recentRuns.length + 1`, where `recentRuns` was paged to 5 entries (for the display widget). Once any suite had ≥5 runs, every suggestion landed on `Run #6` — producing the exact duplicate-pile-up Russell reported.
v4.74 moves the calculation server-side: a new `next_run_number` field on the suite-scoped runs endpoint scans every run name in the suite and returns max+1. Renamed runs (no numeric suffix) and runs from similarly-named other suites are correctly ignored.
Existing `Run #6` duplicates are non-destructive — runs are addressable by id, so the new suggestion just skips past them to the true next number.
## Why a regex-based scan rather than a numeric column
The scan handles renames gracefully (runs renamed to `Friday smoke pass` don't break numbering) and works against historic data without a migration. A separate `run_number` numeric column would be cleaner long-term and would let DB-level uniqueness constraints kick in; filing as a future enhancement.
v4.73 — Test run project context preserved on Start Run + attachment upload race fixed
Two S1-blocking bugs that combined on the test-run flow: (1) Start Run silently flipped the active project to the team default mid-flow, and (2) attaching a screenshot to a failed case during run execution returned 400 "No files provided" due to a FileList race. Both fixed in one ship.
## What changed
**Bug 1 — Project preservation on Start Run.** When a tester clicked Start Run on a test suite, the post-create redirect was bare-URL (no `?project=`), and the middleware fell back to the team's default project. The data layer was always correct — only the UI rendered in the wrong project — but the visual flip was disorienting and dropped per-tab project context on subsequent navigations. Fixed: the `POST /api/test-runs` response now includes `project_slug`, and the client redirect carries `?project=<slug>` so the project context is preserved across the redirect.
**Bug 2 — Attachment upload race during run execution.** The change handler on the per-case file input was clearing `input.value = '` synchronously immediately after kicking off the (async) upload chain. `HTMLInputElement.files` is a live FileList — clearing the input emptied it before the upload code read it, so the POST always landed with zero files attached and the API returned 400 "No files provided". Fixed: snapshot the selected files to a plain array before clearing the input.
## Why one ship
The two bugs surfaced together on the same repro path so they're fixed together as one S1 hotfix. The fix files are independent — different modules, different mechanisms — but the user experience needs both to be right for the failed-case-with-evidence flow to actually work end-to-end.
## What this also does
It confirms (retroactively) that the TEST-183 attachment feature finally works end-to-end from the Start Run flow. The schema column, API endpoint, and UI components were all shipped previously, but Bug 2 meant the user-facing path was broken on every attempt. v4.73 closes that gap.
v4.68 — Explicit Cache-Control on every dashboard response (SEA expansion Layer A, code half)
performancecachingsea-expansionsmall-ship
Code-side of TEST-227. Authed dashboard pages now ship with explicit `Cache-Control: private, no-store` and a tight allow-list of public read endpoints opts in to short edge cache. Sets up Cloudflare aggressive-cache rules without any per-user cache leak risk.
## What changed
Every response from the dashboard SSR origin (`app.bugagent.com`) now carries an explicit `Cache-Control` header. Authed dashboard pages and APIs return `private, no-store`; public read endpoints (`/api/admin/changelog`, `/changelog.xml`) return `public, max-age=60, s-maxage=300, stale-while-revalidate=600`; write methods always return `private, no-store` regardless of path.
Fail-safe by default — anything not explicitly on the public allow-list is `no-store`.
## Why
First code-half of the **SEA expansion Layer A** ticket (TEST-227). With explicit headers in place, we can confidently configure Cloudflare Cache Rules to aggressively edge-cache the marketing site and public-read endpoints, knowing that authed dashboard pages are protected from accidental caching.
Follow-up operator work in the Cloudflare dashboard (cache rules for marketing TLDs, SEA latency probe) is documented in the TEST-227 ticket comment.
## What this doesn't change
- No user-visible behavior on authed pages — they were already not being cached, this just makes it explicit.
- Marketing site caching is unchanged until Cloudflare Cache Rules are added in the dashboard (operator step).
Daily: Ship to prod: v4.68 — TEST-227 SEA expansion Layer A (code half): explicit Cache-Control on every dashboard response
dailyautomationaibug-reportsapi
38 commits — Improvements: - Ship to prod: v4.68 — TEST-227 SEA expansion Layer A (code half): explicit Cache-Control on every dashboard response - Ship to prod: v4.67 — pin ffmpeg=7:6.1.1-3ubuntu5 in runner D
Improvements:
- Ship to prod: v4.68 — TEST-227 SEA expansion Layer A (code half): explicit Cache-Control on every dashboard response
- Ship to prod: v4.67 — pin ffmpeg=7:6.1.1-3ubuntu5 in runner Dockerfile
- Ship to prod: v4.66 — OAuth client_secret hash validation at /token (PR 3/3)
- Ship to prod: v4.56 — oauth_clients.client_secret_hash + dashboard hashes on mint (PR 2/3)
- Ship to prod: v4.55 — Vitest harness for mcp-server (PR 1/3 OAuth secret hashing chain)
v4.54 — Safari: text selection now works in test step fields
ui-uxsafaritest-casessmall-ship
On Safari, click-dragging inside an Action / Expected textarea on the test-case detail page used to drag the whole step row around instead of selecting text. Fix: only the .step-grip handle is draggable now; the textareas can be selected normally.
## What changed
**Browser-compatibility fix.** The test-case detail page's drag-to-reorder used to mark the entire step row as draggable. Safari's drag-source detection is stricter than Chrome's — a mousedown+drag inside a textarea inside a draggable ancestor triggers the drag operation and skips selection. Now the `draggable="true"` lives only on the six-dot grip handle, so Safari users can click-drag inside the Action / Expected fields to select text just like in Chrome. Drag-to-reorder still works — just initiate from the grip.
No functional regressions: the grip already had `cursor: grab` so the affordance was always there, only the actual drag source moved.
v4.53 — Archive Run button no longer appears on already-archived runs
ui-uxtest-runsidempotencysmall-ship
Opening an archived test run used to still show a clickable Archive Run button that fired a no-op write. Now hidden. The API also short-circuits archive-on-archived to an idempotent no-op so script / MCP callers can't trigger the same wasted write.
## What changed
**UI** — The Archive Run button on the test-run detail page now hides when the run is already in `archived` status. Same status that drives the **Archived** badge — they stay in sync.
**API** — `PATCH /api/test-runs/:id` with `{status: "archived"}` against an already-archived run now returns `200 { ..., already_archived: true }` and skips the DB write entirely. Idempotent PATCH semantics preserved — retry loops on flaky networks still see clean success without polluting `updated_at`.
## Why the API guard too
The UI fix covers humans clicking through the dashboard. The API guard covers everything else — the MCP server, scripts, curl, future REST consumers. Both surfaces now refuse to no-op archive an already-archived run; no divergence between paths.
Bug-report assignment emails now fire from a database trigger, so every entry point — dashboard, MCP `update_bug_report`, raw SQL — hits the same email-send path. Closes the v4.42 gap where MCP-driven assignments (the agent loop's retest hand-off) silently skipped email.
## What changed
The email path for "you've been assigned a bug report" used to live in the dashboard endpoints. When the agent loop transitioned a ticket to `retesting` and reassigned it to the reporter via the MCP `update_bug_report` tool, the in-app bell notification still fired, but the email didn't — meaning the testers who care most about hand-offs were the ones missing them.
v4.52 moves the email send into a database trigger. Now any UPDATE that touches `bug_reports.assigned_to` — from any entry point — fires the same email pipeline:
```
bug_reports UPDATE → AFTER UPDATE OF assigned_to trigger
→ pg_net.http_post → bug-report-assigned-email Edge Function
→ opt-out check → profile/team lookups → render → Resend
```
The trigger reads a bearer secret from a new private RLS-locked table (`private.app_secrets`) and passes it to the Edge Function for auth.
A new `bug_reports.last_assigned_by` column is now populated by every update path so the email body can say "Jason asked you to retest …" rather than "A team member …".
## Why
One pipeline, one place to fix bugs, no divergence. Future REST clients, MCP tools, scripts, and raw SQL all get the same notification automatically.
## Behavior
- Dashboard kanban assignment → email (unchanged from v4.42)
- MCP `update_bug_report` with new `assigned_to` → email (NEW — fixes the gap)
- Raw SQL `UPDATE bug_reports SET assigned_to = ... ` → email (NEW — defense in depth)
- Re-saving the same assignee → no email (silent, as before)
- Unassigning (assigned_to → null) → no email (silent, as before)
- Per-user opt-out at `/dashboard/settings#notifications` → still respected
- In-app bell notification → unchanged, always fires regardless of email opt-out
v4.42 — Email-notify on bug-report assignment + per-user opt-outs
notificationsemailaccessibilitytester-workflow
Bug reports now email you when they're assigned to you, with retest-handoff wording when status=retesting. Two new toggles in Account Settings → Notifications let you mute the bug-report and test-case assignment emails individually (both default on). One stale amber in the usage-warning email is gone too.
## What's new
**Bug-report assignment emails.** When a bug report is assigned to you from the dashboard (kanban, ticket detail, or any future surface that calls the dashboard assignment endpoints), you'll now receive an HTML email in addition to the in-app bell notification. Subject + body include the short ID, and the CTA button links straight to the report. Copy adapts to the report's status — `retesting` reads "please retest", anything else reads "assigned to you".
**Two new opt-out toggles in Account Settings → Notifications.**
- "Bug report assigned to me" — controls the new email above
- "Test case assigned to me" — was always sending; now mutable. Covers both per-case suite assignments and bulk test-run digest emails.
Both toggles default ON for every existing and new user. To mute, open `/dashboard/settings#notifications`, flip the toggle, save.
**Polish.** Cron usage-warning email's 90-94% progress bar tier was still using the legacy amber `#f59e0b` — now uses the brand neon-green, matching the rest of the customer-facing email palette.
## What's NOT changed
- In-app bell notifications still fire for every assignment regardless of email opt-out — email is a courtesy second channel.
- Email defaults stay opt-in so testers don't silently miss assignments; you have to actively opt out.
- Bug-report assignments routed through the MCP `update_bug_report` tool (used by some agent flows) still only emit the bell notification — email coverage there is a documented follow-up.
## Why
Testers were missing retest-handoff signals because the only notification was a bell that they'd only see if they had the dashboard open. Email closes that gap, and the per-user toggle covers users who don't want a second channel.
When the AI assistant emitted a URL wrapped in bold+code (e.g. **`https://app.bugagent.com/dashboard/reports/TEST-259`**), the previous renderer left literal ** and backticks visible around an otherwise clickable link. v4.32 fixes the pairing so the URL renders as bold, monospaced, and clickable with no stray markdown characters.
## What changed
The AI Assist chat now renders bold-wrapped, code-wrapped, and combined bold+code URL patterns cleanly:
- `**`URL`**` → bold + monospaced + clickable in a new tab
- `` `URL` `` → monospaced + clickable
- `**URL**` → bold + clickable
## Why it broke before
v4.31 split each text span into segments around bare URLs, then ran the bold/italic/code regex on each segment separately. With a pattern like `**`URL`**` the segments looked like:
1. `**``
2. `URL` (anchor-wrapped)
3. ``**`
The bold and code delimiters were in different segments, so they could never pair up — and the literal `**` / backticks leaked into the visible output.
## The fix
`renderTextSpan` now applies inline-formatting to the whole HTML-escaped span first (bold/code markers pair cleanly across URL boundaries), then auto-links bare URLs in the formatted output. URL strings sit untouched inside `<strong>` / `<code>` wrappers because inline-format only matches the surrounding markdown delimiters — never the characters inside them.
Markdown-link `href` values stay protected: those tokens never reach the text-span renderer.
## Tests
Three regression tests cover the three patterns above. Full dashboard suite is 616/616 green.
v4.31 — AI chat link hrefs are clean again — no more 404s on clickable tickets
ai-chatuxbugfixregression
A v4.30 regression where markdown chars in the AI's emitted URLs leaked HTML tags into the rendered href (causing /dashboard/reports/TEST-259</code></strong> 404s on click) is fixed by tokenizing the rendering pipeline so URL strings never see the inline-formatting regex.
## What changed
Clicking a ticket link in the AI Assist chat now reliably navigates to the right page. Hovering shows a clean URL — no HTML tag soup baked into the path.
## What was broken
v4.30 added clickable links by running a Markdown-link pass first and then running the bold / italic / code regex on the whole output. That second pass operated over the entire HTML string, including the rendered `<a href="...">` attribute values. When the AI emitted Markdown URLs that happened to contain markdown characters (backticks, `**`, `*`) — for example because it wanted to format the ticket name in monospace inside the link text — the bold/code regex matched those chars inside the href and injected `<code>` / `<strong>` tags into the URL itself.
Result on click: `/dashboard/reports/TEST-259</code></strong>?project=bugagent` → URL-encoded by the browser → 404.
## How v4.31 fixes it
Rewrote `format-chat-links.ts` so all transforms happen inside a single tokenizing pipeline:
1. Tokenize the input into Markdown-link spans vs text spans.
2. For each text span: detect bare URLs, HTML-escape the text, then apply bold / italic / code / newlines — all within the text span's own scope.
3. For each link span: HTML-escape the URL once for the href, then run the visible text through the same escape + inline-format chain.
URL strings never see the inline-formatting regex. The anchor's href stays pristine even when the AI emits markdown chars inside the URL portion of a Markdown link.
## Inline formatting in link text still works
`[**bold**](/path)` still renders the visible text as `<strong>bold</strong>` while keeping the href clean. Both sides of the transform stay independent.
## Tests
Dashboard suite is now at 613 / 613 green (up from 603). The 10 new cases include the exact production reproducer Jason hit plus variants with `**` and `*` inside the URL portion — so this regression can't silently come back.
v4.30 — AI assistant links are now clickable and open in a new tab
ai-chatuxbugfix
The dashboard AI assistant now renders Markdown links and bare URLs as proper clickable anchors that open in a new tab. Previously the link syntax showed up as literal text, so even though v4.28/v4.29 built correct short-ID URLs, you couldn't actually click them.
## What changed
When the AI Assist chat returns a reference like `TEST-259`, the linked text is now a real clickable anchor that opens the report in a new tab. Both Markdown link form (`[TEST-259](/dashboard/reports/TEST-259)`) and bare URL form (`https://example.com`) are auto-linked.
## Why it wasn't working before
The chat panel had a hand-rolled markdown renderer that handled bold, italic, inline-code, and newlines — but not link syntax. So when the AI emitted a Markdown link, users saw the literal brackets and parentheses in the chat bubble, with no clickable surface. The v4.28 and v4.29 work made the AI generate the right short-ID URLs in the right format, but the actual click experience was still missing.
## What's in v4.30
- **New helper** `lib/format-chat-links.ts` does the link rendering. Pinned by 21 unit tests covering Markdown form, bare-URL auto-linking, multi-link strings, dangerous-scheme rejection, and HTML-escape safety.
- **Scheme allowlist**: only internal absolute paths (`/...`), `http`, `https`, `mailto`, and `tel` get rendered as clickable. Dangerous schemes (`javascript:`, `data:`, `vbscript:`) drop the anchor entirely and just leave the visible text.
- **New tab + opener hardening**: every anchor gets `target="_blank"` and `rel="noopener noreferrer"`. The chat session stays alive in the original tab; the new tab can't reach back into your chat state.
- **Bare URL auto-linking**: typing `https://supabase.com` (no Markdown brackets) gets converted to a clickable anchor too. Trailing sentence punctuation is trimmed so "see https://supabase.com." doesn't link the period.
## End of the URL bug chain
Three-ship arc on the same root annoyance:
- v4.28 — `/api/reports` returns `ticket_prefix` so the chat panel can render short-ID URLs on any page (not just /dashboard/reports).
- v4.29 — AI's prompt context now leads with the short ID; severity values aligned with the post-migration s1-s4 vocabulary so "show me critical bugs" stops returning empty.
- v4.30 — those short-ID URLs are now actually clickable.
v4.29 — AI assistant: ask by severity (s1-s4) and get working short-ID links in prose
ai-chatbug-reportsuxbugfix
Two bugs caught after v4.28 testing: the AI's severity filter was using retired pre-migration-135 values (always returned zero rows), and prose-rendered bug-report URLs still used UUIDs because the prompt context only carried UUIDs.
## What changed
The dashboard's AI assistant now talks about bugs in the same vocabulary that lives in the database, and links to them with the short-ID URLs that actually resolve.
### Severity values now use s1-s4
The AI's prompt previously listed severity options as `critical, high, medium, low` — the legacy strings retired by migration 135. After the migration, the bug_reports.severity column only stores `s1, s2, s3, s4`, enforced by a BEFORE-write trigger. So any time a user asked "show me critical bugs" or "any high-severity issues open?", the AI built a query with `severity=critical`, the API passed it to Postgres, and zero rows matched — looked like "no reports found" even when the queue was full.
Fix: prompt now lists the canonical s1-s4 values and maps user-spoken severities ("critical" → s1, "high" → s2, "medium" → s3, "low" → s4) before passing.
### Prose-rendered URLs use short IDs
v4.28 fixed the LIST_DATA renderer (the structured tool call), but the AI could also reference reports in plain prose — and the prompt's BUG REPORTS context block only carried `${UUID}` for each row. So when the AI said "you should check the report about login failures", it built `/dashboard/reports/<32-char-uuid>` links because that's all the data it had.
Fix: the context block now leads with the short ID — `short_id|fullUUID|title|status|severity|type` — and instructs the AI to use short_id for user-facing prose links and fullUUID only for internal tool calls.
### Bonus: search by short ID is now documented
The AI's search-field description now mentions that "TEST-259" or a bare "259" both work as searches — they hit a ticket_number filter via TEST-164's search-clause builder. Closes a gap where the AI didn't know it could query a specific ticket by name and would scan its 100-row recent-reports context instead, missing anything older.
## Why this matters
The two bugs combined meant asking the AI for any specific bug report by description or severity was unreliable. After v4.29:
- "show me critical bugs" → returns S1 tickets
- "show me TEST-259" → returns the ticket with a clickable short-ID link
- "give me a summary of recent bugs" → prose response with short-ID links inline
v4.28 — AI assistant now builds short-ID bug-report URLs from every page
ai-chatbug-reportsuxbugfix
Asking the dashboard AI assistant for a bug report from analytics, test cases, agent queue, or any non-reports page now gives you working short-ID links like /dashboard/reports/TEST-259. Previously the chat fell back to UUID URLs everywhere except the reports list page.
## What changed
When you opened the AI assistant from a page that wasn't `/dashboard/reports` and asked for "recent bug reports" (or any list/search), the returned links pointed at the long UUID URL form. The short-ID form (`TEST-259`, `COMP-204`, etc.) only worked on the reports list page itself.
Fix lives in two places:
- **`/api/reports`** — now returns the team's `ticket_prefix` alongside the report list on every call. The lookup was previously gated behind the search-clause branch; now it runs on every GET so list responses always carry the prefix.
- **AI chat panel** — reads `ticket_prefix` from the API response first (works on any page). Keeps the existing DOM-attribute fallback for backward compatibility on the reports list page.
## Data access was already correct
While investigating, I double-checked the access scoping. `/api/reports` already filters by `team_id`, so the AI assistant only ever returns records the user has access to. The leak was URL formatting only — no records were exposed that shouldn't have been.
## Pages where the chat now generates short-ID URLs
Every dashboard surface where the AI chat is rendered: analytics, test cases, test runs, agent queue, mobile automations, security, code review, automations, explorations, notes. Previously only the `/dashboard/reports` list page produced short-ID URLs.
v4.24 — Indexes added for every unindexed foreign key
databaseperformanceinfrastructure
Migration 151 adds 109 covering indexes to close the gap on every public-schema foreign key that previously lacked one. Cascading deletes and team-scoped joins should be noticeably faster.
## What changed
The Supabase database linter had been flagging **112 unindexed foreign keys** for months. Each one meant Postgres had to seq-scan the child table any time the parent row was deleted or updated, to enforce ON DELETE / ON UPDATE rules. On busy tables (automation_runs, mobile_runs, test_run_results, report_comments) that paid a meaningful cost on every cascade. Most of the FKs were on the `team_id` / `project_id` / `user_id` columns that the dashboard JOINs on constantly, so the absence also slowed down team-scoped queries.
Migration 151 adds 109 indexes (the 4-statement gap from the advisor count is the github / jira / slack `_connections.connected_by` columns each carrying two FK constraints to different parent tables — same column, same index, deduplicated with `IF NOT EXISTS`).
Verified post-deploy: the same query that previously found 112 unindexed FKs now returns **0**.
## What this means in practice
- **Cascading deletes** (e.g. deleting a project) get fast across the board. Before: each child-table cascade triggered a seq-scan. After: index lookup.
- **Team-scoped queries** that JOIN through `team_id` or `project_id` get faster paths.
- **Tiny INSERT/UPDATE overhead** added per touched table — invisible at our scale.
## What's NOT in this ship
Two other Supabase advisor categories were intentionally deferred:
- **54 "unused index" advisories**: held off because `pg_stat_user_indexes.idx_scan` (the counter the advisor reads) just reset with the nano→micro compute upgrade. Trusting "unused" right now would risk dropping indexes that haven't had time to register a single scan. Will revisit after ~30 days of post-restart usage.
- **6 "multiple permissive policies" warnings**: needs a careful read of each RLS policy's intent before any consolidation. Separate dedicated PR when time allows.
## Backward compatibility
No schema changes to table shapes. No code changes. Only new indexes — every query and every dashboard surface keeps working unchanged, just faster on the affected tables.
Daily: Ship to prod: v4.24 — add indexes for every unindexed foreign key
dailyautomationainotesapi
36 commits — Improvements: - Ship to prod: v4.24 — add indexes for every unindexed foreign key - Ship to prod: v4.23 — agent_priority_queue view: SECURITY INVOKER + revoke anon
Improvements:
- Ship to prod: v4.24 — add indexes for every unindexed foreign key
- Ship to prod: v4.23 — agent_priority_queue view: SECURITY INVOKER + revoke anon
- Ship to prod: v4.22 — extend .mov transcode to notes / quick-submit / AI / MCP
- Ship to prod: v4.21 — server-side ffmpeg transcode pipeline for .mov uploads
- Ship to prod: v4.20 — attachment cleanup walks transcodedUrl + fixes 3 orphan leaks
v4.23 — Tightened the agent priority queue view (no more anonymous queue snooping)
securityinfrastructureagent-loop
A Postgres view that powered the internal agent priority queue was readable by unauthenticated callers holding the public anon key. Migration 150 swaps it to security-invoker mode and revokes the anonymous role. No code changes, no breakage to legitimate callers.
## What changed
The `public.agent_priority_queue` view (used by the internal agent loop and the `/dashboard/agent-queue` page) was running in Postgres' default view mode, which means it used the view-owner's privileges rather than the caller's. Combined with Supabase's default grants, that left the view readable by `anon` — the unauthenticated role whose key ships in the dashboard JS bundle.
Migration 150 applies two changes:
```sql
ALTER VIEW public.agent_priority_queue SET (security_invoker = on);
REVOKE SELECT ON public.agent_priority_queue FROM anon;
```
The view now respects RLS on the underlying `bug_reports` and `projects` tables. Unauthenticated callers can no longer read it, and team members continue to see exactly what they could see before through their own RLS scope.
## What was actually exposed
- Ticket numbers, titles, severity (s1/s2/s3), type, and assignee UUIDs for `status='new'` rows in the bugagent dogfood project.
- ~30 bugs visible at any time.
**Not exposed** (each behind its own RLS): bug descriptions, comments, attachments, customer-project data, bugs in any non-`new` status.
## What didn't change
- The MCP `pick_next_bug` tool — uses the service-role connection, unaffected.
- The dashboard agent-queue page — TestLauncher members satisfy the `bug_reports` RLS for their own team's projects.
- No code changes shipped with this — purely a Postgres-level grant + view-option tightening.
## Why we didn't catch it earlier
Postgres views default to security-definer behavior. The original migration (137) created the view without specifying `security_invoker = on`. Supabase's default grants then handed SELECT to `anon` automatically, and the combination created the leak. The Supabase database linter flagged it, which is how we noticed.
v4.22 — Server-side .mov transcoding now covers every upload path
attachmentsbug-reportsnotesvideorunnermcp
Extends the v4.21 ffmpeg pipeline to notes, quick-submit, AI-drafted reports, and the MCP server. Every .mov uploaded through bugAgent — regardless of which surface it comes from — now gets a real .mp4 produced server-side.
## What changed
The .mov → .mp4 transcoder introduced in v4.21 now fires for **every** attachment-upload entry point. Wherever you can attach a video to a bug report or a note, you get the same automatic transcoding behavior:
- **Bug reports** — manual dashboard upload (v4.21), quick-submit, AI-drafted reports, MCP `create_bug_report`.
- **Notes** — multipart upload, signed-URL direct-upload finalizer, MCP `create_note` / `update_note`.
- **Chrome extension** — already covered via v4.21 since it goes through the same `/api/reports/upload` endpoint.
## What's NOT in scope
- FAB SDK session captures — those go through `/api/sessions/capture` which only accepts `.webm` and `.mp4` (browser MediaRecorder doesn't produce `.mov`). Nothing to transcode there.
## Render behavior
The attachment download links on the notes detail and new-note pages now prefer the transcoded `.mp4` URL when present. So once the runner finishes transcoding (typically 10-30s after upload), clicking the link gives you the playable `.mp4` instead of the original `.mov`. The original `.mov` remains in storage for audit / re-download.
## Failure graceful-degradation chain
1. If `transcodedUrl` is present → `<source type="video/mp4">` plays directly in every browser.
2. If transcode is still in flight or failed → v4.19 dual-source MOV trick demuxes the QuickTime container as MP4 (works for ~99% of macOS screen recordings).
3. If even that fails (rare HEVC `.mov` files) → inline download link inside the `<video>` element.
The v4.20 cleanup-on-delete walks both `url` and `transcodedUrl` so neither file ever orphans in storage when an attachment, report, note, or whole project is deleted.
## Closes the .mov saga
This is the fourth ship in the v4.19/v4.20/v4.21/v4.22 series:
- v4.19 — browser-side dual-source fallback (immediate fix on existing .mov files)
- v4.20 — cleanup-on-delete walks transcodedUrl + 3 pre-existing orphan-leak fixes
- v4.21 — runner ffmpeg endpoint + dashboard helper + bug-report upload integration
- v4.22 — extend integration to every remaining upload entry point (this ship)
v4.21 — Server-side ffmpeg transcoding for .mov uploads
attachmentsbug-reportsvideorunnerinfrastructure
Every .mov uploaded to a bug report is now automatically transcoded to a real .mp4 server-side. Playback no longer depends on browser quirks.
## What changed
When a `.mov` file is uploaded to a bug report (via the dashboard or the Chrome extension), the automation runner now transcodes it to a true `.mp4` server-side using ffmpeg. The video element on the report detail page prefers the transcoded file as its primary source, so playback works the same in every browser — no MIME spoofing, no quirks.
## How it flows
1. You upload a `.mov` to the report — same as always.
2. The dashboard saves the file to storage and returns the response immediately (no upload delay).
3. In the background, the runner downloads the `.mov`, runs `ffmpeg -c:v libx264 -c:a aac -movflags +faststart`, uploads the `.mp4` alongside the original, and patches the attachment record with the new URL.
4. About 10-30 seconds later (depending on file size), the video element on the report page automatically picks up the `.mp4` and plays it natively in any browser.
The `.mov` original is kept in storage alongside the transcoded `.mp4` so you don't lose the source.
## Failure handling
The pipeline is best-effort. If anything fails — runner offline, ffmpeg can't decode an exotic codec (HEVC etc.), upload fails — the transcoded URL is never written and the `.mov` continues to play via the v4.19 dual-source fallback. So this is pure optimization on top of the v4.19 safety net.
## Covered today
- Manual dashboard upload from the bug-report detail page
- The Chrome extension's session-replay flow (hits the same endpoint)
## Coming next (v4.22)
The `kickoffTranscode` helper is in place — the v4.22 ship extends integration to: notes upload paths, AI-drafted reports, quick-submit, FAB SDK session capture, and MCP server `create_bug_report`. Same helper, different upload entry points.
v4.20 — Attachment cleanup picks up transcoded files + fixes three orphan leaks
attachmentsdata-integritybug-reportsnotesbugfix
Foundation for the upcoming server-side ffmpeg transcode pipeline: every code path that deletes attachments now removes both the original file and its transcoded counterpart. Audit also caught and fixed three pre-existing orphan-leak bugs.
## What changed
When you delete a bug report, a note, an entire project, or even a single attachment row, the dashboard now batches up every storage file linked to that object and removes them in one shot. Previously the cleanup walked only the `url` field; with the upcoming ffmpeg transcode pipeline producing a `.mp4` alongside each uploaded `.mov`, we'd have re-orphaned every transcoded file on delete without this fix.
## Three pre-existing leaks caught during the audit
The audit turned up three places where the cleanup wasn't happening at all today — independent of the .mov work:
1. **MCP `delete_note` tool.** Deleting a note via Claude Desktop / Inspector / the MCP server dropped the row but never called `storage.remove()`. Every note attachment ever deleted through MCP was orphaned.
2. **MCP `delete_project` tool.** Same issue, scaled up — dropped every `bug_reports` and `notes` row in the project but left every attachment behind.
3. **Dashboard project DELETE.** The existing cleanup code read fields (`storage_path`, `path`) that don't exist on bug-report attachments — attachments only carry `url`. So the cleanup found zero paths and silently leaked 100% of project-delete attachments.
All three are fixed in this ship.
## Quality wins on the bug-report side
- The full-report delete from the detail page now routes through `DELETE /api/reports/[id]` instead of doing inline storage calls. One round-trip instead of N, and the server-side cleanup correctly handles signed-URL query strings (the old client-side `split('/bug-attachments/')` ate `?token=…` suffixes, silently leaking signed-URL files).
- The single-attachment "X" button delete also walks both URLs now.
## Test coverage
The shared helper picks up 11 new regression cases — single-attachment, dedup, missing-url, external-URL, bulk-walks, and backward-compat-without-transcoded-url paths are all pinned.
## What's next
v4.21 brings the actual ffmpeg transcode pipeline: a new `/transcode-video` endpoint on the automation runner, a fire-and-forget helper from the dashboard upload handlers, and a render-time preference for the `.mp4` over the `.mov`. v4.22 extends the pipeline to every remaining upload entry point (FAB SDK, Chrome extension, AI-drafted reports, notes, MCP server).
v4.19 — .mov video attachments now play in Chrome / Firefox / Edge
bug-reportsattachmentsbugfixux
macOS screen recordings (saved as .mov / video/quicktime) used to render as a blank video player outside Safari. They now play in every major browser via a dual-source declaration, with a graceful download fallback for unsupported codecs.
## What changed
Video attachments with MIME type `video/quicktime` (or filename ending in `.mov`) now play in Chrome, Firefox, and Edge — not just Safari. Same single click; no extra UI.
## Why it was broken
Only Safari plays `.mov` natively (it's the macOS native format). Chrome / Firefox / Edge refuse the `video/quicktime` MIME type even when the file underneath is a perfectly valid H.264 + AAC stream (which is what macOS screen recordings actually contain — they're structurally compatible with MP4, just wrapped in a QuickTime container).
## The fix
For `.mov` attachments, the dashboard now declares two `<source>` tags inside the `<video>` element:
1. `<source type="video/mp4">` first — tricks Chrome / Firefox / Edge into demuxing the QuickTime container as MP4. Works for ~99% of .mov files in practice.
2. `<source type="video/quicktime">` second — for Safari, which prefers the accurate MIME.
Browsers pick the first source they can decode. The rare HEVC-encoded .mov (where the trick won't work) falls through to an inline "Your browser can't play this video — Download it to view" link.
## What's next
A bigger follow-up is in progress: server-side ffmpeg transcode at upload time. Every new .mov upload will be transcoded to a real .mp4 server-side, so playback no longer depends on browser quirks. Will cover every attachment entry point (manual upload, FAB / SDK, Chrome extension) as well as Notes attachments.
v4.18 — Deleting the last test step now persists
test-casesdata-integritybugfix
TEST-271 fix. The "X" button on the only remaining test step in a test case now writes the deletion to the database. Previously the row cleared in the UI but reverted on reload.
## What changed
On the test-case detail page, clicking the **X** delete icon next to the only remaining test step now saves the empty state to the database. Reloading the page no longer resurrects the deleted step.
## Root cause
The delete-step handler had two branches: the multi-step path called `saveSteps()` after the in-memory splice; the single-step path cleared the row but **didn't persist**. Reloading re-fetched the unchanged database row and the deletion appeared to revert.
Fix: one-line addition so the single-step branch also persists. The existing blank-filter inside `saveSteps()` strips empty rows before sending, so the database stores `steps: []` and the UI shows one empty placeholder row to type into on next visit.
## Backward compatibility
No schema changes. Multi-step deletes already worked and were not touched. The placeholder-row behavior on test cases with zero saved steps is unchanged — the page init logic still adds an empty row to type into.
v4.17 — Comments thread auto-collapses; newest comment shown first
ui-uxbug-reportscomments
Follow-up to v4.16. The COMMENTS / PROMPTS thread on every bug-report page now collapses by default (click heading to expand) and sorts newest-first. The composer stays visible whether collapsed or expanded.
## What changed
On every bug-report detail page:
- **Auto-collapsed by default** — the comments thread starts hidden. The COMMENTS / PROMPTS (x) heading is a click target with a chevron; click it to expand the thread. Same pattern as the Similar Issues and Change Log toggles already on the page.
- **Composer always visible** — you don't have to expand the thread to post a comment. The text input and Submit button sit below the heading whether the thread is open or closed.
- **Newest-first sort** — top-level comments now appear in newest-to-oldest order. Replies inside a thread stay chronological so conversations still read top-to-bottom within a thread.
- **New comments insert at the top** — covers all three paths: self-submit, Jira sync polling, and Supabase realtime INSERT.
- **Self-submit auto-expands** — posting from the composer reveals the thread automatically so you actually see your just-posted comment. Comments posted by teammates or imported from Jira do NOT auto-expand (the count bump in the heading is enough of a signal without yanking your viewport).
## Why
v4.16 moved the thread into the main column directly under Description, which made long comments much easier to read but also took more vertical space on busy tickets. The collapse default keeps the page compact while preserving one-click access to the full discussion. Newest-first matches how people scan retest narratives and AI analyses — the latest action is the one you usually want.
## What still works
- Reactions, threaded replies, edit, delete, @mention autocomplete.
- Jira sync polling + Supabase realtime — both continue to drive inserts via the same render path; the toggle is a separate concern.
- The per-comment "Show more / Open in popup" controls — re-measured when the thread expands so they appear correctly even on comments that were inserted while the thread was collapsed.
v4.16 — Bug-report comments moved to main column, with popup + clamp controls
ui-uxbug-reportscomments
The Comments thread moved out of the right sidebar to the main column directly under Description. Long comments now auto-clamp with Show more / Show less, and a new Open in popup option opens a focused modal for full-width reading.
## What changed
On every bug-report detail page, the **COMMENTS / PROMPTS** thread (previously titled "Comments" and tucked into the right sidebar) now lives in the main column directly under Description. Wider column = much easier reading on long retest narratives, AI analyses, and multi-paragraph discussions.
### Heading rename
- **Comments** → **COMMENTS / PROMPTS (x)** — reflects that the agent flow now uses this thread for retest prompts in addition to discussion.
- The (x) count parentheses always render, even at 0, so the slot doesn't shift when the first comment arrives.
### Long-comment controls
Comments now auto-clamp at ~280px tall when content overflows, with a fade-out gradient at the bottom and a small toolbar:
- **Show more / Show less** — inline expand/collapse, keeps your scroll position. Best for skimming in line with the rest of the thread.
- **Open in popup** — opens a centered modal with the full comment content, internal scroll, close on X / backdrop click / ESC. Best for reading or copying long content without page-scroll fighting.
The modal preserves rendered formatting: mentions, links, code blocks, and lists all carry through verbatim.
### What still works
All polling and realtime paths verified intact:
- Jira sync (every 30s) — imported comments get the new clamp/popup controls automatically.
- Supabase realtime — INSERT, UPDATE, and DELETE events all play correctly. Remote edits re-measure the wrapper so a comment edited from short to long (or vice versa) gets its toolbar state refreshed.
- Reactions, replies, edit / delete, the composer, and @mention autocomplete are unchanged.
## Why
The previous right-sidebar placement worked when there were 1-2 short comments. As tickets accumulated retest notes, AI analyses, and threaded replies, the narrow column made it hard to follow long content. Moving comments into the main column under Description matches the natural reading flow of a bug report (description → discussion) and gives roughly 2× the width.
v4.15 — Mobile BrowserStack runs honor the picker's OS version
automationsmobilebrowserstackbugfix
TEST-65 re-fix. Pixel 9 (and any other modern flagship) now reaches BrowserStack with the OS version you picked in the dropdown instead of falling back to Android 14.
## What changed
Mobile Appium runs (Run Now on **Automate Mobile** and cron-triggered mobile schedules) now use the OS version you select in the device picker. Previously the Run Now dialog stored only the device id; the backend filled in a hardcoded fallback that defaulted to Android 14 for any device not in a small static list — which BrowserStack rejected for newer devices like the Pixel 9.
### Fix details
- The **Run Now** modal at *Automate Mobile → automation → Run Now* now passes the OS version that matches your dropdown selection. The run row in the dashboard also displays it correctly (instead of blank).
- The runner-side `runAppiumOnBrowserStack` now queries BrowserStack's live device catalog when the caller doesn't supply an OS version, so cron-triggered schedules also send a version BrowserStack accepts.
- The internal fallback catalog (used only when the live query times out) was expanded to the current lineup: Pixel 6→10, Samsung Galaxy S21→S26, iPhone 13→17, plus current OnePlus / Motorola models. Last-resort default bumped from `14.0` / `17` to `15.0` / `18.6`.
- Regression test added so this can't silently re-break.
## Why
The May 9 retest claim that this was fixed was wrong — it referenced an update to the web-Playwright Live-run picker that doesn't cover the real-mobile Appium path. Russell pushed the ticket back this morning; this re-fix targets the right code path.
## No action needed
Existing mobile automations and schedules continue to work. Next time you pick a device in Run Now, the OS shown in the dropdown is what BrowserStack receives.
v4.14 — MCP Connector redirect URI is now required
mcpoauthdevelopersux
The Redirect URIs field on the MCP Connectors card no longer pre-fills claude.ai's callback. It's a placeholder + required field, matching the multi-tenant intent of the credentials.
## What changed
**Settings → Developers → MCP Connectors → Redirect URIs** used to silently default to `https://claude.ai/api/mcp/auth_callback` if you left it blank. That made the field look like it had a sensible default even when you were generating credentials for a different MCP host.
It's now a true placeholder, just like the "Connector name" field next to it:
- Placeholder reads `e.g. https://claude.ai/api/mcp/auth_callback`.
- Field is `required` — submitting empty returns "At least one redirect URI is required".
- Hint copy: "ask your MCP host which callback URL to use".
## Why
bug**Agent** is multi-tenant. A workspace might be connecting through claude.ai, but it might also be connecting through any other OAuth-aware MCP host with its own callback URL. Pre-filling claude.ai's URL hid the choice and produced quietly-wrong credentials for non-claude.ai hosts.
## Public docs
- `bugagent.com/mcp` Option 6, `bugagent.com/docs` "Connecting to the MCP Server" bullet 6, and `bugagent.com/api-reference` Connect via MCP all updated. References to "the default redirect URI is set for Claude.ai" are gone; Claude.ai's specific callback URL is now mentioned only as a worked example.
## Backward compatibility
No behavior change for existing connectors — they keep whatever redirect URI was stored when they were generated. New connectors require an explicit URI.
OAuth client_id/client_secret for the bugAgent MCP server are now described as host-agnostic across the dashboard settings UI and the public docs. Claude.ai is shown as one example among any OAuth-aware MCP host.
## What changed
The **Settings → Developers → MCP Connectors** card (previously labeled "Claude Connectors") generates OAuth credentials that work with any MCP host implementing OAuth 2.0 Authorization Code with PKCE — not just claude.ai.
### Dashboard
- Card renamed **Claude Connectors → MCP Connectors**.
- Card description, "Confidential" auth-method blurb, new-credentials banner, and empty-state line all rephrased to lead with "any OAuth-aware MCP host" instead of singling out claude.ai.
- Removed the claude.ai-only step-by-step help block at the bottom of the card. Per-host walkthroughs now live in the public docs where they belong.
### Public docs
- **mcp.bugagent.com/mcp** — Option 6 is now "OAuth-aware hosts (Claude.ai web shown as the example)" with a generic walkthrough that uses claude.ai as the worked example.
- **mcp.bugagent.com/docs** — "Connecting to the MCP Server" intro and bullet 6 reframed.
- **mcp.bugagent.com/api-reference** — "Connect via MCP" intro and Option 6 walkthrough match.
- **Homepage "How it works"** — Step 1 now explicitly mentions `client_id` / `client_secret` and the "MCP Connectors" card.
### Sign-in clarification
Fixed misleading docs that implied Google sign-in was required for the OAuth consent step. bug**Agent** uses Supabase Auth and supports both **Google OAuth and email/password** — whichever you use for the dashboard works for connector consent.
## Why
bug**Agent** is multi-tenant. Customers connect from many MCP hosts (Claude Code / Cursor / VS Code register themselves via RFC 7591 dynamic registration; other hosts ask for static `client_id` + `client_secret` upfront). The credential card and docs shouldn't imply that the credentials are claude.ai-specific — they're a generic OAuth 2.0 client registration that any MCP host can consume.
No behavior changes to credential generation, OAuth endpoints, or token lifecycle.
Daily: Ship to prod: v4.12 — public-docs catch-up for the Claude.ai connector flow (v4.11 follow-up)
dailyaibug-reports
16 commits — Improvements: - Ship to prod: v4.12 — public-docs catch-up for the Claude.ai connector flow (v4.11 follow-up) - Ship to prod: v4.11 — generate Claude connector OAuth credentials from the dashboard
Improvements:
- Ship to prod: v4.12 — public-docs catch-up for the Claude.ai connector flow (v4.11 follow-up)
- Ship to prod: v4.11 — generate Claude connector OAuth credentials from the dashboard
- Ship to prod: v4.01 — clicking "Create Bug" in sidebar opens the composer cleanly even when saved=collapsed
- Ship to prod: v4.00 — explicit h4/h5/h6 sizing in bug-report markdown so #### isn't bigger than #
- Ship to prod: v3.99 — collapse Similar Issues by default + gate sidebar sub-items on parent activeNav
v4.12 — Public docs caught up to the v4.11 Claude.ai connector flow
docsmcpoauthwebsite
The homepage, docs page, and API reference all mentioned "six connection options" and pointed at an old settings URL. All three now show the new Claude.ai (web) walkthrough alongside the existing API-key-based clients.
## What changed
Follow-up on v4.11. The dashboard's Claude Connectors UI shipped together with the website/mcp.astro walkthrough, but three other public surfaces still talked about "six connection options" and pointed at the old `/dashboard/settings/api-keys` URL.
- **bugagent.com/docs** — "Connecting to the MCP Server" feature list grew from six items to seven. Item 6 walks through generating a connector and pasting it into claude.ai. Fixed the URL pointing at the old API-keys page.
- **bugagent.com/api-reference** — "Six ways to connect" heading bumped to "Seven". Lead paragraph rewritten to call out the API-key vs OAuth split. Full Claude.ai walkthrough inserted as Option 6.
- **bugagent.com homepage** — How It Works "Step 1: Connect" now says "all 7 connection options"; Integrations grid MCP tile broadened to mention Claude.ai (web).
## Files changed
- `website/src/pages/docs.astro`
- `website/src/pages/api-reference.astro`
- `website/src/components/sections/HowItWorks.astro`
- `website/src/components/sections/Integrations.astro`
v4.11 — Generate Claude connector credentials right from the dashboard
developersmcpoauthdashboarddocs
Need a client_id + client_secret to add bugAgent as an MCP connector in claude.ai? Settings → Developers → Claude Connectors now generates both with one click.
## What changed
Until now bugAgent's MCP server only supported RFC 7591 dynamic client registration (DCR) — Claude Code, Cursor, and VS Code drive that flow inline, but the claude.ai web app prompts for a static client_id and client_secret upfront and has no DCR path.
Workspace owners can now generate connector credentials manually from `Settings → Developers → Claude Connectors`. Same UX as API keys: name + redirect URI(s), pick public (PKCE only) or confidential (id + secret), generate, copy once, never see the secret again. The success screen shows the values plus the authorization / token URLs to paste into claude.ai's "Add MCP connector" form.
Generated credentials live in a new `oauth_clients` table (migration 149) so they survive Railway container restarts — distinct from DCR-issued in-memory clients which are still ephemeral by design. The MCP server's `clientsStore.getClient()` now reads from the DB first, falling back to the in-memory DCR map. Stamps `last_used_at` on every lookup.
Security note: client secrets are stored as plaintext (required by the MCP SDK's plaintext-equality middleware on /token, same pattern Auth0/Okta use for OAuth secrets — distinct from API keys which are hashed because they're long-lived bearer tokens). Future hardening would swap the SDK middleware for a custom interceptor doing SHA-256 compare; not blocking for v1.
## Files changed
- `supabase/migrations/149_oauth_clients.sql` (new)
- `mcp-server/src/auth.ts`
- `dashboard/src/pages/dashboard/settings/developers.astro`
- `website/src/pages/mcp.astro` (added "Option 6 — Claude.ai" walkthrough)
v4.01 — "Create Bug" sidebar shortcut now opens the composer cleanly even when collapsed by default
bug-reportssidebardashboardui-fix
If you had the bug-report composer set to collapsed and clicked "Create Bug" in the sidebar, the composer briefly flashed expanded then closed and never opened. Two stacked bugs; both fixed.
## What changed
Clicking "Create Bug" in the sidebar links to `/dashboard/reports?new=1` and is meant to force the composer open. With a saved-collapsed preference, it instead rendered:
1. Briefly expanded (server-rendered default)
2. Then collapsed (saved=false branch ran)
3. Then nothing — the explicit-open branch never fired because…
Two bugs stacked:
- **Cookie-restore replaceState dropped `?new=1`.** A separate script that restores filter state from cookie ran before the composer-toggle script. It rebuilt the URL from filter params + project + team only, stripping `new=1` before the composer-toggle IIFE could read it. Fix: preserve `?new=1` in that replaceState.
- **Saved-state branch ran before the explicit-open branch.** Even when `?new=1` survived, the page first applied the saved-collapsed state for one frame, then re-expanded — visible flicker. Fix: hoist the `?new=1` detection ABOVE the saved-state collapse so the initial paint state is correct.
## Files changed
- `dashboard/src/pages/dashboard/reports/index.astro`
v4.00 — Heading sizes h4–h6 in bug-report descriptions now follow the descending scale
bug-reportsmarkdownui-fixdashboard
Writing #### or ##### in a bug-report description used to render bigger than #. Explicit sizing for every heading level fixes the inverted hierarchy.
## What changed
The markdown renderer (markdown-it) emits real `<h1>`–`<h6>` elements, but the description CSS only customized h1/h2/h3 down to ~17/15/14 px. h4-h6 fell through to the browser user-agent stylesheet, where h4's default ~18 px ended up LARGER than our customized h1 at 16.8 px. Writing `#### Footnote` rendered visually bigger than `# Title`.
Added explicit sizes for every level so the descending visual cascade actually descends:
| Level | Size | Weight |
|---|---|---|
| h1 (`#`) | 1.05rem (~17 px) | 700 |
| h2 (`##`) | 0.95rem (~15 px) | 600 |
| h3 (`###`) | 0.85rem (~14 px) | 600 |
| h4 (`####`) | 0.8rem (~13 px) | 600 |
| h5 (`#####`) | 0.78rem (~12 px) | 500 |
| h6 (`######`) | 0.75rem (~12 px) | 500 |
## Version note
Rolls 3.99 → 4.00. Per the dashboard's MAJOR.MM versioning, a small bump when MM hits 99 rolls MM to 00 and bumps MAJOR. The change is three CSS rules — it just happens to be the release that crosses the boundary.
## Files changed
- `dashboard/src/pages/dashboard/reports/[id].astro`
v3.99 — Similar Issues collapsed by default + sidebar sub-items hidden until parent active
bug-reportssidebardashboardui-fix
The bug-report detail page's Similar Issues card is now collapsed by default (matches the Change Log toggle). The sidebar only shows sub-items like "Create Bug", "Agent Queue", and "Probes" when their parent group is the active section.
## What changed
**Similar Issues — collapsed by default**
The Similar Issues card on the bug-report detail page used to render its full match list expanded. On popular bugs with lots of duplicates that pushed the Change Log, attachments, and time-tracking sections far down. Now the section opens collapsed with a clickable header, count badge, and chevron — same toggle pattern as the Change Log card right below it. Click to expand.
**Sidebar sub-items only show when the parent group is active**
The left rail used to always render "Create Bug" and "Agent Queue" under Bug Reports, and "Probes" under Security. Now those children only appear when the active page is in that section (or a sibling child is). On every other page the rail shows just "Bug Reports" and "Security" as parent items — much less visual noise.
Logic: a child item with `parent: '<id>'` renders if `activeNav` matches the parent, the child itself, or any sibling child of the same parent. The third clause keeps the subtree expanded as you navigate between siblings (e.g. Bug Reports → Agent Queue without "Create Bug" yanking out from under you).
## Files changed
- `dashboard/src/components/Sidebar.astro`
- `dashboard/src/pages/dashboard/reports/[id].astro`
v3.98 — Markdown lists in bug-report descriptions actually show their numbers / bullets
bug-reportsdashboardmarkdownui-fix
Hotfix on top of v3.97. The markdown-it swap rendered descriptions correctly into the DOM but the list-marker CSS got eaten by the global reset because the dynamically-injected elements didn't carry Astro's scope hash.
## What changed
v3.97 swapped the bug-report description renderer to markdown-it + DOMPurify, which produces real `<ol>` / `<li>` / `<strong>` elements. The structural DOM was correct but list markers (1. 2. 3.) and bullets weren't visible — the page's scoped `<style>` block didn't match the dynamically-injected elements (Astro hashes scoped selectors per-component; elements injected via `set:html` don't carry the hash), so the global CSS reset (`* { margin: 0; padding: 0 }`, `ul, ol { list-style: none }`) won by default.
Fix: wrapped every descendant selector inside `.report-description` with `:global()` so the rules apply to the markdown-it output. The parent `.report-description` selector stays scoped — it's the static container that Astro renders directly.
After this change a bug-report description with a numbered list shows "1. 2. 3.", bullet lists show `•`, code fences render in a styled `<pre><code>` block, and headings / strong / italic / blockquote / hr / links all visually styled.
## Files changed
- `dashboard/src/pages/dashboard/reports/[id].astro` — CSS only
v3.97 — Bug-report Markdown rendering swapped to a real parser (markdown-it + DOMPurify)
bug-reportsdashboardmarkdowndependencies
Bug-report descriptions now go through the markdown-it CommonMark parser, sanitized by DOMPurify. Code fences, nested lists, auto-linked URLs, and more all render correctly.
## What changed
v3.96 shipped a small regex-based Markdown renderer covering the dashboard's bug-report template. It worked, but the regex couldn't handle code fences, nested lists, tables, or reference links — anything beyond the very basic cases.
Replaced with `markdown-it` (a CommonMark-compliant parser used by GitHub, GitLab, and friends) wrapped in `isomorphic-dompurify` for XSS sanitization. Same call sites; far wider coverage:
- Fenced code blocks (```` ``` ````) render as `<pre><code>` with proper styling
- Nested lists nest visually
- Bare URLs in prose auto-link
- Reference links work
- Tables and other GFM-style features available behind plugins (none enabled today)
Security:
- `markdown-it` constructed with `html: false` — inline HTML in the source is escaped, not parsed
- DOMPurify allowlist scopes the surviving tags + attributes
- URL allowlist blocks `javascript:` / `data:` / `vbscript:` / `file:` schemes at the sanitizer
- 25 unit tests cover the formatting cases plus the XSS regression set
Bundle cost: ~110 KB gzipped on the bug-report detail page only. The rest of the dashboard doesn't import the lib so they're unchanged.
## Files changed
- `dashboard/package.json` (added `markdown-it`, `@types/markdown-it`, `isomorphic-dompurify`)
- `dashboard/src/lib/render-markdown.ts` — full rewrite
- `dashboard/src/lib/render-markdown.test.ts` — 25 cases (formatting + XSS hardening)
- `dashboard/src/pages/dashboard/reports/[id].astro` — CSS updated for real `<ul>` / `<ol>` / `<li>` / `<pre>` styling
- `dashboard/src/pages/dashboard/notes/[id].astro` — inline regex stays (page uses `define:vars` so ES imports don't bundle)
v3.96 — Bug-report descriptions now render Markdown
bug-reportsui-fixdashboardmarkdown
Headings, bold, lists, code spans, blockquotes, and links in a bug report's description now render as formatted text instead of raw `##` and `**` characters.
## What changed
For a long time the bug-report detail page rendered descriptions verbatim — even though the dashboard's "Report a bug" template uses Markdown (`## Steps to Reproduce`, `**bold**`, numbered lists), every report showed those characters as literal text.
The Session Notes Edit/Preview tab pair has had a small regex Markdown renderer for a while. Extracted it into a shared helper, locked down the URL allowlist (`javascript:` / `data:` / `vbscript:` / `file:` URLs are rewritten to `#` so dropping the output into the page is XSS-safe), added 24 unit tests, and wired it through the bug-report detail page.
Five write paths (Save, AI Format, Jira force-sync, real-time Jira diff, manual Jira pull) all route through the new `setDescriptionView()` client helper now, so live edits keep their formatted rendering without a page reload.
## Files changed
- `dashboard/src/lib/render-markdown.ts` (new)
- `dashboard/src/lib/render-markdown.test.ts` (new — 24 cases)
- `dashboard/src/pages/dashboard/reports/[id].astro` — SSR `set:html` + per-update-path render hook + scoped CSS for the rendered elements
- `dashboard/src/pages/dashboard/notes/[id].astro` — synced its inline copy with the new escapes + URL allowlist
v3.95 — Three small queue fixes: accessibility link, session-note toolbar, AI Assistant scope
accessibilitysession-notesaidashboard
Easy-to-fix queue batch — added an Accessibility Statement link to the in-app header dropdown, fixed the session-note markdown toolbar so it appears on first load, and stopped the AI Assistant from suggesting unreachable setup files.
## What changed
**Accessibility Statement link in the app**
The marketing site has an Accessibility Statement page but the in-app shell didn't link to it (the app sits on a subdomain that doesn't render the marketing footer). Added a link to the user-profile dropdown in the dashboard header — reachable from every page, opens in a new tab.
**Session note toolbar visibility**
The markdown formatting toolbar (bold / italic / headings / lists / quote / code / hr / link) on Session Notes was hidden until you toggled the Edit/Preview tab pair. The SSR display rule only revealed it for the 'bugtemplate' format, but every button in it is a markdown formatter. Now visible from first paint when the note format is markdown — both on the create form and on the detail page.
**AI Assistant respects the sandboxed runner**
The Optimize-script and AI Assistant prompts told Claude to "use storageState pattern" without warning it the runner has no access to a Playwright project's config files. Claude was occasionally recommending changes to `playwright.config.ts` / `auth.setup.ts` / `global-setup.ts` — all unreachable in our hosted product. Both prompts now include an ENVIRONMENT CONSTRAINTS section that hard-bans references to those files and pushes auth handling inline.
## Files changed
- `dashboard/src/components/DashboardHeader.astro`
- `dashboard/src/pages/dashboard/notes/[id].astro`
- `dashboard/src/pages/dashboard/notes/new.astro`
- `dashboard/src/pages/api/automations/[id]/optimize.ts`
- `dashboard/src/pages/api/ai/chat.ts`
v3.94 — Bug-report list rows: title finally vertically centers when neighboring cells wrap
bug-reportsui-fixdashboard
Third (and decisive) patch in the row-alignment chain. With display:flex stripped from the title cell, the title now centers consistently with every other cell regardless of which one goes multi-line.
## What changed
The title cell on the Bug Reports list had `display: flex; align-items: center` left over from a delete button that was removed earlier. Setting `display` on a `<td>` takes the cell out of table-row layout, which means the global `td { vertical-align: middle }` rule (added in v3.93) was silently ignored on the title cell specifically. So when a sibling cell wrapped to multiple lines, every other cell vertically centered correctly while the title cell stayed pinned to the row's top — exactly the "small titles look offset from the row's bottom border" symptom.
The `<a>` link inside the cell doesn't need a flex container around it. Removing `display: flex` drops the title cell back into normal table-cell layout, so it inherits `vertical-align: middle` and centers consistently with the rest of the row.
Third patch in the chain:
- v3.92 prevented `.person-cell` from wrapping in JS-rendered rows
- v3.93 forced `td { vertical-align: middle }` so cells with single-line content centered with multi-line neighbors
- v3.94 (this) removes flex from `.title-cell` so it actually USES that vertical-align
## Files changed
- `dashboard/src/pages/dashboard/reports/index.astro`
- `dashboard/src/components/BugReportRow.astro`
v3.93 — Table rows stay aligned even when one cell wraps to multiple lines
ui-fixdashboardtables
Even short titles like "bug 1" used to look misaligned with the row's bottom border whenever a neighboring cell wrapped. Every table cell now vertically centers its content, so row alignment stays clean regardless of which cell goes multi-line.
## What changed
Follow-up to v3.92. The previous fix made `.person-cell { white-space: nowrap }` apply to client-side rendered rows so a freshly rapid-submitted row's "Jason Hamilton" no longer wrapped. But the underlying alignment quirk persisted: the global `td` rule never set `vertical-align`, so cells fell back to the HTML default `baseline`. When any cell in a row wrapped (a long title, a wrapped tag list, anything multi-line), baseline-aligned cells anchored to the first line's baseline while flex-centered cells stayed centered — making short content like "bug 1" look offset from the row's bottom border.
Fix: `td { vertical-align: middle }` in the global stylesheet. Every cell vertically centers its content within the row, so adding wrap to one cell doesn't desync the others.
## Files changed
- `dashboard/src/styles/global.css`
v3.92 — Rapid-submit rows on the Bug Reports list now match SSR rows
bug-reportsui-fixdashboard
A freshly rapid-submitted bug report no longer renders taller than its neighbors because of long names wrapping in the Reported By / Assigned To columns.
## What changed
The Bug Reports list builds rows two ways: server-side (via the `BugReportRow` component) on initial load, and client-side (via the `buildRowHTML` JS helper) for rapid-submit / kanban-toggle / Realtime updates. The component had its row styles inside a scoped `<style>` block, which Astro hashes per-component — so JS-injected rows were missing rules like `.person-cell { white-space: nowrap }`. That made names like "Jason Hamilton" wrap to two lines on a freshly-prepended row, while neighboring SSR rows kept them on one line.
Fix: hoisted `.report-row`, `.report-title` (+ hover), `.person-cell`, `.resolution-cell`, `.resolution-empty`, `.report-time`, `.quality-pill`, and `.badge-resolution` into the parent page's existing `<style is:global>` block. Both render paths now share identical selectors and rules — heights, fonts, colors, hover states, and column wrapping are consistent regardless of how a row got there.
## Files changed
- `dashboard/src/pages/dashboard/reports/index.astro`
Daily: Ship to prod: v3.95 — TEST-41 / TEST-56 / TEST-98 small queue fixes
dailyaibug-reportsnotesapi
51 commits — Improvements: - Ship to prod: v3.95 — TEST-41 / TEST-56 / TEST-98 small queue fixes - Ship to prod: v3.94 — drop display:flex from .title-cell so vertical-align: middle finally takes effect
Improvements:
- Ship to prod: v3.95 — TEST-41 / TEST-56 / TEST-98 small queue fixes
- Ship to prod: v3.94 — drop display:flex from .title-cell so vertical-align: middle finally takes effect
- Ship to prod: v3.93 — vertical-align: middle on all <td> so rows stay aligned when one cell wraps
- Ship to prod: v3.92 — unify bug-report row styles so rapid-submit rows match SSR rows
- Ship to prod: v3.91 — surface case attachments, URLs, and text-template content during runs and in filed bugs
v3.91 — Test case attachments, reference URLs, and text-template content flow into runs and filed bugs
test-runstest-casesattachmentsdashboardux
The full author-supplied context for a test case — attachments, URLs, and free-form text instructions — now appears during a run and carries over when you file a bug report from a failed case.
## What changed
A bunch of TEST-212 / TEST-183 wiring was already in place — case-level attachments, reference URL lists, and the `template_type=text` variant. But the run carousel and the "Create Bug Report" flow only consumed a subset of it. This release fills the gaps.
### Run carousel
- Text-variant cases now show their `text_content` under "Test instructions" (was labelled "Steps (text)" — matches the case-detail page).
- New "Reference URLs" block with clickable links.
- New "Reference attachments" block with 72×72 thumbnails for images and a labeled file icon for everything else. Read-only.
### Bug-report description (Create Bug Report on a failed case)
- For `template_type=text` cases the test_content is used as the body of the "Steps to Reproduce" section — no more empty Steps when the case author chose the text variant.
- "Reference URLs:" section emitted when the case has reference URLs.
- "Notes:" section emitted when the tester typed notes during the run.
- Case-level attachments are forwarded to the new bug's media gallery alongside the run-level attachments. Files keep their original storage paths; metadata tagged with `source` + `from_case_id` for future cleanup logic.
### Run exports
- CSV gains an `Attachments` column (semicolon-separated `filename (url)` pairs).
- JSON each result entry now includes the full `attachments` array.
## Files changed
- `dashboard/src/pages/dashboard/test-cases/runs/[id].astro`
v3.81 — Run attachments carry over when you file a bug from a failed test
test-runsbug-reportsattachmentsdashboardapi
Click "Create Bug Report" on a failed case in a run and the screenshots you already attached come along — no re-uploading.
## What changed
Filing a bug report from a failed test case in the run carousel now forwards every attachment on that result row to the new bug's media gallery. Files stay at their original storage paths (no duplicate copies); the bug page references them directly. Each forwarded entry is tagged `source: 'test_run_attachment'` plus the originating run / case ids so future cleanup logic can keep the file alive while either record references it.
Mechanically: `POST /api/reports/quick-submit` now accepts an optional `prefilled_media` formData field — a JSON-stringified array of attachment metadata — and seeds the bug's `media[]` from it. Freshly uploaded files attached in the same submission are merged in afterwards rather than replacing.
## Files changed
- `dashboard/src/pages/api/reports/quick-submit.ts`
- `dashboard/src/pages/dashboard/test-cases/runs/[id].astro` (createBugReport flow)
v3.80 — Tester notes on test runs are no longer silently dropped
test-runsdata-integritydashboardapi
The Notes textarea on each case in the run carousel now persists what you type — previously the column it was meant to write to didn't exist.
## What changed
Migration 148 adds a `notes text DEFAULT ''` column to `test_run_results`. The run carousel UI has been shipping a Notes textarea per case since the carousel first rolled out, but the API discarded the field and the column it was meant to land in was never added — every keystroke evaporated on blur or refresh. Surfaced as a TEST-183 follow-up while wiring the new attachments column.
Now:
- `POST /api/test-runs/:id/results` reads `notes` off the body and persists it on insert + update.
- `GET /api/test-runs/:id` returns `notes` per row (defaults to empty string for legacy rows).
- The UI required no changes — it has always sent and read the field correctly.
## Files changed
- `supabase/migrations/148_test_run_results_notes.sql` (new)
- `dashboard/src/pages/api/test-runs/[id]/results.ts`
- `dashboard/src/pages/api/test-runs/[id].ts`
- `website/src/pages/api-reference.astro`
v3.70 — Test case Name is no longer required up-front
test-casesuxdashboard
You can now click +Add test case, fill in a step or description, and click Save without typing a name first — we derive a sensible default for you.
## What changed
The Name input on `/dashboard/test-cases/new` is now optional. Most testers prefer to fill in a step or description first and name the case based on what it ended up covering — the previous Save button blocked that flow.
When you click Save with an empty Name, the form picks one from your existing input, in this priority order:
1. First non-empty step’s Action
2. First line of Description
3. First line of Preconditions
4. First line of `text_content` (text-template variant)
5. Literal “Untitled test case”
80-character ceiling, snapped to a word boundary with an ellipsis when truncating. You can rename the case from its detail page at any time afterwards.
## Files changed
- `dashboard/src/lib/derive-test-case-name.ts` (new, with 11 Vitest cases)
- `dashboard/src/lib/derive-test-case-name.test.ts` (new)
- `dashboard/src/pages/dashboard/test-cases/new.astro`
v3.69 — Test case step fields wrap correctly on the detail page
test-casesui-fixdashboard
Long Action / Expected Result text in test step rows now wraps across multiple lines instead of being clipped at ~50 characters.
## What changed
On the test case detail page, every step’s Action and Expected Result textarea now renders the full content, wrapped across as many lines as needed. Previously a 113-character string would visibly truncate at ~49 characters — you could see step 5 and step 6 looking identical even when their text diverged.
The culprit was font-load timing. The dashboard’s body font is Inter, which loads asynchronously from Google Fonts. The auto-grow logic that sizes each textarea ran synchronously at page-init time using the system-fallback font, cached the resulting height, and never re-ran once Inter swapped in. Inter’s slightly wider character metrics caused additional line wraps that the cached height didn’t account for, and the textarea’s `overflow: hidden` style hid the new bottom lines.
Fix:
- Re-run autoGrow on `document.fonts.ready` so heights settle once webfonts are on the page.
- Re-run on window-resize (debounced) so responsive layout changes don’t strand a stale height.
- Defensive CSS: `min-width: 0`, explicit `white-space: pre-wrap`, and `overflow-wrap: anywhere` on the step textareas so very long unbreakable substrings still wrap by character.
## Files changed
- `dashboard/src/pages/dashboard/test-cases/[id].astro`
v3.68 — Attach screenshots to failed test cases mid-run
test-runstest-casesattachmentsdashboardapi
Testers can now attach screenshots and recordings to a failed test case directly inside the run carousel — no more "file a bug just to capture evidence."
## What changed
When a test case fails inside a run, the carousel now shows an **Attach screenshot** button below the Actual Result textarea. Up to 8 files per upload (25 MB each, 100 MB total batch). Images render as 72×72 thumbnails you can click to open full-size; non-image files show a labeled icon. A × button on each thumbnail removes both the file and its metadata.
Under the hood the existing `test_run_results.attachments` jsonb column (in the schema since the test-management feature first shipped) is finally wired up. New endpoints power the upload + delete:
- `POST /api/test-runs/:id/results/attachments` (multipart)
- `DELETE /api/test-runs/:id/results/attachments`
Files land in the `bug-attachments` bucket under a `team-id/test-runs/run-id/case-id/...` path so per-run cleanup is straightforward.
## Why
Mid-run, a tester wants to mark the failure, capture evidence, and move on — then triage bugs at the end. Forcing a bug-report-creation detour for every fail broke that flow. Now the evidence sticks to the case in the run record itself.
## Files changed
- `dashboard/src/pages/api/test-runs/[id]/results/attachments.ts` (new)
- `dashboard/src/pages/api/test-runs/[id].ts` — GET now returns `attachments`
- `dashboard/src/pages/dashboard/test-cases/runs/[id].astro` — fail-section thumbnail strip
- `website/src/pages/api-reference.astro` — new endpoint docs
v3.58 — Test case color picker no longer clipped off-screen
test-casesui-fixdashboard
The Change Color popup on the Test Cases list now stays inside the viewport regardless of how wide your screen is.
## What changed
Clicking the Change Color icon on a test case row used to open a 130px-wide swatch grid that extended off the right edge of the viewport — only 3 of the 13 colors were visible because the popup was anchored to the rightmost column with `position: absolute` and no relative-positioned ancestor.
The popup now positions itself with viewport coordinates (`position: fixed` + `getBoundingClientRect`), right-anchored to the icon so it grows leftward into the page, and clamps itself inside the viewport with an 8px margin on either side. Resize the browser window narrow and the popup still shows all 13 colors.
## Files changed
- `dashboard/src/pages/dashboard/test-cases/index.astro` — color picker positioning rewrite
17 commits — Improvements: - Ship to prod: v3.47 — security: bump hono 4.12.16→4.12.18, fast-uri 3.1.0→3.1.2 - deps(runner): bump lighthouse from 13.2.0 to 13.3.0 in /runner
Improvements:
- Ship to prod: v3.47 — security: bump hono 4.12.16→4.12.18, fast-uri 3.1.0→3.1.2
- deps(runner): bump lighthouse from 13.2.0 to 13.3.0 in /runner
- deps(runner): bump browserstack-node-sdk in /runner
- deps(runner): bump @types/node from 25.6.0 to 25.6.2 in /runner
- deps(website): bump astro from 6.3.0 to 6.3.1 in /website
The Impact Score badge on the bug-report detail page used to read "{N} similar · {M} users" with a tooltip calling that "Affected Users". Both phrasings implied "end-users of your product who hit the bug" — but the underlying SQL counts distinct team members who filed similar reports in the last 30 days, not end-user impact. Especially confusing for multi-tenant customers. Pure label change; the math is untouched.
## What changed
On `/dashboard/reports/[id]`:
- Visible breakdown: `{M} users` → `{M} reporters` (singular: `reporter`).
- Tooltip: `Affected Users: ${score}/30 (${M} users)` → `Reporter diversity: ${score}/30 (${M} distinct team members filed similar reports in the last 30 days)`.
- Internal vars renamed for clarity (`affectedUsers` → `distinctReporters`, `affectedScore` → `reporterScore`). Local consts only; no API impact.
## Why
The badge measured one thing ("how many of YOUR team members filed similar reports") and labeled it as another ("how many users were affected"). For a small dogfood team the gap was harmless, but for a customer with a 50-person QA team reading the badge in a multi-tenant rollout, the labeling actively misleads — they'd read "5 affected users" as "5 of our end-users hit this bug in production" when it actually means "5 of our QA team filed reports of this type."
A real end-users-affected metric (separate ticket if customer demand surfaces) would need an SDK `setUser({id})` API, a new `affected_end_user_id` column, and the badge math swapped to count distinct end-users instead of distinct reporters.
## Verification
436 / 436 Vitest passing, dashboard build clean, repo-wide grep confirms no stale references to the old variable names.
All three Railway services (Dashboard, MCP Server, Automation Runner) now run in us-east4 (Virginia, GCP) — same metro as Supabase us-east-1 (Virginia, AWS). Removes ~70ms cross-coast RTT from every Supabase call; an authed dashboard page used to spend 350–700ms on pure DB-network before Postgres ran a query.
## What changed
- `region = "us-east4"` pinned in the `[deploy]` block of all three `railway.toml` files (dashboard, mcp-server, runner). Pinning in toml makes the source-of-truth committable rather than a "someone clicked a thing in the dashboard 6 months ago" black box, and guarantees future rebuilds don't drift back.
- Existing blue/green config carried the cutover — Railway spun new pods in us-east4, health-checked them, switched the router, and drained the old us-west2 pods for the configured window. Zero downtime.
- CLAUDE.md "Hosting" line updated to reflect the new region.
## Cutover
```
06:27:39 UTC All three services start BUILDING merge commit 22192856
06:30:54 UTC All three services SUCCESS, /api/health and /health return 200, version 3.37
```
End-to-end ~3 minutes. No deploy errors, no error_logs spike during cutover.
## What's next
Phase 1 was the dependency for the rest of the SEA expansion. Three follow-ups now unblocked:
- Layer A — Cloudflare edge caching for marketing/docs/public reads
- Layer B — Supabase Singapore read replica + dashboard read/write split
- Layer C — Railway Singapore pod + Cloudflare geo-routing
## Verification
Dashboard `/api/health`, MCP Server `/health`, and Automation Runner `/health` all return 200 with the new version. Sentry `dashboard.middleware.team-resolve` p50 will drop from ~70ms → <10ms as production traffic hits the new pods over the next hour.
v3.36 — sending a team invite no longer waits 2 seconds for the email to leave Resend
dashboardteaminvitesperformancebugfix
Clicking "Send Invite" on /dashboard/settings/team used to block the page redirect on a sequential await sendEmail() to Resend. The email send is now fire-and-forget on the same pattern documented for assignment emails — the invitation row is created on the critical path, the email lands in the background, and the user sees the redirect ~100ms later instead of after a 500ms–2s lag.
## What changed
- The page-POST critical path now does only the work the user-visible response depends on: an unconditional DELETE of any existing pending row for this email/team, then INSERT of the new row, then redirect. (Always-delete is correct because the prior code's check-then-delete-then-insert ended in INSERT anyway.)
- New `sendInviteEmail()` helper holds the inviter-profile lookup + sendEmail call. The page handler invokes it without await — Railway's 60-second blue/green drain window gives the post-flush promise plenty of time to complete.
- Errors log to Railway with `team_id` / `to` / `role` context. If a transient email failure happens (Resend outage, domain glitch, rate limit), the invitation row is already in the team list as "Pending" and the inviter can resend or cancel.
## Latency
Critical path: ~700ms–2.2s → ~100–150ms.
Matches the canonical pattern documented in CLAUDE.md ("Emails on assignment go through sendEmail() fire-and-forget. Don't await — Resend can be slow.") that had been applied to assignment emails but missed for the invite path.
## Verification
436 / 436 Vitest passing, dashboard build clean.
v3.35 — Slack alerts no longer cry "wolf" when a CoPilot user is offline or signed out
chrome-extensiontelemetryslackobservabilitybugfix
The chrome extension's resource-sync poll fires every 15 seconds. Pre-fix, two recurring failures — the user being offline and the user not being signed in — surfaced through /api/errors/capture and woke #engineering several times an hour. Both are expected user-state conditions, not bugs. They're now silently dropped by a new expected-errors filter; genuine 4xx-not-401 / 5xx / unexpected throws still alert.
## What changed
- New `chrome-extension/lib/expected-errors.js` exporting `markExpectedError(err, code)`, `isExpectedExtensionError(err)`, `EXPECTED_CODES`. Same UMD pattern as `template-merge.js` — usable from the extension at runtime AND `require()`-d by Vitest. 12 cases pin the contract.
- `appRequest` tags the two known noise sources at the source: `fetch` throw → `NETWORK_OFFLINE`, 401 response → `UNAUTHENTICATED`. The `.expected` flag is the primary signal so future renames of the user-facing message don't reintroduce noise.
- `reportExtensionError` checks `isExpectedExtensionError` as the very first thing inside its try block — silently returns before the 60-second dedup map is touched, before the `/api/errors/capture` fetch fires.
- Service-worker side: `background.js` `importScripts` the helper before `extension-telemetry.js`, mirroring `sidepanel.html`'s `<script>` order.
Genuine 4xx-not-401 / 5xx / unexpected throws still surface to Slack — the filter is anchored to specific message strings + the explicit flag, not a broad blocklist.
## Versions
- Chrome extension manifest 1.7.3 → 1.7.4.
- Product 3.34 → 3.35.
## Verification
436 / 436 Vitest passing (12 new), dashboard build clean, packaged extension zip (~3.5 KB heavier) includes the new helper.
Daily: Ship to prod: v3.35 — drop expected user-state errors from extension telemetry / Slack (TEST-254)
dailyautomationbug-reports
43 commits — Improvements: - Ship to prod: v3.35 — drop expected user-state errors from extension telemetry / Slack (TEST-254) - Ship to prod: v3.34 — bound runner heal-retry evaluates so BrowserStack socket-i
Improvements:
- Ship to prod: v3.35 — drop expected user-state errors from extension telemetry / Slack (TEST-254)
- Ship to prod: v3.34 — bound runner heal-retry evaluates so BrowserStack socket-idle no longer kills tests (TEST-253)
- chore(bugagent): add login-test.spec.ts
- Ship to prod: v3.33 — chrome extension merges instead of clobbering on template click (TEST-241)
- Ship to prod: v3.32 — fix pagination styling on Bug Reports list after search/filter (TEST-240)
v3.34 — automations no longer hang on BrowserStack 'Socket idle' for 90+ seconds
runnerautomationsbrowserstackheal-retrybugfix
The runner's heal-and-retry wrapper used to fire raw page.evaluate fallbacks without a per-call timeout. On BrowserStack each evaluate is double-wrapped by the BS SDK's instrumentation, and a single slow inner evaluate could stall until the WebSocket idle threshold severed the entire session. Each fallback now has a 5-second ceiling and connection-level errors fail fast instead of churning through more retries.
## What changed
- New `runner/src/lib/heal-retry-patterns.ts` exporting `HEALABLE_PATTERN`, `UNRECOVERABLE_PATTERN`, `FORCE_CLICKABLE_PATTERN`. The runner's prologue interpolates them via `.toString()` so future drift between the runtime and the unit tests breaks CI.
- `__ba_UNRECOVERABLE_ERRORS` fast-fail at the top of the heal catch: when the browser session is already dead (`Socket idle`, `Target closed`, `Browser has been closed`, `Connection closed`, `Session has been disconnected`, `crashed`), re-throw the original error immediately instead of running 5–7 more retry steps that each call page.evaluate against a dead socket.
- New `__ba_evalWithTimeout(label, ms, fn)` helper wraps step-5 (boundingBox fallback) and step-6 (JS-dispatched click) page.evaluate / locator.evaluate calls with a 5-second ceiling. Existing per-step try/catch falls through to the next fallback when the ceiling fires.
- 12 new Vitest cases in `dashboard/src/lib/heal-retry-patterns.test.ts` pin the regex matrix.
## Net behavior
A click that genuinely can't find its element fails in a handful of seconds with a clean error chain, instead of grinding through 90+ seconds of socket-idle wait per step. The heal chain still does its full work when individual evaluates respond quickly — only the slow / disconnected paths are short-circuited.
## Verification
424 / 424 Vitest passing (12 new), runner tsc clean, dashboard build clean.
v3.33 — Quick Submit no longer erases your typed description when you click a template
chrome-extensionquick-submittemplatesbugfix
In the Chrome extension's Quick Submit panel, clicking any of the bug-type template buttons (Functional, UI, Performance, Security) used to wholesale-replace whatever description you had already typed. The new merger keeps every character you've written and slots structured content into matching template sections — re-clicking is idempotent.
## What changed
- New `chrome-extension/lib/template-merge.js` with `mergeIntoTemplate(existing, template)`. Splits both strings into `## level-2` sections, maps user sections onto template sections by heading (alias-aware: Steps ↔ Steps to Reproduce, Expected ↔ Expected Behavior, etc.), preserves a free-form preamble at the top, and appends any user section without a template match at the bottom — so switching templates never silently drops a typed character.
- 13 new Vitest cases in `dashboard/src/lib/template-merge-helper.test.ts` pin the contract (the chrome extension and the test load the same source file).
- Idempotent: clicking the same template twice doesn't duplicate content.
- Defensive fallback in the click handler: if the merger script ever fails to load, the panel appends with a separator instead of replacing.
## Versions
- Chrome extension manifest 1.7.2 → 1.7.3.
- Product 3.32 → 3.33.
## Verification
412 / 412 Vitest passing, dashboard build clean, packaged extension zip includes the new helper.
v3.32 — Bug Reports pagination styling no longer breaks after search/filter
dashboardreportspaginationbugfix
On the Bug Reports list, searching or changing a filter dropdown was rebuilding the pagination from JS using different class names than the SSR component. The result was an unstyled "12Next »" mess jammed to the left. Both render paths now share one markup helper, and the styles are global so they apply to whichever path emits them.
## What changed
- New `dashboard/src/lib/pagination-html.ts` — shared `buildPaginationHtml(layout, opts)` helper. Both the SSR `<Pagination>` component and the JS-rebuild path now produce the same markup (`<nav class="pagination">`, `.page-btn` with `.active`/`.disabled`, ← Prev/Next → arrows, `.page-ellipsis`).
- 11 new Vitest cases pin shape, classes, visual content, ARIA, `data-page`, and HTML escaping — future drift between the two paths fails CI rather than only surfacing as a visual bug.
- `Pagination.astro` styles are now `is:global` so the rules apply to both render paths.
## Why it broke
Two stacked issues caught by TEST-240. The JS rebuild emitted entirely different class names (`.pagination-link`, `.pagination-items`, `.pagination-ellipsis`) than the Astro component, and Astro's scoped CSS only attaches to nodes rendered by the component instance — neither of which the JS-rebuilt nodes had.
## Verification
399/399 Vitest cases passing, dashboard build clean.
v3.31 — readActiveProjectSlug helper retires the cookie-direct read
dashboardactive-projectrefactor
Eleven dashboard pages were reading the active_project cookie directly instead of the URL-resolved slug, so a sidebar project switch could leave them rendering data for the previous project. One shared helper now backs every page that resolves an active project.
## What changed
- New `lib/active-project.ts` exporting `readActiveProjectSlug(Astro)` — returns `locals.active_project_slug` (URL-first, computed by middleware) and falls back to the `active_project` cookie. 10 Vitest cases pin the contract.
- 11 SSR pages updated to call the helper instead of reading the cookie directly: settings/index, security/index, security/new, explorations/index, code-review/index, notes/new, notes/[id], timetracking/resources, timetracking/index, performance/new, performance/index.
- 2 already-correct pages (analytics/index, reports/index) refactored from inline `((Astro.locals as any).active_project_slug …) || cookies.get(…)` to the helper for consistency.
## Why
After shipping the analytics-page fix in v3.30, an audit found the same bug pattern across the rest of the dashboard. Future SSR pages now have one obvious place to read the active project from — no way to silently reintroduce the bug.
## Verification
388 / 388 Vitest cases passing, dashboard build clean.
v3.26 — Delete Exploration now actually deletes
explorationsstoragereliabilityux
The Delete Exploration button in the danger zone now does what its copy has always promised: permanently removes the exploration plus all of its schedules, runs, findings, and storage. Was a soft-archive; now it lives up to the warning.
## What changed
The danger-zone button on an exploration's detail page reads *"This will permanently delete the exploration config, all schedules, and run history. This action cannot be undone."* — but until today it was a soft-archive only. The row stayed, the schedules stayed, all run history stayed, and any uploaded recon screenshots / scripts / videos remained in storage.
Now the button matches its copy. Clicking Delete cascades cleanly through every related table (FK constraints handle that automatically), removes the storage objects under `explorations/<run_id>/...`, and is genuinely unrecoverable.
No new UI surface — the existing button now works as advertised.
## Storage-cleanup project complete
This closes the storage-cleanup work that started this morning at 1.45 GB of bucket use. Every parent table in the project now has both a row-delete forward-fix and an automated cleanup pathway (daily TTL cron for session replays, weekly orphan-sweep for everything else). Bucket should drift downward by a few MB per week as natural deletions occur, instead of growing linearly.
v3.25 — Explorations now covered by the weekly orphan-sweep
storagereliabilityexplorationsops
The weekly orphan-sweep cron now also cleans up storage left behind from deleted exploration runs. Closes the last remaining gap in the storage-cleanup project that ran today.
## What changed
The weekly orphan-sweep cron (Sunday 03:00 UTC) used to skip exploration storage paths entirely — they had no clear ownership model when the cron first shipped. Schema check confirmed the database already cascades deletes correctly from explorations down through runs and findings, so storage was the only loose thread.
This release adds an exploration-aware path detector to the sweep helper. When the cron runs, it cross-references storage objects under `explorations/<run_id>/...` against current `exploration_runs` rows, and any whose run is gone get cleaned up.
No leak today (all current files map to live runs), so this is forward-protection for whatever path eventually triggers an exploration hard-delete — manual SQL cleanup, a future "delete permanently" button, etc.
## Storage-cleanup project status
With this release, every storage path in `bug-attachments` and `mobile-apps` is now covered by either a forward-fix DELETE handler, the daily TTL cron (session_replays), or the weekly orphan-sweep cron (everything else). Started this morning at 1.45 GB, currently at 1.19 GB, will drop to ~1.17 GB after tomorrow's TTL fire.
v3.24 — Session-replay TTL is now actually enforced
storagereliabilitysession-replaysttl
Session replays have always had a 30-day TTL written on every row, but no code path enforced it. A new daily cleanup cron now retires expired replays — including their screenshot and video files in storage — and the per-session delete also cascades to storage now.
## What changed
When the chrome extension or quick-submit flow captures a session replay, the row is created with an automatic 30-day expiry. Until today nothing actually deleted expired rows or their storage assets, so the table grew linearly and the bucket along with it. As of this release, rows past their expiry get cleaned up daily at 03:15 UTC, and explicitly deleting a session via the dashboard or API now also removes its screenshot and video from storage.
## Behind the scenes
The new daily cron uses storage-first ordering — it removes the storage objects before deleting the row, so any partial failure leaves the row in place for a retry on the next day's run rather than producing an orphaned file. Same belt-and-suspenders philosophy as the weekly orphan-sweep that landed earlier today.
First run is tomorrow morning UTC; expected impact on the bugAgent project is reclaiming roughly 20 MB across about 90 expired files.
## What's next
A separate ticket tracks the same pattern for explorations, which currently soft-archive without ever actually removing rows or storage.
v3.23 — Mobile app deletes also clean up the simulator build
storagemobilereliability
Deleting a mobile app from the dashboard now removes both the main APK / IPA and the simulator build from storage. Pre-fix only the main file was deleted, leaving 50-100 MB simulator builds behind on every per-app delete.
## What changed
When you delete a mobile app via the dashboard, the storage cleanup now removes both files associated with that app: the main APK / IPA and the simulator build. Pre-fix only the main file was being cleaned up — every per-app delete leaked the simulator build.
The orphan-sweep cron (shipped earlier today) catches this kind of leak weekly, but the happy-path delete should clean up everything at once. Now it does.
No user-visible change beyond the storage savings — pure backend plumbing.
v3.22 — Recurring orphan-attachment sweep cron
storagereliabilityopsautomation
A weekly cron now sweeps orphaned attachments in Supabase Storage as a belt-and-suspenders catch for any edge case the per-delete cleanup misses. Closes the storage-orphan saga that started with disk pressure earlier today.
## What changed
Deleting a bug report or session note already cleans up attachments (shipped v3.19). A new weekly job now scans for any orphaned files left behind by edge cases — manual SQL deletes, partial-failure paths, future code paths that bypass the API — and removes them.
The sweep runs every Sunday at 03:00 UTC. Each run logs a row to a new `orphan_sweep_runs` table with how many objects were scanned, how many orphans were found, how many were deleted, and any per-file failures. A clean run looks like ``found = 0, deleted = 0`` and is the signal the forward-fix is working.
## Safeguards
- Per-run cap of 500 deletions. If a sustained leak ever produces thousands of orphans, the cron picks them up over subsequent weeks rather than nuking everything at once. ``found`` exceeding 500 is the ops trigger to investigate upstream.
- Detection rule is shared with the one-shot script via a new `lib/orphan-sweep.ts` helper, with 13 unit tests pinning every path-pattern decision (notes/, automations/, explorations/, session-videos/, geo-snap/ — what counts as orphan vs. what is intentionally out of scope).
Nothing user-visible changes from this release; it's all behind-the-scenes maintenance plumbing.
v3.21 — Reclaimed 262 MB of orphan attachment storage (TEST-237)
storagereliabilitycleanup
After the v3.19 forward-fix to cascade attachment cleanup on bug-report and note deletion, a one-shot sweep cleared 28 pre-existing orphan files: 24 bug-attachment orphans (mostly duplicated audio recordings from a Zoom call uploaded three times) and 4 obsolete APK builds. Project storage dropped from 1.45 GB to 1.19 GB.
## What changed
The cascade-delete fix shipped in v3.19 prevents new orphans going forward. The one-shot sweep that ran today cleared the existing backlog:
- **20 note orphans** (~215 MB) — heavy, dominated by triplicated 70 MB audio recordings
- **4 bug-report orphans** (~tiny) — small screenshots tied to deleted reports
- **4 mobile-apps orphans** (~61 MB) — superseded APK builds
**Total reclaim: ~262 MB.** Storage usage dropped from 1.45 GB to 1.19 GB.
## Under the hood
The sweep script took three iterations to land cleanly. The first version queried `storage.objects` directly via the JS client, but PostgREST doesn't expose the storage schema. The second version walked Supabase Storage's list() API recursively, but it returned only ~24 of 941 objects because nested folder traversal was unreliable. The third version (shipped) added a SECURITY DEFINER Postgres function (`public.list_bucket_objects`) that exposes storage object metadata for ops scripts — single round trip, deterministic, source-of-truth.
A followup ticket tracks 151 session-replay-related files (~35 MB) that follow a different ownership model and need their own cleanup logic.
v3.19 — Bug report and note deletes now clean up their attachments
storagereliabilitybug-reportsnotesdata-integrity
Closed a long-standing leak: deleting a bug report or note now also removes its uploaded screenshots, videos, audio recordings, and other attachments from storage. Pre-fix the row went away but the files lingered, accumulating about 334 MB of orphan storage on prod since launch.
## What changed
When you delete a bug report or session note, all of its uploaded attachments — screenshots, screen recordings, audio notes, log files — are now also removed from Supabase Storage. Pre-fix the row was deleted but the attachments stayed, so over time the storage bucket grew with files no longer referenced anywhere in the product.
The per-attachment delete (the trash icon on a single attachment within a bug report) was already cleaning up storage, but used a naive regex that could mishandle signed URLs with query-string tokens. That path now goes through the same shared helper as the cascade delete, so signed URLs, encoded filenames, and multi-bucket safety are all handled consistently.
## Under the hood
- New `attachment-storage` helper extracts storage paths from any URL shape (public, signed, authenticated) and refuses URLs pointing at other buckets so a delete can never touch the wrong files. 20 unit tests pin the contract.
- A one-shot sweep script clears existing orphans accumulated before the fix landed.
- A follow-up enhancement is queued to run the sweep on a weekly cron going forward, as a safety net for any partial-failure paths.
## Out of scope
Mobile app uploads (`.apk` / simulator files) follow the same pattern but live in a separate bucket. A small follow-up ticket tracks that fix.
v3.18 — CoPilot Recent Reports now scoped to the active project
The Recent Reports section in the CoPilot side panel now shows only reports for the currently-selected project, and refreshes when you switch projects. Previously it leaked reports across every project in the workspace.
## What changed
The Recent Reports list in the CoPilot side panel previously queried by workspace only — so a tester working on Project A in a multi-project workspace saw recent reports from Projects B, C, and D mixed in. Switching the project dropdown also didn't refresh the list.
Fixed both: the list is now project-scoped, and switching projects auto-refreshes the section.
No migration or backfill needed — this is a query-shape fix, applied client-side in the extension.
v3.16 — CoPilot Page Links auto-refresh on tab navigation
chrome-extensioncopilotdata-integrity
Page Links in the CoPilot side panel now updates automatically when you navigate to a new page or switch browser tabs. Previously only the first page's links were ever loaded — every subsequent navigation kept the stale set.
## What changed
When you used the Page Links section in CoPilot to inventory the links on a page, only the very first page you opened the section on actually got its links rendered. Browsing to a new page (or switching browser tabs) left the section stuck on the original page's links until you reloaded the extension.
The section now refreshes automatically:
- On every section open (closed → opened after a navigation always shows the current page's links).
- When the active browser tab finishes loading a new URL.
- When you switch to a different browser tab.
No new permissions required — the extension already had the `tabs` grant from the manifest.
Bundled four small CoPilot quality-of-life fixes into one release: chevron disclosure indicators on Recording rows, Export buttons on the Playwright script and Action log, and backend support for .log/.md attachments that browsers report as application/octet-stream.
## What changed
**CoPilot Resources tab (CoPilot v1.7.0)**
- Recording accordion rows now have a visible chevron disclosure indicator that rotates when you expand or collapse the row. Matches the existing pattern on Page Links and Recent Reports.
- The Playwright script block has an Export button alongside the existing Copy. Downloads as ``playwright-flow-<timestamp>.spec.js``. Reads from the live textarea so unsaved edits export too.
- The Action log block now has both Copy and Export buttons. Downloads as ``action-log-<timestamp>.txt``.
**Bug report and session note attachments**
- ``.log`` and ``.md`` files now upload successfully. Pre-fix browsers reported them as ``application/octet-stream`` (because those extensions don't have a universally-registered MIME), and the upload validator rejected octet-stream outright.
- The validator now treats octet-stream as "look at the filename" — accepts ``.log``, ``.md``, ``.json``, ``.csv``, ``.txt`` via extension fallback. Stored files get the canonical MIME (``text/plain`` / ``text/markdown`` / etc.) so previews work on re-fetch.
- ``.zip`` attachments remain intentionally rejected as a security boundary.
## Under the hood
- New shared helper ``dashboard/src/lib/upload-mime.ts`` (20 Vitest cases) replaces three near-identical inline validators across ``/api/reports/upload``, ``/api/notes/upload``, and ``/api/notes/upload-url``.
- chrome-extension manifest bumped to v1.7.0; product version 3.05 → 3.15.
- Web Store upload of v1.7.0 ZIP pending manual push.
v3.05 — Bug Reports composer template now matches the Chrome extension
bug-reportschrome-extensionconsistencyai
Aligned the dashboard's "Report a bug" template to use the same Markdown heading style and section names as CoPilot. The shared structure (Steps to Reproduce / Expected Behavior / Actual Behavior / Additional Context) is now identical across both surfaces.
## What changed
The template that fills the **Report a bug** textarea (when the Template toggle is on) used `**bold**` pseudo-headings and slightly different section names than the Chrome extension. Now both use `## h2` semantic headings and the same names:
- `## Steps to Reproduce`
- `## Expected Behavior` (was "Expected Result")
- `## Actual Behavior` (was "Actual Result")
- `## Additional Context` (was "Additional Notes")
`## Summary` and `## Environment` stay in the app template only — the extension submits a title field directly and auto-captures environment, so it doesn't need either.
## Side benefit
Pre-fix, app-filed bugs that used the template weren't detected as already-structured, so the AI formatter would re-format them and sometimes rewrite the user's headings. Post-fix it matches the canonical format and the AI step correctly skips them.
v3.04 — CoPilot Resources tab count badge no longer shows stale numbers
chrome-extensioncopilotui-ux
Switching workspaces or projects in the CoPilot side panel now clears the Resources tab count badge correctly when the new scope has zero resources. Previously the badge could leak the previous scope's count.
## What changed
When you switched between workspaces or projects in the CoPilot side panel, the Resources tab count badge could keep showing the previous scope's number (e.g. "Resources 13") even when the panel correctly rendered the empty state for the new scope.
A CSS rule was overriding the standard `[hidden]` attribute, leaving the badge visible with whatever count was painted last. The CSS now respects `[hidden]` on count badges, and the JS also clears the count text when the badge hides — so even a future styling regression couldn't leak a stale number.
The fix applies to every count badge in the side panel, not just Resources.
v3.03 — Deleting a test suite now removes its runs too
test-casestest-runsdata-integrity
Aligned test-suite deletion behavior with what the danger-zone copy promised: deleting a suite now permanently removes its associated test runs and per-case results, not just the suite itself.
## What changed
The Danger Zone on a test suite says "Permanently remove this suite and all associated data. Cannot be undone." Pre-fix the suite went away but the runs that originated from it stayed in place — visible on the Runs tab with no suite to navigate back to. The copy was right; the database wasn't.
The foreign keys from runs and per-case results back to suites now cascade on delete, so the suite + every run that ran against it + every per-case result inside those runs all disappear together.
## What's preserved
If you have ad-hoc test runs (not associated with any suite), they're unaffected. The cascade only fires for runs whose `suite_id` matches the suite being deleted.
If you previously deleted a suite and saw its runs linger, those orphaned runs stay where they are — they're indistinguishable from intentionally-standalone runs at the data-model level. New deletions from this point forward cascade cleanly.
v3.02 — Test case + suite title fields now strip formatting on paste
test-casesaccessibilityui-uxreliability
Pasting text from external sources (Gmail, Gemini chat, Google search results) into a test case or suite title no longer carries the source dark color, font, link styling, or transforms with it. The dashboard renders the pasted text in its normal title color.
## What changed
When you copy a phrase from another app and paste it into the **test case** or **test suite** title field, the dashboard previously preserved the source formatting — most painfully, dark text color from rich-text sources rendered invisible against the dashboard's dark background. In Safari, some Google search-result paste sources also produced upside-down or mirrored text.
## Fix
The title fields now read only the plain-text payload of any paste, dropping the rich-text variant entirely. Cursor position is preserved across the paste — keep typing and your characters land where you expect.
Applies to:
- Test case detail page title
- Test suite detail page title
(The AI voice transcript field is unchanged — voice dictation is the primary input path there.)
v3.01 — CoPilot recordings now show the right action count in the dashboard
chrome-extensioncopilotautomationsdata-integrity
Fixed a contract gap where the Chrome extension created an automation with the right Playwright script but no action log, so the dashboard always rendered "0 recorded actions" regardless of how many actions you actually recorded.
## What changed
When CoPilot turned a recording into an automation, the resulting automation in the dashboard always read "0 recorded actions" next to the Playwright Script header — even though the Action log in the side panel showed the right count and the Playwright script itself was correct.
Root cause: the extension's `Create automation` request only sent the compiled script. The backend column for the action log defaulted to an empty array, and the dashboard renders the count from that array's length.
## Fix
The extension now sends the action log alongside the script. Dashboard count badges and any future tooling that reads `recorded_actions` get accurate data going forward.
Applies to recordings made in CoPilot v1.6.1+. Older recordings retain their original (zero-count) state — re-record to update.
Four CoPilot quick-wins shipped together: extension version surfaced in created automations, Device + Page Title + Viewport now render in the dashboard environment section, attachment file types match between extension and app, URL field wraps cleanly in the side panel.
## Highlights
- **Automation traceability**: when CoPilot creates an automation from a recording, the title and description now include the CoPilot version (e.g. ``(from CoPilot v1.6.0)``). Useful when debugging selector or playwright drift across builds.
- **Environment parity**: the bug-report environment card now displays Device, Page Title, and Viewport that the Chrome extension auto-captured. Older extension installs (≤ v1.5) keep working — a new normalization helper coalesces legacy keys onto canonical ones.
- **Attachment parity**: the CoPilot file picker now offers the same file types as the dashboard (image, video, audio, pdf, txt, csv, md, json) — pre-fix the extension was a strict subset.
- **URL field readability**: in the CoPilot side panel, long URLs in the auto-captured Environment section now wrap to multiple lines instead of truncating with an ellipsis.
## Under the hood
- New ``dashboard/src/lib/environment.ts`` pure helper (16 Vitest cases) normalizes the bug-report environment JSON regardless of which submitter wrote it. Reads either ``device`` or ``device_type``, snake_case or camelCase. Drops empty values so render guards keep working.
- Chrome extension manifest bumped to v1.6.0; product version rolled v2.99 → v3.00 (the small bump rolled MM=99 to a major increment).
- Web Store upload of v1.6.0 ZIP pending manual push.
Daily: Ship to prod: v3.16 — TEST-235 CoPilot Page Links auto-refreshes on tab navigation
Improvements:
- Ship to prod: v3.16 — TEST-235 CoPilot Page Links auto-refreshes on tab navigation
- Ship to prod: v3.15 — TEST-82/138/139/140 s4 cleanup batch (CoPilot UI polish + .log/.md attachments)
- Ship to prod: v3.05 — TEST-232 align Bug Reports app template to canonical ## headings + Behavior/Context naming
- Ship to prod: v3.04 — TEST-180 fix CoPilot Resources tab count badge stuck on workspace switch
- Ship to prod: v3.03 — TEST-118 cascade test_runs + test_run_results on suite delete
v2.99 — Project context preserved across Bug Reports search and filters
bug-reportsmulti-projectreliability
Fixed a pair of coupled bugs that caused multi-project workspaces to lose their active project after a keyword search, and the sidebar dropdown to lag behind the URL when switching projects.
## What changed
Multi-project workspaces could not cleanly switch projects from the Bug Reports list after any search or filter activity. The URL silently dropped the project slug, the sidebar dropdown lagged behind the URL, and visible data sometimes disagreed with both. Two coupled defects, one fix:
- The Bug Reports page had five client-side history.replaceState call sites that built URLs from filter form values only, dropping the URL params that pin a tab to its workspace and project.
- The dashboard layout and reports SSR read the active project from a browser-wide cookie instead of the URL-first value the middleware had already resolved. Sibling tabs writing the cookie clobbered every other tab.
## Fix
- New pure helper preserveContextParams that copies project + team URL params from the current URL onto a fresh clone of new params, with caller-explicit override semantics. 13 unit tests pin the contract.
- Wired into the three culprit replaceState call sites (cookie-restore script, filter fetch, Clear filters).
- Layout and reports page now resolve the active project from middleware locals first (URL-resolved), falling back to the cookie.
User-visible result: typing in the search box keeps the project context. Clicking another project in the sidebar updates the URL, the dropdown, and the data simultaneously.
v2.98 — Bug Reports pagination no longer disappears after clearing filters
bug-reportspaginationux
On long Bug Reports lists, clicking Clear (or changing a filter dropdown) used to leave the bottom pagination control stale or missing. Now it updates correctly every time. Same component family as the v2.96 and v2.97 fixes — the Bug Reports list polish trilogy is now complete.
## What changed
Clicking Clear on a long Bug Reports list — or changing any filter dropdown — would re-render the table correctly but leave the bottom pagination control either stale or missing entirely. Refreshing the page would put it back. Confusing inconsistency, especially on workspaces with hundreds of reports where pagination is the only way to reach old tickets.
v2.98 fixes this. The pagination control now updates correctly on every client-side filter change.
## Root cause (a silent throw)
A dead-code call to a function that no longer exists (setupDeleteListeners()) was throwing a JavaScript ReferenceError right before the line that updates pagination. The catch block silently swallowed the error, the next line never ran, and the pagination control was left holding whatever state the initial page render set up. Removing the dead call let the pagination update fire as designed.
We also added a console.error log to the same catch block, so future regressions in this code path surface to the browser console instead of failing silently. The pagination math itself was extracted into a small unit-testable helper with 14 Vitest cases pinning down the layout decisions (page-count math, ellipsis placement on long lists, prev/next flag derivation).
## Why this matters
Filter UX without working pagination is a one-page-deep workflow — past the first 20 reports, there is no way to navigate further. Workspaces with backlogs in the hundreds felt this acutely.
## Impact
- No customer-facing UI change beyond the pagination now appearing when it should
- No URL or API contract change
- Same component family as v2.96 (filter persistence) and v2.97 (Clear link visibility); the three together close out the recent Bug Reports list polish work
v2.97 — Bug Reports search "Clear" link is now visible right when you'd expect it
bug-reportsfiltersux
Typing a search and hitting Enter on the Bug Reports list used to show the right results but no way to clear them. The Clear link now appears the moment any filter is active, including while you're still typing in the search box.
## What changed
The Clear link beside the filter dropdowns on the Bug Reports list was rendered conditionally — only when at least one filter was active at the moment the page loaded. Fast-path filter updates (typing a search and hitting Enter, switching a dropdown, etc.) used a client-side refresh that never inserted the link, so a fresh search appeared without a way to clear it. A hard reload would put the link back, which made the bug feel inconsistent.
v2.97 fixes this by always rendering the link and toggling its visibility based on the live form state. The link now appears the instant any filter input has a value, including while you're still typing in the search box (you don't have to hit Enter first).
## Why this matters
Filter UX without a Clear control puts the user in a "stuck" state — the only way to recover is a full page reload. Showing the Clear link consistently is a baseline trust signal for any filterable list.
## Impact
- No customer-facing UI change beyond "the Clear link now appears when it should"
- No URL or API contract change
- Other list pages (test cases, automations) use different list-page patterns and weren't touched
v2.96 — Bug Reports list filters persist across page nav and refresh
bug-reportsfiltersux
Filters and sort on the Bug Reports list now survive clicking Next/Prev or refreshing the page. Previously, the list would silently revert to "All" data even when the dropdown still showed the filter selection.
## What changed
The Bug Reports list silently lost its filters the moment you clicked Next/Prev or refreshed. The dropdowns kept their visual selection, the URL kept its query string — but the visible rows would revert to "all reports, newest first" instead of just the filtered set.
Two bugs were behind this:
1. **Pagination links dropped filters.** The Next/Prev links pointed at `/dashboard/reports?page=N` — no `status=`, `severity=`, or sort params. Clicking Next teleported you off your filter.
2. **Background polling fetched unfiltered rows.** The list polls every 30 seconds for newly-filed reports. That polling fetch wasn't respecting your current filter, so on a filtered page (e.g. `?status=closed`) it would silently add the latest 10 unfiltered "new" reports to the top of the table — exactly what made the list look broken after a refresh.
v2.96 fixes both. Pagination links now carry every active filter and sort param; polling fetches now respect the filter set so you only see newly-filed reports that match what you're currently viewing.
## Why this matters
Filter persistence is a baseline trust signal — if you filter to "Status: Closed" and the table immediately shows non-closed rows, you can't trust anything you're looking at. Triage workflows, manager-level reviews, and any "show me all the X bugs" use case rely on the filter actually filtering.
## Impact
- All bug-report filters (search, type, severity, status, resolution, reported_by, assigned_to) and sort persist across pagination + refresh
- No URL or API contract change
- Custom-saved filter views and the "Clear" link work the same
- Other list pages (test cases, automations, etc.) use different list-page logic and weren't touched by this fix
v2.95 — Useful middleware error logging
platformobservabilitymiddleware
When the dashboard middleware fails to fetch a user profile or workspace memberships from Supabase, the failure now logs the full structured error context instead of an unhelpful ":undefined" stub. Internal observability fix only — no customer-facing UI change.
## What changed
The middleware logs profile / membership fetch failures as a structured object now: name, code, message, status, details, hint, and a body-excerpt (first 500 chars) of any HTML response Supabase returned. The previous log line was just `error.message`, which is undefined when Supabase's REST gateway returns HTML — exactly the case during an upstream load-balancer hiccup, the time we most want diagnostic context.
Failures also leave a Sentry breadcrumb so the next captured exception in that session carries the middleware failure in its context.
## Why
During a recent Supabase REST gateway blip, our Railway logs showed:
[middleware] profile SELECT failed for d78f… : undefined
…across hundreds of entries. Took ~30 minutes to manually pull the actual cause (`<html>` body, status 502) out of the underlying request stream. Now the same failure logs:
[middleware] profile SELECT failed for d78f… { name: 'PostgrestError', code: null, message: '(empty)', status: 502, bodyExcerpt: '<html>...502 Bad Gateway...' }
Diagnosis time: ~30 seconds.
## Impact
- No customer-facing UI change
- Internal observability: when Supabase has a transient issue, the cause is immediately visible in our logs and Sentry breadcrumbs
- The structured logger is a reusable lib helper (`summarizeSupabaseError`), so other middleware-adjacent error sites can adopt the same shape
v2.94 — Performance run durations now show real numbers
performancedata-qualityrunner
A long-standing data-quality bug where every "Run took X seconds" UI silently rendered blank or zero is fixed. Per-test duration aggregations are now meaningful instead of computed against null data.
## What changed
Every performance run row had a `started_at` timestamp column that was perpetually empty — the runner never wrote to it, but the dashboard read from it everywhere durations were displayed. Result: every "Run took X seconds" UI rendered blank or zero, and every per-test median/p95 duration aggregation was calculated against null data.
v2.94 fixes this in two parts:
1. The performance runner now stamps the start timestamp the moment it accepts a run (before Lighthouse fires), in a single explicit write that's independent of the throttled progress-update path.
2. A one-off backfill set `started_at = created_at` for every existing terminal-state run, since for those rows the runner-accepted timestamp is no longer recoverable but the creation timestamp is a sub-second-accurate approximation.
## Why this matters
- Run-history UI now shows real durations instead of blank or zero
- Per-test duration trend charts (median, p95) are now computed against real data
- Performance report PDFs render correct elapsed-time values
## Impact
- No customer-facing UI change beyond "the duration column now shows numbers"
- Backfill is idempotent and safe — re-running it disturbs nothing
- All future runs will stamp `started_at` automatically at the moment the runner accepts the work
Three capture-quality improvements for the bugAgent CoPilot Chrome extension. Element picker now uses Playwright-grade selectors, DOM snapshots can be attached to bug reports, and the Screenshot button gets a Full Page mode.
## What's new in v1.5.0
### Element picker uses Playwright-grade selectors
When you click "Select Element" on the Report tab, the captured selector is now far more durable. Instead of brittle CSS paths like `#root > div.app > main > section > button:nth-child(2)` (which broke on any UI refactor), you'll see human-readable descriptors that survive class renames, sibling reorders, and layout changes:
- `[data-testid="submit"]`
- `role=button[name="Submit"]`
- `label="Email address"`
- `text="Sign in"`
The priority order matches Playwright's codegen — testid → role → label → placeholder → altText → title → text → CSS fallback. The recording flow already used this engine; the bug-report flow now matches.
### Optional DOM snapshot attachment
A new "Include DOM snapshot" checkbox lives next to the existing console/network capture toggles. When checked, the report includes a sanitized HTML snapshot (`dom-snapshot.html`) of the page at capture time:
- `<script>` tags stripped entirely
- Inline event handlers (`onclick`, etc.) stripped
- Password fields and credit-card inputs blanked
- Editable areas (textarea, contenteditable) truncated to 200 chars per node
- 500 KB hard cap with a clear truncate marker
Default OFF — strict opt-in because a 100–500 KB blob is overkill for most bugs and may carry per-user state.
UI bugs in particular benefit dramatically — "the layout is broken in this section" needs the whole tree, not just URL + title.
### Full-page screenshot
The Screenshot button now has a small chevron menu (▾) with two modes:
- **Visible area** — original behavior, captures only the viewport (default click)
- **Full page** — scrolls the page in viewport-height strips, captures each, and stitches them onto a single image
Useful for bugs that span multiple screens of content. Graceful fallbacks for pages that don't allow programmatic scroll or exceed the 50-strip cap (≈40,000px tall).
Known trade-offs: sticky/fixed elements may appear once per viewport they were visible in (the alternative requires browser-debugger permission, which displays a security banner across all tabs — too intrusive for the marginal benefit). Lazy-loaded content not yet rendered won't appear.
## Rollout
v1.5.0 is queued for Chrome Web Store review. Once approved (typical 24–48 hour cycle), Chrome's auto-update will roll it out to all installed users over the following 24–48 hours.
Four post-launch improvements to the bugAgent CoPilot Chrome extension. Console capture is broader, recordings now survive long pauses, the severity dropdown matches the dashboard, and we'll now see extension-side failures in our alerting. Icon also got a refresh to match the homepage style.
## What's new in v1.4.0
### Wider console capture
When you tick "Include console output" on a bug report, we now capture every level of console output from the page (errors, warnings, **plus info, log, and debug**). Most real bugs leave their forensic breadcrumbs at info/log level — state dumps, lifecycle traces, timing hints — and the previous "errors only" capture missed roughly 80% of that diagnostic context. Buffer doubled to absorb the wider stream.
### Recording durability
If you start a recording and let the side panel sit idle for a while, your in-flight recording no longer gets dropped. Manifest V3 service workers shut down aggressively to save memory, and our recording state used to live only in that worker's memory. Now the recording is mirrored to session storage on every action, so an idle pause — or the panel being closed and reopened — picks up exactly where you left off.
### Severity codes match the dashboard
The severity dropdown now uses the formal QA codes (S1 Blocker / S2 Critical / S3 Major / S4 Minor) instead of legacy critical/high/medium/low. S3 stays the default with a small hint reminding you to reserve S1/S2 for genuinely blocking issues.
### Internal: extension-side failures now visible to us
The extension now reports its own internal failures (a failed report submission, an OAuth callback error, a resource-sync hiccup) to our existing error-tracking pipeline. Before this, prod failures were invisible to us until a customer mentioned them. Privacy boundary intact: no tokens, no DOM contents, no page text — just the error itself plus the section it occurred in.
### Icon refresh
Icon now matches the homepage hero — dark navy background with the neon green bug, instead of the previous green-on-green treatment. Clearer at every size from 16px (browser toolbar) to 128px (Chrome Web Store listing).
## Rollout
v1.4.0 is queued for Chrome Web Store review. Once approved (typical 24–48 hour cycle), Chrome's auto-update will roll it out to all installed users over the following 24–48 hours. No action needed on your end.
v2.93 — Two-tab project independence is now bulletproof
platformmulti-tenantworkspaces
Switching projects in one tab no longer affects what other tabs are showing. Each tab keeps its own active project, even after browser refreshes or address-bar navigation.
## What changed
When multiple browser tabs were open in the same workspace, switching the active project in one tab could silently flip the OTHER tab to the new project on its next click — particularly for tabs that were opened by typing the URL directly, clicking a bookmark, or arriving via an external link. The wrong project's data would render with no visible warning.
v2.93 closes that gap with two changes that work together:
1. The active-project switch no longer writes a browser-wide cookie. The switching tab gets pinned via its own URL; other tabs keep whatever project they were on.
2. Any direct dashboard URL (typed, bookmarked, or deep-linked) is now automatically tagged with the active project in the URL on first load. Internal navigation within that tab carries the tag forward, so the tab stays anchored to its own project independent of every other tab.
## Why this matters
Workspaces that segregate sensitive data by project (security findings vs. internal bugs vs. external customer reports) need genuine per-tab isolation. Before this fix, opening a fresh dashboard tab while another tab was switching projects could surface the wrong data unintentionally. After this fix, each tab is its own pinned context.
## Impact
- No customer-facing UI change — the project switcher works exactly as before.
- Multi-tab workflows are safer: comparing two projects side-by-side, or having a "scratch tab" while the main tab is on a different project, both work cleanly.
- Existing users keep their current project as the default for any new tabs they open until they explicitly switch.
v2.92 — multi-tab 503 fix
reliabilitymulti-tabmiddlewareinfra
Two-part fix triggered by a multi-tab post-mortem: dashboard now runs on 2 replicas (single-replica + concurrent SSR could starve under heavy multi-tab use), and middleware no longer clobbers active_team / active_project cookies on URL-driven navigation.
## What changed
- **Dashboard scaled to 2 replicas.** Astro SSR is single-threaded per Node process, so a power user with many tabs open could backlog one replica enough for Railway's edge router to start returning 503. Two replicas give the router a peer to fall over to.
- **Middleware no longer persists URL-driven workspace selections to cookies.** When tabs each carried a different `?team=` or `?project=` URL param, the middleware was racing to overwrite the cookie on every request. A tab making a later cookie-only fetch could silently see another tab's workspace. Per-request resolution from URL params still works; persistence still flows through the explicit workspace/project switcher endpoints and through deep-link auto-switching.
## Why
A real user (multi-tab power user) hit a 503 on the mobile page that only resolved after closing all tabs — closing the tabs released the connections that had been starving the single replica. The 503 itself was Railway-router-side (no Sentry capture), but reviewing the middleware while investigating turned up a related cross-tab cookie-bleed pattern that was worth fixing in the same release.
Daily: compliance: close OFI-009 and dedupe orphaned OFI block in improvement log
dailyautomationaibug-reportsapi
51 commits — Improvements: - compliance: close OFI-009 and dedupe orphaned OFI block in improvement log - compliance: capture v2.90 + v2.91 ships and close NC-023/NC-024/OFI-017
Improvements:
- compliance: close OFI-009 and dedupe orphaned OFI block in improvement log
- compliance: capture v2.90 + v2.91 ships and close NC-023/NC-024/OFI-017
- Ship to prod: v2.91 — runner moduleResolution: node10 → Node16
- deps: regenerate runner/src/version.ts for v2.90
- Ship to prod: v2.90 — mcp-server version-reporter
8 commits — Improvements: - Ship to prod: v2.82 — TEST-198 perf cluster: adopt dbErrorResponse - Ship to prod: v2.81 — TEST-187 dbErrorResponse helper + use in /api/automations/runs GET
Improvements:
- Ship to prod: v2.82 — TEST-198 perf cluster: adopt dbErrorResponse
- Ship to prod: v2.81 — TEST-187 dbErrorResponse helper + use in /api/automations/runs GET
- chore(bugagent): add recording-may-1-11-57-am-from-copilot.spec.ts
- Ship to prod: v2.80 — TEST-190 hide Agent Queue from non-TestLauncher workspaces
- chore(bugagent): add recording-may-1-10-00-am-from-copilot.spec.ts
v2.82 — Better error visibility on performance test endpoints
platformobservabilityperformanceapi
When something fails on the database side of any performance-test endpoint, the failure is now captured to our Sentry alerting with full context. No customer-facing UI change — clients still get the same opaque 500 with no internal details leaked.
## What changed
Nine performance-test API endpoints — listing tests, creating tests, updating/deleting tests, listing trends, creating runs, completing runs, archiving/unarchiving runs — used to return generic 500s with no logs and no Sentry capture when the underlying database query failed. The actual cause was invisible everywhere except an internal error log that only saw the status code.
v2.82 routes all of those failures through a shared helper that:
- Logs the underlying database error to our Railway logs
- Captures it to Sentry, grouped per call site (e.g. `performance.run.complete`, `performance.tests.list`) so issue volume per endpoint becomes alertable
- Attaches team_id, test_id, and run_id as extras so triage has full context
- Returns a generic `Internal error` JSON body to the client — no leakage of the underlying database error message or schema details (security)
## Why
Follow-up to the v2.81 helper. The performance-test cluster was prioritized because the run-completion callback fired this 500 path during recent dogfood testing with zero diagnostic trail in Sentry — once log retention rotated, we had nothing to debug from. The helper now buys us automatic visibility on the next occurrence.
## Impact
- No change to the user-facing API contract.
- The client still receives a generic 500 JSON for unexpected failures; no internal details leak.
- Internal observability: Sentry now sees these failures grouped per endpoint, so we can alert on regressions and correlate issues across deploys.
- This is the first cluster of a multi-PR sweep. Reports, test-cases, mobile, security, and other API clusters follow.
v2.81 — Better error visibility on automation runs
platformobservabilityautomationsapi
When the dashboard can't fetch your automation runs because of a database hiccup, the failure is now surfaced to our Sentry alerting instead of being silently dropped. No customer-facing UI change.
## What changed
The `/api/automations/runs` GET endpoint used to return a generic 500 with no logs and no Sentry capture. If a Supabase hiccup caused the underlying query to fail, the only artifact was a row in our internal error log saying "API 500" — the actual cause was invisible.
v2.81 adds a small shared helper that:
- Logs the underlying database error to our Railway logs
- Captures it to Sentry with the team context attached, grouped by call site so we can alert on volume spikes
- Returns the same opaque `Internal error` JSON to the client (no leakage of internal schema or query details)
The helper is now used in the automation-runs list endpoint as the first adoption point. The same pattern will be rolled out across the rest of the API in a follow-up.
## Why
A recent transient Supabase gateway hiccup fired this 500 path several times for an internal user, and we had zero diagnostic information beyond the URL. Surfacing those errors to Sentry means the next time anything similar happens we can see exactly which queries failed and why, without hand-grepping logs.
## Impact
- No change to the user-facing automation-runs UI or API contract.
- The client still receives the same generic 500 JSON for unexpected failures (no leakage of internal details).
- Internal observability: Sentry now sees these failures grouped by call site, so we can alert on regressions.
v2.80 — Cleaner sidebar for non-TestLauncher workspaces
platformuinavigation
Tidied an internal-only feature link that was leaking into customer dashboards. No customer-facing functionality changed; the menu just stops showing a feature you can't use yet.
## What changed
An internal "Agent Queue" navigation entry — used by our own bug-fix automation — was visible in every workspace's sidebar even though only one workspace (the TestLauncher dogfood team) had data behind it. Other teams saw the menu item, clicked it, and landed on an empty page with no explanation.
v2.80 hides the entry for any workspace that isn't the dogfood team and redirects direct-URL visits back to Bug Reports.
## Why
Agent Queue is an upcoming product feature. We built the first pass for our own internal dogfood, with a single-workspace data path, and the navigation gating wasn't put in at the same time. The next feature ticket will productize it for every paying workspace; until then, it shouldn't advertise itself.
## Impact
- No change to existing functionality. Bug Reports, Test Cases, Automations, etc. are untouched.
- Sidebars for non-dogfood workspaces have one fewer entry. Cleaner, less confusing.
- The dogfood Agent Queue itself continues to work for the team that owns it.
Daily: deps(mcp-server): bump @sentry/node in /mcp-server
dailyapi
10 commits — Improvements: - deps(mcp-server): bump @sentry/node in /mcp-server - deps(mcp-server): bump zod from 4.3.6 to 4.4.1 in /mcp-server
Improvements:
- deps(mcp-server): bump @sentry/node in /mcp-server
- deps(mcp-server): bump zod from 4.3.6 to 4.4.1 in /mcp-server
- deps(website): bump astro from 6.1.10 to 6.2.0 in /website
- deps(dashboard): bump @sentry/astro in /dashboard
- Ship to prod: v2.79 — TEST-181 downgrade Supabase OAuth log level (Sentry noise fix)
v2.79 — Quieter MCP OAuth logs
platformobservabilitymcpauth
The MCP server now logs Supabase OAuth token-exchange failures at warn instead of error. User-side OAuth hiccups (expired PKCE codes, mid-flow browser refreshes, parallel auth attempts) no longer generate Sentry pages.
## What changed
When a Supabase OAuth token exchange fails, the MCP server logs the failure at `warn` level instead of `error`. The diagnostic message still prints to Railway logs, so a real outage would still be debuggable — but `@sentry/node` no longer auto-captures every transient user-side failure.
## Why
Supabase PKCE token exchanges fail for user-side reasons most of the time:
- Expired auth code (the user took >5 min between Google login and redeem)
- Code-verifier mismatch from a browser refresh in the middle of the OAuth handshake
- Parallel auth attempts overwriting each other's verifier in browser storage
None of those are server bugs we can act on, and they were generating a steady stream of Sentry events with no actionable signal. The fix is one line plus an inline comment so future readers understand the rationale and don't "fix" it back to error level.
## Impact
- Cleaner Sentry inbox for the MCP server — fewer noise events drowning out real signals.
- No change to the user-facing OAuth flow. Failed exchanges still return the same error to Claude Code; only the server-side log level changed.
- Railway logs at warn level are still searchable if a real systemic issue ever surfaces.
Reduced Sentry signal-to-noise ratio. Downgraded a Supabase OAuth token-exchange log from error to warn so user-side auth failures stop generating events; resolved one stale issue and set three others to ignore (forever for browser-extension noise, until-escalating for probe-detection signals).
## What changed
* **mcp-server/src/auth.ts** — Supabase PKCE token-exchange failure log downgraded from `console.error` to `console.warn`. `@sentry/node` no longer auto-captures it. Message stays in Railway logs at warn level for debugging if a real issue surfaces.
* **JAVASCRIPT-ASTRO-B** resolved (confirmed not recurring after 2 days).
* **JAVASCRIPT-ASTRO-A + NODE-EXPRESS-3** ignored forever — both are noise from user-installed browser extensions (`views.js` injecting an `updateFrom` method that we never call).
* **NODE-EXPRESS-2** set to ignore-until-escalating — the `[security] Invalid bugAgent API key probed against /mcp` issue is a real probe-detection signal (not noise) but routine background traffic; Sentry will surface it again if probe volume escalates above the current baseline.
## Why
User-side OAuth failures (expired PKCE codes, browser refreshes mid-flow, parallel auth attempts) were generating Sentry events that aren't actionable for us. The signal-to-noise ratio in Sentry was poor, making it harder to spot real issues. This pass clears the routine noise while preserving real signals (the API-key-probe issue stays as escalation-aware).
Foundation for multi-agent operation. New claim_bug MCP tool atomically transitions a ticket from new to in-progress; a 30-min lease + 5-min reaper cron returns abandoned claims to the queue. Multi-agent capability is now unlocked — you can run multiple Claude Code sessions against the bugAgent MCP server and they'll pick non-overlapping tickets.
## What changed
The agent priority queue (read-only view, shipped in v2.69 via migration 137) and its `pick_next_bug` MCP wrapper (v2.77) were the data plane. v2.78 adds the atomic claim primitive on top: a new `claim_bug` MCP tool that race-safely transitions a ticket from `status='new'` to `'in-progress'`, sets `assigned_to` to the calling user, and stamps a `claimed_at` timestamp.
Race safety comes from Postgres's UPDATE-WHERE-RETURNING pattern: the `WHERE status='new'` clause IS the lock. If two agents race on the same id, exactly one wins; the loser's WHERE clause matches zero rows because status was no longer 'new' by then.
A pg_cron job runs every 5 min and releases stale claims (in-progress + claimed_at > 30 min ago) back to 'new'. So if an agent crashes mid-fix, its ticket re-enters the queue automatically. Human dashboard assignments don't set `claimed_at`, so they're never reaped — only programmatic claims via this tool participate in the lease.
## What this unlocks
Multi-agent operation is now safe. You can spin up multiple Claude Code sessions pointed at the bugAgent MCP server and have them work the queue concurrently. Each session calls `pick_next_bug` → `claim_bug` → fix → push → comment. The atomic claim ensures they pick non-overlapping tickets; the reaper handles abandoned work.
The orchestrator skeleton (TEST-178, queued) is the automation layer that wraps this pattern with a fixed system prompt, tool budget, and Claude SDK subagent dispatch. Until that lands, the same capability is available manually via parallel Claude Code sessions.
Two consumer surfaces for the existing agent_priority_queue SQL view (shipped in v2.69). A new /dashboard/agent-queue page shows what the agent loop will pick up next; a new pick_next_bug MCP tool lets agents query the same priority order programmatically. Both are read-only — atomic claim semantics arrive in a follow-up.
## What changed
The SQL view `public.agent_priority_queue` (S1+S2+S3 in `status=new`, oldest-first within each bucket) has been live since v2.69 but had no consumer surfaces. v2.77 adds:
* **`/dashboard/agent-queue`** — Astro SSR page. Severity counts at the top, dense table of the next 50 tickets in priority order. Sidebar nav entry under Bug Reports.
* **`pick_next_bug` MCP tool** — read-only, returns rows in the same shape as `list_bug_reports`. Optional `severity` filter (s1/s2/s3) and `limit` (1-50, default 1).
## Why read-only
Deliberate design choice. Agents can peek at the queue without side effects (multi-step deliberation, exploring before committing, multi-model triage). Atomic claim semantics live in a separate upcoming `claim_bug` tool so the data plane stays composable across single-agent / multi-replica / orchestrator+subagents architectures.
## What's next
* **TEST-177** — `claim_bug` atomic primitive + `claimed_at` lease + reaper cron. Foundation for any multi-consumer architecture.
* **TEST-178** — Orchestrator skeleton (long-running poller that polls the queue → claims → spawns Claude subagents per ticket).
These together ship the "agent v0 readiness" milestone.
v2.75 + v2.76 — Semgrep workflow no longer emails on every PR
devopscisemgrep
Three-layer fix unblocks the Semgrep SAST workflow that had been failing on every PR for weeks. CLI flag, missing permission, and GHAS-disabled state all addressed.
## What changed
The Semgrep CI workflow had been failing on every PR (and pushing failure-notification emails to the repo owner) for weeks. Fixed in three sequential layers as each one uncovered the next:
1. **v2.75** — removed the deprecated `--error` flag from `semgrep ci` (it was a `scan`-only flag in newer Semgrep CLIs).
2. **v2.76 part 1** — added `actions: read` permission so the `upload-sarif` action can read workflow-run metadata.
3. **v2.76 part 2** — `continue-on-error: true` on the SARIF upload step. The repo is private and doesn't have GitHub Advanced Security enabled, so the Security-tab upload can't land. Findings still print in workflow logs (the primary review surface today). The upload step stays in the workflow so a future GHAS license decision works without a code change.
## Effect
No more "Semgrep SAST workflow failed" emails on every PR. Workflow now exits green. If GHAS is ever enabled on this repo, the upload step starts succeeding automatically.
## Lesson
When fixing a CI workflow, the inline smoke test is the **next** workflow run, not the PR's own. A PR's own CI runs against the BASE workflow definition; the new definition only takes effect on subsequent merged commits. Calling a CI fix "done" without observing one post-merge run is premature.
v2.74 — bug-reports pagination no longer disappears after kanban view round trip
bug-fixdashboardui
The view toggle on /dashboard/reports correctly hid pagination when switching to kanban. The reverse direction was never restoring the display attribute, so after one round trip pagination stayed invisible until the user refreshed the page. Single-line fix in dashboard/public/js/kanban.js switchToList().
## What changed
The Bug Reports list page has a view toggle (List / Kanban) at the top. Switching to Kanban hides the pagination element with `style.display = "none"` — correct, since kanban has no pages. The reverse function (Kanban → List) was resetting the kanban container, list container, and filters but never the pagination. Result: after one round trip, the pagination div stayed `display: none` forever even though the underlying list refresh was rewriting its HTML.
One line added to switchToList() restores the display state. Round trip now leaves pagination visible.
## Note
This is the bug Jason originally flagged as "I don't see pagination anymore." We initially mis-triaged it as the page-2 count-inflation issue (TEST-24, fixed in v2.69), then mis-filed it as TEST-174 before the actual symptom was clear. TEST-174 closed as duplicate of TEST-24. TEST-175 finally closes the visibility-loss leg.
v2.73 — tests-or-explain CI gate enforces TDD discipline mechanically
devopscitddchange-management
A new GitHub Actions workflow fails any PR touching dashboard/src/** without a matching test diff or an explicit "No test added —" opt-out in the PR body or a commit message. CLAUDE.md's TDD rule is now machine-enforced rather than policy-only.
## What changed
The rule "every bug-fix commit adds a test or explains why" has lived in CLAUDE.md for a while as discipline. As of v2.73 it's mechanically enforced by `.github/workflows/tests-or-explain.yml` — a small CI workflow that runs on every PR and:
1. Looks at the diff between the PR base and head.
2. If anything under `dashboard/src/**` (excluding `*.test.ts`) was touched AND no test file was modified AND the PR body / commit messages don't contain the literal string `No test added —` (em-dash), the workflow fails.
3. Otherwise it passes (with a friendly notice noting which path satisfied the gate).
The em-dash is U+2014 — typing two ASCII hyphens won't match. CLAUDE.md notes this explicitly so anyone hitting the gate can self-resolve.
## Self-validation
The gate ran on its own introducing PR (#99) and passed trivially because the PR has no `dashboard/src/**` production-code changes — confirming both that GitHub Actions runs newly-added workflows on the PR that adds them, and that the trivial-pass branch works. The two failure paths (prod-code-without-test, prod-code-with-opt-out) will be exercised by the next real bug-fix PR.
## Required-checks list
Branch protection from TEST-170 currently requires only the `test` (Vitest) check. Once we've seen a few clean ships through the new gate, the `tests-or-explain` workflow gets promoted to required so it becomes a hard merge block instead of advisory.
v2.72 — PR-only merges to main + branch protection enforced
devopscichange-management
Direct pushes to main are blocked. Every change now goes through a pull request and must have the `test` (Vitest) status check passing before it can merge. Admins keep an emergency-hotfix bypass; reviewer requirement is deferred until the team grows.
## What changed
Branch protection is enabled on `main`. The everyday ship flow is now:
1. Branch off main: `git checkout -b bug/TEST-XXX-slug`
2. Work, commit, bump `package.json` version, push
3. `gh pr create` with title `Ship to prod: vX.Y — <one-liner>`
4. CI runs the `test` workflow; PR cannot merge until it's green
5. `gh pr merge --merge --delete-branch <PR>` merges with a `--no-ff`-style merge commit, keeping the per-iteration history
Documented in CLAUDE.md "Ship flow" with the multi-ticket variant (via develop) and the admin-only emergency-hotfix path. Two new "Mistakes to avoid" entries (#15: don't push directly to main; #16: don't add stuck-failing checks to required) capture the nearby footguns.
## What's NOT required (yet)
- `Semgrep SAST` — currently failing on main, independent issue
- CodeQL `Analyze (javascript-typescript)` / `Analyze (python)` — green but not yet vetted as PR-time gating
- Reviewer requirement — deferred until team grows; today's single-author flow still functions
The required set will widen as we get a few clean ships through the new flow and as Semgrep is fixed.
create_bug_report and update_bug_report now accept the full 19-value type enum the dashboard uses. update_bug_report and list_bug_reports gained resolution + root_cause fields, closing the gap that forced every agent close to fall back to raw SQL.
## What changed
The MCP wire schema had drifted: the bug-type vocabulary grew from 7 → 19 categories on the dashboard side over many releases, but the MCP server's Zod enums never followed along. Today every `create_bug_report` call with a newer category (`devops`, `technical-debt`, `feature-request`, etc.) was either rejected or silently mis-classified. Same shape problem for resolution + root_cause: the dashboard had columns and the agent-loop convention required them on close, but `update_bug_report` had no way to set either field over the wire — every close needed raw SQL.
## Now
- `create_bug_report({type})` accepts all 19 dashboard categories.
- `update_bug_report` accepts new `resolution` (6-value enum) and `root_cause` (open-ended kebab-case taxonomy with documented canonical set) fields, settable in the same call as `status`.
- `list_bug_reports` filters on those new fields too, and the returned rows include `status` / `resolution` / `root_cause` for downstream filtering.
- Public API + MCP docs updated.
No DB migration — Postgres columns are already `text`. The wire-format widening is strictly additive (new values are a superset of old); existing callers are unaffected.
v2.70 — bug-reports search now matches by short ID (PREFIX-NNN)
bug-fixdashboardsearchtdd
The search box on /dashboard/reports now recognizes the short-ID format (TEST-24, test-024) and bare positive integers (24) as exact ticket-number lookups, in addition to the existing title + description text search.
## What changed
Typing a short ID like `TEST-24`, `test-024`, or just `24` into the bug-reports search box now matches the actual ticket #24, not only reports whose title/description literally contain those characters. Wrong-prefix inputs (e.g. another workspace's short ID format) silently fall through to plain text search — no cross-workspace leakage. Regular text queries behave exactly as before.
## Why
Display IDs are everywhere now (kanban cards, deep links, comments, even MCP tool inputs), but the search couldn't reach them. Filed by Jason as a small ergonomics request; Russell will retest in v2.70.
## Implementation note
New pure helper `dashboard/src/lib/report-search.ts` with 12 unit tests (193 tests total now) builds the PostgREST `.or()` clause. Wired into both the SSR list page and the `/api/reports` endpoint that powers the client-side filter refresh + kanban card fetches.
v2.69 — bug-reports list page 2+ no longer inflates total count
bug-fixdashboardrealtimetdd
Polling and Realtime INSERT paths in the Bug Reports list view were prepending the latest 10 reports onto whatever page the user was on, inflating page 2+ tbody by ~10 rows and the "X total reports" widget by the same amount a few seconds after initial render. Both paths now no-op on non-first pages.
## What changed
The Bug Reports list page polls the latest 10 reports every 30s (with an immediate poll on visibility change) so users see new bugs come in without refreshing. That fetch was unconditionally `?limit=10&offset=0`, regardless of which page the user had open. On page 2+ those 10 rows didn't belong on the visible list — they got prepended to the tbody and bumped the "X total reports" widget by 10. Symptom: page 2 starts at the right total, jumps by 10 a few seconds later.
## Fix
A new pure helper `dashboard/src/lib/list-page.ts` exports `isOnFirstPage(search)` — defensive parsing, fails open on malformed URLs. The polling and `addReportRow` paths both gate on it and no-op when the user isn't on page 1. Six new unit tests lock in the behavior (181 tests total, was 175). The Realtime INSERT handler also benefits since it goes through the same `addReportRow` function.
## Trade-off
Quick-submit's rapid-mode also calls `addReportRow`. On page 2+ a newly-submitted bug no longer appears as a phantom row at the top of the user's current view — but the textarea still flashes "Bug submitted!" for feedback, and the new bug is on page 1 where it actually belongs.
## Bundled with this ship
Also shipping the "Agent priority bucket" CLAUDE.md section + the `public.agent_priority_queue` SQL view (migration 137) that codify which tickets the agent loop picks up. No user-facing change from those — internal documentation + queryable view for future automation.
v2.68 — CoPilot Resources tab scoped to current workspace + project
bug-fixchrome-extensioncopilot
Fixed a render-layer leak in the bugAgent CoPilot Chrome extension where the Resources tab showed every workspace's recordings to anyone in any workspace. Sync was already workspace-aware; the fix completes the scoping at render time. Extension version 1.3.1 — rebuild + reload required to pick up.
## What changed
The CoPilot extension keeps a local IndexedDB store of your recordings and resources for fast tab rendering. The sync layer correctly fetches per workspace + project, but the render layer was doing an unfiltered `getAll()` on the local store, so once you'd visited two workspaces the Resources tab merged their recordings together. Now the render layer filters to the workspace + project you have selected in the sidebar, and switching workspaces re-renders instantly from cache before the next sync round-trip.
## Distribution
The Chrome extension does not auto-deploy via Railway. To pick up v1.3.1, rebuild via `chrome-extension/build.sh` and reload the unpacked extension (or wait for the next Chrome Web Store push). Web app, MCP, and Runner are unchanged.
v2.67 — bug-report severity normalized to formal QA codes (s1-s4)
bug-fixdashboardmcpdata-migrationapi
Single canonical severity vocabulary across the dashboard, MCP, and public docs. Storage migrated; a write-side trigger keeps it normalized regardless of input shape. Wire format still accepts the legacy aliases for backward compatibility.
## What changed
bug-report severity now uses exactly four canonical values everywhere they appear in the UI: **s1 (Blocker)**, **s2 (Critical)**, **s3 (Major)**, **s4 (Minor)**. Previously the list filter showed eight options (the four formal codes plus four legacy `critical`/`high`/`medium`/`low` aliases marked `(legacy)`), and the detail-page picker would sometimes show five options when a stored row used a legacy value.
## Why it's safe even on older reports
Migration 135 rewrote every legacy-severity row in storage to its formal QA equivalent (`critical → s1`, `high → s2`, `medium → s3`, `low → s4`) — the same mapping already baked into the P0/P1/P2/P3 priority hint. Migration 136 added a `BEFORE INSERT OR UPDATE` trigger that normalizes any future legacy input at the storage boundary, so external integrations (MCP, REST, CLI agents) can keep posting either shape on the wire without re-introducing legacy in the database.
## New developer surface
`dashboard/src/lib/severity.ts` is the single source of truth for the vocabulary. It exports:
- `FORMAL_SEVERITIES` / `LEGACY_SEVERITIES` constants (in priority order)
- `normalizeSeverity(input)` — returns the formal code or `null`
- `normalizeSeverityOrDefault(input, fallback?)` — same with an explicit fallback
Use these helpers anywhere you're about to compare or write a severity value.
## Public API contract
Unchanged for callers — `update_bug_report` and `POST /api/reports` still accept either the formal codes (`s1`-`s4`) or the legacy aliases (`critical`/`high`/`medium`/`low`) on input. Reads always return the formal codes.
v2.57 — web sessions persist across browser restart + MCP status enum aligned
authmcpbug-fixdashboard
Two fixes shipped together: bugAgent now keeps you signed in across browser restart (matching every peer product surveyed), and the MCP server's update_bug_report / list_bug_reports tools now use the dashboard's canonical eight-value status enum.
## Web session persistence (TEST-155)
bugAgent's web app no longer signs you out when you close the browser. Survey of peer products — 1Password, Notion, Slack, Linear, Loom, LastPass — showed all of them keep web sessions alive across restart. We were the outlier with the strict-session policy, and it was actively frustrating power users. The cookie-stripping shim and the per-tab sessionStorage marker check are gone; sign in once, close the browser, reopen tomorrow — you're still signed in.
## MCP status enum aligned (TEST-162)
The MCP server's `list_bug_reports` filter and `update_bug_report` mutator now accept exactly the dashboard's eight canonical status values:
`new`, `awaiting-triage`, `confirmed`, `in-progress`, `resolved`, `retesting`, `closed`, `reopened`
Hyphens in `awaiting-triage` and `in-progress` are deliberate — the kanban + detail page render against those literal strings. The old MCP-only values (`open`, `in_progress`, `triaged`, …) are now rejected at the validation layer, so external agents can no longer accidentally write an unrouted literal that disappears from the kanban. Public docs at `/api-reference` and `/mcp` updated in the same ship.
No DB migration — the column is already `text` and enforcement was always at the API + UI layer; this fix moves the MCP layer back into agreement with everyone else.
v2.38 — Real daily backup verification for compliance
compliancebackups
Backup verification on the compliance dashboard now actually checks the database vendor's backup state through their API instead of just attesting that backups happen. A daily cron fires the check at 04:00 UTC for every workspace that has the integration configured, so the audit trail builds itself.
## What changed
The Backups card on the Compliance dashboard previously inserted "we believe backups are happening" attestation rows when you clicked Verify. v2.38 replaces that with a real check against the database vendor's API — for workspaces that have the integration configured, the verification row now records the actual age of the most recent backup.
A daily cron at 04:00 UTC fires the same verification automatically, so the audit trail in the Backups history table builds itself even if no one opens the page.
## To enable real verification on your workspace
Add a Supabase Personal Access Token to your Compliance connection's credentials as `management_token`. Without one the row records `status: unknown` with a note explaining how to enable the real check — the dashboard still keeps the trail continuous, just flagged as assertion-only.
## Why it matters
For SOC 2 / ISO 27001 audits, an auditor wants to see independent verification of vendor backups on a recurring schedule, not just a "the vendor says they back up" assertion. The old flow was the latter; v2.38 is the former.
v2.17 — AI Assistant streams its replies
performanceai
The AI Assistant now starts replying within about half a second instead of waiting for the entire answer to finish before showing anything. Words appear as Claude generates them, the same way Claude.ai and ChatGPT work.
## What changed
When you send a message, the assistant now streams its response token-by-token instead of waiting for the complete answer and showing it all at once.
## Impact
- **First word appears in ~500ms** instead of after the entire 10-17 second generation finishes.
- **Total response time is the same** — Claude still takes the same amount of time to think — but you can read along instead of staring at a typing indicator.
- **Long technical answers** (bug analyses, multi-step plans) feel dramatically more responsive.
- **Short answers** ("yes", one-line replies) are basically the same as before.
The public session AI page and any external integrators still receive the original buffered JSON response — nothing breaks.
v2.06 — Bell icon updates instantly
performancenotifications
New messages now appear in the bell icon as soon as they're created instead of after the next poll. The bell still works exactly the same — clicking, marking as read, deleting — it just refreshes itself in real time now.
## What changed
The bell-icon dropdown was previously checking for new messages every 30 seconds. Now it opens a single live connection per session that delivers new messages as they happen. No more waiting for the next poll cycle.
Reconnects after a brief network blip refresh automatically, so notifications don't get dropped.
## Impact
- New messages and notifications appear instantly instead of within 30 seconds.
- The dashboard makes far fewer background requests, which keeps the rest of your tabs snappier.
v1.96 — Faster bug report detail page
performance
Opening a bug report should feel noticeably snappier — the same dozen database lookups now run in parallel instead of one after another.
## What changed
The bug report detail page was making roughly a dozen database queries sequentially before rendering — each one waiting for the previous to complete. The new version groups them into parallel batches, so the page only waits for the slowest query, not the sum.
## Impact
- The /dashboard/reports/<id> page should load substantially faster.
- Same data, same UI, same numbers — just less waiting.
Combined with the middleware auth cache from earlier this week, most authed pages should now feel responsive.
v1.95 — Faster analytics dashboard
performance
The analytics dashboard endpoint is now noticeably quicker — most of the time it spent waiting on the database has been eliminated by issuing queries in parallel.
## What changed
The analytics page was making over a dozen database queries one after another, each one waiting for the previous to finish. The new version groups them into two parallel batches: independent reads first, then a smaller batch for queries that need ids from the first round.
## Impact
- The /dashboard/analytics page should feel meaningfully faster.
- Same data, same charts, same numbers — just less waiting.
Profile / Reports list pages are next on the perf cleanup list.
v1.93 — Schema cleanup
infrastructure
Internal cleanup that brings local development environments back in line with production and fixes a silently-broken analytics RPC.
## What changed
Two small migrations to clean up some technical debt:
- A few pieces of production schema had been added directly without going through the migration system, which made fresh local development environments diverge from prod. v1.93 captures those as proper migration files so a clean clone matches what's running.
- A pre-existing duplicate stored procedure had stopped working after the schema cleanup above. Caught during a pre-ship scan and fixed in the same release. No user-visible impact (the procedure handles internal usage analytics, called fire-and-forget).
## Impact
Nothing visible. This is plumbing.
v1.91 — RLS policy rewrite for faster team-scoped queries
performanceinfrastructurerls
Postgres now evaluates RLS auth checks once per query instead of once per scanned row. The benefit is small at today's scale and grows with data volume.
## What changed
Every row-level security policy on every public table previously called the auth function (e.g. `auth.uid()`) directly. Postgres re-evaluated that function for every row scanned during a SELECT/UPDATE/DELETE, which made authorization cost grow linearly with table size.
v1.91 wraps each `auth.<func>()` call in `(select ...)` across all 223 policies. The planner now caches the value once per query.
## Impact
- No visible change at today's scale (small tables) — but the bigger your team's data gets, the more this pays off.
- No semantic change. Same rows are visible and modifiable to the same users.
## Verification
The Supabase performance advisor flagged 184 policies as needing this rewrite. After the migration: 0 remaining. A custom check that strips the canonical wrap and looks for any bare `auth.<func>()` calls also reports zero across all 223 public-schema policies.
## Behind the scenes
Migration generated programmatically from the live policy expressions in `pg_policies`. Each policy was dropped and recreated with the same logic, only the auth call wrapping differed. Mechanical, no judgment calls.
v1.81 — Faster authed pages and APIs
performanceinfrastructure
Authenticated requests are now ~400ms faster across the board thanks to a short-TTL middleware cache for resolved profile and team membership data.
## What changed
Every authenticated page render and API call previously did three round-trips to Supabase: a JWT validation against the auth service, a `profiles` SELECT, and a `team_members` JOIN. Sentry data showed this added a ~400ms p50 baseline to every endpoint, regardless of whether the endpoint did any real work.
v1.81 introduces a per-pod in-memory cache keyed on the auth token. On a cache hit (which is most requests during an active session), the middleware reconstructs your user and workspace context from cached data instead of making the round-trips.
## Impact
- The bell-icon poll, dashboard page renders, and most API endpoints get noticeably snappier.
- Profile changes (name, avatar) and team membership changes propagate within 30 seconds.
- Trial expirations explicitly invalidate the cache so plan state is never served stale.
## Behind the scenes
Shipped via TDD with 16 new unit tests covering TTL expiry, eviction, and the chunked-cookie extraction that Supabase SSR uses. The cache is short-lived and per-pod — no external infrastructure added.
Daily: Track Dependabot alerts in continuous-improvement log
dailyaibug-reportsapi
69 commits — Improvements: - Track Dependabot alerts in continuous-improvement log - Ship to prod: v2.39 — drop 23 speculative unused indexes on never-written tables
Improvements:
- Track Dependabot alerts in continuous-improvement log
- Ship to prod: v2.39 — drop 23 speculative unused indexes on never-written tables
- Drop 23 speculative unused indexes on never-written tables
- Ship to prod: v2.38 — real backup verification + daily compliance cron
- Real backup verification via Supabase Management API + daily cron
Three new automated security controls land + the OWASP ZAP gap is documented honestly rather than silently shipped half-done.
## What landed
After an honest review against industry security-program patterns (Policy-as-Code, tamper-proof audit trails, automated SAST/DAST/SCA), three of the four gaps I'd flagged close cleanly in this release.
### CTRL-SYS-006 — Semgrep AST-based SAST
- New `.github/workflows/semgrep.yml` runs Semgrep on every PR and push to `develop`/`main`. SARIF findings upload to the GitHub Security tab alongside CodeQL + Dependabot.
- Runner Dockerfile pins `semgrep==1.139.0` for user-submitted `scan_type=code` security scans; engine version stays consistent between user scans and our own CI.
- Soft-fail mode for the rollout window — flips to blocking once the ruleset is calibrated.
### CTRL-CHG-005 — Commit signature audit
- New `.github/workflows/commit-signing-audit.yml` audits the signature status of every new commit and logs unsigned ones without failing the run yet.
- Developer setup guide at `docs/commit-signing.md` covers three signing options (sigstore/gitsign, SSH-signing, GPG) plus the planned flip to enforced-signing once the rolling unsigned count is zero for 14 days.
### CTRL-CHG-006 — Immutable audit-log export to Cloudflare R2
- New `.github/workflows/audit-log-export.yml` runs daily at 03:17 UTC. Pulls the previous-day Supabase audit events, serializes to JSONL via tested helpers, and PUTs to a Cloudflare R2 bucket with Object Lock in **Compliance mode** + **7-year retention**.
- Object Lock in Compliance mode means even a Cloudflare admin cannot delete or modify the objects within the retention window. That's the *tamper-proof* property — separable from the *tamper-evident* git history.
- Pure helpers in `dashboard/src/lib/audit-export.ts` covered by 10 new Vitest cases (full suite: 139 passing in ~460ms).
- Bucket setup is one-time manual; runbook in `compliance/r2-audit-bucket-setup.md`.
## What was honestly documented as a gap
### CTRL-SYS-008 — OWASP ZAP DAST
The wiring code exists in `runner/src/zap-scanner.ts` — but it invokes ZAP via `docker run ghcr.io/zaproxy/zaproxy ...`, which requires Docker-in-Docker. Railway containers do not expose the Docker daemon socket, so the runner cannot invoke nested Docker. Recorded explicitly in the control matrix as `NOT YET` with two mitigation paths (separate-service deployment, or refactor to ZAP-as-subprocess). Not a silent skip.
## Deliberately deferred
- OPA / Policy-as-Code (revisit when IaC enters the picture)
- Sigstore *enforced* commit signing via branch protection (audit-only first)
- Checkov (minimal IaC surface) and MobSF (low scan volume)
## Compliance evidence
All four documents updated: SOC 2 control matrix, ISO 27001 SoA review history, GDPR change log, dated evidence-log entry. Cross-references between files are explicit so an auditor can trace any control row to its workflow file, runbook, or test file in two clicks.
v1.76 — test suite expanded to 129 tests + TDD-for-bugs codified in CLAUDE.md
testingciengineeringcompliancedocumentation
Test suite jumps from 71 to 129 tests. CLAUDE.md now mandates that every bug-fix commit add a regression test. Compliance evidence updated across all four documents.
## What expanded
+58 new tests across three pure-function modules — all green in ~430ms total:
- `lib/strip-html.test.ts` — `stripHtml` + `stripHtmlFromFinding` (HTML scrubbing for scanner findings; covers entity decoding, `<script>` / `<style>` block removal, list conversion, whitespace normalization, fast-path behavior)
- `lib/adf-to-markdown.test.ts` — Atlassian Document Format → markdown round-trip for Jira sync (text marks, lists, headings, code blocks, blockquote, plus `stripSyncBoilerplate` footer scrub)
- `lib/format-description.test.ts` — `isStructuredDescription` marker detection
## CLAUDE.md
Replaced the brief Tests stub with a full section that **mandates TDD for bug fixes**:
1. Write a failing test first that reproduces the bug as concretely as possible
2. Run it. Confirm it fails for the right reason
3. Fix the code
4. Run the test again — should pass
5. Run the full suite
6. The test stays as the regression guard
Every bug-fix commit must contain at least one new test, OR document in the commit message why one isn't feasible (e.g. the bug lives in `<script is:inline>` which we can't import from Vitest yet). The discipline exists to prevent repeats of today's pattern of seven hotfixes for bugs trivial unit tests would have caught.
## Compliance evidence (all four files)
- `compliance/evidence-log.md` — new dated entry "Automated Test Gate Stood Up" with TSC / ISO / GDPR mappings
- `compliance/gdpr-compliance.md` — change-log row 1.9 covering Art. 32(1)(d) regular-testing-and-evaluation reinforcement
- `compliance/statement-of-applicability.md` — review-history entry citing A.8.25 (secure development life cycle) + A.8.32 (change management)
- `compliance/soc2-control-matrix.md` — new control row CTRL-CHG-004 ("Automated unit-test + build gate on every PR and push") plus change-log row 1.8
v1.75 — Vitest + CI on every PR (71 unit tests, all green)
testingciengineeringcompliance
First automated test suite + CI gate. Every PR and push to develop/main now runs unit tests + a build typecheck before anything can merge or deploy.
## Why
We shipped seven hotfixes today (v1.67–v1.74). Several were for bugs a tiny unit test would have caught before they reached production:
- A UUID-only regex in the sidebar that didn't recognize short IDs (`TEST-156`) → users couldn't change projects from any short-ID detail page.
- A redirect loop on cross-workspace deep links when the URL had a stale `?team=` param.
- A TDZ runtime error from a duplicate import that bundled into `getServiceClient2 before initialization` and 500'd every request to one route.
Time to put a gate under the door.
## What landed
- **Vitest** installed + configured with `npm test` (one-shot for CI) and `npm run test:watch` (TDD).
- **71 unit tests** across four files, all passing in ~400ms:
- `lib/ticket-id.test.ts` — `parseShortId`, `isShortId`, `isUuid`, `formatTicketId`, `reportDisplayId`
- `lib/route-utils.test.ts` — `isUuidPathSegment`, `isShortIdPathSegment`, `isDetailPage` (the v1.74 short-ID detection)
- `lib/active-workspace.test.ts` — `buildWorkspaceSwitchPath` (the v1.73 redirect-loop fix)
- `lib/mobile-tap-resolver.test.ts` — 26 cases covering layout/wrapper/input class detection, bounds parsing for both Android UIAutomator and iOS XCUITest formats, and every branch of `resolveTapIdentity` with concrete scenarios from the mobile recording bug investigation (parentPanel override, EditText re-tap typed-value avoidance, deeper-cache-element wins by bounds-area)
- Two new pure-function modules extracted so they're testable AND reusable:
- `lib/route-utils.ts` (lifted from `Sidebar.astro`)
- `lib/mobile-tap-resolver.ts` (parallel to the inline copy in `pages/dashboard/mobile/record.astro` — see file header for the duplication note; consolidating requires moving record.astro off `<script is:inline>`, separate work)
- **`.github/workflows/test.yml`** runs `npm test` + `npm run build` on every PR and every push to `develop`/`main`. Cached node_modules, <2 min end-to-end. Fails the run on red.
## Compliance angle
This strengthens CTRL-SYS-001 (change management) and SOC 2 CC8.1: every change merged to main now has a green CI run behind it. SoA + evidence-log entries to follow in the next compliance pass.
## What's next
This is the bottom of the testing ladder. Coming pulls (separate commits) will add: SSR smoke tests for the cross-workspace deep-link redirect, more `mobile-tap-resolver` coverage on real Appetize attribute payloads, and a refactor of `record.astro`'s inline `<script>` so the duplicated tap-resolver helpers can collapse to one source of truth.
v1.72 — cross-workspace deep links no longer redirect to localhost (TEST-156 resolved)
bug-fixworkspacesdeep-linksredirects
Following a deep link for a bug report (or any detail page) belonging to a different workspace than your active one now reliably lands on the right page. The intermittent redirect to a chrome-error page complaining about https://localhost/... is fixed.
## What was happening
When you opened a deep link like `/dashboard/reports/TEST-156` while your active workspace was different from the resource's owning workspace, the page would auto-switch your workspace and redirect to the same URL to refresh the rendering context. The redirect was using `Astro.redirect(Astro.request.url)` to send the browser back to the URL it requested.
## Why it broke
Behind Cloudflare → Railway → Node, Astro's `request.url` doesn't always reflect the public URL the browser asked for. Under specific header-forwarding conditions, Astro's Node adapter falls back to the container's internal loopback hostname — producing `https://localhost/dashboard/reports/TEST-156`. The 302 Location header carried that bad URL. The browser dutifully followed it, failed to connect (no SSL cert for `localhost`), and landed on a chrome-error page with the cryptic message:
> Unsafe attempt to load URL `https://localhost/dashboard/reports/TEST-156` from frame with URL `chrome-error://chromewebdata/`.
Intermittent because the proxy header forwarding state varies under load. The same mechanism almost certainly produced the originally-reported localhost redirect after submitting bugs (`http://localhost/dashboard/reports/TEST-NNN`) — different redirect cascade, same bad-host root.
## Fix
All 11 cross-workspace deep-link redirect sites now use a path-only Location header:
```js
return Astro.redirect(Astro.url.pathname + Astro.url.search);
```
A path-only Location is resolved by the browser against **its own** request URL (the real `https://app.bugagent.com/...` it requested), not the server's possibly-wrong view. So even when Astro's `request.url` is `https://localhost/...`, we send back `/dashboard/reports/TEST-156` and the browser stays on the correct host.
Patched: reports, test-cases, probes, probes/runs, security, explorations, explorations/detail, code-review, compliance, automations, performance. Updated `lib/active-workspace.ts` docstring with the corrected pattern as the documented example.
## Earlier defenses kept in place
The v1.70 (per-call-site `window.location.origin` pinning) and v1.71 (`tab-context.js` window.location interceptor) defenses stay live as belt-and-suspenders. They didn't fix the actual bug — that was server-side — but they protect against any future relative-URL drift on the client.
v1.67 — bug-report short-ID deep links resolve across workspaces
bug-fixworkspacesdeep-linksreports
A link like `/dashboard/reports/TEST-156` now lands you on the report and switches your active workspace to match — regardless of which workspace you were last viewing — as long as you're a member of the workspace whose prefix is `TEST`.
## What was happening
Clicking a deep link with a short ID (PREFIX-NNN form, like `TEST-156`) was redirecting to the bug-report listing page even when the user had access to the matching workspace. The page's short-ID resolver was hard-scoped to the user's ACTIVE workspace — so if the active workspace's ticket prefix wasn't `TEST`, no row matched, the report fetch never ran, and the page bailed out before the existing workspace-auto-switch logic could kick in.
## What changed
The short-ID resolver now looks across EVERY workspace the caller is a member of. A service-client query finds candidate teams whose `ticket_prefix` matches the deep link's prefix; an explicit `team_members` intersection filters to the ones the caller actually belongs to. If a match exists, the report is fetched from that workspace and the existing `ensureActiveWorkspace` helper switches the `active_team` and `active_project` cookies so the sidebar, project picker, and quotas all reflect the right context after a single redirect.
## Security
Guessing another workspace's PREFIX-NNN still returns null — the membership intersection enforces the same access boundary as before, just across all of the caller's workspaces instead of one. Behavior on a miss is identical (redirect to the listing page) so there's no information leak about whether the prefix exists in some workspace the caller can't see.
The API endpoints that use `lib/resolve-bug-id.ts` (score, assign, comments, upload, claude-push) are untouched — they correctly continue to scope to a single `teamId` for write paths.
Patched two medium-severity transitive dependency vulnerabilities (postcss XSS in dashboard + website) without touching any top-level package versions. Major-version Dependabot PRs (Astro 6, @astrojs/node 10, exceljs major, browserstack-node-sdk major) intentionally left open for a focused upgrade pass.
## What landed
`npm audit fix` (without `--force`) was run across the dashboard, website, and runner workspaces. Only lockfile entries changed — no `package.json` top-level constraints touched, so every bump is guaranteed to stay within the same major version.
Resolved alerts:
- **postcss** `8.5.8 → 8.5.12` in `dashboard/package-lock.json` and `website/package-lock.json` — closes the "PostCSS has XSS via Unescaped `</style>` in its CSS Stringify Output" advisory (medium severity, two alerts).
- **browserstack-node-sdk** `1.52.1 → 1.52.2` in `runner/package-lock.json` — patch bump that came along for the ride; doesn't fix any specific advisory but stays current.
## What was deliberately not auto-merged
Four alert clusters remain because they would require breaking changes that need a focused upgrade pass with manual testing:
- `astro` < 6.1.6 (XSS in `define:vars`) — open as PR #82, would jump from 5.x to 6.x.
- `@astrojs/node` < 10.0.5 (cache poisoning on malformed `if-match`) — open as PR #83, jumps from 9.x to 10.x.
- `uuid` < 14.0.0 (buffer bounds check in v3/v5/v6) in dashboard — gated by an exceljs major bump.
- `uuid` + `aws-sdk` + `@tootallnate/once` chain in runner — gated by a browserstack-node-sdk major migration.
All four require code-level review, not just a lockfile update. They'll be handled as a separate dependency upgrade sprint.
v1.65 — routine dependency bumps
dependenciesmaintenance
Bumped @supabase/supabase-js to 2.105.0 across dashboard and mcp-server, and @anthropic-ai/sdk to 0.91.1 in dashboard.
## Routine dependency maintenance
Merged three Dependabot PRs into main:
- **@supabase/supabase-js** 2.104.1 → 2.105.0 (dashboard + mcp-server)
- **@anthropic-ai/sdk** 0.91.0 → 0.91.1 (dashboard)
All three are minor/patch bumps with backward-compatible APIs. Verified clean builds for mcp-server (tsc) and dashboard (astro check + build) before shipping.
No behavior changes — version-tracking bump only.
v1.58 — mobile recorder uses resource-id over Appetize's typed-value text and layout wrappers
mobilerecordingreliabilitybug-fix
Re-tapping an input field after typing into it now records the field's identifier instead of the typed value. Edge clicks that hit a parent layout container also resolve to the inner element instead of recording the wrapper.
## Two bugs from v1.57 testing
### Re-tap recorded the typed value as the identifier
After typing `[email protected]` into the email field and re-tapping it, the recorder was capturing `tap "[email protected]"` instead of `tap email_field`. Appetize returns the second tap as `{ class: EditText, resource-id: "", text: "[email protected]" }`. Our previous code saw non-empty text and used it as the identifier — but for input-class elements, `text` is the CURRENT VALUE, not a stable label.
### Edge clicks recorded the parent layout container
An edge click on a button or input where Appetize's hit-test returned the wrapping layout (`LinearLayout`, `ConstraintLayout`, iOS `XCUIElementTypeOther`) was being recorded as the wrapper instead of the inner element. The cache had the right element, but our code only consulted it when Appetize returned an empty identifier — and "LinearLayout" isn't empty.
## Fix
New `resolveTapIdentity()` runs on every tap regardless of what Appetize returns, with strict priority:
1. **Appetize returned a `resource-id`** → trust it. Most stable.
2. **Cache has a resource-id at (x,y)** AND Appetize returned no resource-id, OR returned an *ambiguous* class (a layout container or an input-like class whose `text` mirrors typed input) → use the cached resource-id.
3. **Appetize returned `text` on a non-input, non-layout class** (Button label, link text, etc.) → use it. These texts are stable labels.
4. **Cache has anything at (x,y)** → use it. Better than coords.
5. **Nothing usable** → fall back to coords.
Class classification handles both platforms: Android (`EditText`, `TextInputEditText`, `AutoCompleteTextView`, `SearchView`, all the `*Layout` containers, `View`, `ViewGroup`, `CardView`) and iOS (`XCUIElementTypeTextField`, `XCUIElementTypeSecureTextField`, `XCUIElementTypeSearchField`, `XCUIElementTypeTextView`, plus `XCUIElementTypeOther` / `Window` / `Application` / `ScrollView` / `Table` / `CollectionView` as wrappers).
Wired into all three tap-producing paths (the SDK click branch, the SDK unknown-type fallback, and the postMessage fallback) using the same resolver function.
v1.57 — mobile recorder pre-fills its element cache from the live UI tree
mobilerecordingreliability
First-time edge clicks on a never-tapped input or button now resolve to the right element identifier. The recorder periodically snapshots the device's live UI hierarchy and seeds the cache so element resolution doesn't depend on the user having tapped the field before.
## Where v1.56 left off
The recent-elements cache shipped in v1.56 fixed re-taps and edge clicks on inputs the user had previously interacted with. The remaining gap was the first-touch case: a brand-new field, edge-clicked, with no cached entry to fall back on. That case still degraded to coordinates.
## What changed
The Appetize JS SDK exposes `session.getUI()` (experimental) which returns the full live UI hierarchy of the currently-displayed screen — every input, button, label with its `resource-id`, `text`, `class`, and `bounds`. Calling it on each meaningful event pre-populates the element cache, so cache hits become possible BEFORE the user has tapped the element at all.
- `snapshotUITree()` races the SDK call against a 2.5s timeout, parses the result, and feeds every node with bounds + a usable identifier into the cache. Failures are silent.
- Handles both Android UIAutomator XML (`<node resource-id="..." bounds="[x1,y1][x2,y2]">`) and iOS XCUITest JSON (`{ type, name, x, y, width, height, children }`).
- Triggers: at session start (+800ms for the first frame to render), after each tap (+400ms for navigation animations), after swipe / scroll / orientation change (+800ms for momentum / rotation). Min 500ms between snapshots, debounced.
The v1.56 cache stays as the primary fast path. The new snapshot just keeps the cache fuller so it has more chances to hit.
## Net result
Edge clicks, re-taps on blank fields, and first-time clicks on previously-unseen elements now all resolve to the right identifier rather than degrading to "tap at (312, 448)". The only remaining edge case is the very first action of a recording before the initial snapshot lands — bounded at under a second and rarely a coord-only event in practice.
v1.56 — mobile recording rescues coord-only taps via a recent-elements cache
mobilerecordingbug-fix
Edge clicks on input fields, and re-taps on blank or near-empty fields, now record as the element you actually meant to tap rather than degrading to "tap at x=312, y=448".
## What was happening
Appetize's hit-test during a mobile recording is inconsistent on edge clicks and on inputs with empty / single-character text. Sometimes it returns no element attributes at all; sometimes it returns a parent layout (LinearLayout, FrameLayout) that has a class but no resource-id. With nothing usable to record, the recorder was falling back to raw coordinates — so the same field, tapped twice, could end up as `tap email_field` once and `tap at (312, 448)` the next time.
User intent is the same in all three scenarios. The recording should be too.
## What changed
A small recent-elements cache. Every time a tap DOES come in with a usable identifier, we remember `{resource-id, label, class, bounds, last-tap-coords, last-seen}`. When a later tap arrives without an identifier, the cache decides what the user meant:
1. **Bounds containment.** Walk the cache for any element whose bounding box `[x1,y1][x2,y2]` (Android) or `{x,y,width,height}` (iOS) contains the new tap. If multiple match (a button inside a card inside a screen), prefer the smallest area — that's the most specific element and almost always the actual target.
2. **Coord proximity.** As a fallback for entries whose bounds we never saw, accept the nearest cached element within 80 pixels of the new tap.
Cache is invalidated on swipe, scroll, orientation change, and recording reset (because layout shifts make bounds stale), with a 30-second TTL on each entry as a backstop.
The rescue runs in all three tap-producing paths: the SDK click branch, the SDK unknown-type fallback, and the postMessage fallback path.
## Net effect
`Tap email_field` … `Type "[email protected]"` … `Tap email_field` (re-tap, blank field) … `Type "[email protected]"` records cleanly even when Appetize reports the second tap as a coord-only event.
v1.55 — graceful-shutdown handler now covers mobile performance + code reviews
reliabilityperformancecode-reviewdeploys
Mobile performance profiling and AI code reviews triggered while a deployment is rolling are no longer left stuck in their "active" state forever. Same SIGTERM-driven recovery as v1.54's mobile-run fix, applied to the other two long-running flows that live in the dashboard process.
## Why
v1.54 wired the graceful-shutdown handler for mobile runs. Two other surfaces follow the same fire-and-forget pattern — the dashboard process kicks off a long-running task (BrowserStack Appium for mobile performance, Anthropic-driven analysis for code reviews) and updates the DB row when the work finishes:
- `startMobilePerformanceRun()` → `performance_runs.status = "running"`
- `runReview()` → `code_reviews.status = "analyzing"`
The code-review case is particularly easy to hit: the GitHub webhook fires `runReview()` without awaiting and returns 200 to GitHub immediately, so any auto-review in flight when a new dashboard pod replaces the old one was getting orphaned at `analyzing`.
## What changed
`lib/shutdown.ts` is now generic. Each surface declares its `{ table, activeStatus, errorStatus }` in a single map, and the SIGTERM handler iterates them all uniformly. Adding a new surface in the future is one map entry plus a pair of `track*` / `untrack*` exports.
Wired surfaces:
- `mobile_runs` (running → error) — already in v1.54
- `performance_runs` (running → error) — new in v1.55
- `code_reviews` (analyzing → failed) — new in v1.55
Web performance, security, and exploration runs dispatch to the runner service via `fetch()` and the long work happens there — they aren't exposed to this problem and were intentionally left out.
Daily: Ship to prod: v1.64 — bounds-area comparison replaces wrapper-name list for override gate
dailyai
35 commits — Improvements: - Ship to prod: v1.64 — bounds-area comparison replaces wrapper-name list for override gate - mobile recording: bounds-area comparison replaces wrapper-name list as override gate
Improvements:
- Ship to prod: v1.64 — bounds-area comparison replaces wrapper-name list for override gate
- mobile recording: bounds-area comparison replaces wrapper-name list as override gate
- Ship to prod: v1.63 — mobile recording uses identifier inheritance like Appium/Espresso
- mobile recording: identifier inheritance during snapshot — the general solution
- Ship to prod: v1.62 — non-wrappers win cache lookup via proximity, not just bounds-contain
v1.54 — graceful-shutdown handler stops stranding mobile runs across deploys
reliabilitymobiledeploys
Mobile runs in flight when a new deployment rolls out are no longer left at status="running" forever. The process now catches SIGTERM, marks every still-tracked run as "error" with a clear re-run prompt, and exits.
## Why
Mobile runs execute as fire-and-forget background promises in the dashboard process — the API endpoint returns 201 immediately and the BrowserStack flow continues asynchronously, polling for completion and writing the result back to the DB.
When Railway rolls a new deployment, the OLD pod gets SIGTERM and a short grace window before SIGKILL. Until now, any run mid-execution at that moment was orphaned — DB row stuck at `running` forever because the process that was going to update it was gone.
We hit it ourselves shipping v1.53. One run got stranded mid-execution and had to be flipped manually.
## What changed
New `lib/shutdown.ts` maintains a Set of in-flight mobile run IDs:
- `runAppiumOnBrowserStack` registers the run on entry and untracks it in `finally` regardless of how the function returns.
- `installShutdownHandler()` (idempotent, called once from middleware at module load) attaches handlers for SIGTERM and SIGINT.
- On signal, the handler iterates every still-tracked run and updates `mobile_runs.status` to `error` with a "Server restarted during deployment; this run was interrupted mid-execution. Please re-run the automation." message.
The DB write is fire-and-forget — Railway only gives ~10 seconds before SIGKILL, so we don't block on it. In practice the update lands well before the kernel reaps the process.
Worst case (pod hard-killed before our handler runs): behaviour is unchanged from before — we're no worse off than the old stuck-row state.
v1.53 — mobile recording waits for explicit move-away, not an idle timer
bug-fixmobilerecording
Slow typing or pausing mid-field (to look at a password manager, sticky note, etc.) no longer splits a username or password into multiple fragmented action events. The recorder now buffers input until an explicit signal that you're done with the field.
## What changed
Until now, every keystroke during a mobile recording armed an 800ms idle timer. If you paused — even briefly — the buffer flushed and your entry split into multiple `input` events. Anyone using a password manager, glancing at a sticky note, or typing slowly would see "myUsername" land as `myUser` + `name` instead of one clean entry.
The timer is gone. The buffer now flushes only on signals that actually mean "you're done with this field":
- Tap or click on any element
- Swipe / scroll / drag
- Enter / Return
- Tab, Escape, arrow keys, function keys, or modifier-only key presses
- A multi-char text event (paste, IME, autocomplete — already final)
- Stop Recording (always flushes any leftover buffer)
Backspace stays a buffer edit (slice the last character) rather than a flush — correcting a typo mid-field shouldn't end your input session.
## Effect
A login flow that used to record as `userna` + `me@exam` + `ple.com` now records as a single `input` event for the full address, regardless of how slowly you typed.
v1.52 — keyboard hygiene + workspace auto-switch on deep links
bug-fixmobileworkspacesdeep-linksrecording
Mobile recording no longer leaks Backspace/Tab control bytes into action scripts, and deep links to resources in another workspace now also switch the active workspace context (sidebar, project picker, quotas) rather than just rendering the page.
## Two production bugs fixed
### Mobile recording: control characters via Backspace / Delete / Tab
Even after v1.49's API-boundary sanitizer, pressing Backspace, Delete, or Tab during a mobile recording could still leak ASCII control bytes (`\b`, `\t`, etc.) into the saved actions script. The sanitizer was scrubbing at save time, but the recording client's in-memory input buffer was already polluted in real time.
The rule now lives where the buffer is built:
- Backspace / Delete explicitly slice the buffer instead of being treated as a typed character.
- Non-typing key names (Arrow*, F-keys, Page*, Home, End, Esc, Tab, Meta, Ctrl, Alt, Shift, CapsLock, …) flush the buffer rather than appending.
- Bytes 0x00–0x1F + 0x7F are dropped at append time.
Net effect: the buffer can only ever hold characters the user actually meant to type.
### Cross-workspace deep links: workspace context wasn't switching
v1.50 and v1.51 unblocked the 404, but landing on a deep-linked resource in another workspace didn't update your workspace context — the sidebar, project picker, and quota meters all still showed your previous workspace. New `ensureActiveWorkspace` helper updates `active_team` (and `active_project` where derivable) on the way through the page, so middleware re-runs with the right context. Server-rendered detail pages call the helper inline; client-fetched pages set the cookie and reload when the fetched resource's team_id doesn't match the active one. Single hop in either case.
### v1.51 gap closed
`api/test-suites/[id]` and its sub-routes (cases, reorder) were still scoped to the active team, so following a deep link to a suite in another workspace 404'd at the API layer regardless of the page-level fix. All three now use the `callerInTeam` pattern shared with test-runs and mobile-runs.
Cross-workspace deep links work everywhere now
workspacebug-fixmulti-tenant
v1.50 fixed mobile detail-page deep links for multi-workspace users. v1.51 extends the same fix across every other detail page in the dashboard — bug reports, test cases, test runs, security scans, code reviews, performance tests, explorations, and more. A direct link to a resource in any workspace you belong to now opens correctly, regardless of which workspace you have selected at the time.
## What changed
Multi-workspace users (people who own or belong to more than one workspace) following a deep link to a resource in a workspace OTHER than their currently-selected one were getting "deleted or you don't have access" 404s.
Every detail-page API endpoint and every server-rendered detail page was filtering by the active-workspace cookie instead of the resource's own workspace. So a link bookmarked from one workspace, or shared via Slack, or linked in an email — would silently 404 when the user happened to have a different workspace active at click time.
The fix mirrors the canonical pattern used by Compliance: fetch the resource without filtering, verify the user belongs to the resource's workspace, then proceed. Surrounding metadata queries (Jira config, team-member dropdowns, project pickers, schedule rows) now also scope to the resource's actual workspace, so dropdowns and badges reflect the right data instead of the active-workspace cookie's.
Features touched: bug reports, test cases / runs / folders, explorations, notes, sessions, security scans / schedules / runs, code reviews, performance tests, probes (and probe runs / targets), web automations, mobile (already fixed in v1.50). List views and the workspace switcher are unchanged — switching still shows that workspace's items, which is the behavior you want for navigation.
If you've been emailing yourself bug-report links or pasting them into Slack and getting frustrated when they 404, this is the fix.
Mobile deep links now work across workspaces
mobileworkspacebug-fix
If you belonged to more than one workspace, opening a direct link to a mobile run, automation, app, or schedule in a workspace other than your active one would return "deleted or you don't have access" — even when you owned that workspace. Detail endpoints now authorize via team membership rather than the active-workspace cookie, so any deep link in any workspace you belong to opens correctly.
## What changed
Every mobile detail endpoint (run / automation / app / schedule) was filtering rows by the workspace you currently had selected, not by the workspaces you actually belonged to. So a link like `/dashboard/mobile/runs/<id>` worked from the workspace switcher but failed from a bookmark, an email, or a Slack notification if your active workspace happened to be a different one — even though you owned the workspace where the run lived.
The fix mirrors the canonical pattern already used by Compliance: fetch the resource without a workspace filter, then verify you're a member of the resource's workspace. If you are, you see it. If not, you get the same 404 as before (we don't leak the existence of resources across workspaces).
List views are unchanged — when you switch workspaces you still see only that workspace's items, which is the behavior you want for navigation. This fix only affects direct links to a specific run / automation / app / schedule.
Same deep-link pattern likely affects bug-reports, security-scans, and other detail pages — those are server-rendered and need a slightly different fix shape; treating that as a follow-up.
Mobile recording fixes: control characters, @ symbol, video readiness
mobilerecordingbug-fix
Three fixes to the mobile automation recording flow. Backspace and Delete keypresses no longer end up as literal control characters in the saved actions. Shift-symbol keys (`@`, `#`, `$`, etc.) now record as the actual symbol instead of the unshifted base digit. The run detail page now waits for BrowserStack's video upload to finish before trying to play the recording.
## What changed
### Control characters in recorded text
When correcting typos with Backspace or Delete during a recording, the raw control bytes (ASCII 0x08 / 0x7F) were being stored as if you had typed them — so a recorded action that should have read `admin` showed up as `\b\b\b\b\badmin`, and the AI-generated test script faithfully reproduced the garbage. Recordings now strip non-printable ASCII at the API boundary. Tab and newline are preserved (they are legitimate input for textareas), and a literal backslash typed by the user (like in a regex pattern) passes through untouched.
### Shift-symbol keys (@ became 2)
Typing `@` (Shift+2) was recording as `2`. Same bug for the rest of the row: `!`, `#`, `$`, `%`, `^`, `&`, `*`, `(`, `)`, `_`, `+`, `:`, `"`, `<`, `>`, `?`, `~`, `|`, `{`, `}`. The recorder now resolves the actually-produced character — preferring the typed-character field when the SDK provides it, falling back to a US QWERTY shift map otherwise. International keyboard layout support is on the roadmap.
### Video player on the run detail page
BrowserStack returns the recording's URL as soon as a session ends, but the file at that URL is still being written for tens of seconds afterwards. The dashboard was happily handing the URL to a `<video>` element, which would start playing, run out of bytes mid-stream, and stall. The dashboard now waits for the upload to finish (server-side HEAD-probe with stability check) and shows a "Recording is finishing upload" placeholder with a Refresh button until the video is verified ready. If the upload is still in flight when the page first loads, the placeholder explains why — no more half-broken playback.
Daily: Ship to prod: v1.48 — Sentry security tags now cover mcp-server too
dailyapi
4 commits — Improvements: - Ship to prod: v1.48 — Sentry security tags now cover mcp-server too - security: extend Sentry security tags to mcp-server
Improvements:
- Ship to prod: v1.48 — Sentry security tags now cover mcp-server too
- security: extend Sentry security tags to mcp-server
- Ship to prod: v1.47 — Sentry security tags + rate-limit visibility
- security: route security-relevant signals to Sentry with stable tags
Daily: compliance: log v1.37 dep ship under CTRL-SYS-001 + nightly-scan evidence
dailyapi
4 commits — Improvements: - compliance: log v1.37 dep ship under CTRL-SYS-001 + nightly-scan evidence - compliance: log v1.37 dep ship under CTRL-SYS-001 + capture pending nightly-scan evidence
Improvements:
- compliance: log v1.37 dep ship under CTRL-SYS-001 + nightly-scan evidence
- compliance: log v1.37 dep ship under CTRL-SYS-001 + capture pending nightly-scan evidence
- Ship to prod: v1.37 — bump stripe v22 + typescript v6
- deps: bump stripe to v22 (dashboard + mcp-server) and typescript to v6 (mcp-server)
Routine maintenance: upgraded Stripe SDK to v22 across the dashboard and MCP server, and TypeScript to v6 in the MCP server build. No user-facing behavior change.
## What changed
Three internal dependency upgrades shipped as a single deploy:
- **Stripe SDK**: v20 → v22 (dashboard + MCP server). Picks up newer Stripe API defaults and security patches. All checkout, customer, subscription, and webhook flows verified to compile and behave the same.
- **TypeScript**: v5 → v6 (MCP server build). Devtime-only change; no runtime impact.
## Why
Staying current on payment SDK majors keeps webhook contracts and API behavior aligned with what Stripe sends from production. Bundling all three bumps into one deploy avoids spamming the deploy queue.
## How it was verified
- Fresh `npm install` + `tsc` (MCP server) and `astro build` (dashboard) on top of `main` — both pass clean.
- Stripe API surface used by the app is limited to long-stable methods (`checkout.sessions.create`, `customers.create`/`retrieve`, `subscriptions.cancel`/`list`/`update`/`retrieve`, `webhooks.constructEvent`).
No action needed from anyone.
73 commits — Improvements: - Ship to prod: v1.36 — unbreak OAuth callback (Authenticating hang) - auth: fix OAuth callback hang — split inline define:vars from module import
Improvements:
- Ship to prod: v1.36 — unbreak OAuth callback ("Authenticating" hang)
- auth: fix OAuth callback hang — split inline define:vars from module import
- Ship to prod: v1.35 — restore OAuth (revert browser cookie adapter)
- auth: revert browser-side cookie adapter — was breaking PKCE
- Ship to prod: v1.34 — Safari session-restore beaten by per-tab marker
Logout-on-close now works in Safari too
authsecuritydashboardsafari
v1.33 flipped Supabase auth cookies to session-scoped, which worked on Chrome and Firefox but was defeated by Safari's default "Reopen All Windows from Last Session" setting — Safari restored the cookies along with the tabs. This release adds a per-tab client-side marker that Safari's restore cannot fake.
## What changed
The dashboard layout now runs a tiny inline script before any other page JS. It checks `sessionStorage` for a `bugagent_tab_marker` that is only set on a successful login/OAuth flow. If the marker is missing, the script asks any other live tab over `BroadcastChannel` whether they are authenticated — a living tab can answer, the new tab inherits the marker, and Cmd+click / middle-click into new tabs keeps working. If no tab answers within 120 ms, the script clears the `sb-*` cookies and navigates to `/login?returnTo=<path>`.
This means:
- Close the browser completely, reopen it, and visit the dashboard → forced to sign in again (regardless of Safari preferences).
- Open a dashboard link in a new tab while other dashboard tabs are open → the new tab picks up auth without a re-login round-trip.
- Chrome, Firefox, and Safari (with or without the "Reopen Last Session" setting) all behave the same way now.
## Why
A testing and bug-reporting tool holding a signed-in session across browser restarts is a meaningful risk on shared or unattended machines. The user now has a predictable mental model: close it → signed out.
Closing the browser now logs you out
authsecuritydashboard
Supabase auth cookies are now session-scoped. When you close the tab or quit the browser the cookies are dropped, and your next visit to the dashboard sends you back to the login page. Tabs left open continue to share the same signed-in session — only an actual close kills it.
## What changed
The `sb-*` auth cookies used to carry a long max-age so the browser held on to them across restarts. Now both the server-side and browser-side Supabase wrappers strip the max-age and expires attributes, which turns them into HTTP session cookies. Browsers delete session cookies on full close, so the next request arrives with no auth state and the middleware redirects to `/login?returnTo=…`.
## What it does NOT change
- Multi-tab use: while the browser is open, every tab shares the same signed-in session.
- Mid-session activity: access tokens still refresh normally while you are using the app.
- Workspace preferences: `active_team` and `active_project` cookies (which remember your last-selected workspace and project) keep their persistence.
- Logout: the explicit logout button already cleared these cookies and still does.
## Why
For a testing and bug-reporting tool, a stale session hanging around on a shared or unattended machine is more risk than convenience. Session-scoped auth gives a cleaner security default — if you actually close the browser you deliberately closed the session.
Auto-classifiers now emit formal QA severity (s1–s4)
bug-reportsseverityai-classificationautomationmcp
Every path that auto-picks a bug severity — the heuristic parser, the Claude Haiku classifier, the AI bug-creation endpoint, failed-automation auto-bugs, and default fallbacks — now writes s1/s2/s3/s4 instead of critical/high/medium/low. The legacy names still work everywhere (API, MCP tool inputs, DB reads, Jira priority map), so nothing breaks for existing integrations. This starts the phase-out of the legacy convention.
## What changed
- **Heuristic parser** (`parse-report.ts` `SEVERITY_SIGNALS`) now emits s1/s2/s4; s3 is the fallback when no pattern fires. The matcher also recognizes the user's own s1/s2/s3/s4 mentions so pasting a QA ticket round-trips cleanly.
- **Claude Haiku parser** is instructed to return s1–s4 with explicit QA definitions (s1 Blocker, s2 Critical, s3 Major, s4 Minor). A normalization layer maps any legacy values the model produces anyway.
- **AI bug-creation endpoint** (`POST /api/ai/create-report`) accepts both shapes; unspecified severity defaults to s3 (was "medium").
- **Failed-automation auto-bug** now tags runtime errors as s2 and assertion failures as s3 (was high/medium).
- **Remaining default fallbacks** — `POST /api/reports`, note-to-report conversion, and the MCP `createBugReport` store — default to s3.
- **MCP tool schemas** (`create_bug_report`, `list_bug_reports`, `update_bug_report`) accept both formal QA and legacy values; descriptions now recommend the s1–s4 codes.
- **Jira priority map** accepts both conventions: s1/critical → Highest, s2/high → High, s3/medium → Medium, s4/low → Low.
## What stayed
- User-facing severity picker (already on s1–s4 since earlier work)
- Severity label map and impact-score weights (already support both conventions)
- Security scan, code-review, exploration, and SIEM event severity scales — those live in separate tables with their own enums and aren't part of this phase-out
- All existing DB rows keep their original values; no backfill
## Why
The product was quietly running two conventions in parallel. The formal QA codes match what testing orgs actually use (S1 Blocker / S2 Critical / S3 Major / S4 Minor) and align with the Severity picker the dashboard already shows. Switching the auto-classifiers over gets new bugs on the modern convention without breaking any existing consumer.
v1.12 — Developer Notes debates itself (critical + high bugs)
bug-reportsclaudeaifeature
Critical and high severity bug reports now get a five-step multi-model debate before Developer Notes is written, with a different-model adjudicator making the final call. Every round is visible in a new transcript view.
## What's new in v1.12
**Debate mode for the bugs that matter most.** Critical and high severity bug reports now go through a full five-step debate before Developer Notes is finalized:
1. **Sonnet drafts** the analysis.
2. **GPT-5 critiques** it as a skeptical peer reviewer.
3. **Sonnet rebuts** the critique point-by-point — conceding valid points, pushing back on weak ones.
4. **Claude Opus adjudicates** with independent judgment, reading the full transcript and picking the stronger argument on each point.
5. Final notes land in the usual Developer Notes block.
**Medium and low bugs keep the cheaper three-step chain** from v0.97 (draft → critique → synthesis). The debate path is ~5× the cost and latency, so it's reserved for the bugs where the extra rigor pays for itself.
**See the full debate transcript.** A new purple "Debated · adjudicated by claude-opus-4-5" chip appears next to the Developer Notes heading on critical/high bug pages. Click the chip (or the new "Show debate transcript" toggle inside the analysis modal) to read every round — color-coded by participant (neon green for the drafter, purple for the critic, amber for the adjudicator). Every round is also persisted on the bug record so auditors and agents can replay the reasoning chain later.
**Graceful degradation everywhere.** Any step failing falls through to the next-best answer — missing keys, rate limits, parse errors — you always get useful notes, never a hard error. Configurable via env (`DEVNOTES_ADJUDICATOR_MODEL`, `DEVNOTES_DEBATE_ENABLED`) for operators who want to swap the adjudicator model or pin everything to the cheaper chain.
v0.97 — Developer Notes now debates itself
bug-reportsclaudeaifeature
Developer Notes now runs a three-model chain: Sonnet drafts, a powerful OpenAI model challenges it as a skeptical peer, Sonnet synthesizes the final answer incorporating the critique.
## What's new in v0.97
**Developer Notes now runs a draft → challenge → synthesize chain.** Every new bug report still auto-generates Developer Notes the moment it's created — but now the output goes through three steps instead of one. Claude Sonnet writes the initial diagnosis, a powerful OpenAI model reviews it as a skeptical peer reviewer (pushing back on weak reasoning, flagging missing considerations, and offering alternative hypotheses), and Sonnet produces the final analysis incorporating the critique.
**Why two models?** Two LLMs pushing on each other tend to catch things a single pass misses. The challenger's specific brief is "don't summarize, don't agree politely — tell us what's wrong with this draft." The final output is the distilled result.
**Visible on the report page.** A small purple "Challenged by <code>gpt-5</code>" chip appears next to the "Developer Notes" heading whenever the chain ran, and hovering the chip shows a preview of the critique. When the challenger step is skipped (platform config change, transient API error, etc.) the draft is used as the final answer and the chip is hidden — no user-visible breakage.
**Under the hood.** Three new columns on bug reports persist the full chain: the pre-challenger draft, the critique, and the challenger model identifier. The `push_to_claude` MCP tool and the `POST /api/claude/push` REST endpoint both now return these extra fields for observability. No per-team Claude connection needed — the platform covers both the Anthropic and OpenAI keys.
v0.87 — "Where bugs are coming from" trends
analyticsfeatureui
New source-dispersion views on the bug reports list and the analytics page. Shared palette so the same color means the same source across both.
## What's new in v0.87
**Trends card on the bug reports page.** Added a compact "Where bugs are coming from" card at the top of `/dashboard/reports` with a mini-donut and the top-5 sources as chips — dashboard, agent, automation, mobile, exploration, test case, and any others present in the last 30 days. One click through to the full analytics view.
**Two new charts on the analytics page.** `/dashboard/analytics` now has a "By Source" donut next to the existing severity/status donuts, plus a "Bug Sources Over Time" stacked daily chart so you can see how the mix has shifted across the selected time range (7d/14d/30d/90d).
**Shared palette.** The same color maps to the same source on both pages — amber for manual dashboard bugs, purple for agent-filed, neon green for failed automations, blue for mobile tests, cyan for exploratory AI, pink for test cases, and so on.
**Bonus fix.** While wiring this up we found the old source breakdown was silently bucketing every bug as "manual" because the analytics endpoint wasn't selecting the metadata column. Now fixed — the distributions you see going forward reflect the real source mix.
v0.77 — Shorter Likely Fix Area + quick-submit bugs now get it too
bug-reportsclaudefix
The Likely Fix Area block is now a glanceable 1-2 sentence summary instead of a paragraph. Bugs filed via the quick-submit path also get one — previously they skipped it.
## What's new in v0.77
**Shorter Likely Fix Area.** Tightened the prompt so the block is 1-2 sentences or up to 3 short bullets — a glanceable summary at the top of each bug, with the deeper analysis one click away in *View Analysis*. Prior output sprawled into 3-5 bullet paragraphs; the new version fits naturally under Description without dominating the page.
**Quick-submit bugs now get Likely Fix Area.** Bugs filed via the Chrome extension / FAB SDK one-shot path previously only fired Developer Notes and skipped Likely Fix Area. They now fire both, matching the behavior of dashboard-created bugs and failed-automation auto-bugs.
**Under the hood.** Recorded a missing migration for the `claude_status` column so fresh local installs no longer diverge from production schema.
Renames Claude Analysis to Developer Notes. Auto-generates on every bug — no API key or "Send to Claude" button required. Prompts to regenerate when the description or attachments change.
## What's new in v0.74
**Developer Notes replaces Claude Analysis.** Every new bug report is automatically analyzed the moment it's created — no "Send to Claude" button to click and no per-team Claude API key required. The card on the bug report page is now called **Developer Notes** and contains both the root-cause analysis and the Likely Fix Area sub-block in one place.
**Powered by bugAgent, not by your wallet.** Developer Notes now runs on the platform's Anthropic key. You no longer need to bring your own key or wire up a Claude connection in Settings.
**Stays in sync with your bug.** When you edit the description or add/remove attachments, a green "Regenerate?" banner appears inside the Developer Notes card. One click refreshes both the analysis and the Likely Fix Area against the new content, so the notes always reflect the current state of the bug.
**REST + MCP mirror the simplification.** `POST /api/claude/push` and the `push_to_claude` MCP tool no longer require a per-team connection — they use the platform key too. Auto-fire on bug creation means most callers don't need to invoke these manually anymore; they're now primarily for explicit regenerate.
v0.64 — "Likely Fix Area" on bug reports
bug-reportsfeatureclaudemcp
New Sonnet-generated section on every bug report pointing at the files or modules most likely to need the fix. Grounded in your connected GitHub repo when available.
## What's new in v0.64
**Likely Fix Area.** Every bug report now includes a new auto-generated section — a short, bulleted Sonnet output that names the parts of the codebase where the fix most likely belongs, inside the existing Claude Analysis card.
**Grounded when a repo is connected.** When your project has a GitHub repo mapped and your team has a GitHub connection, the suggestions reference specific file paths pulled from real code in the repo. When no repo is connected, you get generic technology-appropriate guidance and a one-click link to wire one up in Settings → Integrations.
**Runs automatically.** Fires on every bug-report creation (human-filed or auto-generated from a failed automation run). A small Retry / Regenerate button on the report page re-runs it if you edited the description or want a fresh look.
**Powered by the platform.** Uses the bugAgent platform Anthropic key — no per-team Claude API key configuration required. This is the first step in moving AI features off BYO keys.
**MCP access.** The new `analyze_fix_area` MCP tool lets agents trigger or re-trigger the analysis for any bug by ID, and returns the output synchronously along with whether the repo-grounded path was used.
Small follow-ups on the web automation version picker: neon green highlight, sharper docs, and the sidebar 3-dot "Archive" option renamed to "Hide".
## What's new across v0.51–v0.53
**v0.51 — Neon green version highlight.** The version-picker accent (inset border on the script when a non-current version is selected, the vN chip in Run History, and the chip next to "View Automation Script →" on bug reports) switches from amber to the brand's neon green to better signal "this is a version-selection surface", not a warning.
**v0.52 — Version-control docs.** `GET /automations/:id` now documents `script_versions` (shape, 100-entry cap, sources) and the new per-run version fields; `PATCH` documents `version_source`; `POST /automations/:id/undo` cap corrected to 100. Both run endpoints and the `run_automation` MCP tool spell out the default-current contract explicitly — omitting `version_index` always runs the current live script.
**v0.53 — Sidebar "Archive" is now "Hide".** The 3-dot option on sidebar items reads "Hide" instead of "Archive" — it was never a true archive (client-side toggle, no DB state), and "Hide" matches what actually happens. Existing hidden items are preserved.
v0.50 — Failed-run bug reports deep-link to the exact script version
automationsfeaturebug-reports
Bug reports auto-created from failed web automation runs now link to the precise script version that executed. MCP run_automation and the CI/CD endpoint accept version_index too.
## What's new in v0.50
A follow-up to v0.49's version picker — the missing half that makes historical runs actually useful for triage.
**Bug reports pinpoint the version that failed.** When a web automation run fails, the auto-created bug report now records the exact script version that executed (label, source, and index). The description, the attached error log, and the metadata all note "Script version: v3 (manual edit)" when a non-current version ran, and the "View Automation Script →" link deep-links the editor to that version via `?version_index=N`.
**Editor opens on the linked version.** The automation editor reads `?version_index=N` on load and opens the picker on that entry, with the same amber "non-current" border as a manual pick. Clicking through from a failed bug report lands you on the script that broke — not whatever happens to be current that day.
**MCP and CI get version replay.** The `run_automation` MCP tool and `POST /api/v1/automations/run` (API-key CI endpoint) both accept an optional `version_index` with the same semantics as the dashboard. Agents and CI pipelines can now replay a specific historical version symmetrically with the UI.
**Run history list responses trimmed.** The `GET /api/automations/runs` list query now skips the per-run `script_snapshot` column (it's only needed for detail views and audit, not for the history list), keeping list responses lean.
v0.49 — Pick any prior script version and run it
automationsfeature
The web automation editor now has a version dropdown covering every saved edit. Run Now executes whichever version you pick, and Run History tags each row with the version that actually ran.
## What's new in v0.49
**Version picker in the script editor.** A dropdown next to the Undo button lists every saved version of your Playwright script — from the first edit through the current live version — annotated with source (manual edit, before AI optimize, before BrowserStack rewrite) and relative timestamp. Selecting an older version previews it in the editor with a subtle amber border; selecting "current" returns to the live script.
**Run Now respects the picker.** When a non-current version is selected, Run Now executes that exact historical script. The run record captures which version was used so failed-run triage no longer depends on your editor being in the same state days later.
**Version chip in Run History.** Any run that used a non-current version shows a small amber `vN` chip next to the trigger badge, with a tooltip explaining the source.
**Under the hood.** Every automation_runs row now stores the exact script text that executed, plus the version label and source — the data survives later edits and history-stack eviction. History cap raised from 10 to 100 entries, so deeply iterated automations keep a full ancestry.
**Bug fix.** The version indicator beside Undo used to stay stale after a manual Save (it only updated after AI-optimize or BrowserStack rewrites). It now increments on every persisted edit.
v0.39 — SDK refresh & error backlog cleared
dependenciesreliabilitycompliance
Bumped Supabase, Anthropic, and Sentry SDKs to their latest minor versions, and cleared a batch of long-resolved error-tracking entries. No user-visible product changes.
## What's new in v0.39
**Dependency freshness**
- @supabase/supabase-js 2.103.3 → 2.104.1 (dashboard + MCP server)
- @anthropic-ai/sdk 0.90.0 → 0.91.0 (dashboard)
- @sentry/astro + @sentry/node 10.49.0 → 10.50.0
**Reliability**
- Cleared 5 long-resolved error-tracking entries whose underlying bugs were fixed in earlier ships but left open in our monitoring dashboard.
- Tightened the internal SOC 2 weekly checkpoint playbook so future access reviews produce consistent, comparable numbers across checkpoints.
No user-visible product changes in this release.
The CoPilot recorder now uses a focused port of Playwright's own selector-generation algorithm. Every click captures the best locator the same way `npx playwright codegen` would — with uniqueness validation on every candidate, so strict-mode "multiple elements found" errors disappear at recording time rather than being healed at run time.
## What's new
- **Selectors match Playwright codegen**: the recorder ports Playwright's `selectorGenerator` logic (not a rewrite — same priority rules, same uniqueness checks). New recordings emit `page.getByTestId(...)`, `page.getByRole("button", { name: "Submit" })`, `page.getByLabel("Email")`, `page.getByText("Accept")` etc. — whichever candidate uniquely identifies the clicked element in the DOM at capture time.
- **Implicit ARIA role detection**: `<button>`, `<a href>`, `<input type="submit">`, `<h1>-<h6>`, `<nav>`, `<section aria-label>` all get their implicit ARIA role so `getByRole` works without author intervention.
- **Accessible-name computation**: follows WAI-ARIA spec order — `aria-labelledby` > `aria-label` > `<label for>` > wrapping `<label>` > `alt` > `value` > text content.
- **No more strict-mode surprises**: if `getByRole("link", { name: "Togetherness" })` would match multiple elements on the page, the generator skips to the next priority (label / placeholder / unique CSS) instead of emitting something that'll fail at run time.
## Upgrade
Download `bugagent-copilot-tester-v1.3.0.zip` from the repo, remove the old extension from `chrome://extensions`, then Load unpacked the 1.3.0 folder. The extension ID is preserved so saved auth and resources carry over.
Existing recordings in IndexedDB still work — they fall through to the old heuristic path. Re-recording gets you the new locators.
Improvements:
- Ship to prod: v0.29 — mouse/tap step-5 + expanded BS iOS compat
- runner: mouse/tap click as step 5 + more BS iOS compat rewrites (v0.29)
- Ship to prod: v0.28 — BS iOS compat rewrite for toHaveURL/toHaveTitle
- runner: BS iOS compat rewrite for unsupported expect() matchers (v0.28)
- Ship to prod: Chrome extension 1.3.0 — Playwright codegen-grade recorder
v0.22 + Chrome extension 1.1.0 — modal and popup handling
automationschrome-extensionreliabilityai
Two-sided fix for the cookie-banner / privacy-modal / custom-element-wrapper case that was defeating automations. Runner learned to force-click overlays; the extension recorder learned to pick the real button inside a wrapper instead of the wrapper itself.
## Runner (v0.22)
- **Wider self-healing**: when a locator action fails with "Internal error", "Failed to execute", "not visible", "intercepted", "pointer-events", or "outside of the viewport", self-healing now triggers. Previously these fell through as hard errors.
- **Force-click fallback**: after the initial click fails with a dispatch-level error, the runner retries with `{ force: true }` on the same selector — bypasses Playwright's actionability check, usually sufficient for cookie banners or fixed-position overlays. If that still fails, Claude picks a better selector and the force-retry runs on the new locator too.
- Healing log entries now tag force-click retries as `click (force)` so you can see in stdout which path got the action through.
## Chrome extension (1.1.0)
- **Smarter click capture for modals**: the recorder now uses `composedPath()` to drill into open Shadow DOM and picks the real interactive descendant (button / link / role=button / [data-testid]) when you click on a custom-element wrapper or generic container. Privacy banners, consent dialogs, and modal confirmations record as the actual button instead of the wrapper.
- Hit-testing by click coordinates disambiguates when a wrapper contains multiple interactive children.
Reload the extension (`chrome-extension/build.sh` or refresh unpacked) to pick up the recorder changes. The runner fixes apply to existing automations immediately — no rebuild required.
v0.21 — performance pass: faster pages with more users online
perfmulti-tenantinfra
Multi-tenant performance work. The dashboard's per-request team-membership check is now O(1) from a middleware-level cache, plus a migration adds targeted indexes on the hottest tables. You should feel it most on dashboards with several users active at once.
## What's new
- **Membership cache**: Every authenticated dashboard page now builds a Map of the user's active team memberships once in middleware. Authorization checks on detail and mutation endpoints become O(1) Map lookups instead of repeated database queries — a page that used to hit `team_members` four to six times per request now hits it once.
- **Indexes on hot tables**: Migration 114 adds composite indexes on `team_members`, `messages`, and `changelog_entries` plus partial indexes on the cron-polled schedule tables. These target the filter patterns the platform actually uses.
- **Stats refresh**: `ANALYZE` runs inline so the query planner picks up the new index stats immediately instead of waiting for autovacuum.
## Why
Our audit found `team_members` serving 96.6% sequential scans (22M tuples read on a 68-row table) and `changelog_entries` at 100% seq-scan with zero index scans ever. Both are classic multi-tenant hot-path problems: small tables that individual queries find cheap, but the collective read volume dominates under concurrent load.
v0.11 — versioning convention + real /health version
infraobservabilityversioning
bugAgent now has a formal build-number scheme so every shipped change is observable. The runner's /health endpoint reports the current product version, replacing a stale hardcoded value.
## What's new
- **Versioning scheme**: bugAgent now tracks a canonical product version in root `package.json`. Small changes bump by 0.01, medium by 0.1, large by 1.0. The Chrome extension follows standard semver on its own cadence.
- **/health now truthful**: The automation runner's `/health` endpoint returns the actual product version read at startup (was pinned to `"2.1"` for ages and disconnected from reality). Useful for confirming what commit prod is running without digging into Railway.
- Every ship-to-prod commit going forward will include the new version in the one-liner.
## Try it
`curl https://bugagent-automation-production.up.railway.app/health` returns `{"status":"ok","version":"0.11",...}` once the deploy lands.
Chrome extension: scrolls now recorded in Playwright captures
chrome-extensionrecordingautomations
The CoPilot extension's Record tab now captures scroll gestures alongside clicks, fills, and submits. Generated Playwright scripts reproduce the same scroll positions on replay — useful for lazy-loaded feeds, infinite scroll, and reviewing the run video.
## What's new
- **Scroll capture**: While recording a session, window scrolls and element-level scrolls (modals, panels with `overflow:auto`) are now included in the action log.
- **Settled-position emit**: One wheel gesture fires 20–40 scroll events; the extension debounces per-target (250ms) and emits the final resting position, keeping the action log tight.
- **Replay fidelity**: Generated scripts use `page.evaluate(({x,y}) => window.scrollTo(x,y), ...)` for window scrolls and `locator.evaluate(...)` for element scrolls, so Live/Virtual replays land at the same vertical offsets the human saw — important for lazy-loaded feeds and scroll-triggered reveal animations.
- **Smart dedupe**: Consecutive scrolls to the same target are collapsed to the final position, so "scroll down → glance → scroll back up" round-trips don't bloat the script.
## How to try it
Reload the bugAgent CoPilot Chrome extension (rebuild with `chrome-extension/build.sh` or refresh the unpacked extension), start a recording, scroll around, stop. The generated Playwright script will include the scroll steps.
Automations: self-healing locators
automationsaireliability
Every Playwright automation now auto-repairs fragile selectors at run-time. If a click / fill / press times out, the runner asks Claude for a better CSS selector from the live page and retries once — no more flaky failures from class renames or custom-element tag names.
## What's new
- **Self-healing locators**: When an action like `page.locator("tb-banner-wrapper").click()` times out or matches zero elements, the runner captures the page's interactive DOM, asks Claude for a replacement CSS selector, and retries the same action once with the healed locator.
- Catches both fragile generations (raw custom-element tag names, brittle class-only selectors) and page drift over time — a class rename six months from now auto-repairs instead of paging the on-call person.
- **Assertions stay strict**: `expect(locator).toBeVisible()` and other test assertions are deliberately NOT healed — a failed assertion still fails the test honestly.
- **Privacy**: healing runs in the Node test worker on our runner, not inside the remote browser. The runner secret and dashboard URL never leave our infrastructure. Only the page's anonymized interactive-elements DOM is sent to Claude.
- **Works everywhere**: applies to every Node Playwright automation — both local Virtual runs and real BrowserStack devices (desktop, Android, and iOS).
## How to see it
Run any Node automation. When healing kicks in you'll see entries in the run's stdout like:
```
[bugAgent] self-healed locator: "tb-banner-wrapper" → "[data-testid=\"banner\"]" for click() after: locator.click: Timeout 30000ms exceeded.
```
No configuration required — it's on by default for every automation.
Automations now support password-only front-door gates (no username needed), pre-auth works on BrowserStack including iOS, and when the built-in heuristics can't match your signin page the runner falls back to a Claude-built locator plan.
## What's new
- **Password-only auth**: If your app sits behind a single front-door password (staging gates, "coming soon" pages, access codes), leave the username field blank on the automation's Authentication panel. Pre-auth now fills and submits just the password.
- **Pre-auth on BrowserStack**: The authenticated session now carries through on BrowserStack runs — including real iOS devices. Previously pre-auth was skipped on BS, which is why some scripts stalled at the login wall.
- **Claude-assisted fallback**: When our heuristic selectors can't find the sign-in fields (custom element wrappers, unusual input names, etc.), the runner captures the sign-in form HTML, asks Claude for a locator plan, and executes it. No configuration needed — it kicks in automatically whenever the heuristic path misses.
## Privacy
Your credentials never leave our infrastructure. The Claude fallback ships only the sign-in page's HTML plus literal "USERNAME" / "PASSWORD" placeholder tokens; the runner substitutes the real values at execute time.
## Where to find it
Automations → open any automation → Authentication card. Toggle "Enable pre-auth", paste the sign-in URL, set the password (optionally username), save.
Pre-auth credentials, richer Live artifacts, Claude script rewrite
Automations can now log in before a test runs (encrypted credentials); Live runs produce a shareable video + Playwright trace that works without a BrowserStack login; Python scripts get a one-click Claude rewrite to the shape bugAgent Live requires.
## Pre-auth credentials on automations
Attach a sign-in URL + username + password to an automation and the runner will log in before executing your Playwright script, handing an authenticated browser context to the test. The runner walks common username / password selectors (``data-testid``, ``type=email``, ``type=password``, ``name``, ``id``, ``placeholder``, ``autocomplete`` hints) and handles two-step flows (email → Next → password).
Passwords are encrypted at rest with AES-256-GCM. ``GET`` responses never return the ciphertext; they include ``auth_has_password: true`` so the UI can show "password set" without leaking anything.
## Richer Live (BrowserStack) artifacts
Previously the Live-run video embedded in the run detail was sometimes just the remote VM desktop. Two upgrades:
- **Session videos re-hosted**: the runner now fetches BrowserStack's session video after every Live run and re-hosts it on bugAgent's own storage. Watch in the dashboard with no BrowserStack login required — useful for multi-tenant teams where only the operator has a BS seat.
- **Playwright traces**: every Live run (Node or Python) now captures a full Playwright trace ``.zip``. The run detail shows an "Open replay →" link that points at ``https://trace.playwright.dev/?trace=<url>`` for a scrubbable DOM timeline + network + console + per-action screenshots. Public viewer, no auth, works in any browser.
Also added ``slowMo: 500ms`` on Live runs so fast tests actually produce visible playback instead of a one-second blur.
## "Rewrite for bugAgent Live" for Python scripts
Python scripts that use the ``sync_playwright()`` context manager can't run on bugAgent Live (the SDK wraps pytest and needs a ``def test_<name>(page: Page):`` function). The automation detail page now detects this and shows an amber banner with a one-click **Rewrite with Claude** button.
Clicking calls a new endpoint that returns a pytest-style rewrite, previewed as a side-by-side diff (red = original, green = rewrite). Accept & save writes it to the script with a ``bs_compat_rewrite`` version tag so the standard **Undo** rolls it back if you change your mind.
## Expanded BrowserStack device catalog (~70 options)
The Live picker was refreshed from BrowserStack's canonical Playwright list. Desktop now covers Chrome + Edge on Windows 10/11 and macOS Ventura → Sequoia, plus the Playwright-bundled engines (Chromium / Firefox / WebKit). Real devices added: iPhone 15 → 17 (all variants), iPad Pro/Air/10th/9th, Samsung Galaxy S21 → S26 (Ultra / Plus / Std), Pixel 6 → 10, OnePlus, Motorola.
## New API endpoint
``POST /api/automations/:id/rewrite-for-bs`` — returns a pytest-style rewrite of a Python Playwright script, preserving behavior. See [API reference](/api-reference#post-automations-rewrite-for-bs).
## Docs
Pre-auth fields (``auth_enabled``, ``auth_signin_url``, ``auth_username``, ``auth_password``) on [PATCH /automations/:id](/api-reference#patch-automations-id), ``trace_url`` + re-hosted ``video_url`` behavior on [GET /automations/runs](/api-reference#get-automations-runs), new rewrite endpoint docs linked above.
Daily: Ship to prod: expand BrowserStack device catalog
Change who owns a case from inside the run carousel — the new assignee sees the bell alert and email right away.
## Reassign during a run
The carousel card on a test run now has the same "Assigned:" picker the suite detail page has. Hand a case off without leaving the run — pick the teammate from the dropdown and save is automatic.
## Notifications fire on the real change
When the assignee actually changes to a new teammate, they get a bell message + email with a neon green "Open test run" button (raw URL included for plaintext clients). Re-saving the same person or clearing the assignee is silent — same rules as the suite-case flow.
## Instant bell refresh
Self-assigning used to leave you waiting up to 30 seconds for the header to poll. Now the bell reloads the moment the save succeeds.
## API
New endpoint: `PATCH /api/test-runs/:id/results` with `{ case_id, assigned_to }`. Documented on the API reference page.
Daily: Ship to prod: neon green + raw URL on test assignment emails
dailyautomationaiapi
29 commits — Improvements: - Ship to prod: neon green + raw URL on test assignment emails - emails: neon green CTA + raw URL fallback on test assignment emails
Improvements:
- Ship to prod: neon green + raw URL on test assignment emails
- emails: neon green CTA + raw URL fallback on test assignment emails
- Ship to prod: docs sweep for BrowserStack runs + schedules + create_schedule
- docs: fill BrowserStack Live gaps on runs + schedules endpoints
- Ship to prod: map browserName for Python BrowserStack endpoint
Daily: Ship to prod: rename New Script button to Upload Script
dailyautomationaimarketingapi
73 commits — New features: - add CLAUDE.md — repo-level instructions for Claude Code
New features:
- add CLAUDE.md — repo-level instructions for Claude Code
Improvements:
- Ship to prod: rename New Script button to Upload Script
- automations: rename "New Script" button to "Upload Script"
- Ship to prod: Python Playwright scripts + modal cleanup + BrowserStack fix + docs
- CLAUDE.md: document the Automation Runner (Railway + conventions)
- CLAUDE.md: strengthen "always update public docs" rule
Import test cases from Figma
test-casesfigmaaiimport
New Import dropdown in the test-cases toolbar lets you upload a Figma zip export; Claude analyzes each frame and drafts test cases into a folder you pick.
## Import test cases from Figma
The test-cases page has a new **Import ▾** button next to AI Generate. The first option, **From Figma**, takes a zip of Figma frame exports and turns it into a set of draft test cases in the folder you choose.
**What you get**
- Upload up to 100 MB (around 300–500 frames)
- Pick an existing folder or create a new one inline
- Optional project context ("B2B dashboard for logistics") to sharpen generation
- Quick (≤5 cases/screen) or Thorough (≤12 cases/screen) depth
- Progress UI with phase, screen counts, and final totals
**How it works**
1. Your zip uploads directly to Supabase Storage via a signed URL.
2. The MCP worker on Railway unzips, dedupes near-duplicate frames (exact-match hash + state-suffix name normalization), then runs a four-pass pipeline on each unique screen: classify, per-screen cases, flow-level cases (screens sharing a name prefix form a flow), and a self-critique pass. The rubric is prompt-cached so the taxonomy is paid for once per run.
3. Cases are written to your chosen folder as drafts with `ai_generated=true`, tagged `figma-import`, and linked back to the source frame name.
4. The uploaded zip is deleted from Storage once the job finishes or fails.
Requires a connected Anthropic API key in Settings → Integrations. Cases start as drafts so you can review and edit before activating.
Test Cases: sort preference now persists across navigation
test-casesbug-fixux
Changing the Cases-tab sort to Oldest now sticks, even when the URL carries unrelated filters like a folder or suite selector.
### What changed
Previously, selecting **Oldest** on the Cases tab and then clicking a folder in the sidebar (or following any link with query params) snapped the sort back to **Newest**. The restore logic treated URL-vs-localStorage as all-or-nothing — a single URL param skipped saved preferences entirely.
### The fix
`restoreFilters()` now merges per-field: URL values still win for shareable links, but your saved sort preference fills in whenever the URL does not explicitly set it. Same behavior already in place for search, priority, type, status, and tag filters.
No action needed — just works.
Shipped to production: test-cases revamp end-to-end
releasetest-casesshipped
69 commits of test-case management work merged from develop to main. Railway is rolling out now. Highlights: folders, sub-suite runs, carousel review with voice control, analytics Reports tab, PDF export.
## What went live
`develop` was fast-forwarded and merged into `main` via commit `0591cc2`, then pushed — Railway picks it up automatically. Production is rolling out now.
### Shipping highlights (69 commits)
- **Folders** with 3-level nesting and drag-drop reorder on the Cases sidebar
- **Sub-suite auto-expansion** on runs (migration 106) — running a parent suite now includes every descendant sub-suite's cases, deduped, with the originating `suite_id` recorded per result row
- **Carousel-style run review UI** — one case at a time, jumpbar dots, prev/next arrows, auto-advance on result, picks up where you left off
- **Keyboard shortcuts**: `P` Pass · `F` Fail · `B` Block · `S` Skip
- **Voice control** via Web Speech API — Pass/Fail/Block/Skip, Next/Previous, Add notes dictation, Save notes, Voice off. `?` help popover with the full vocabulary.
- **Reports tab Tier 1–6**: KPIs with prior-period deltas, pass-rate trend chart, failure analysis (failing / flaky / regressed / failing suites), suite health, coverage, tester productivity, one-click PDF export
- **Status-check fix** (migration 107) — `aborted` and `archived` were missing from the DB CHECK on some environments
- **AI Assistant**: expanded system prompt covering every new capability plus data-export guidance and team-scoping guardrails
- **MCP server**: `createTestRun` now mirrors the dashboard (was silently creating empty runs); new folder + report tools
- **Sidebar**: Compliance hidden on prod builds while the feature iterates
- **Public docs + API reference** updated to match
Hide Compliance on prod + test-case AI examples in docs
fixdocscomplianceaitest-cases
Compliance is hidden from the nav on production builds (still visible in local dev while we iterate). Docs now include AI Assistant example prompts for folders, voice, analytics, and export.
## Two changes
### Compliance menu hidden on prod
The Compliance item in the main nav is now gated on `import.meta.env.DEV` — visible during local `astro dev`, hidden on Railway production builds (and any `astro build` preview). The underlying pages and API endpoints are unchanged, so direct-URL navigation still works for dev work; this is purely a nav-visibility guard while the feature keeps evolving.
### Docs: test-case AI examples
The public docs now reflect the expanded AI Assistant capabilities. Two updates:
1. **AI Assistant examples list** — swapped in the fuller prompt set: *"How's our pass rate trending?"*, *"What cases are flaky this month?"*, *"Create a Smoke/Auth folder and move my login cases into it"*, *"Walk me through running my Billing suite hands-free"*, *"Export the results of run X as CSV"*, *"Generate a QA stakeholder PDF for the last 30 days"*, and more.
2. **Test Cases feature section** — substantially updated:
- Folders + Reports tab added to Core Concepts
- New **Hands-Free Run Execution** subsection covering the carousel, P/F/B/S shortcuts, and the voice command vocabulary
- New **Ask the AI Assistant** subsection with the example prompts + scoping note (the assistant never pulls data from teams you aren't a member of)
- Workflow rewritten to reference folders → suites → carousel → Reports-tab → PDF
- Type enum corrected (dropped `e2e`/`accessibility`, added `exploratory` to match the DB CHECK)
AI assistant + marketing: voice test runs + data export help
featureaitest-casesvoicedocs
The AI Assistant now knows the whole test case surface — folders, sub-suite auto-expansion, the carousel, P/F/B/S shortcuts, voice control, the Reports-tab analytics, and how to help you export your own data. Marketing page updated to match.
## What's new
### AI Assistant
Significant rewrite of the TEST CASE MANAGEMENT section of the assistant's system prompt. The assistant now understands:
- **Three concepts, not one** — folders (organizational), suites (test plans, many-to-many), runs (executions). Users confused them constantly; the assistant now disambiguates and explains the difference when asked.
- **Valid type enum** synced to the DB CHECK — so it won't suggest `e2e` or `other` (DB rejects those). Steers users to `integration` or `functional` instead.
- **Folder commands** — list folders, create folders with optional parent nesting.
- **Sub-suite expansion warnings** — before creating a run from a parent suite, the assistant explicitly warns the user that every descendant sub-suite's cases will be included.
- **Carousel + keyboard + voice** — the assistant can coach a tester through running a suite with the carousel UI, P/F/B/S shortcuts, and the full voice vocabulary (Pass/Fail/Block/Skip/Next/Previous/Add notes/Save notes/Voice off). Knows Fail stays put for dictation.
- **Reports analytics** — five new `LIST_DATA` types (overview, failures, suite health, coverage, tester productivity). The assistant can now answer "how's our pass rate?" / "what's flaky?" / "what haven't we tested?" with real numbers.
- **Data export help** — CSV/JSON per run, the full Reports-tab PDF (guided verbally since it's a UI button), bulk catalog export via list at per_page=500, and a historical audit-trail pattern. Explicit guardrail: team scoping applies and cross-team requests are declined.
### Under the hood
The AI chat panel's `LIST_DATA` dispatcher was extended to route the new types (`test_case_folders`, `test_reports_*`) to the right endpoints. Single-object analytics responses are auto-wrapped so the renderer handles them uniformly. New `CREATE_TEST_CASE_FOLDER` marker lets the assistant spin up folders end-to-end.
### Marketing
The MCP feature page gained a hands-free-execution paragraph on the Test Cases card covering the carousel, P/F/B/S shortcuts, and the full voice command list (with browser-support note).
Test runs: ? help popover lists every voice command
featuretest-runsvoiceux
A small ? button next to the Voice toggle reveals a grouped popover listing all voice commands with examples. Hover or tab to open.
Added a **?** help button next to the Voice toggle on the test-run carousel. Hover it (or tab onto it) to get a grouped popover with every voice command:
- **Set result** — "Pass", "Fail", "Block", "Skip" (with the auto-advance quirks called out; Fail stays put so you can dictate)
- **Navigate** — "Next", "Previous" / "Back"
- **Dictate notes** — "Add notes" → speak → "Save notes"
- **Toggle** — "Voice off"
An example row at the bottom walks through a mixed flow: "Pass" → auto-advances → "Fail" → "Add notes, the login button did not respond, save notes" → "Next".
The inline voice-command line that used to appear below the card when voice was on has been removed — the popover is now the canonical home for voice discovery, accessible any time.
Test runs: P/F/B/S shortcuts + voice control
featuretest-runsaccessibilityvoice
Keyboard shortcuts now map to action letters (P for Pass, F for Fail, B for Block, S for Skip). New Voice button enables hands-free review with speech commands and dictated notes.
## Keyboard shortcuts
On the test-run carousel, the result shortcuts changed from `1/2/3/4` to **letters that match the actions**:
- **P** — Pass
- **F** — Fail
- **B** — Block
- **S** — Skip
Same behaviour: press the key, the matching button clicks, the carousel auto-advances (Fail stays put so you can dictate an Actual Result).
## Voice control
New **Voice** button in the carousel topbar between the dots and the Next arrow. Click it to go hands-free.
**Commands (idle mode):**
- "Pass" / "Fail" / "Block" / "Skip" — set the result on the current case
- "Next" — advance
- "Previous" / "Back" — go back
- "Add notes" — switch to transcription mode
- "Voice off" — disable voice entirely
**Notes mode:** every utterance is appended to the active card's notes field (leading-cap normalized, single-space joined). Exit with **"Save notes"** or **"Voice off"**.
Navigating cards mid-dictation commits the current transcription before jumping — you don't lose words because the textarea never blurred.
## UI states
- **Off** — muted mic icon, neutral.
- **Listening** — neon green, "Listening" label, pulsing green dot.
- **Notes…** — amber, "Notes…" label, pulsing amber dot (visible signal that speech is being transcribed, not commanded).
## Browser support
Uses the Web Speech API. Works in Chrome, Edge, and Safari. In Firefox (no support), the button disables itself with a clear tooltip and the keyboard shortcuts still work — nothing else regresses.
Test-run carousel: stuck spinner fix + arrows moved to topbar
fixtest-runsux
Clicking dots back and forth no longer leaves a stuck "Loading details…" spinner. Prev/Next arrows are now on the same row as "Case N of M", not a separate row below.
Two small tweaks from early feedback on the new carousel review flow:
- **Stuck loading spinner on revisit.** Clicking dots back and forth could leave "Loading details…" visible on top of already-loaded details. Fixed — the cache-hit path now hides the loader just like the fetch path does, and both re-query the element by id so a detached DOM reference can't leak through.
- **Arrows moved onto the topbar.** Prev / Next now sit on the same row as "Case N of M" and the dots. One line of eye-travel instead of having to look top AND bottom of the card.
Test runs: new carousel review flow with auto-advance
featuretest-runsux
Reviewing a run is now a one-at-a-time carousel with arrow/keyboard nav. Set a result, the carousel auto-advances. When every case has a result the run auto-completes and Export Results appears.
## What's new
Reviewing a test run used to mean scrolling through a stacked accordion and clicking each case header to expand it. The new flow is closer to how testers actually work: land on one case, read it, pick a result, slide to the next.
### Carousel
- **One case visible at a time** — all the details (steps, preconditions, fail section, notes) are always in view on the active card. No more click-to-expand.
- **Jumpbar** across the top: "Case N of M" plus one clickable dot per case, coloured by its current result status (green/red/orange/amber/grey). The current dot is enlarged with a neon outline. Click any dot to jump — great for going back to edit a specific case.
- **Prev / Next** buttons at the bottom. Disabled at the ends.
- **Keyboard shortcuts**: `←` `→` navigate · `1` Pass · `2` Fail · `3` Blocked · `4` Skip. Ignored while typing in text fields.
### Auto-advance
After setting **Pass**, **Blocked**, or **Skip** the carousel waits a beat so your click registers visually, then slides to the next **untested** case. If everything forward has a result it wraps backward to find an untested one. If the whole run is done, it stays put and the Run-completed notice appears below.
**Fail** does NOT auto-advance — you almost certainly want to type an Actual Result and spawn a bug report before moving on.
### Completion
The manual **Complete Run** button is gone. Three coordinated visual signals all key off `run.status === 'completed'`:
- Green **"Run completed"** card below the active carousel card, showing final pass/fail/blocked/skipped counts.
- "Run completed" pill in the summary panel.
- **Export Results** (CSV / JSON) only appears once the run is actually completed. No point downloading an in-progress run for stakeholder distribution.
The server already auto-completes when untested hits 0; the frontend now lights up all three visuals in the same tick.
### Pick up where you left off
On page load the carousel lands on the **first untested** case, not always #1. Returning to a half-finished run drops you right where you stopped.
Reports: removed search box, date window drives everything
changetest-reportsux
Reports is now strictly a windowed-analytics view. Pick a date range — every panel refetches, including the PDF export. Dates are inclusive on both ends.
## What changed
The search-by-run-name input on the Reports tab is gone. It was a hangover from when this tab was just a filtered list of runs — Reports is fundamentally a windowed-analytics view, not a run finder. (Want to search for a specific run? Use the Runs tab.)
The date inputs now drive the entire tab: **KPIs · trend · failure analysis · suite health · coverage · tester productivity · the recent-runs table · the PDF export** all share the same `[from, to]` window. Change a date — every section refetches. Hit **Export PDF** — the PDF mirrors exactly what's on screen.
## Behavior
- Dates are **inclusive on both ends**. Picking `2026-04-15 → 2026-04-15` returns every run completed anywhere within the UTC calendar day 2026-04-15.
- The "Clear" link clears the date range and resets to the default trailing 30 days.
- Previously the run-list endpoint used local-time boundaries while the analytics endpoints used UTC — up to ~8h of drift meant the table and the KPI strip could disagree on which runs belonged to the same window. Unified on UTC-inclusive across every endpoint.
Reports: fix search box returning every run + wire pagination
fixtest-reportssearch
Typing in the Reports-tab search did nothing — the backend ignored the param. Same for pagination (stuck at page 1). Both are now honored.
## The bug
Typing anything in the Reports-tab search box still listed every run. The frontend was building `?search=...` into the URL correctly, but `/api/test-reports` never read it — no `ilike`, no filter at all. Pagination had the same pattern: the UI built `?page=...&per_page=...`, the endpoint ignored them, and the 'Page N of M' math clamped itself to 1 page no matter how many runs matched.
## Fix
- **Search** now filters by run name OR description (`ilike %q%`). User input is sanitized against PostgREST-reserved characters (`%`, comma, parens) so something like `run(1), v2%` can't break the query parser or match unintended rows.
- **Pagination** honors `page` and `per_page` (capped at 100/page) with a proper range slice. The response carries the full-match total (via PostgREST's `count: exact`) so 'Page N of M' now reflects reality.
Agent-created test runs now contain the right cases (including sub-suite descendants). New MCP tools for folders, report KPIs, and failure analysis. Marketing + API docs updated to match.
## The real bug
MCP's `create_test_run` was inserting a `test_runs` row but NO `test_run_results` — the run appeared empty on the dashboard. It also used `status="in_progress"` where the dashboard uses `"pending"`, and had no awareness of the sub-suite expansion that's been in the web UI since the Tier 5 release.
## Fix
`create_test_run` now mirrors `POST /api/test-runs` exactly:
- Walks `get_suite_descendants` to collect the parent + every nested sub-suite.
- Fetches every linked case in that subtree, ordered by `sort_order` so your drag-drop order survives.
- Parent-first dedupe: a case linked to BOTH the parent and a sub-suite is enqueued exactly once, attributed to the parent.
- Inserts the run with `status="pending"` + prefilled `results_summary`, then bulk-inserts `test_run_results` with the originating `suite_id` per row so result pages can group by origin.
## New MCP tools
- **`list_test_case_folders`** / **`create_test_case_folder`** — folders are the one-per-case hierarchy (distinct from suites, which are M:M test plans). Nestable up to 3 levels.
- **`get_test_reports_overview`** — headline KPIs (pass rate, runs completed, cases executed) with deltas vs the prior equivalent-length window.
- **`get_test_reports_failures`** — failing / flaky / regressed / failing-suites, same rules as the Reports tab failure-analysis grid.
All tools call Supabase directly — no dashboard HTTP hop, same latency as the web UI.
## Marketing + docs
- `mcp.astro` restructured the Test Cases feature card into three labelled sub-sections (Cases & Folders · Suites & Runs · Reports) with a rewritten example workflow.
- `api-reference.astro` gained four previously-undocumented endpoints: `POST /test-cases/bulk`, `GET /test-cases/:id/links`, `GET /test-cases/review-candidates`, plus cross-referencing to the existing folder and report endpoints.
- Corrected the `type` enum on MCP and in the docs — `e2e` / `accessibility` / `other` are NOT DB-valid; the DB CHECK allows `exploratory` instead.
Reports PDF: fix WinAnsi encoding error on generate
fixtest-reportspdf
Clicking Export PDF threw "WinAnsi cannot encode ▲". Swapped unicode arrows and comparison operators for WinAnsi-safe alternatives. The on-screen dashboard keeps the triangular arrows — only the PDF was affected.
## The bug
Clicking **Export PDF** on the Reports tab surfaced:
```
WinAnsi cannot encode "▲" (0x25b2)
```
pdf-lib's `StandardFonts.Helvetica` uses WinAnsi (CP1252) encoding, which covers standard Latin-1 and a handful of common punctuation glyphs — but not the triangular delta arrows (U+25B2 / U+25BC), the pass↔fail lozenge (U+2194), or the ≤/≥ comparison operators. `save()` throws on any unencodable char.
## Fix
Replaced four unsafe literals with WinAnsi-safe alternatives:
- KPI delta chips: `▲ 4%` / `▼ 2` / `→ 0` → `+4%` / `-2` / `0`
- Coverage bucket: `≤7d` → `<=7d`
- `≥50% fail rate` → `50%+ fail rate`
- `pass↔fail flips` → `pass/fail flips`
The on-screen dashboard keeps the triangular arrows — browsers render them fine; only the PDF was affected.
Reports: Tier 6 — one-click PDF export
featuretest-reportspdfexport
A new "Export PDF" button on the Reports tab downloads a 3-page brand-styled QA report. Same date range, project, and suite filter as what's on screen — share-ready in one click.
## What's new
An **Export PDF** button now sits in the Reports tab toolbar (next to the date filters). One click downloads a 3-page QA report styled in the bugAgent palette, suitable for stakeholder distribution — managers, auditors, anyone who wants the digest without logging in.
## Page contents
**Page 1 — At a glance**
- 4 KPI tiles (Pass rate · Runs · Avg duration · Cases executed) with deltas vs the prior equivalent-length window
- Pass-rate trend line chart (weekly buckets, current period)
- Coverage summary: KPIs + horizontal staleness distribution bar
**Page 2 — What to fix**
- Top 5 failing cases (≥50% fail rate, min 3 runs)
- Top 5 flaky cases (most pass↔fail flips)
- Top 5 recently regressed cases (passed before, failing now)
**Page 3 — By suite + by tester**
- Top 12 suites by health (worst pass rate first)
- Top 10 testers by cases run
## Filters carry through
The PDF reflects the same date range, project, and suite filter that's active on the dashboard — no separate scope picker. Filename includes the date range so saved files stay self-describing in a Drive folder: `bugagent-qa-report-2026-03-21_2026-04-20.pdf`.
## API
New endpoint: `GET /api/test-reports/export.pdf` — same `from`/`to`/`project`/`suite` query params as the JSON analytics endpoints. Documented at `/api-reference`.
Reports: Tier 5 — tester productivity
featuretest-reportsanalyticsteam
Per-tester rollup with cases run, pass rate, avg time per case, runs assigned + completed, bugs filed via runs, and a "backlog" flag for testers with assigned-but-not-executed work.
## What's new
A new **Tester productivity** section sits between Coverage and Archive candidates. It's a per-team-member leaderboard for capacity planning and bottleneck detection — not a quality scoreboard.
### Workspace totals
Three counters across the top: **Active testers** (≥1 execution in the period), **Cases executed**, and **Bugs filed via runs**.
### Per-tester table
One row per tester with:
- Initials avatar coloured by a hash of the user-id, so each tester keeps the same colour across visits — visual continuity matters when scanning a multi-week retrospective.
- **Cases-run distribution bar** — width relative to the busiest tester (leader fills 100%) for at-a-glance ranking.
- **Cases**, **Pass rate**, **Avg time/case**, **Runs assigned (X done)**, **Bugs filed**, **Last active**.
### Bottleneck flag
Testers with **≥3 assigned runs and ≤25% completed** get a `backlog` badge plus an amber side stripe and faint background tint. Stands out without screaming. This is the row a team lead opens this section to find.
## Important caveats
- **Pass rate per tester is informational only.** It mostly reflects which cases that tester was assigned (smoke on a stable suite vs hammering a fresh release). The UI surfaces a footnote to make this explicit.
- **bugs_filed counts only bugs found via test runs** — bug reports linked from `test_run_results.bug_report_id`. Standalone bugs filed from the Bugs tab, MCP, or browser extension don't count, because this metric is about *testing* output, not bug filing in general.
- **avg_duration** caps per-case duration at 24h to defend against forgotten timers wrecking the average.
## API
New endpoint: `GET /api/test-reports/tester-productivity` — same `from`/`to`/`project`/`suite` query params as the other Tier 1+2+3+4 endpoints. Documented at `/api-reference`.
Reports: Tier 3 — coverage report
featuretest-reportsanalyticscoverage
A new Coverage section answers "what have I NOT been testing?" — workspace coverage %, untouched/never-run counts, a 6-bucket staleness distribution bar, and a per-suite coverage rollup sorted worst-first.
## What's new
A new **Coverage** section sits between Suite health and Archive candidates. It answers a question the prior tiers couldn't: *what have I NOT been testing?*
### KPI strip
Four cards — **Coverage %** (cases run in the period / total active cases), **Untouched in period**, **Never run** (lifetime, not period-scoped), and **Active cases** total.
### Staleness distribution
A single horizontal bar segmented into six buckets, each tinted green→red as cases get older:
- Last 7 days · 8–30 days ago · 31–90 days ago · 91–180 days ago · Older than 180 days · Never run
Segments are proportional to case counts and hover-tooltipped. A legend underneath always shows every bucket so even tiny segments stay discoverable.
### Coverage by suite
Per-suite rollup using cases linked via `test_suite_cases` (the test-plan relationship). Sorted **worst-coverage first** so the under-tested area is row 1. Suite name links to the suite's detail page — the report doubles as navigation.
## Definitions
- **Active** = `test_cases.status = 'active'`. Drafts (work in progress) and deprecated (intentionally retired) are out of scope.
- **Covered in period** = at least one execution within the selected window (status not `untested`).
- **Never run** is **lifetime**, not period-scoped — catalog hygiene means truly never-run, not "missed this sprint".
## API
New endpoint: `GET /api/test-reports/coverage` — same `from`/`to`/`project`/`suite` query params as the other Tier 1+2+4 endpoints. Documented at `/api-reference`.
Reports: Tier 2 — suite health table
featuretest-reportsanalyticssuites
A sortable per-suite rollup with pass rate, trend arrow, runs, last activity, and open bug count. Worst-pass-rate suites bubble to the top so the most-broken area is the first thing you see.
## What's new
A new "Suite health" section sits between the failure analysis grid and the run list. One row per suite that had activity in the period, showing the metrics a QA lead actually needs to spot "where's the rot?":
- **Suite** (with depth indent so nested suites read as sub-rows)
- **Pass rate** (coloured: ≥80 green, ≥50 amber, else red)
- **Trend** ▲ ▼ → vs the prior equivalent-length window (±3% threshold; new suites get a "new" badge)
- **Runs** completed in the period
- **Cases run**
- **Last run** (relative time)
- **Open bugs** — bug reports linked from any result, status not closed/resolved
Default sort is **worst pass rate first**, so the row at the top is the one you're here to find. Idle suites (zero runs in the current period) are filtered out so the table stays focused on what actually moved. The suite name is a link to the suite's detail page, so the report doubles as navigation.
## API
New endpoint: `GET /api/test-reports/suite-health` — same `from`/`to`/`project`/`suite` query params as the other Tier 1+4 endpoints. Suite attribution uses the per-result `suite_id` for an exact rollup. Documented at `/api-reference`.
Reports: KPIs, pass-rate trend, and failure analysis
featuretest-reportsanalytics
The Reports tab is no longer just the Runs tab with extra columns. New: 4 KPI cards with deltas vs prior period, a weekly pass-rate trend chart, and a failure-analysis grid showing what to fix this week.
## What's new
The Reports tab gets a real overview, not a list. Two new sections sit above the run table:
### Tier 1 — KPI strip + pass-rate trend
Four cards across the top: **Pass rate · Runs completed · Avg run duration · Cases executed**. Each shows the current value plus a coloured delta arrow vs the prior equivalent-length window (▲ 4% / ▼ 2 / →). Below them, an inline-SVG line chart plots pass rate by week — empty weeks are zero-filled so the line is continuous instead of compressing sparse activity.
### Tier 4 — Failure analysis
Four parallel "what to fix this week?" panels:
- **Top failing cases** — pass rate < 50% with at least 3 runs (noise filter)
- **Flaky cases** — most pass↔fail flips in the period (per-case ordered history; suppresses one-off noise)
- **Failing suites** — same logic at suite level via the per-result `suite_id` from the recent runs change
- **Recently regressed** — most recent execution failed but there's an earlier pass in the period (a real regression, not a perpetually broken test)
Each row is clickable — the report doubles as a navigation surface.
## Why
A list answers "what runs have I done?" A report answers "is quality improving?", "what to fix?", and "where's the rot?" The Reports tab now answers all three.
## API
Two new endpoints power this:
- `GET /api/test-reports/overview` — KPIs (with prior-period delta) + weekly trend buckets
- `GET /api/test-reports/failures` — flaky_cases / failing_cases / failing_suites / regressed_cases
Both accept `from`, `to`, `project`, and `suite` (suite expands to descendants). Documented at /api-reference.
Test cases: removed redundant Done button on case detail
fixtest-casesux
The case detail page already auto-saves on every edit, so the green Done button at the bottom was misleading — and as a bonus footgun it always sent you to the generic listing instead of using the new context-aware Back link.
The "Done" button on the test case detail page suggested a confirm-and-leave action, but every field already auto-saves on blur/edit — there was nothing to "confirm." Worse, it always navigated to `/dashboard/test-cases`, ignoring the context-aware Back link in the topbar that returns you to the suite (or wherever) you came from.
Gone. Use the topbar Back link to return where you came from.
Test cases: breadcrumb returns you to the suite you came from
fixtest-casesuxsuites
Editing a case from a suite now returns you to that suite via the breadcrumb — not the generic test-cases listing. Label updates to "Back to Suite" too.
## What changed
If you were on a suite detail page and clicked into one of its cases to make edits, the case page's "← Back to Test Cases" link dumped you on the generic `/dashboard/test-cases` listing — losing your suite context. To keep editing other cases in the same suite you had to navigate back manually.
Now the suite detail page passes a `?back=<suite URL>` query parameter when linking out to a case, and the case detail page reads it to:
- Set the back link to the suite (not the listing)
- Change the label from "Back to Test Cases" to "Back to Suite"
If the param isn't present (e.g., you opened the case from the Cases tab table), the link defaults to `/dashboard/test-cases` exactly as before.
## Safety
The `back` parameter is validated strictly to prevent open-redirect or URL-injection abuse: must start with `/dashboard/`, no `..` (path traversal), no `\`, no control chars, length-capped at 200. Anything off-spec falls back to the default.
Suites: Add Cases modal now shows the folder tree
featuretest-casessuitesux
Adding cases to a suite no longer means scrolling a flat wall of names. The modal mirrors the Cases-tab folder structure with collapsible groups and per-folder counts.
## What changed
The **Add Test Cases** modal on a suite detail page used to render every case in the workspace as one undifferentiated list. With the folder tree organizing the catalog, that meant you'd open the modal looking for a smoke test in `Auth / Login` and end up scrolling past 80 cases from unrelated areas.
The modal now mirrors the Cases-tab sidebar layout:
- Folders render as **collapsible group headers** with a count of available cases
- Cases are **nested inside their folder**, indented per depth
- An **Unfiled** pseudo-section at the top holds cases with no folder
- Folders with zero available cases (already in the suite, or empty) are hidden so the modal stays tight
- Sibling order matches the sidebar — `sort_order` first, name as tiebreaker
## Search
Typing in the search box still filters by name, but now any folder containing matches **auto-expands** so hits never hide inside a collapsed branch. Clearing the search restores your collapse state — typing doesn't reset what you closed.
## Plumbing
The modal now fetches `/api/test-cases` and `/api/test-case-folders` in parallel and bumps `per_page` from 200 to 500 (partial-page truncation would split a folder's cases between "shown" and "hidden" with no good UX answer).
AI Generate: stop dropping cases silently + respect active folder
fixtest-casesaischema
Approve 3, get 3. AI-generated cases now also land in the folder you're currently viewing — same behavior as the manual "Add test case" button.
## Two bugs that combined into one mystery
### "Approved 3, only 2 created"
The DB CHECK on `test_cases.type` allows `functional, regression, smoke, integration, performance, security, usability, exploratory`. The API's whitelist had `e2e` and `other` (which the DB rejects) and was missing `exploratory`. Worse, the AI generate prompt explicitly told the model "type must be one of …, e2e, …", so the AI happily produced `e2e` cases. The API passed them through as valid, the DB rejected with a CHECK violation, the per-case POST returned 500, and the frontend's `if (r.ok) created++` quietly skipped the failures. The user saw "Created 2 test cases" with no clue the third was lost.
Three coordinated fixes:
- API whitelist now matches the DB exactly (drop `e2e`/`other`, add `exploratory`).
- AI generate prompt drops `e2e` and tells the model to use `integration` for end-to-end flows.
- Server-side coercion maps known synonyms (`e2e` → `integration`, `other` → `functional`, etc.) and falls back to `functional` for anything still unknown — defends against the model ignoring the rules.
### "AI cases ignore my current folder"
The manual "Add test case" button has always passed the active sidebar folder so new cases land where you're looking. AI generate didn't — generated cases always went to "All cases" regardless of context. Now AI generate inherits the same folder context: viewing "Smoke / Auth" creates the cases inside "Smoke / Auth"; viewing "All cases" leaves them unfiled.
### Bonus cleanups in the same flow
- Parallel POSTs via `Promise.allSettled` instead of a sequential for-loop, so one failure can't poison the rest of the batch and the batch is faster.
- The result alert names every failing case + reason: "Created 2 of 3. Failed: • <name>: <reason>"
- "Create selected" button disables itself during the batch so a double-click can't fire the batch twice and double-create cases.
Test cases: folder reorder in the sidebar is now visible and easy to hit
fixtest-casesuxdrag-drop
Drag-and-drop folder reorder on the Cases sidebar now shows clear visual feedback during the drag, has wider drop zones, and a grab cursor. Order persists per team — same as before, just usable now.
## What's different
Folder drag-and-drop reorder on the Cases-tab sidebar has been wired end-to-end since the earlier release, but two issues made it look broken:
- **No visual feedback during drag.** The drop handler was setting `drop-into` / `drop-before` / `drop-after` classes on rows, but the only matching CSS was a leftover `.drop-target` rule. So users dragged with no green wash, no insertion line — just guesswork about whether the drop registered.
- **Drop zones too narrow.** The 25/50/25 split copied from the Suites-tab tree gave only ~7px to land "before" or "after" on a 28-32px sidebar row.
## Fix
- Added proper drop-zone styles for the sidebar: green wash for "into" (reparent), green top/bottom 3px line for "before/after" (reorder among siblings).
- Widened the zones to equal thirds (33/34/33) so each is clearly hittable.
- Added a grab/grabbing cursor on draggable rows so it's obvious you can pick up a folder before you commit.
## Persistence
Unchanged — this was already correct. The reorder PATCH writes `sort_order` on the team's rows and every team member sees the same order on their next load. This release just makes the existing wiring visible during the drag.
Test runs: Runs tab defaults to All Users + persists the filter
fixtest-runsux
The user filter on the Runs tab now defaults to "All Users" and remembers every choice — including an explicit "All Users" pick — across reloads.
## The two bugs, back-to-back
The user filter on the Runs tab used to land on your own name on first visit (because the option for the current user carried `selected`), and selecting "All Users" didn't survive a reload (because the localStorage restore used a truthy check that skipped the empty string, letting the HTML default reassert itself).
## Fix
- Default is now genuinely **All Users** — no more `selected={m.id === user.id}` on the option list.
- Restore uses a key-presence check instead of truthy so `""` round-trips. Your last pick — your name, another teammate, or All Users — is what you see on reload. Same treatment for the search and status filters for consistency.
Test runs: fix "Failed to archive" CHECK constraint error
fixtest-runsschema
Archiving a test run now works everywhere. The CHECK constraint on test_runs.status was missing 'archived' and 'aborted' on environments that hadn't been hand-patched.
## The bug
Trying to archive a test run on some environments surfaced a raw Postgres error:
```
Failed to archive run: Failed to update: new row for relation "test_runs" violates check constraint "test_runs_status_check"
```
## Why it happened
The original `044_test_cases` migration locked `test_runs.status` to one of `pending`, `in_progress`, `completed`, `cancelled`. The app later grew two more statuses — `aborted` (stopped-early runs) and `archived` (hidden from default views) — but no migration was ever added for them. Production had been hand-patched via the Supabase SQL editor months ago, but local Supabase and any fresh environment still enforced the old narrow whitelist, so archiving there bounced off the CHECK.
## Fix (migration 107)
Drop and recreate `test_runs_status_check` with the full set:
```sql
CHECK (status IN ('pending', 'in_progress', 'completed', 'cancelled', 'aborted', 'archived'))
```
Idempotent — on prod the definition matches the hand-patched version, so it's a no-op there. `cancelled` stays in the whitelist so any historical rows still using it don't retroactively break. Applied to both local and prod.
Test runs: clearer error when archive fails
fixtest-runsux
The "Failed to archive" alert now shows the actual reason (team mismatch, network, DB error) instead of a dead-end message. Archive still works the same when it succeeds.
## What changed
Archiving a test run that failed for any reason used to show "Failed to archive" with no hint at why. Both the Runs-tab list and the run detail page now:
- Show the server's actual error message in the popup: "Failed to archive run: Run not found in this team (it may have been deleted or belong to a different team)" etc.
- Catch network errors separately and label them as such: "Failed to archive run: network error (...)"
- Log the full context to the browser console for in-browser debugging.
## Server side
`PATCH /api/test-runs/:id` now uses `.maybeSingle()` instead of `.single()` so a 0-row match comes back as a clean 404 with a useful message, instead of a generic 500 from PostgREST. The success path is unchanged.
Test runs: parent-suite runs now include sub-suites
featuretest-runssuitesapi
Creating a test run from a suite with nested sub-suites now runs every case in every descendant, not just the parent's direct cases. Each result row is labeled with its originating sub-suite.
## What changed
Running a parent suite used to execute only the cases directly attached to it — anything in a sub-suite underneath got silently skipped. That turned "run Billing" into partial coverage for teams that organized their cases into folders like Billing > Checkout > Coupons.
Now, when you create a run from a parent suite:
- Every case in every descendant sub-suite is included.
- A case linked to both a parent and a sub-suite is added once (parent wins).
- Each result row records which sub-suite the case was pulled from, so the run page can label rows by origin.
## UI
- **Suite detail page** now shows a green banner above the case list when sub-suites contribute cases, plus a preview line inside the create-run form: "Will run 59 cases — 12 from this suite + 47 from 3 sub-suites."
- **Run detail page** shows a blue note when the run includes sub-suite cases, plus a small origin badge on each row whose sub-suite differs from the run's headline suite.
- Leaf-suite runs (no sub-suites) look identical to before — no new visual noise when it isn't needed.
## API
- `POST /api/test-runs` now walks the descendant tree via `get_suite_descendants(uuid)` and inserts one `test_run_results` row per unique case, with the originating `suite_id` recorded.
- `GET /api/test-runs/:id` — each result in `results[]` now carries `suite_id` and `case_suite_name`.
- `GET /api/test-suites/:id?include_descendants=1` — new query parameter returns `descendant_suites[]`, `descendant_case_count`, and `total_case_count` for accurate run-creation previews.
## Schema (migration 106)
- `test_run_results.suite_id` is a new nullable column pointing at `test_suites(id)` with `ON DELETE SET NULL`.
- All pre-existing rows backfilled to the run's headline suite_id (the correct answer for pre-change runs, which had no sub-suite awareness).
- New index on `(run_id, suite_id)` supports per-sub-suite grouping queries on the run page.
Test cases: pin feature retired on Suites tab
changetest-casessuites
Removed the "pin suite" button from the Suites tab. Drag-and-drop reorder (sort_order) replaces it — both did the same job, and one way is clearer than two.
## What changed
The pin button is gone from suite tree rows on the Suites tab. The pinned-first sort is also gone — siblings now sort purely by user-set `sort_order` (the drag-and-drop reorder shipped in the previous release), with `created_at` as a stable tie-breaker.
## Why
Pin and drag-to-reorder solved the same problem from two angles. With drag-to-reorder in place, pin was redundant: both surfaced "make this suite jump to the top of its sibling group." Two ways to express the same thing led to confusion ("is this pinned or just reordered?") and made the row toolbar busier than it needed to be.
## What stayed
- Cases-tab pin is unchanged — that's a different concept (a "favorite case" marker, not a sort signal)
- The `pinned` and `pinned_at` columns stay in the `test_suites` table; existing data is preserved, the API just stops reading or writing them
- `PATCH /api/test-suites/:id` no longer accepts `{pinned: ...}` in the request body
Test cases: drag-and-drop folder reordering
featuretest-casesapidrag-drop
Reorder folders in the Cases-tab sidebar by dragging — top 25% to insert before, middle to reparent, bottom 25% to insert after. Cross-parent drops reparent then reorder automatically.
## What changed
The Cases-tab folder sidebar now supports the same position-based drag-and-drop reorder pattern that shipped on the Suites tab in the previous release:
- Drop on the **top 25%** of a folder → insert the dragged folder immediately **before** it
- Drop on the **middle 50%** → **reparent** into that folder (existing behavior)
- Drop on the **bottom 25%** → insert immediately **after** it
- Cross-parent drops with a before/after zone reparent first, then reorder, so you can drag a folder out of one branch and place it precisely between two siblings in another
## API additions
- New `PATCH /api/test-case-folders/reorder` endpoint — accepts `{ parent_folder_id, ordered_folder_ids[] }` and rewrites `sort_order` across the contiguous sibling set in one shot. Mirrors `PATCH /api/test-suites/reorder` exactly.
- `GET /api/test-case-folders` now returns `sort_order` and orders by it (with name as tiebreaker), so the order you set in the UI persists across reloads.
## Why
Folder organization was already useful for cataloging hundreds or thousands of test cases, but the only sibling order was alphabetical. Teams that wanted "smoke tests on top, regression in the middle, exploratory at the bottom" had no way to express that. Drag-to-reorder solves it.
Daily: Bug fixes and improvements
dailyaiapi
3 commits —
Daily: deps(mcp-server): bump @supabase/supabase-js in /mcp-server
dailyaiapi
18 commits — Bug fixes: - Fix POST /api/projects 500 — RLS policy blocked manager role inserts
Bug fixes:
- Fix POST /api/projects 500 — RLS policy blocked manager role inserts
Improvements:
- deps(mcp-server): bump @supabase/supabase-js in /mcp-server
- deps(dashboard): bump @supabase/supabase-js in /dashboard
- mcp-server: switch base image from Docker Hub to AWS Public ECR
- Chrome extension: fix side panel not opening after activeTab refactor
- Chrome extension: eliminate ALL broad-host surfaces from manifest
Fix: project creation 500 error for manager-role users
fix
Resolved a 500 error on POST /api/projects that affected team members with the manager role.
The RLS policy for inserting projects only allowed owner and admin roles, but the app uses manager as the intermediate role. Any manager trying to create a project hit a 42501 RLS violation that was returned as a generic 500. Fixed by updating the RLS policies to include manager and adding an upfront role check that returns a clear 403.
Daily: Add collapsible Page Links section with Excel export
dailyaibug-reports
92 commits — New features: - Add collapsible Page Links section with Excel export - Add debug logging to extension upload flow
New features:
- Add collapsible Page Links section with Excel export
- Add debug logging to extension upload flow
Bug fixes:
- Fix intermittent 500 on bug report submit — trigger blocked by RLS
- Fix CSP blocking screenshot upload — convert data URL without fetch
- Fix extension: attachments, console logs, network errors not sending
- Fix screenshot permission — add <all_urls> to host_permissions
Improvements:
- Replace Unicode box chars with ASCII dashes in console/network txt
- Always attach console-log.txt and network-errors.txt when checked
- Support Bearer token auth in middleware for Chrome extension
- Extension: use short-ID URLs with ?team= for dashboard links
Daily: Extract truncated-title tooltip into shared helper; add to notes cards
dailyautomationaibug-reportsnotesapi
51 commits — Bug fixes: - Fix detail page 400s — use report.id (UUID) not URL short-id - Fix bug report image attachments not displaying on local
Bug fixes:
- Fix detail page 400s — use report.id (UUID) not URL short-id
- Fix bug report image attachments not displaying on local
- Fix report save errors: don't derive UUID from the short-ID URL
Improvements:
- Extract truncated-title tooltip into shared helper; add to notes cards
- FAB + AI-created-report integrate with short IDs and tab-scoped context
- Tab-scoped workspace/project context — independent sessions per tab
- Accept short IDs in every API + MCP endpoint; update public docs
- Note-to-bug conversion: redirect to short-ID URL
Daily: chore(assign): remove diagnostic logging after root cause found
dailyaibug-reports
10 commits — Bug fixes: - fix(assign): allow unassign + auto-unassign on close - fix(assign): use report's team_id instead of active-team cookie
Bug fixes:
- fix(assign): allow unassign + auto-unassign on close
- fix(assign): use report's team_id instead of active-team cookie
- fix(search): remove id ILIKE filter to fix 500 error on UUID column
Improvements:
- chore(assign): remove diagnostic logging after root cause found
- deps(website): Bump astro from 6.1.5 to 6.1.6 in /website
- compliance: log dependabot merge + update GDPR revision history (2026-04-14)
- deps(dashboard): bump @anthropic-ai/sdk in /dashboard (#58)
Daily: Add reported_by + assigned_to filters to kanban board
dailyautomationbug-reports
8 commits — New features: - Add reported_by + assigned_to filters to kanban board - Add Reported by + Assigned to filters and sortable columns
New features:
- Add reported_by + assigned_to filters to kanban board
- Add Reported by + Assigned to filters and sortable columns
- Add sortable Resolution column to bug reports list
- Add bug report deduplication across all automated run paths
Bug fixes:
- Fix memberNameMap not defined when filters come from URL params
- Fix Top Bug Reporters 'Avg Quality' column showing '--'
- Fix cross-project report leak when filters are applied
Improvements:
- docs(compliance): add 2026-04-13 nightly scan evidence entry
Daily: Logo: all black + align TestLauncher end with Agent end
dailybug-reportsnotesapi
33 commits — New features: - Add 'by TestLauncher' tagline to light logo variant - Add high-resolution brand assets (SVG + PNG)
New features:
- Add 'by TestLauncher' tagline to light logo variant
- Add high-resolution brand assets (SVG + PNG)
- Add CSV export for filtered bug reports
- Add sortable column headers to bug reports list
Bug fixes:
- Fix sort flip-back + normalize existing report types in DB
- Fix sort persistence in cookie save on page load
- Fix sort persistence: /api/reports now respects sort and dir params
Improvements:
- Logo: all black + align TestLauncher end with Agent end
- Revert Agent text to dark navy, keep tagline styling
- Logo: neon green Agent text, smaller 'by', shifted right tagline
- Align 'by TestLauncher' with end of Agent text
- Right-align 'by TestLauncher' tagline on light logo
Daily: Show reporter name on bug report detail page
dailyautomationaibug-reportsmarketing
38 commits — New features: - Add Content-Security-Policy meta tag to marketing site pages - Add ContactModal popup for Enterprise + Migration CTAs, honeypot on forms
New features:
- Add Content-Security-Policy meta tag to marketing site pages
- Add ContactModal popup for Enterprise + Migration CTAs, honeypot on forms
Bug fixes:
- fix(security): patch basic-ftp CRLF injection CVE (HIGH)
- fix(docs): escape Astro template braces in CI/CD FAQ code blocks
Improvements:
- Show reporter name on bug report detail page
- Auto-linkify URLs in comments + Tab-to-accept mention suggestions
- Remove homepage mailto triggers so Cloudflare stops wrapping them
- Automation rename + smart [Copy N] naming + CI/CD FAQ polish
- docs(compliance): update weekly audit with Railway log findings
Daily: Remove domain verification requirement from web automations
dailyautomationaibug-reportsnotes
32 commits — New features: - Add Docs link to sidebar under Support section - Add clear all button to messages dropdown for inbox management
New features:
- Add Docs link to sidebar under Support section
- Add clear all button to messages dropdown for inbox management
- Add @mention autocomplete, notifications, and workspace-aware links in comments
Bug fixes:
- Fix notification links to always include workspace context
- Fix 6 of 8 Dependabot vulnerabilities via package updates
- Fix ReferenceError: finalStatus/run not defined in mobile run complete handler
Improvements:
- Remove domain verification requirement from web automations
- Remove domain verification check from new automation script modal
- Replace testmode query param with email whitelist for FAB access
- Update FAB tooltip from 'Report a bug' to 'Create a bug or session note'
- docs(compliance): add 2026-04-09 evidence entries and update improvement/risk logs
Daily: Clear filters refreshes both list and kanban views
dailyautomationaibug-reportsapi
96 commits — New features: - Add voice transcript to AI Assistant session replay context
New features:
- Add voice transcript to AI Assistant session replay context
Bug fixes:
- Fix pagination after filtering: return total count from reports API
- Fix status and resolution filters not working on bug reports list
- Fix stop recording killing popup before video saves to IDB
- Fix AI Assistant not seeing session data in reused popup
- Fix project dropdown showing Loading when popup opens automatically
Improvements:
- Clear filters refreshes both list and kanban views
- Attach voice transcript to bug reports + stop voice on submit
- Speed up session capture: upload media in background after response
Daily: Uppercase report IDs on list, kanban, detail
dailyautomationaibug-reportsnotes
61 commits — New features: - Add image lightbox cycling arrows - Add prev/next cycling to image lightbox on bug report detail page
New features:
- Add image lightbox cycling arrows
- Add prev/next cycling to image lightbox on bug report detail page
- Add markdown preview toggle on note detail page
- Add Edit/Preview toggle for markdown notes
Bug fixes:
- Fix source attribution + analytics accuracy
- Fix bug report source attribution and analytics accuracy
- Fix null querySelector crash on report detail page
- Fix null querySelector crash on report detail page
Improvements:
- Uppercase report IDs on list, kanban, detail
- Show uppercase bug report ID on listing, kanban cards, and detail page
- Show report ID on detail page
- Show bug report ID on detail page with click-to-copy
- Render markdown in note card previews
Daily: Remove app-level HSTS
dailyaibug-reportsnotesapi
60 commits — New features: - Add HSTS header at application level for defense-in-depth
New features:
- Add HSTS header at application level for defense-in-depth
Bug fixes:
- Fix file chip styles on bug reports page for dynamically added attachments
- Fix attachment styles not applying to dynamically added files
- Fix 403: set ORIGIN env for Astro behind Railway
- Fix 403 on API uploads: skip CSRF for API routes
- Fix 403 on file uploads
Improvements:
- Remove app-level HSTS
- Remove HSTS from app middleware — Cloudflare handles it at the edge
- Set ORIGIN env in Dockerfile so Astro resolves url.origin correctly
- Force rebuild
- Force Railway rebuild — bust Docker layer cache for checkOrigin fix
23 commits — New features: - Add security header configs for Cloudflare Pages / Netlify / Vercel - Add security scan detail page with schedule support
New features:
- Add security header configs for Cloudflare Pages / Netlify / Vercel
- Add security scan detail page with schedule support
Bug fixes:
- Fix hardcoded prod Supabase client in reports Realtime subscription
- Fix WebSocket using prod key on local dev
- Fix missing CSS on security config detail page
- Fix config section CSS not applying (use :global() for JS-injected HTML)
- Fix security config detail page: define:vars is plain JS not TS
Improvements:
- Rename Organization→Workspace + free workspace creation + switch redirect
- Teach AI Assistant to help users set up the MCP server
- Update MCP Inspector instructions to use Authentication tab
- Document 6 ways to connect to the MCP server (Mac + Windows)
- Hide Repository and Branch rows for web/mobile security scans
Daily: Show Start Free button in mobile header next to hamburger menu
dailyaimarketingapi
51 commits — New features: - Add mobile schedule MCP tools + update all schedule docs - Add Slack + email notifications to mobile and exploration completions
New features:
- Add mobile schedule MCP tools + update all schedule docs
- Add Slack + email notifications to mobile and exploration completions
Bug fixes:
- Fix docs page mobile responsiveness
- Fix constellation node hover on Chrome and Safari
- Fix constellation hover for Safari: use transform scale instead of SVG r
- Fix 4 of 5 dependabot vulnerabilities via npm audit fix
- Fix: position icons inside nodes using centered group transforms
Improvements:
- Show Start Free button in mobile header next to hamburger menu
- Replace Smart Classification with Autonomous Orchestration feed
- Hero: hover shows large detail overlay at center of constellation
- Crop dashboard screenshot height to prevent vertical stretch
- Replace dashboard CSS mockup with real screenshot
Daily: Shared domain verification across all features
dailyaibug-reportsmarketingapi
30 commits — New features: - Add visual regression detection: Claude Vision compares screenshots between runs - Add Schedule button and modal to Exploratory AI page
New features:
- Add visual regression detection: Claude Vision compares screenshots between runs
- Add Schedule button and modal to Exploratory AI page
- Add screenshot comparison overlay on Recon tab
- Add green logo glow during exploration runs
- Add exploration progress endpoint for real-time phase updates
Bug fixes:
- Fix domains page: filter by d.verified boolean, not d.status string
- Fix domains page: CSS variables, global styles, back button uses history
- Fix screenshot comparison: read nested before/after objects from API
- Fix exploration trends: API resilience + field name mapping
- Fix exploration detail page data parsing + clickable link on bug reports
Improvements:
- Shared domain verification across all features
- Improve DNS TXT verification: check all records, log lookups, strip quotes
- Update homepage and docs with latest Exploratory AI features
- Make script parsing resilient — 5 fallback strategies, never fails
- Default schedule timezone to user's browser timezone
Daily: Add 'Autonomously.' in green to homepage tagline
dailyaimarketing
40 commits — New features: - Add 'Autonomously.' in green to homepage tagline - Add green glow animation to logo during code review analysis
New features:
- Add 'Autonomously.' in green to homepage tagline
- Add green glow animation to logo during code review analysis
- Add role grants to claude_connections migration
- Add claude_connections table migration and restore query
Bug fixes:
- Fix review analytics not loading
- Fix open PRs table overflowing past right frame edge
- Fix code review: handle missing migration columns + button layout
- Fix: Code Review button layout breaks during analyzing state
Improvements:
- Update homepage: advanced code review features + unlimited reviews in pricing
- Remove repo selector from Code Review — auto-detect from active project
- Make Integrations page accessible to all users
- Unlimited code reviews for paid plans, Developers access for all users, AI Assistant code review features
- Code Review Phase 3: PR security integration + review analytics dashboard
Daily: Expose bug report status in MCP server API
dailyautomationaibug-reportsapi
114 commits — New features: - Add automated error monitor workflow (GitHub Actions) - Add System Status link to website footer
New features:
- Add automated error monitor workflow (GitHub Actions)
- Add System Status link to website footer
- Add Performance Trends link on analytics dashboard
Bug fixes:
- Fix MCP server build: add appium_js to createMobileAutomation type
- Fix: cannot add postgres_changes callbacks after subscribe()
- Fix: add explicit pricing to AI Assistant to prevent hallucinated prices
- Fix: 'team is not defined' error in AI chat system prompt
Improvements:
- Expose bug report status in MCP server API
- deps(mcp-server): bump @sentry/node from 10.46.0 to 10.47.0
- AI Assistant: user-scoped storage, plan-aware random prompts
- Improve automations empty state with onboarding steps
Daily: Rewrite autonomous agents section with comprehensive platform cycles
dailyautomationai
41 commits — New features: - Add individual feature nav links in docs sidebar - Add Feature Guide link to docs sidebar navigation
New features:
- Add individual feature nav links in docs sidebar
- Add Feature Guide link to docs sidebar navigation
- Add comprehensive Feature Guide to docs with AI Assistant examples
- Add web/mobile automation + performance testing to AI assistant
Bug fixes:
- Fix sitemap: remove www subdomain from site URL
Improvements:
- Rewrite autonomous agents section with comprehensive platform cycles
- Remove Stripe from integrations docs — internal billing detail
- Comprehensive feature docs: standalone sections + Feature Guide additions
- Update docs: collapsible sidebar, AI perf testing, device filtering
- Sidebar: move collapse button to Main header + hover-to-expand
71 commits — New features: - Add p99 to k6 output: include summaryTrendStats in script options
New features:
- Add p99 to k6 output: include summaryTrendStats in script options
Bug fixes:
- Patch @astrojs/node 9.1.3 → 9.5.4: fix 4 moderate Dependabot alerts
- Fix website build: escape {id} in api-reference to prevent Astro eval
- Fix k6 load test data not showing: normalize camelCase/snake_case field names
- Fix k6 metrics: use local execution with cloud streaming
- Fix k6 script: use options.cloud (not deprecated ext.loadimpact), fix distribution format, use k6 cloud run
Improvements:
- Rename feature card: Performance & Load Testing
- Remove vendor names from homepage features: k6 and Lighthouse
- Route all auth flows to /dashboard/analytics instead of /dashboard
- Update docs and marketing for performance testing + dashboard rebrand
- Capitalize first name in dashboard welcome message
Daily: Revert @astrojs/node to v9.1.3 — v10 requires newer Astro
dailyautomationaimarketingnotesapi
117 commits — New features: - Add competitor comparison pages: bugAgent vs Functionize & Mabl - Add error telemetry: client capture, server capture, admin logs, Slack alerts
New features:
- Add competitor comparison pages: bugAgent vs Functionize & Mabl
- Add error telemetry: client capture, server capture, admin logs, Slack alerts
Bug fixes:
- Fix Dependabot alerts: path-to-regexp ReDoS + @astrojs/node memory DoS
- Fix admin logs access: check ADMIN_EMAILS like other admin pages
- Fix geo-snap filters to display on one line
Improvements:
- Revert @astrojs/node to v9.1.3 — v10 requires newer Astro
- Update homepage marketing: no-code automation, team scaling, 2FA icon
- Hide Send to Claude button when ANTHROPIC_API_KEY not configured
- Simplify saving state: remove AI conversion text
- Replace × with trash can icon on schedule calendar cards
Daily: Major homepage overhaul — concise copy, QA Wolf-inspired polish
dailyaimarketing
50 commits — Bug fixes: - Fix npm audit vulnerabilities across all packages - Fix downgrade Stripe cancellation — fetch full team data
Bug fixes:
- Fix npm audit vulnerabilities across all packages
- Fix downgrade Stripe cancellation — fetch full team data
- Fix Mobile+ not showing as active after purchase
- Fix Mobile+ activation — bypass portal for addons + success fallback
- Fix Mobile+ cancel when no Stripe subscription ID exists
Improvements:
- Major homepage overhaul — concise copy, QA Wolf-inspired polish
- Update all Stripe price IDs to new pricing
- Update Pro monthly Stripe price ID to new $49 pricing
- Update pricing: Pro $49/mo, Team $99/mo, annual saves 10%
- Allow downgrade to any lower plan with Stripe cancellation
Daily: Smaller font + spacing in admin modal activity section
dailyaimarketingapi
172 commits — Bug fixes: - Fix modal flash — show placeholders until API data loads - Fix admin modal activity counts — robust queries + hours tracking
Bug fixes:
- Fix modal flash — show placeholders until API data loads
- Fix admin modal activity counts — robust queries + hours tracking
- Fix admin modal: use active team, fresh last sign-in, update plan_limits
- Fix admin user stats, backfill storage, clean up pricing
- Fix nav/footer links to use absolute paths with anchors
Improvements:
- Smaller font + spacing in admin modal activity section
- Admin user modal: IP info, storage, activity counts
- Org-level storage tracking + updated plan limits
- Update all docs, API reference, MCP, and marketing for mobile
Daily: Add beforeunload warning during mobile app upload
dailyautomationaiapi
84 commits — New features: - Add beforeunload warning during mobile app upload - Add comprehensive mobile testing documentation and MCP tools
New features:
- Add beforeunload warning during mobile app upload
- Add comprehensive mobile testing documentation and MCP tools
- Add Mobile Testing feature: BrowserStack + Maestro Cloud
- Add test case metrics to analytics dashboard
Bug fixes:
- Fix mobile run URL: /api/mobile/runs not /runs/create
- Fix mobile automation create URL: /api/mobile/automations not /create
- Fix mobile detail pages: define:vars scope issue
- Fix signed URL upload: use correct Supabase upload/sign endpoint
- Fix mobile app upload: signed URL flow for large files
Improvements:
- Migrate Sentry to new config pattern (fixes deprecation warning)
- deps(mcp-server): bump @modelcontextprotocol/sdk in /mcp-server
- deps(mcp-server): bump @supabase/supabase-js in /mcp-server
60 commits — New features: - Add Test Cases docs, MCP tools, reports tab, homepage marketing - Add complete Test Cases feature: cases, suites, runs, execution
New features:
- Add Test Cases docs, MCP tools, reports tab, homepage marketing
- Add complete Test Cases feature: cases, suites, runs, execution
- Add draft/active toggle to automation detail page
- Add green glow sparkle to Regenerate button while optimizing
- Add download buttons for run history video and screenshots
Bug fixes:
- Fix time tracking: use window.__bugagentActiveProjectId as fallback
- Fix time tracking: always use active project, don't reset on clear
- Fix time tracking: remove project filters, add admin check
- Fix Regenerate Script to use optimize endpoint with version history
- Fix TS type annotation in inline script + update docs
Improvements:
- Show creator name on automation listing and detail pages
- Duplicate automation includes recorded_actions, description, selectors_strategy
- Change Improve with AI to subtle link style with chevron arrow
- Regenerate button: white text + green glow wave effect
- Force download for video/screenshots + spacing
Page-Aware AI Assistant
featureai-assistantcontext
The AI Assistant now knows what page you are viewing and can reference the specific content on screen.
When on a bug report detail page, the AI sees the report title, description, status, and severity. On an automation detail page, it reads the full Playwright script (up to 3KB). On a note page, it reads the title and content. When you say "this report", "improve this script", or "explain this note", the AI understands what "this" refers to. Also includes Playwright expert persona that activates when discussing automation topics.
AI Script Optimization Pipeline
featureautomationai
Regenerate Script now sends your Playwright code through a 12-point Sonnet 4 optimization checklist.
The POST /api/automations/:id/optimize endpoint sends scripts to Claude Sonnet 4 with a comprehensive optimization checklist covering: selector reliability, wait strategies, assertions, error handling, authentication patterns, mobile compatibility, timing, cleanup, strict mode, network handling, and result verification. The optimized script is saved automatically with version history.
Script Version History with Undo
featureautomationapimcp
Automation scripts now track up to 10 previous versions with one-click undo.
Every script change — manual edits, AI optimization, or regeneration — is saved to a version history. The undo button on the automation detail page reverts to the previous version instantly. A version badge (v1, v2, etc.) shows the current version number. Available via API (POST /automations/:id/undo) and MCP (undo_automation_script tool).
D3 Force-Directed Coverage Mind Map
featuredashboardautomationd3
Replaced the static coverage visualization with an interactive D3.js force-directed graph that auto-spaces nodes, supports drag, zoom, and pan.
The automation coverage mind map on the dashboard now uses D3.js for a force-directed layout. Nodes automatically spread apart to avoid overlap, with stronger repulsion for more scripts. Features include: drag-and-drop node repositioning, zoom and pan, curved link paths, hover highlighting of connected links, pulse animation on failing tests, auto-fit zoom to fill the 700px container, and click-to-navigate to automation details. Color-coded groups with 8 distinct colors make it easy to identify test areas at a glance.
Duplicate Automation Scripts
featureautomation
One-click duplicate creates a copy of any automation script without version history.
The Duplicate button on the automation detail page creates a new automation named "[Copy] Original Name" with the same script, target URL, and project. The copy starts in draft status with no version history. Device selector moved inline below the title, project dropdown removed (uses sidebar project).
Playwright Runner Improvements
fixautomationrunner
Auto-fixes for common Playwright issues, line-by-line pass/fail highlighting, and intensifying run button glow.
The runner now auto-fixes getByLabel("Password") to locator("input[type=password]") and adds waitForLoadState after bare goto() calls. After a run completes, the script shows green/red line highlighting for passed/failed lines. The Run Now button glows neon green with increasing brightness the longer the run takes. Run history updates inline without page refresh.
Homepage Rebrand: QA Layer Messaging
marketinghomepage
New hero title, Context Engine section, and quality-of-testing philosophy throughout.
Homepage hero changed to "The QA Layer Your AI Stack Is Missing" with secondary messaging "Your Agents Write Code. We Make Sure It Works." Added Context Engine section explaining how every QA action is fed by deep context. Updated quality score messaging to emphasize measuring testing quality, not just bug severity. Added "Why bugAgent? Our Philosophy" to documentation with context-aware intelligence and continuous improvement loop sections.
Dev Environment + Repo Migration
infrastructuredevops
Full local development environment with Docker-based Supabase and the repo moved to TestLauncher organization.
Repository moved from hamiltonmascioli/bugAgent to TestLauncher/bugAgent. Set up local dev with Supabase CLI (Docker), Google OAuth for localhost, environment-aware configuration (.env.local/.env.example), develop branch workflow, and convenience scripts (npm run dev starts both dashboard:4321 and website:4322). Created bugAgent Test Team in GitHub for QA access. Updated all hardcoded URLs, DNS CNAME, and GitHub Pages.
bugAgent Skills + Migration Offer
marketinghomepageintegrations
New Skills ecosystem on homepage with GitHub, Claude, Jira integration cards and free migration offer.
Added bugAgent Skills section to homepage showcasing GitHub (repo sync), Claude (root cause analysis), and Jira (bi-directional sync) with SVG logos. "Build Your Own Skill" card with submission modal sends to [email protected]. Migration section offers free export from existing platforms and dedicated QA team support. All documented in docs, API reference, and MCP pages.
Quality Score Documentation
docsquality-scoreapi
Added Quality Score feature documentation to homepage, docs, API reference, and MCP pages.
Documented the Quality Score feature across the website: added a new feature card to the homepage describing the 1-10 rating system using Rapid Software Testing heuristics across 10 dimensions (reproduction steps, expected vs actual, environment details, evidence, root cause analysis, impact assessment, context and history, heuristics and oracles, clarity and structure, actionability). Updated the docs capabilities section with Quality Score details. Added quality_score (integer 1-10) and quality_breakdown (object with 10 dimension scores 0.0-1.0) to the API reference GET /reports/:id response fields and example. Updated MCP docs with qualityScore and qualityBreakdown fields on get_bug_report.
Team Booster: Scale your QA team instantly
new-featuremcpapiteam-booster
Added Team Booster feature with scale_team MCP tool, POST /team-booster REST API endpoint, and full documentation across API reference, MCP docs, homepage features, and docs pages.
## Team Booster
Provision pre-configured tester accounts on demand via the new **Team Booster** feature.
### What's new
- **MCP tool**: `scale_team` — specify team size (1-10), location, duration, technical levels, and budget to provision tester accounts instantly
- **REST API**: `POST /team-booster` — programmatic access with Bearer token auth (Pro and Team plans only)
- **Homepage**: Team Booster feature card added to the features grid
- **API Reference**: Full endpoint documentation with request/response examples
- **MCP Docs**: Tool documentation with example workflow
- **Docs**: Team Booster added to capabilities list and solution grid
### How it works
1. Specify team size, location, duration, technical level, and budget
2. Tester accounts are provisioned in seconds
3. New testers appear in your Team Management page with full platform access
4. You will not be charged until approval has been given
Available on **Pro** and **Team** plans.
Claude Analysis via MCP & API + Self-Healing Vision
mcpapiclaudeself-healing
Added push_to_claude MCP tool and POST /claude/push API endpoint. Updated self-healing docs to describe the full-circle autopilot healing engine.
Added push_to_claude MCP tool and POST /claude/push API endpoint for programmatic Claude analysis of bug reports. AI agents and API consumers can now trigger root cause analysis, read results via get_bug_report (claude_analysis and claude_pushed_at fields), and close the loop with automated fixes and re-verification. Updated the Self-Healing Development feature and documentation to describe bugAgent as a full-circle autopilot healing engine: Record > Diagnose > Automate > Heal, with humans in the loop at every stage.
Claude Integration: Self-Healing Bug Analysis
featureintegrationai
Connect your Anthropic API key to push bug reports to Claude for root cause analysis and fix suggestions. Self-healing cycle: detect, analyze, fix, verify.
Connect your Anthropic API key in Settings → Integrations to push bug reports to Claude for root cause analysis and fix suggestions. Choose from Claude Sonnet 4, Opus 4, or Haiku 3.5. Configure per-project auto-push and custom instructions. From any bug detail page, click Send to Claude or Re-analyze for on-demand analysis. Combined with Playwright automation, bugAgent now creates a self-healing cycle: detect → analyze → fix → verify. Pro and Team plans only.
Second-precise timers on bug reports and notes
featurebug-reportsnotestimer
Built-in timers on bug reports and notes track testing time to the exact second. Start, stop, resume, and click to manually edit. Time transfers automatically when converting notes to bug reports.
Bug reports and notes now include built-in timers that track testing time down to the exact second. Start, stop, and resume anytime. Click the time display to manually adjust. The timer appears on both the bug report creation form and detail page, as well as on notes. When converting a note to a bug report, tracked time transfers automatically. Perfect for tracking QA effort and billing testing hours.
The AI Assistant is now a complete QA command center — create and update reports, add comments and notes, search your backlog, use voice input, attach files, and analyze session replays, all through natural conversation.
The AI Assistant goes far beyond report creation. You can now create and update bug reports, change status, severity, and type, add comments, create testing notes in multiple formats (Markdown, Plain Text, Bug Template, Checklist, Outline), list and search reports, notes, automations, and schedules, and send feedback — all through natural conversation. Use voice input powered by Whisper transcription, attach files, and let the AI analyze session replays to auto-draft reports. Start a new chat anytime with the New Chat button. Available in both the dashboard and the in-app FAB popup (no login required for FAB). Updated homepage and documentation marketing copy to reflect the full command center capabilities.
Kanban Board View for Bug Reports
featurekanbanjiradashboard
Drag-and-drop Kanban board with 8 status columns, real-time Jira bi-directional sync, and persistent list/kanban view toggle.
Visualize and manage bug reports with a drag-and-drop Kanban board. Eight status columns (New, Awaiting Triage, Confirmed, In Progress, Resolved, Retesting, Closed, Reopened) let you move cards between stages instantly. Status changes sync bi-directionally to Jira in real time. Each card displays severity, type, description preview, and timestamps. Toggle between list and kanban views with a persistent preference. New batch sync endpoint (POST /api/jira/batch-sync-status) supports syncing multiple reports at once.
Notes feature documented on website
documentationnotesnew-feature
Added Notes feature to the homepage, documentation, API reference (6 endpoints), and MCP page (5 tools). Notes supports Markdown, Plain Text, Rich Text, Checklist, and Outline formats with voice-to-text, time tracking, file attachments, and private/shared visibility.
## Notes Feature Documentation
Updated the bugAgent website to document the new Notes feature across all relevant pages:
- **Homepage Features section** — Added Notes card highlighting 5 formats, voice-to-text dictation, time tracking timer, file attachments, private/shared visibility, auto-save, and keyword search with filters.
- **Documentation page** — Added Notes to the "What is bugAgent?" capabilities list, describing all key features and noting availability on all plans.
- **API Reference** — Documented 6 Notes API endpoints: GET /notes (list with search/filters), POST /notes (create), GET /notes/:id (detail), PATCH /notes/:id (update), DELETE /notes/:id (delete), POST /notes/upload (file attachments up to 10 MB).
- **MCP page** — Added Notes tools section with 5 MCP tools (list_notes, create_note, get_note, update_note, delete_note) and an example workflow.
Notes is available on all plans (Free, Pro, Team).
GitHub Integration Docs Added to Website
docsgithubintegrations
Updated website homepage and documentation with GitHub integration feature for Playwright automation script sync.
Added GitHub as an active integration on the homepage with its own card. Updated the Playwright Automation feature and Delegate Testing sections to mention GitHub script sync. Created full GitHub Integration API documentation with five new endpoints (connect, repos, mapping, status, disconnect). Added GitHub FAQ to docs page and updated SDK docs with sync details.
Slack Integration + Schedule Notifications
featureintegrationsslackautomationsnotifications
Connect Slack and get notified when scheduled automations fail via Slack or email.
Full Slack OAuth integration for teams on Pro and Team plans. Connect your Slack workspace from Settings, then configure per-schedule failure notifications: choose None, Email, Slack (with channel picker), or both. When a scheduled automation fails, a bug report is auto-created AND notifications are sent to your configured channels. The automation detail page now has a Schedule button with an inline form for time, days, timezone, and notification preferences. The Schedules dashboard shows notification icons next to each schedule.
Enhanced Analytics Feature Documentation
analyticsdocshomepageimprovement
Added Analytics Dashboard feature card to homepage and comprehensive Analytics docs section with all 12+ chart types, health score formula, and API access details.
Added a new Analytics Dashboard feature card to the homepage with detailed description of all 12+ chart types (Bug Reports Over Time, Quality Score Trend, Severity/Status/Type distributions, Top Bug Reporters leaderboard, Automation Health, Time Tracking, Notes Created sparkline, Resolution Time) and a mini SVG chart preview. Added comprehensive Analytics documentation section to the docs page covering the Quality Testing Health Score formula (Quality 25%, Resolution 25%, Automation 25%, Low Severity 25%), all 12 chart sections with descriptions, filtering controls (7/14/30/90 days + project), crown icon for best-performing area, and API/MCP access details. Updated existing analytics references in docs to link to the new section.
Time Tracking Tools & Analytics
featuretime-trackingmcpapianalytics
Added full time tracking support across MCP, API, and analytics dashboard. Team plan feature.
New MCP tools: list_time_entries, create_time_entry, update_time_entry, delete_time_entry. New REST API endpoints: GET/POST/PATCH/DELETE /time-entries. Analytics dashboard now includes Hours by Day bar chart and Hours by Category horizontal bar chart. Time Tracking feature card added to homepage. Documentation updated across API reference, MCP docs, and main docs pages.
Time Tracking Page
featureteamtime-tracking
Added a dedicated Time Tracking page for Team plan users with daily/weekly summaries, category-based entry tracking, and inline editing.
New Time Tracking page available under the dashboard for Team plan users. Features include a daily summary bar with 8-hour progress tracking, collapsible add-entry form with tester and developer category groups (Manual Testing, Exploratory Testing, Bug Reporting, Code Review, Development, Debugging, etc.), filterable card grid with search, project, category, member, and date range filters, inline card editing, delete confirmation dialog, localStorage-persisted filter state, and pagination for large entry sets.
Analytics Suite
featureanalyticsdashboard
Comprehensive analytics page with 12 chart sections for Pro/Team plans.
Added Analytics page with key stats cards, stacked bar chart for reports over time, SVG line chart for quality trends, donut charts for severity and status, horizontal bar chart for bug types, leaderboard table, automation health bars, sparklines for notes and time spent, resolution time metric, and circular product health gauge. Pure CSS/SVG charts with time range selector and project filter. Gated to Pro/Team plans.
Quality Score Display
featuredashboardmcp
Added quality score (1-10) display across dashboard and MCP server
Quality score badge on bug detail pages (circular badge with hover tooltip showing breakdown dimensions), colored pill column in reports listing table, Q:score badge on kanban cards, and qualityScore/qualityBreakdown fields added to MCP BugReport type.
Notes — Testing Memos for Teams
featurenotescollaboration
Capture testing observations, ideas, and findings with the new Notes feature.
Notes gives testers a dedicated space to write and organize their thoughts during testing. Create notes in Markdown, Plain Text, Rich Text, Checklist, or Outline format. Notes auto-save as you type. Mark notes as Private (only you) or Shared (anyone on your team and project can read). Filter by project, author, or date range. Full-width editor with word count, Cmd+S shortcut, and auto-title from content.
Coverage map in pricing and marketing
improvementwebsite
Added Automation Coverage Map as a listed feature for Pro and Team pricing tiers.
Added Automation Coverage Map as a listed feature for Pro and Team pricing tiers. Updated Features, docs, session-replay, and API reference pages with coverage map information.
Gate coverage mind map behind paid plans
improvementdashboard
The Automation Coverage mind map on the dashboard is now only shown for Pro, Team, and Enterprise plans with at least one active automation.
The Automation Coverage mind map on the dashboard is now only shown for Pro, Team, and Enterprise plans with at least one active automation. Free plan users no longer see the section.
Rename Schedules nav to Scheduled
improvementdashboard
Updated sidebar navigation, page titles, and all references from "Schedules" to "Scheduled" across the dashboard for consistency. URL paths remain unchanged.
Updated sidebar navigation, page titles, and all references from "Schedules" to "Scheduled" across the dashboard for consistency. URL paths remain unchanged.
Automation Coverage Mind Map
featuredashboardautomationsvisualization
Dashboard now shows an interactive test coverage visualization instead of recent reports.
The dashboard home page now features an interactive SVG mind map that visualizes your automation test coverage. Claude AI analyzes your Playwright scripts to extract pages, features, and assertions, grouping them into a hierarchical map. Tests are color-coded: green (passing), red (failing), gray (untested). Click any test node to jump to its automation detail page. Failing tests show a pulse indicator. Results are cached and only re-analyzed when automations change.
Rewind — replay last 5 actions in the browser
featuresdksrt
New Rewind button on the SRT FAB toolbar. Click it to watch a replay of your last 5 actions — the SDK moves a ghost cursor to each element, highlights it, and executes the real action (clicks buttons, fills form fields with character-by-character typing, toggles checkboxes, scrolls). Supports cross-page navigation. Press ESC or click Stop to abort.
Custom password reset emails via Resend
improvementauthemail
Replaced Supabase built-in password reset emails with custom branded emails sent through Resend. The new flow uses secure tokens with 1-hour expiry, rate limiting, and a dedicated reset page with password confirmation. Email design matches the current slate branding with a security tip callout.
Renamed Session Replay to Session Replay Tools (SRT)
improvementbranding
Session Replay has been renamed to Session Replay Tools (SRT) across the entire platform — dashboard, documentation, pricing, API reference, integrations, and AI assistant. The SRT section on bug report detail pages is now hidden for free plan users.
Default Project auto-created for new signups
featureonboarding
New users signing up via email or Google OAuth now get a "Default Project" automatically created in their team. This does not apply to invited users.
Nightly usage warning emails for free plans
featureemail
Account owners on free plans approaching the 5,000 bug report limit now receive a nightly email with their current usage, remaining reports, and a link to upgrade. Warnings send at 80%+ usage.
Removed weekly digest and new report email features
improvementcleanup
Simplified notification preferences to only include usage warnings. Removed the weekly digest and new bug report email toggles from settings, API, and MCP server.
GitHub Integration for Automation Scripts
featureintegrationsgithubautomations
Connect GitHub to automatically sync Playwright automation scripts to your repos.
Full GitHub OAuth integration that syncs automation scripts bidirectionally with mapped GitHub repos. When you record an automation, the generated Playwright script is pushed to tests/bugagent/ in the mapped repo. Editing the script in bugAgent updates the file in GitHub. Deleting an automation removes the file. Project-to-repo mapping is configured in Settings → Integrations. SHA conflict recovery handles cases where files are edited directly on GitHub.
New Automate tool in the FAB records browser actions and generates Playwright test scripts via AI. Run on demand, schedule recurring runs, or integrate into CI/CD pipelines.
Full-stack automation feature: (1) FAB Automate button records clicks, inputs, navigation, and form interactions with enriched selectors (role, testid, aria-label, text, CSS). (2) Claude AI generates reliable Playwright test scripts using semantic selectors and automatic assertions. (3) Separate runner service executes scripts in headless Chromium, captures video and screenshots, uploads artifacts to storage. (4) Dashboard Automations page lists all automations with run history, script viewer/editor, and CI/CD integration section with curl and GitHub Actions examples. (5) Schedules page manages recurring cron-based runs with timezone support. (6) Public CI/CD API at /api/v1/automations/run for pipeline integration with webhook callbacks. New database tables: automations, automation_runs, automation_schedules with full RLS. Timezone setting added to user profile.
AI Assistant now uses all captured FAB data to auto-draft reports
featureaifabwcag
When a session is captured via the FAB, the AI uses every piece of captured data to immediately draft a complete bug report.
The AI Assistant now proactively uses all data captured by the FAB SDK to auto-draft comprehensive bug reports: console errors with stack traces, failed network requests with status codes, user click sequences, form field interactions, WCAG accessibility audit findings (grouped by severity), annotated screenshots, screen recordings, DOM mutations, and performance metrics (FCP, LCP, CLS, TTFB). When session replay data is present, the AI immediately presents a ready-to-confirm draft rather than asking step-by-step questions. On the dashboard (without session data), the AI correctly avoids analyzing the current page and instead guides the user through report creation, using any uploaded media as context.
Jira ADF formatting + sync documentation updates
fixjiraformattingdocs
Bug reports created by AI Assistant now preserve markdown formatting in Jira editor. Sync docs updated with force-sync, last-updated-wins, and media sync.
All Jira sync paths now convert markdown descriptions into proper Atlassian Document Format (ADF) nodes. Headings (## Summary, ## Steps to Reproduce), ordered lists, unordered lists, bold, italic, and inline code are rendered correctly in Jira's editor — no more lost formatting when editing. A new shared markdownToAdf() utility is used by sync.ts, force-sync.ts, merge.ts, push-field.ts, and create-report.ts. API reference updated with the new POST /jira/force-sync endpoint and complete behavior documentation for severity last-updated-wins, bi-directional comments, and media sync with filename deduplication.
Severity last-updated-wins + bi-directional media sync
featurejirasyncmedia
Jira sync now auto-resolves severity conflicts using timestamps and syncs all media/images both directions without duplicates.
Both auto-sync (polling) and manual force sync now compare report.updated_at vs Jira fields.updated to determine which platform was modified most recently — the latest change wins and the other side is updated automatically. Media attachments are now synced bi-directionally: local images/videos are pushed to Jira as issue attachments, and Jira attachments are pulled into bugAgent storage. Deduplication checks both jira_attachment_id and filename (case-insensitive) to prevent any duplicates across sync cycles.
Manual Jira force sync button
featurejirasync
New sync button next to AUTO SYNC badge forces an immediate bi-directional sync of description and comments.
Added a sync icon button on the bug report detail page next to the existing AUTO SYNC badge. Clicking it triggers an immediate force sync: pushes the current title and description to Jira, pushes any local comments not yet synced, and pulls any Jira comments not yet in bugAgent. The button shows a spinning animation during sync and displays a toast summarizing what was synced (e.g. "Description synced, 2 comment(s) pushed to Jira, 1 comment(s) pulled from Jira"). New API endpoint: POST /api/jira/force-sync.
WCAG audit results displayed on bug report detail page
featuredashboardwcagaccessibilitydocs
WCAG accessibility audit findings now appear in the Session Replay Tools section of the bug report detail page with impact badges, rule IDs, and remediation links.
When a WCAG audit is run before sending a session to the AI, the results are stored alongside the session data and displayed in a new collapsible "WCAG Accessibility Audit" section within the SRT area on the report detail page. Each violation shows an impact badge (critical/serious/moderate/minor with color coding), the axe-core rule ID, WCAG criteria tags, a description of the issue, the CSS selector of the affected element, an HTML snippet of the offending markup, and a link to Deque's remediation documentation. The homepage Features grid now includes a dedicated WCAG Accessibility Audit card, and the SRT documentation page has a full WCAG Audit section covering how it works, what rules are checked, and the report output format.
Updated verification email branding
improvementbranding
Updated email verification templates and pages to use current slate (#94A3B8) branding instead of legacy amber (#F59E0B). Updated logo rendering, button colors, and icon styling across register, send-verification, verify-email, and verify-pending pages.
Performance optimizations
improvement
Minified SDK (44% smaller), consolidated Jira polling from 3 API calls to 1, added tab visibility detection to pause background polling, reduced font loading overhead, and enabled HTML compression.
Email verification for signups
feature
New accounts now require email verification before accessing the dashboard. Verification emails sent via Resend with branded templates, resend button, and automatic detection when verified.
Improved AI chat interface
improvement
Removed Submit quick-reply button and refined Yes/No button detection to only appear for direct yes/no questions.
Microphone audio capture in screen recordings
feature
Screen recordings now capture microphone audio via Web Audio API mixing, enabling voice narration during bug recording sessions.
Fixed report detail page error
fix
Resolved database column error that could cause failures when loading bug report detail pages with similar reports.
Stack Trace, Network Waterfall, and Performance Metrics in Session Replay
featuresession-replayperformancedebugging
Three new collapsible sections in bug reports: full console log with search, network waterfall for failed/slow requests, and auto-captured performance metrics including FPS, memory, and page load times.
Added three major new data capture and display features to Session Replay. Stack Trace / Console Log captures all console output (log, info, warn, error, exceptions) with full stack traces — the last 50 entries are shown in a searchable, scrollable section. Network Waterfall shows failed (4xx/5xx) and slow (>1s) API requests with method, URL, status code, and duration — also searchable. Performance Metrics auto-captures page load time, first contentful paint, DOM ready, FPS, memory usage, DOM node count, and long tasks. The SDK now intercepts console.log and console.info in addition to warn/error, runs a lightweight FPS tracking loop, and collects performance data at submit time.
Fix recorder popup size for Chrome permission dialog
fixsdkscreen-recording
Recorder popup now opens large enough (420x550) for Chrome to display its screen sharing permission dialog. Auto-shrinks to compact bar once recording starts.
The recorder popup was opening at 320x72 which was too small for Chrome to render its getDisplayMedia permission dialog (share screen confirmation). Now opens at 420x550 so the full permission prompt with cancel/share buttons is visible. Once the user approves and recording begins, the popup auto-resizes to a compact 320x80 recording bar with timer and stop button.
Cleaner report layout: video player in attachments, no duplication
improvementuiattachmentsjira
Removed duplicate Screen Recording and DOM Replay sections from Session Replay card. Video and DOM replay now appear only in Attachments with full-width playable video player.
Removed the Screen Recording video player and DOM Replay badge from the Session Replay section since they were duplicated in Attachments. Video attachments now render as full-width playable players instead of cropped 140px thumbnails. DOM replay info cards also span full width. This is cleaner and ensures attachments are properly included when reports are pushed to Jira.
Popup-based screen recorder survives page refresh
featuresdkscreen-recording
Screen recording now runs in a separate popup window that continues recording even when the main page refreshes or navigates. Also uses displaySurface constraints to guide Chrome picker to the correct surface type.
Moved the screen recording logic from inline MediaRecorder to a dedicated popup window. The popup handles getDisplayMedia, MediaRecorder, timer, and 60-second auto-stop independently of the main page. Communication between the popup and main page uses BroadcastChannel for real-time status and IndexedDB for blob persistence. When you choose Full Desktop, Chrome shows monitor options; when you choose Browser Window, Chrome shows window options — using the displaySurface constraint to guide the picker.
Fix video upload and session attachments display
fixvideoattachmentssdksession-replay
Video recordings now upload inline with session data instead of a separate request. Session screenshots, videos, and DOM replay data now reliably show in report attachments.
The two-phase video upload (session capture then separate FormData upload) was silently failing for all users — zero video files were ever stored. Replaced with inline base64 upload in the same JSON payload as session data, matching the pattern used for screenshots. Also improved the report detail page to ensure session screenshots and videos always appear in the attachments section even if not in the report media array. DOM replay mutation count card is now properly styled and rendered.
Recording source picker: Full Desktop or Browser Window
featuresdkscreen-recording
Users can now choose between Full Desktop and Browser Window when recording. Tab capture has been removed since it stopped on page refresh.
Added a source picker modal to the SDK screen recording flow. Users now see two options — Full Desktop (captures everything on screen) and Browser Window (captures the entire browser window). This replaces the previous tab-based capture which would stop recording whenever the page refreshed or navigated. The picker features a styled modal with icons, descriptions, and a cancel option, followed by the existing 3-second countdown before recording begins.
Fix video upload CORS, IndexedDB blob persistence, and DOM replay display
fixvideocorssdksession-replay
Fixed critical CORS preflight issue preventing video uploads. Added IndexedDB blob persistence and DOM replay display in report attachments.
Fixed critical CORS preflight issue preventing video uploads by adding separate OPTIONS exports on capture endpoints. Added IndexedDB persistence for video blobs to survive page navigation. Added DOM replay info card with mutation count in bug report attachments. Improved video upload error logging and screen recording tab selection.
Added FAQ Section to Documentation
docsfeature
New FAQ section on the documentation page with 10 expandable accordion items covering: what bugAgent is, how it works, getting started, team management, multiple projects and organizations, pricing plans, subscription cancellation, supported report types, Jira integration, and data security.
Fixed Screen Recording in Session Replay SDK
fixsession-replaysdk
Resolved video recording issues: recordings now survive page navigation by preserving the video blob and flushing MediaRecorder data on unload. Chrome users will see their current tab pre-selected in the screen picker. Session replay videos now appear in the Attachments section of bug report details alongside screenshots and uploaded files.
Updated Marketing & Docs for Expanded Report Types
docsmarketingreports
Homepage, documentation, API reference, MCP docs, and Session Replay docs now reflect the full breadth of 19 supported report types — including feature requests, enhancements, technical debt, documentation, DevOps, UX improvements, and integrations — across the AI Assistant, REST API, and MCP server.
Expanded Report Types Beyond Bugs
featureai-assistantreports
The platform now accepts 7 new report types: feature requests, enhancements, technical debt, documentation, DevOps, UX improvements, and integrations. The AI Assistant, report creation, auto-classification, filtering, and display layers all support the expanded types with new badge styles and classification patterns.
Integrations Page Enhanced
docsdashboard
Expanded the Session Replay section on the dashboard integrations page with a detailed "What gets captured" reference table, screen recording instructions, and FAB toolbar documentation.
Homepage & Docs Updated with New Features
docsfeature
Updated homepage feature cards, Session Replay documentation, API reference, MCP docs, and developer documentation with DOM replay, screen + voice recording, and FAB toolbar features. All CoTester references renamed to AI Assistant across the entire website.
MCP Server CORS Hardened
security
Restricted CORS from wildcard to an explicit allowlist of production origins. MCP clients are unaffected as CORS is browser-only.
Session Cleanup Automation
improvementinfrastructure
Orphan session replays not attached to a bug report are automatically cleaned up after 24 hours — both database records and storage files. Hourly cron job.
Screen + Voice Recording
featuresession-replay
New FAB toolbar with satellite record button. 3-second countdown, then captures screen via getDisplayMedia with optional microphone. Records up to 60 seconds and attaches to bug report.
DOM Replay Recording
featuresession-replay
Session Replay SDK now records DOM mutations via MutationObserver in a rolling 60-second buffer. DOM snapshot and mutations are stored with the session and displayed on the report detail page.
AI Assistant Rebrand
improvementbranding
Renamed all CoTester references across the platform to AI Assistant — sidebar, admin pages, settings, billing, header, and system prompts.
Session Replay setup on Integrations page
featuresession-replayintegrations
The Integrations page now includes a Session Replay section with a full setup guide, copyable script snippet pre-filled with your API key prefix, and configuration options reference.
SDK events persist across page reloads
fixsession-replaysdk
The Session Replay SDK now saves events to sessionStorage so clicks, errors, and navigation are preserved across page reloads. Previously, form submissions that triggered a page reload would wipe the event buffer clean.
User Journey in Session Replay
featuresession-replay
Session replay now shows a User Journey section with the last 10 pages visited during the session, displayed as a vertical timeline with page titles, URL paths, and timestamps. Collapsed by default.
Improved SDK click capture
fixsession-replaysdk
The Session Replay SDK now captures all button clicks including save buttons, form submits, and buttons with nested icon/text elements. A mousedown backup listener ensures clicks are captured even when pages reload.
Duplicate Detection on homepage and docs
docs
Duplicate Detection is now featured on the homepage and in the developer documentation under AI-Native Features.
Improved Duplicate Detection
improvementduplicate-detection
Duplicate detection now weights title similarity at 70% with a lower matching threshold, catching near-identical titles that were previously missed. All existing reports have been backfilled with similarity data.
Activity count badge fixed
fixsession-replay
The session replay badge now shows the count of meaningful user activities (clicks, navigations, errors) instead of all raw events including mouse moves and scrolls.
AI Analysis collapsed by default
improvementsession-replay
The AI Analysis section in session replay is now collapsed by default. Click to expand when you want to review the findings.
User Activity Log replaces Event Timeline
improvementsession-replay
The abstract dot timeline in session replay has been replaced with a readable User Activity log showing timestamped clicks, navigations, errors, and warnings.
AI Assistant asks one question at a time
improvementai
The AI Assistant now asks a single focused question per message instead of multiple questions at once, making conversations more natural.
Screenshot removed from replay section
fixui
Session replay screenshots now only appear in the attachments section, avoiding duplication on the bug report detail page.
Duplicate detection for bug reports
featureaireports
New reports are automatically checked against existing reports in the same org and project using trigram embedding similarity. Similar issues appear on the bug detail page between comments and changelog with links, severity, status, and match percentage. No clutter when no duplicates found.
User display preferences for bug reports
featuresettings
New Bug Report Display section in Settings allows users to toggle Suggested Test Case and Suggested Playwright Script sections on bug report detail pages. Both are off by default. Settings are per-user only.
Feedback system added
featurefeedback
New Support section in sidebar with Feedback button that opens a popup modal. Users can submit feedback with category (general, bug, feature, improvement). Admin page shows all feedback with name, email, message, and status management.
Session Replay SDK and AI Assistant improvements
enhancementsession-replayai-assistant
SDK now filters out clicks on the bugAgent FAB and submit buttons from recorded events. AI Assistant now receives the page screenshot URL and URL navigation history from session replays for more complete bug report drafting.
Session Replay docs updated
docssession-replay
Updated Session Replay documentation to cover new features: page screenshot capture on submit, console.error() and console.warn() interception, 60-second URL navigation history tracking, and updated privacy section to reflect screenshot capture capability.
Impact Score on Bug Reports
featurereportstriage
Bug report detail page now shows an Impact Score (0-100) combining severity, frequency, and affected user count for objective triage.
Each bug report now displays an Impact Score ring that combines three data-driven factors: severity weight (0-40 based on S1-S4), frequency (0-30 based on similar reports in the same project over 30 days), and affected users (0-30 based on distinct reporters). The score uses logarithmic scaling for frequency and user count. Hover the score to see the full breakdown. Helps teams prioritize based on data instead of gut feel.
Session Replay: Console Errors & URL History
featuresession-replaysdk
Session Replay SDK now captures console.error() and console.warn() calls, plus a chronological URL navigation history for the last 60 seconds.
Enhanced the Session Replay SDK to intercept console.error() and console.warn() calls (in addition to uncaught exceptions and unhandled rejections), each tagged with a severity level. Also added a rolling 60-second URL navigation history that tracks every page the user visited in chronological order. Both are included in the session payload, stored in the database, and fed to the AI analysis for richer bug report context.
Admin: Inline Plan & Admin Controls
adminusersplans
Added inline plan dropdown and admin toggle switch to the Admin > All Users page for quick user management.
The All Users admin page now supports: inline plan switching (Free/Pro/Team/Enterprise) per user that updates their team plan and limits, a toggle switch to grant or revoke platform admin access, live admin count in stats bar, and toast notifications for all actions. Added is_admin column to profiles with database migration.
Session Replay Documentation Page
documentationsession-replaysdk
Added a dedicated /session-replay documentation page with full setup guide, SDK reference, and cross-links from docs hub, header nav, and API reference.
New comprehensive Session Replay documentation page covering: SDK installation, configuration options, event types captured, AI analysis workflow, CoTester integration, privacy and security details, API endpoints, and plan comparison. Also added a Docs link to the website header navigation, a Session Replay card to the docs Get Started and Developer Resources sections, a cross-reference callout in the API reference, and linked the feature list entry.
Session Replay SDK & AI Analysis
featuresession-replaysdkaiproteam
Capture the last 60 seconds of user activity before a bug report. A lightweight JavaScript SDK records clicks, navigation, console errors, and network failures. CoTester AI analyzes the session data to auto-draft bug reports with repro steps, error analysis, and suggested severity. Available on Pro and Team plans.
New Session Replay feature: Add bugagent-sdk.js to your site, and when users click Report Bug, the last 60 seconds of their browser session is captured and sent to CoTester AI. The AI analyzes clicks, navigation paths, console errors, and failed network requests to automatically draft a structured bug report. Developers can view the event timeline, errors, and AI analysis directly on the report detail page. New API endpoints: POST /api/sessions/capture, GET /api/sessions, GET /api/sessions/:id, PATCH /api/sessions. Database migration adds session_replays table with RLS policies. Updated pricing, features, docs, API reference, and MCP pages.
Fix website build and deploy
fixwebsitedeploy
Fixed a build error that prevented the website from deploying to GitHub Pages since March 19.
Resolved an esbuild compilation error in the API reference page where raw JSON objects inside HTML code tags were being parsed as Astro template expressions. The fix uses the set:text directive to properly escape inline JSON examples. All homepage updates (CoTester context-awareness, pricing, docs) are now live.
Admin CoTester Knowledge Base
featureaiadmincotester
New admin page for platform-wide testing expertise that enriches every CoTester AI session.
Added a new Admin CoTester Knowledge page with four sections: Master Testing Prompt for core AI instructions, Testing Keywords for domain-specific terminology, Reference URLs for testing frameworks and methodologies, and Knowledge Document uploads for testing guides and best practices. All global knowledge is injected into every CoTester AI session across all organizations, making CoTester a true testing expert that can teach testing concepts and help with complex testing problems.
CoTester AI Context-Aware Branding
improvedwebsitecotesterdocumentation
Updated all website pages to highlight the CoTester AI Assistant as context-aware with voice input, custom instructions, and knowledge document support.
The homepage Features section, Platform Pillars, documentation page, API reference, and MCP page now describe the CoTester AI Assistant as context-aware — reflecting its ability to use custom organization instructions, uploaded knowledge documents (product specs, testing playbooks), voice-to-text input, and full org data awareness in every session.
Knowledge Document Uploads for CoTester AI
featuredashboardcotesterdocuments
Upload reference documents (PDFs, markdown, text files) that the CoTester AI Assistant uses as context in every session.
The CoTester AI Instructions section in Settings now includes a Knowledge Documents area. Upload product specs, testing playbooks, API documentation, or onboarding guides — the AI will reference them automatically in every conversation. Supports PDF (with text extraction), Markdown, TXT, CSV, JSON, YAML, and HTML files. Includes drag-and-drop upload, character usage meter (200K limit), and document management with delete. Up to 20 documents, 10MB each.
Session Replay features always visible
improvementsession-replaydashboard
Stack Trace, Network Waterfall, and Performance Metrics sections now always show in Session Replay with empty state messages when no data is captured.
All three feature sections (Stack Trace / Console Log, Network Waterfall, Performance Metrics) are now always visible in the Session Replay card on bug reports, even when no data was captured. Each section displays an empty state message when data is unavailable, and includes a feature icon in the toggle header for better visual identification.
WCAG audit powered by axe-core engine
featuresdkwcagaccessibility
WCAG Audit tool now uses axe-core v4.10.2 by Deque — the industry-standard accessibility engine with 80+ rules and zero false positives.
Replaced the custom inline WCAG checker with axe-core v4.10.2 (by Deque), the same engine used by Google Lighthouse and Chrome DevTools. axe-core is lazy-loaded from CDN on first click so there's no impact on initial SDK bundle size. It covers 80+ WCAG 2.0/2.1 Level A and AA rules with zero false positives, including comprehensive checks for color contrast, ARIA validation, focus management, form labels, heading structure, link purpose, image alt text, and much more. Results now include impact severity levels (critical, serious, moderate, minor), WCAG criterion tags, element CSS selectors, offending HTML snippets, and links to remediation documentation. The overlay shows a richer summary with violation count, affected elements, rules passed, and impact breakdown.
Jira-style bug template toggle for report composer
featuredashboardreports
Enable the Template toggle to prefill the bug report textarea with a standard Jira bug template.
A new "Template" toggle in the bug report composer toolbar prefills the textarea with a structured Jira-style bug template including: Summary, Steps to Reproduce, Expected Result, Actual Result, Environment, and Additional Notes. The setting is saved to localStorage and persists across sessions. When using Rapid Entry mode, the template automatically re-fills after each submission. Toggling off clears the template if it hasn't been modified.
Voice-to-text input for bug report composer
featuredashboardvoice
Click the microphone icon in the bug report toolbar to dictate your bug report instead of typing.
Added a voice input button next to the file attachment button in the bug report composer. Click the mic to start recording — audio is transcribed in 30-second chunks via the /api/ai/transcribe endpoint. A live transcript review panel appears with a timer, recording status, and editable transcript text. Users can review and edit the transcription before inserting it into the bug report textarea. Includes discard/accept workflow, recording dot animation, and pulse effect matching the AI Assistant voice input pattern.
Delete bug reports from the reports list
featuredashboardreports
Hover over a report title to reveal a trash icon. Click it, confirm, and the report is permanently deleted.
Added a trash icon that appears on hover next to each bug report title on the reports list page. Clicking it shows a confirmation dialog. On confirm, the report is deleted via a new DELETE /api/reports/:id endpoint and the row fades out. The DELETE endpoint is scoped to the user's team to prevent unauthorized deletions.
Fix Rewind stopping on click-triggered page navigations
fixsdkrewind
Rewind now continues seamlessly when a replayed click navigates to a new page on the same site.
The root cause was that targetEl.click() on an anchor link would navigate the browser immediately, killing JS execution before the remaining rewind state could be saved. The fix adds two layers: (1) pre-click detection checks if the target is inside an <a href> or is a form submit button — if so, the full rewind state is saved to sessionStorage before the click executes; (2) a beforeunload safety net listener saves progress if the page unloads unexpectedly during rewind, covering edge cases like JS-driven window.location navigations. On the new page, the SDK picks up the saved state and resumes with the full progress bar intact.
Rewind persists progress across page navigations
enhancementsdkrewind
The Rewind progress bar and step statuses now persist seamlessly when replaying actions that navigate to a new page.
Previously, when Rewind replayed a navigation action that loaded a new page, the progress bar would reset and only show the remaining steps — losing visual continuity. Now the full rewind state (all actions, total step count, completed/skipped step statuses, and current index) is serialized to sessionStorage before navigation and restored on the new page. The progress bar shows all original steps with previously completed ones already marked as done, and the step counter updates live (e.g. "step 3/5").
Rewind feature added to homepage
featurehomepagerewind
Added Rewind as a dedicated feature card on the homepage and as a bullet point in the Enrich platform pillar.
The homepage Features grid now includes a dedicated Rewind card describing the real-time action replay capability. The SRT feature card description was updated to mention Rewind. The Enrich pillar in Platform Pillars now lists Rewind as a feature bullet.
Rewind feature documentation added to SRT page
docssrtrewind
Added comprehensive Rewind documentation to the Session Replay Tools (SRT) docs including feature highlight, how-it-works guide, supported actions, controls, and element targeting details.
The SRT documentation page now includes full coverage of the Rewind feature: a highlight card in the overview, a capture item in the tools listing, and a dedicated section with step-by-step usage guide, supported actions (clicks, text input, checkboxes, selects, scroll, navigation), controls (ESC key, Stop button), and a callout explaining the CSS selector-based element targeting strategy.
Smarter AI Assistant quick reply buttons
improvementai
Yes/No buttons now only appear for direct yes/no questions. A new Submit button appears for questions that need typed answers.
The yes/no detection is stricter — questions with "or" options, information-seeking questions (what, which, how), and multi-part questions no longer show yes/no buttons. Instead, a Submit button appears to expedite sending your typed reply. Bug reports generated by the AI no longer include bugAgent SDK tool steps (recording, annotation, FAB) in the description or repro steps. The session replay FAB is now hidden when the AI chat panel is open.
Expanded browser console and network capture
featuresdk
SDK now captures all console levels including debug, trace, and assert. Bug report detail page shows all network requests, not just failed ones.
Added console.debug (purple badge), console.trace (gray badge with stack), and console.assert capture. Report page now displays up to 100 console entries and shows all network requests including successful ones for complete visibility.
Fix video zero duration on bug reports
fixsdk
Fixed an issue where screen recording videos showed zero duration on bug report pages.
Applied a WebM duration metadata fix on the report page (seek-to-end workaround for Chrome MediaRecorder bug). SDK now passes actual recording duration to the capture API instead of hardcoded 60s.
Auto-display new bug reports in real-time
featuredashboard
Bug reports list now auto-refreshes every 30 seconds to show new reports from team members or the AI assistant without requiring page reload.
New reports appear with a slide-in toast notification and highlighted row animation. Uses polling fallback for reliability alongside Supabase Realtime subscriptions.
Browser markup annotation tool
featuresdksession-replaymarkup
New annotate tool on the floating button lets testers draw circles, arrows, and freehand annotations directly on the page before sending to AI.
Hover over the bug FAB to reveal the new pen icon. Click it to enter annotation mode with a full toolbar: 7-color palette, circle/arrow/freehand drawing tools, undo, done/cancel. Annotations are composited over a base screenshot and sent as the screenshot in the bug report, replacing the auto-captured one. Reset clears any saved markup.
DOM Replay Mutation Timeline & Recording UI Cleanup
improvementsession-replaysdkwebsite
Interactive DOM mutation timeline in attachments, removed redundant in-browser recording indicator, updated website marketing.
DOM Replay in attachments now shows an interactive mutation timeline with color-coded entries (DOM/ATTR/TEXT), searchable filter, and collapsible panel. Removed redundant in-browser recording indicator — timer and stop button now handled entirely by the popup recorder window. Updated homepage with 10x tester velocity messaging, Delegate/Heal coming soon sections, and footer trademark.
Fixed video recording upload - videos were 0 duration
bugfixsession-replayvideoapi
Screen recording videos were being uploaded as 7-byte corrupt files due to a base64 parsing bug in the capture API.
The webm video data URL contains commas in its MIME type (e.g. video/webm;codecs=vp8,opus) which caused the base64 split to grab the wrong segment. All previously uploaded videos were only 7 bytes and showed zero duration. The fix now correctly locates the ";base64," marker to extract the actual video payload.
Reset button added to SDK toolbar
featuresdksession-replay
New Reset button in the floating popup clears all captured session data so testers can start fresh without reloading.
A Reset button has been added next to the "Send Session to AI" button in the SDK popup. Clicking it clears all captured data: video recordings, event buffer, console logs, network requests, URL history, DOM mutations, performance metrics, sessionStorage buffers, and IndexedDB video blobs. Any active recording is also stopped. This lets testers discard a bad capture and redo their workflow cleanly.
DOM Replay moved to Session Replay section
improvementsession-replaydashboard
DOM Replay mutation timeline moved from Attachments into the Session Replay section alongside Stack Trace, Network Waterfall, and Performance Metrics.
The DOM Replay mutation timeline with searchable, color-coded mutation entries has been moved from the Attachments section into the Session Replay card. It now uses the same collapsible toggle pattern as the other session replay features, and is always visible with an empty state when no mutations were captured.
CoTester AI Custom Instructions
featuredashboardcotestersettings
Team admins can now configure persistent custom instructions for the CoTester AI Assistant in organization settings.
A new "CoTester AI Instructions" section in Settings allows admins to define: product description, testing guidelines, documentation links, known issues, custom terminology, and general instructions. These are automatically injected into every CoTester AI session so the assistant always has context about your product and team practices. Each field supports up to 5000 characters and is sanitized before injection into the system prompt.
Voice Memo Transcription with Whisper
featuredashboardcotestervoicewhisper
Upgraded CoTester AI Assistant voice input to support long recording sessions (15-20+ minutes) using OpenAI Whisper chunked transcription.
The voice-to-text feature now uses MediaRecorder with chunked Whisper transcription instead of the browser Web Speech API. Audio is recorded continuously and sent in 30-second chunks to OpenAI Whisper for accurate transcription. Text appears progressively in a review panel where users can edit before accepting. Includes a timer display, discard/use controls, and a new /api/ai/transcribe backend endpoint with full authentication and validation.
Voice-to-Text in CoTester AI Assistant
featuredashboardcotestervoice
Added a voice input button to the CoTester AI Assistant chat panel using the browser-native Web Speech API.
The CoTester AI Assistant now supports voice-to-text input. Click the microphone button next to the attach button to start dictating — real-time transcription streams directly into the text field. A pulsing red indicator shows when recording is active. Works on Chrome, Edge, and Safari. The button is gracefully hidden on unsupported browsers.
CoTester AI Assistant in Pricing & Billing
improvedwebsitedashboardcotesterpricing
Added CoTester AI Assistant as a featured capability across all pricing tiers on the website and dashboard billing page.
CoTester AI Assistant is now listed as a feature in all four pricing plans (Free, Pro, Team, Enterprise) on both the website pricing section and dashboard billing settings. Enterprise tier highlights a dedicated CoTester AI Assistant experience.
CoTester AI Assistant on Homepage
improvedwebsitecotester
Added CoTester AI Assistant to the Enrich pillar on the homepage Platform Pillars section.
Added CoTester AI Assistant to the Enrich pillar on the homepage, highlighting guided bug creation as a key platform capability alongside existing features like Jira sync and email reporting.
Sentry MCP Server & Uptime Monitoring
securitymonitoringsentrydevops
Configured Sentry MCP server integration, uptime monitors, and secured credentials from git tracking.
Configured Sentry MCP server for querying issues and alerts directly from Claude Code. Set up Sentry uptime monitors for the bugAgent dashboard (5-minute checks) and marketing site (10-minute checks). Removed .mcp.json from git tracking and added to .gitignore to protect sensitive credentials including Stripe, Supabase, and Sentry tokens. Dependabot already configured for daily dependency scans at 6 AM ET across website, dashboard, MCP server, and GitHub Actions ecosystems.
The MCP SDK auth router uses express-rate-limit in sub-routers that do not inherit the parent Express app trust proxy setting. Behind Railway reverse proxy, this threw ERR_ERL_UNEXPECTED_X_FORWARDED_FOR and crashed the server. Disabled SDK built-in rate limiting since bugAgent applies its own rate limiting at the application level.
Consistent www.bugagent.com URLs
seobugfix
Fixed Google Search Console redirect issue by updating all website URLs to use www.bugagent.com consistently.
Updated Astro site config, CNAME, robots.txt, JSON-LD structured data, RSS feed links, OG image, and legal pages to use www.bugagent.com instead of bare bugagent.com. This eliminates the redirect flagged by Google Search Console and ensures consistent canonical URLs for SEO.
Introducing CoTester AI Assistant
new-featureimprovement
The AI chatbot has been rebranded as CoTester AI Assistant across the entire platform and documentation.
The built-in AI assistant is now called CoTester AI Assistant. Updated across the dashboard chat panel, header tooltip, system prompt, homepage features section, API reference docs, general docs, and MCP documentation. CoTester helps teams create detailed bug reports through guided conversation, answer questions about bug data, and suggest testing strategies.
Default project checkbox in settings
new-featureimprovement
Added a checkbox in the project settings section to set the active project as the default for new bug reports.
The project section in Settings now includes a Set as default project checkbox alongside the rename field. When toggled, the previous default is unset and the active project becomes the default. New bug reports are automatically assigned to the default project.
Project switcher in sidebar and streamlined bug report filters
new-featureimprovement
Added a project switcher to the sidebar matching the org switcher design, moved project filtering into the reports filter bar, and added project renaming in settings.
The project selector has been reorganized: a new sidebar project switcher (below the org dropdown) lets users quickly switch between projects or create new ones via a modal. The bug reports page filter bar now includes a project dropdown alongside type, severity, status, and resolution filters. The settings page has a new Project section for renaming the active project. New users automatically get a Default project created with their account.
Custom domain for MCP server
improvementmcp
Replaced all hardcoded Railway production URLs with mcp.bugagent.com custom domain across codebase.
Updated MCP server OAuth fallback URL, dashboard integrations setup guide, and README to use mcp.bugagent.com instead of the Railway deployment URL. This prepares for the custom domain DNS configuration on Railway.
Description field added to Jira sync
featurejirasync
Bi-directional Jira sync now includes the description field alongside title, severity, status, and type.
The Jira check endpoint now compares descriptions between bugAgent and Jira. When a description change is detected, the top-of-page sync banner appears. The merge flow handles description updates in both directions, converting between plain text and Jira ADF format.
Jira project dropdown selector
featurejiradashboard
Replaced manual project key input with a dropdown fetching available Jira projects from the team connection.
When pushing a bug report to Jira for the first time, users now see a dropdown of all available Jira projects instead of typing a project key manually. The dropdown pre-selects the default project if one was previously saved. New API endpoint: GET /api/jira/projects.
Fix team invitations and update email branding
bug-fixteamemailbranding
Fixed invite emails using old gold branding — now uses the new chrome/slate color scheme. Fixed "Invalid Invitation" error when clicking invite links by switching to service role client for fetching and accepting invitations. Added RESEND_API_KEY to production environment for email delivery.
Fix team management page and invite form
bug-fixteamrls
Fixed the team settings page showing 0 members and missing the invite form. Root causes: missing foreign key from team_members.user_id to profiles.id (broke PostgREST joins), and a self-referencing RLS policy on profiles that silently failed. Team members, roles, and the invite form now display correctly.
Duplicate bug report button
new-featurereports
Added a Duplicate button on the bug report detail page. Creates a copy of the report with [Copy] prefixed to the title, preserving all fields except status (reset to new), resolution (cleared), and media. The duplicate metadata tracks the original report ID.
Fix RLS recursion breaking team data loading
securitybug-fixrls
Fixed a critical self-referencing RLS policy on team_members that caused PostgreSQL to silently return zero rows. This broke the middleware team loading, causing the plan switcher to always show Free. Replaced with SECURITY DEFINER helper functions that safely bypass RLS recursion.
Admin Plan Switcher Fix and Uplifting Org Names
bug-fixdashboardnew-featureteams
Fixed the admin plan dropdown not persisting changes. All organizations now have unique, uplifting random names generated automatically.
Two fixes in this release:\n\n**Admin Plan Switcher Bug Fix**\n- Root cause: service role key was only checked via process.env which is empty in Astro SSR runtime. Now also checks import.meta.env as fallback.\n- The endpoint was falling back to the user's Supabase client which couldn't update the teams table due to RLS policies, causing silent failure.\n- Now uses locals.team?.id directly instead of a redundant DB query.\n- Returns proper error if service role key is not configured.\n\n**Uplifting Organization Names**\n- All existing teams renamed from "X's Team" pattern to unique uplifting names (e.g., "Noble Wellspring Hub", "Breeze Harbor Works")\n- New accounts auto-generate uplifting names via the handle_new_user() DB trigger\n- Auth callback safety-net also generates uplifting names using the shared org-names utility\n- Organization name field in settings validates non-blank and shows error if empty\n- Unique constraint (teams_name_unique) prevents duplicate org names in the database\n- Uniqueness check with friendly error message before saving\n- DB function generate_org_name() uses 50 adjectives x 50 nouns x 10 suffixes (25,000 combinations) with collision retry loop
Multi-Organization Support, Org Settings, and Downgrade Restrictions
multi-orgnew-featuredashboardbillingteams
Users can now belong to multiple organizations with a sidebar org switcher. Added organization name settings and billing downgrade restrictions.
Major feature additions:\n\n**Multi-Organization Support**\n- Users can belong to multiple organizations with different roles in each\n- Sidebar org switcher dropdown appears when user is in 2+ orgs\n- Shows org name, role, and plan for each org\n- Active team tracked via cookie and profiles.active_team_id column\n- Middleware loads all memberships and resolves active team (cookie > profile > first)\n- New /api/switch-team endpoint validates membership and switches context\n- Invite acceptance automatically sets the new org as active\n- Migration 013 applied: adds active_team_id to profiles with indexes\n\n**Organization Settings**\n- New "Organization" section in account settings\n- Manager, admin, and owner roles can rename the organization\n- Contributors see read-only organization info\n\n**Billing Downgrade Restrictions**\n- Free plan: cannot be downgraded to from any paid plan\n- Pro plan: can only upgrade to Team, no downgrade path\n- Team plan: can downgrade to Pro with confirmation modal\n- Downgrade modal shows warnings if >3 active members (excess will be deactivated) or if storage exceeds 1GB Pro limit\n- Requires typing DOWNGRADE to confirm\n\n**Invite Improvements**\n- Client-side check prevents inviting existing org members\n- Shows info message: user is already in the organization\n- Users in other orgs can still be invited (multi-org supported)
Security Fix: RLS Policies for Team Roles
securityteamsrlsbug-fix
Fixed critical RLS policy issues where team_members INSERT policy referenced deprecated role, and UPDATE/DELETE policies were owner-only. Added email verification on invite acceptance.
Security fixes found during automated audit:\n\n- **Critical**: team_members INSERT policy required role='member' which was removed in migration 011. Users could not accept invitations through the normal auth path. Fixed to allow all valid roles.\n- **Critical**: team_members UPDATE/DELETE policies only allowed team owner. Managers could not manage members despite UI showing controls. Extended to include admin and manager roles.\n- **High**: change_role action had no server-side allowlist validation. Added strict allowlist (contributor, manager, admin) and blocked self-role-change.\n- **Medium**: Invite acceptance did not verify that the logged-in user email matched the invitation email. An authenticated user with a different email could join any team via a shared invite link. Added email match verification.\n- **Fix**: team_invitations management policy now includes manager role.\n- **Fix**: team_members SELECT policy now allows all team members to see each other (was previously self-only).
Team Management with Roles and Invitations
teamsrolesnew-featuredashboardsecurity
Full team management system with role-based access control, invite flow with 5-day expiry, owner transfer, and new user onboarding.
Built comprehensive team management for the dashboard:\n\n- **New roles**: owner, manager, contributor (DB migration 011 applied)\n- **Role-based sidebar**: contributors see limited navigation (no billing, team, integrations)\n- **Middleware protection**: contributors blocked from restricted routes\n- **Team page rewrite**: invite with role selection (owner/manager/contributor), role change dropdown, member status management (activate/deactivate), resend invites with fresh 5-day expiry, owner transfer with confirmation\n- **Settings restrictions**: danger zone sections (delete project, flush reports, delete account) hidden based on role\n- **Invite flow**: new users create password + full name directly from invite link; existing users redirected to login\n- **Send-invite edge function**: supports manager role for inviting, improved welcome email with inviter name and role descriptions, 5-day expiry\n- **Security**: pre-commit scan passed, managers cannot escalate to owner/admin, contributors cannot access restricted pages
Simplified Bug Reports + Internal Notes
improvementfeature
Bug reports now store full descriptions without splitting. Added Internal Testing Notes.
Bug reports now store the full description as-is without splitting into separate sections (steps, expected, actual). Added Internal Testing Notes field — editable on the bug detail page but never synced with Jira or external integrations. AI Format feature still available to reformat descriptions on demand. Removed deprecated fields from all APIs and MCP tools.
Multi-Organization Support
feature
Multi-Organization Support
Pro and Team plan users can now create multiple organizations from the sidebar. Click the org dropdown and select "Create Organization" to start a new workspace. Auto-generates a unique uplifting name or enter a custom one. Free plan users can belong to multiple orgs via invites but cannot create new ones.
AI Description Formatter
feature
New AI wand button on bug detail page reformats descriptions into structured bug templates. format_description flag available on MCP create_bug_report and dashboard quick-submit. Jira sync handles pre-formatted descriptions without duplicate sections.
New AI wand button on bug detail page reformats descriptions into structured bug templates. format_description flag available on MCP create_bug_report and dashboard quick-submit. Jira sync handles pre-formatted descriptions without duplicate sections.
5 New MCP Tools + Setup Guides
featuremcp-servertools
Added Jira sync, comments, and team management tools plus easy setup configs
Five new MCP tools: check_jira_sync (detect remote changes), merge_jira_sync (bi-directional merge), add_comment (with auto-push to Jira), list_team_members, and invite_team_member. README updated with quick setup guides for Claude Code, Claude Desktop, Cursor, Windsurf, and remote HTTP connections. Evaluation suite with 10 test questions added for quality assurance.
MCP Server Best Practices Update
improvementmcp-server
Tool annotations, pagination metadata, and improved error handling
All 24 MCP tools now include behavior annotations (readOnlyHint, destructiveHint, idempotentHint, openWorldHint) to help clients understand tool behavior. The list_bug_reports tool returns pagination metadata (has_more, next_offset). All error responses now include the isError flag per MCP convention.
Bug Reports RLS Fix for Teams
fixsecurityteams
Team members can now update and sync reports created by other team members
Fixed row-level security policies on bug_reports to allow any team member to update team reports. Previously only the original creator could edit, which blocked Jira sync for reports filed by other team members.
Deleted Jira Issue Handling
featurejira
Graceful handling when linked Jira issues are deleted
When a synced Jira issue is deleted or becomes inaccessible, the report detail page shows a notification with two options: Unlink (remove the Jira reference so the report can be re-pushed as a new ticket) or Keep Reference (grey out the link as a historical record).
Bi-directional Jira Comment Sync
featurejiracomments
Comments now sync between bugAgent and Jira automatically
New comments posted in bugAgent on Jira-synced reports are automatically pushed to the Jira issue. When checking for Jira updates, new Jira comments are detected and can be imported during the merge flow. Comments from Jira appear with a "[Jira — Author]" prefix. Duplicate prevention ensures comments are never imported twice. Reactions and edits are not synced between systems.
Bi-directional Jira Sync
featurejirasync
Bug reports now detect and merge changes from Jira automatically
When viewing a bug report synced to Jira, bugAgent now checks if the Jira issue has been modified. If changes are detected, an amber banner prompts you to review. A merge dialog shows field-by-field comparison (title, severity, status, type) so you can choose which value to keep. After confirming, both systems are updated to match. Comments are never synced. After editing a synced report locally, a Sync to Jira button appears to push your changes.
Team-Scoped Jira Integration
featurejirateams
Jira connections now belong to the team, not individual users
Jira integrations are now shared across all team members. Any member can sync bugs to Jira using the team connection. Only managers and owners can connect, configure, or disconnect. If the person who set up the integration leaves or is removed, the connection persists for all remaining and future team members.
Sentry Error Monitoring
improvementmonitoring
Added Sentry SDK to dashboard and MCP server
Integrated Sentry error tracking with separate projects for dashboard (javascript-astro) and MCP server (node-express). Configured nightly security scan at 3am via scheduled task.
MCP Server Security Hardening
securitymcp-server
Fixed 14 security and reliability issues across the MCP server
Fixed SQL injection in search, added rate limiting on auth, removed Jira sync privilege escalation, added file upload path traversal protection, input validation limits, 15s API timeouts, pendingAuth memory leak fix, credential file permissions, and externalized Stripe price IDs to env vars.
Nightly security scan
featuresecurityautomation
Scheduled automated security scan at 3am daily. Combines Sentry issue monitoring with local code security scanning for comprehensive overnight analysis.
Sentry MCP server
featuremcpsentry
Added Sentry MCP server for AI-powered error investigation and debugging directly from Claude Code. Enables querying Sentry issues, events, and Seer AI analysis.
Sentry integration
featuremonitoringsentry
Added Sentry error monitoring to dashboard and MCP server. Configured @sentry/astro for SSR error capture, @sentry/node for MCP server monitoring, and Sentry release tracking in CI/CD pipeline.
Fix new user signup from invite
fixauthdatabase
Fixed "Database error saving new user" when accepting invites by setting search_path on all SECURITY DEFINER PostgreSQL functions.
Fix invite email branding
improvementemailui
Updated invite email template with clean white CTA button, refined typography, and consistent dark theme. Removed party emoji from invite acceptance page for a more compact layout.
Italicize Agent in bugAgent branding across homepage
uibrandingwebsite
Updated all visible bugAgent text on the homepage to italicize the "Agent" portion using em tags, consistent with the logo styling. Applied across 10 section components including Hero, Features, How It Works, Pricing, and Footer.
Fix admin users page 500 error on Railway
bugfixdashboardsecurity
Fixed server-side environment variable access across all dashboard files. Replaced import.meta.env (build-time only) with process.env fallback for SUPABASE_SERVICE_ROLE_KEY so it resolves correctly at runtime in Docker/Railway deployments. Added graceful error handling for admin pages.
Automated security scanning on every commit
securitydevopsnew-feature
Added an 8-category security scanner that runs as a pre-commit hook and CI gate. Checks for hardcoded secrets, SQL injection, XSS, auth gaps, insecure patterns, dependency vulnerabilities, sensitive file exposure, and RLS/data access scoping.
Dashboard chart upgraded to grouped bar layout
dashboardenhancementui
Updated the actual dashboard bug reports chart from a single bar to a grouped 3-bar-per-day layout showing UI (blue), Performance (amber), and Crashes (red) categories with legend, Y-axis labels, gridlines, and detailed hover tooltips.
Homepage repositioned as agentic QA platform
new-featureimprovement
Added three new homepage sections positioning bugAgent as a bug enrichment and QA delegation platform: Platform Pillars, Delegate Testing, and Agent QA Swarm.
Three new sections added to the marketing homepage:
**Platform Pillars** — "Enrich bugs. Delegate testing. Heal code."
- Three-pillar overview with Enrich (Live), Delegate (Coming Soon), and Heal (Coming Soon)
- Each pillar card has feature checklist and interactive visual
- Enrich shows raw-to-enriched bug report transformation
- Delegate shows AI or PM requesting human QA
- Heal shows the agent swarm grid
**Delegate Testing** — "Real humans. Real testing. Requested by agents or you."
- Exploratory Testing card with demo showing CI Agent requesting human testers
- Automation Testing (Playwright) card with demo showing PM triggering test suite
- "Agents in the loop" callout explaining autonomous QA delegation
**Agent QA Swarm** — "Agents that lint, scan, and heal your codebase"
- 10 specialized agent cards: Code Lint, Dependency, Accessibility, Security, Performance, Visual Regression, API Contract, Dead Code, Test Coverage, Localization
- Each card shows what the agent does and its action types (auto-fix, scan, report)
- 3-step flow: Agents scan → Auto-heal → File reports
API reference page and public changelog
new-featureimprovement
Added a comprehensive API reference page at bugagent.com/api-reference documenting all REST endpoints, and a public changelog page at bugagent.com/changelog with tag filtering, date navigation, and RSS subscription.
Two new pages added to the marketing website:
**API Reference** (`/api-reference/`)
- Left sidebar with collapsible navigation sections for all endpoint categories
- Welcome/Getting Started, Introduction, Base URL, Authentication docs
- All REST endpoints documented with method badges, auth requirements, parameter tables, and response examples
- Covers: Auth, Reports, Comments, Projects, API Keys, Profile/Settings, Usage/Stats, Billing, Jira Integration, Changelog
- SDKs section (Node.js, Python, Rust, Go — coming soon)
- Contribute section and MIT License info
- Recommendations with best practices
- Mobile responsive with floating sidebar toggle
**Changelog** (`/changelog/`)
- Fetches entries from Supabase at build time
- Groups entries by date with right-side date navigation
- Color-coded tag filtering (security, api, mcp, new-feature, agents, improvement, bugfix)
- Expandable detail sections for each entry
- Subscribe via RSS button linking to `/changelog.xml`
Footer updated to link both API Reference and Changelog.
14 new MCP tools for autonomous agents
mcpnew-featureagents
Added full account lifecycle support for autonomous AI agents: registration, login, profile management, project CRUD, API key management, settings, and usage tracking.
New MCP tools:
- register_account, login — agent self-service onboarding
- update_profile, change_password — profile management
- get_settings, update_settings — notification preferences
- create_project, delete_project — project CRUD with plan limits
- flush_reports — bulk cleanup with owner/admin check
- generate_api_key, list_api_keys, regenerate_api_key, delete_api_key — API key lifecycle
- upgrade_plan — returns Stripe checkout URL
Agent accounts are flagged with is_agent=true in profiles.
Critical security fixes for MCP server
securitymcp
Fixed 3 store functions (listReports, getReport, updateReport) that had no user/team ownership scoping, allowing any authenticated user to access any report via the MCP server.
The MCP server uses a Supabase service role key that bypasses Row Level Security. Three core functions had no application-level ownership checks. Added userId parameter and .or() ownership filtering to all three, plus a shared getUserTeamId() helper to eliminate duplicate team lookups across 7+ functions.
API key authentication for REST endpoints
apisecuritynew-feature
REST API endpoints now accept Bearer token authentication using ba_live_ API keys, in addition to session cookies.
Added middleware that validates Authorization: Bearer ba_live_... headers by SHA-256 hashing the token and looking up the api_keys table. Populates the request context with user and team info so all existing endpoints work automatically with API key auth.
Changelog system with RSS feed
new-featureimprovement
Added a changelog database, admin API for creating entries, and an RSS feed at /changelog.xml for tracking platform updates.
- New changelog_entries table with public read access
- GET /api/admin/changelog (public) lists entries
- POST /api/admin/changelog (admin only) creates entries
- /changelog.xml serves RSS 2.0 feed with content:encoded support
- Entries support tags for categorization (security, api, mcp, new-feature, etc.)
REST API parity with MCP tools
apinew-feature
Added 12 new REST API endpoints so API key holders can programmatically access all features previously only available through MCP tools.
New endpoints:
- POST /api/auth/register, POST /api/auth/login
- GET/POST /api/reports, GET/PATCH /api/reports/:id
- GET/POST /api/keys, DELETE /api/keys/:id, POST /api/keys/:id/regenerate
- GET/PATCH /api/profile, POST /api/profile/password
- GET/PATCH /api/settings
- GET /api/usage, GET /api/stats
All endpoints enforce ownership filtering and work with both session cookies and Bearer API key authentication.
We use cookies to analyze site usage and improve your experience. You can accept or decline non-essential cookies. Privacy Policy