Pablo's Architecture Proposal
A faithful write-up of what Pablo wants to build and change in the Call Analyst architecture, framed so Mark can decide. Pablo's case is presented in full; an honest assessment section follows so the tradeoffs are visible.
Subject: Call Analyst (callanalyst.app) · Proposed by: Pablo (senior-engineer collaborator) · Companions: the build docs and the refactor plan.
Executive summary
Pablo is proposing to take Call Analyst from a solo proof-of-concept (a single-file PWA on managed serverless) to a production-grade, AWS-native, compliance-ready product. The proposal has two halves that reinforce each other:
The engineering half
The legal half
His stated goals: reliability, scalability, predictable cost, security, and the legal standing to actually operate. His core argument: use the right tools and "do it right" now, with infra he already has battle-tested, rather than ship a centralized POC and retrofit later.
The core thesis
Pablo's central point is that "it works and it's centralized" is not the same as "it's a product." Shipping a POC as a full-fledged product means crossing a set of engineering and legal thresholds that the current build has not crossed. He is arguing the gap is real, the tooling to close it already exists (his CDK templates), and the closing is "not a large lift" when the right tools are used and he carries the infra.
He also reframed an earlier reliability debate: the headline isn't a specific number of nines, it's whether the product is built to be operated, audited, and trusted at production standard. On that framing, the productionization gap is the real subject.
Where it stands today (what Pablo wants to move off of)
| Layer | Today |
|---|---|
| Frontend | One ~20,000-line single-file PWA (index.html), HTML+CSS+JS inline, no build step |
| Hosting | Netlify (managed); the app auto-publishes from main |
| Backend | Netlify Node functions + Deno edge functions (managed serverless) |
| Data / auth | Supabase (managed Postgres + auth + storage), plus Cloudflare R2 |
| Build segregation | Consumer vs internal via a runtime flag and a drifting git branch |
| Secrets | Scattered as per-site env vars; some keys in plaintext config; no rotation |
| Compliance | Not yet addressed — no consent framework, GDPR, residency, retention, or DPAs |
| Network posture | Public managed endpoints; no VPC / private subnets (there is no network to isolate in a managed-SaaS model) |
1 · Decompose the monolith
Separate the HTML, JS, and CSS into distinct files and modules. End the single 20k-line file. Pablo frames this as table stakes for a "grown-up" build, and as a prerequisite for any security or privacy review (an opaque single file is effectively unauditable).
2 · Frontend on a CDN (CloudFront + S3)
Serve the static frontend from Amazon S3 behind CloudFront, defined by a spot-optimized, ready-to-go CDK template Pablo already has. This replaces Netlify hosting for the consumer app with AWS-native, globally cached delivery under unified AWS control.
3 · VPC-isolated Lambda backends
Move the backend from Netlify/Deno functions to AWS Lambda, deployed inside a VPC with private subnets (network isolation), via Pablo's second CDK template. The goal is a controlled network boundary, fine-grained IAM, and a single, auditable security perimeter for everything that touches user data. Pablo considers the Lambda lift small given the templates already exist.
4 · Containers & images
Pablo wants container images as part of the build ("we need our images"). In a Lambda-centric design this most directly means container-image-packaged Lambdas and/or a path to ECS/Fargate for any workload that needs it. It standardizes how code is packaged and deployed, and gives a clean home for any long-running worker (for example agentic background jobs or media/transcription processing that exceeds serverless time limits).
5 · Everything as CDK / CloudFormation
Express the entire stack as AWS CDK / CloudFormation — infrastructure as code — so the frontend, backend, network, and security are unified, versioned, reproducible, and owned in one place. This is the backbone of the proposal: one selected infrastructure that unifies security and operations, rather than a spread of managed SaaS each with its own controls.
6 · Repo & ownership structure
Pablo proposes feature-grouped separate repos rather than one general, multi-faceted repo, to force focus and a more mature build. Critically, he offers to own the AWS infrastructure himself. That ownership is a central variable: it changes who carries the operational burden of the migration (see §assessment).
7 · The legal / compliance layer the part a POC skips
Pablo's strongest point: Call Analyst records people's meetings, which makes it a processor of voice and transcript data full of personal information, often including people who never opted in. That front-loads a legal layer from the first real user, independent of how clean the code is:
- Recording consent — all-party consent. US two-party-consent states (California, Florida, others) and GDPR both require it. This is the defining legal risk for any notetaker.
- GDPR / CCPA — lawful basis, privacy policy, ToS, right to access, right to erasure (the ability to actually delete a user's data and their meeting data on request).
- Data residency & geofencing — keep EU user data in-region, or geofence the launch (e.g. US-first) to defer GDPR by intent rather than by accident.
- Retention & deletion policy, records of processing.
- Subprocessor DPAs — data-processing agreements with everyone the audio/transcript touches: Recall, Anthropic, OpenAI, Supabase, Stripe.
Before → after, at a glance
| Layer | Today (POC) | Pablo's target |
|---|---|---|
| Frontend code | 20k-line single file | → split HTML/JS/CSS, modular |
| Frontend hosting | Netlify | → S3 + CloudFront (CDK) |
| Backend | Netlify + Deno functions | → VPC-isolated Lambdas (CDK) |
| Packaging | raw files | → container images |
| Infra definition | console + scattered env | → CDK / CloudFormation, unified |
| Network | public managed endpoints | → VPC, private subnets, IAM |
| Repos | one repo | → feature-grouped repos |
| Secrets | plaintext / per-site env | → KMS + rotation + least privilege |
| Compliance | none | → consent, GDPR, residency, retention, DPAs |
| Ownership | Mark + AI | → Pablo owns the infra |
What Pablo is optimizing for
Reliability
Scalability
Predictable cost
Security
Legal standing
"Do it right"
Honest assessment (so the tradeoffs are visible)
Where Pablo is clearly right
- The compliance gap is real and front-loaded. For a meeting recorder, consent + GDPR + deletion are not "after product-market fit." They apply from the first real user. This was the strongest point in the discussion, and it was missing from the earlier engineering-only plan.
- "Centralized" is not "shippable." A 20k-line file serves traffic fine, but it is close to unauditable, which is a liability the moment a privacy or security review arrives. Splitting it is more urgent under a compliance lens, not less.
- Controlled infra is the right home for the compliance primitives (residency, KMS, IAM, geofencing).
- Ready templates + Pablo owning the infra changes the cost calculus. The original "premature, months of work, spends Mark's runway" objection weakens substantially when the senior engineer carries it on infra he has already built.
The distinctions Mark should hold onto
- AWS is not compliance. The migration buys better primitives; it does not make the product GDPR-compliant. The consent flow, deletion pipeline, DPAs, privacy policy, and retention policy are application + legal + process work that must happen regardless of host. Treat them as two tracks that meet, not one.
- The template is the easy 20%; the migration is the 80%. Standing up the CDN/VPC skeleton is fast. The real labor is porting ~40 functions to Lambda and migrating auth + Postgres + storage off Supabase, plus the cutover. Worth estimating eyes-open.
- Reliability is earned, not bought. AWS provides the floor (the current stack already rides on AWS via Supabase/Netlify); production reliability comes from architecture and operations, and someone must own the pager.
- Monorepo vs polyrepo is an open question. The frontend is one deployable app, not independent services, so feature-grouped repos add coordination cost without mapping to separate deployments. A monorepo with clean module boundaries delivers the same "grown-up" structure with less friction. Worth deciding deliberately.
- Containerize where a workload needs it (long-running workers), not everywhere, to avoid orchestration overhead on short-lived handlers.
Proposed sequencing (do it right without blowing the launch)
- Scope the launch by geography. US-first, recording consent handled per two-party states, GDPR deferred by deliberate geofence. Makes a near-term launch legal without boiling the ocean.
- Stand up Pablo's CDK foundation — S3 + CloudFront frontend, VPC-isolated Lambda backend, KMS for secrets, region-pinned storage. The skeleton he already has.
- Run the legal / data track in parallel — consent capture, a real deletion pipeline, subprocessor DPAs, privacy policy + ToS. The part no infra does for you.
- Split the frontend into modules as it lands on the CDN, so it is auditable for the review compliance will trigger.
- Port functions to Lambda and migrate auth/data off Supabase — the genuine labor, sequenced after the skeleton, with a tested cutover and rollback.
Decisions for Mark
| Decision | The choice |
|---|---|
| Migrate to AWS now, or after PMF? | Pablo owning the infra + ready templates makes "now" reasonable. The gate is sequencing it so it doesn't consume the launch window. |
| Launch scope | US-first geofence (defer GDPR) vs full EU readiness on day one. |
| Repos | Feature-grouped repos (Pablo) vs one monorepo with module boundaries. |
| Who owns the infra long-term | Pablo carries AWS — confirm this is durable, since reliability follows ownership. |
| Compliance track owner + budget | The legal work (consent, deletion, DPAs, policy) needs an owner and likely outside counsel. This is the non-optional half. |