Diary of Engineering Learnings
This page is updated whenever we have a hard fought learning that future Pivot engineers should know about.
2025-09-19: Cloudflare can block legitimate RPC calls as suspicious
Today we experienced a significant outage where specific RPC calls were getting blocked by Cloudflare with 403 Forbidden responses. Cloudflare's security filtering was analyzing our URL paths and request bodies, flagging certain legitimate RPC calls as suspicious. This delayed our incident response by 30+ minutes because we initially assumed the issue was with a recent frontend or backend deployment and couldn't find anything wrong in our application logs.
Lesson: if we are getting 403 Forbidden errors, we absolutely need to check Cloudflare security events immediately. Don't spend time looking for application-level issues first.
2024-12-02: Synadia Cloud can go down
NATS Jetstrean can error when we publish to it and can take a long time to do so, allowing for our services to timeout. Most of the time, this means we should make best effort publication efforts without blocking the request. Sometimes, this does mean waiting for successful publication, but we should only do that when an outage is better than degredation (for example, if we receive a Stripe webhook and publish it to NATS, we actually want to fail the request if NATS is failing, but most of the time that's not true.)
Lesson: avoid treating NATS as a single point of failure: if you can write to Postgres/Cassandra and respond
200, then do that. Don't let a NATS timeout lead to the client thinking the request failed.
2024-12-01: Websockets are weird when it comes to cookies
Chrome suddenly started rejecting ws.pivot.app requests before they could even
leave the client.
Lesson: don't set cookies on
.pivot.app. The browser will fail trying to send them tows.pivot.app. Only set cookies on the root domainpivot.app(meaning, don't specify the domain at all).