Skip to main content

#338 – Pulling the Thread

Friday Ship #338 | March 10th, 2023

ian-taylor-jOqJbvo1P9g-unsplash

Last week we (finally) upgraded our infrastructure…

It started with an upgrade to NodeJS: the software that powers our servers. We were on a version set to expire by September and needed to upgrade in order to use the latest features, improve performance, and most importantly, stay secure.

Unfortunately, upgrading NodeJS required us to upgrade our Operating System (we were a few versions behind there, too). Upgrading our OS required upgrading our hosting service Dokku (an open-source version of Heroku).

As luck would have it, that required upgrading Docker, a new version of Let’s Encrypt (our Certificate Authority), and a handful of other packages. What is typically a painless upgrade turned into a full two-day event for our infrastructure team. A story as old as time in the software industry.

Our original trusty server, and all the settings that power it, was painstakingly configured by our CEO Jordan Husney back in 2016 – long before we even had a DevSecOps team. We had hoped we’d migrate to Kubernetes before we needed to upgrade, but our luck had run out.

Rafael Romero Carmona, our newest Senior DevSecOps member, took this as a challenge. He spun up a new server running the latest software, copied over all of the bespoke settings he could find, and after we tested it on staging, we switched the DNS to the new server.

The result was a zero downtime upgrade!

However, as tech workers, when something works perfectly the first time, panic ensues.

Sure enough, the next day when we reached peak load, our app became sporadically unavailable for some users. The root cause was not setting our reverse proxy’s file limit high enough to support all of the traffic that we get on a typical Friday.

Our team in Europe discovered, triaged, and fixed the bug before I even sat down for breakfast in California. While remote work can be challenging at times, events like this are a clear reminder that a globally distributed team gives us a huge advantage when it comes to site reliability.

Metrics

Parabol's Metrics for March 2023 show that web traffic was down, monthly active users dipped slightly, and so did weekly meetings.

Usage is generally down this week. After the previous strong weeks this type of behavior is expected. We’ll investigate more if this trend continues.

This week we…

…wrapped up Sprint 117. This included building the foundation to our new Activity Library, which will support dozens of new meeting types.

held our first Developer Experience (DX) retro. Engineers shared their pain points & we created some next steps to make development here a little better.

Next week we’ll…

…take a week off from sprinting to focus on the features that we personally care about. We call it slack week & it’s where we get to focus on the issues that matter the most to us, even if it doesn’t align with our sprint goals.


Have feedback? See something that you like or something you think could be better? Please write to us.

Matthew Krick

Matthew Krick

Matt is a full-stack web developer, data scientist, and global project manager. He has previously worked for Peace Corps, Ecova, and Boeing, and is the creator and lead developer of several open-source projects including Meatier and Cashay. Matt lives and works in San Diego, CA.

All your agile meetings in one place

Run efficient meetings, get your team talking, and save time. Parabol is free for up to 2 teams.