#224 – Data Plumbing
This week, we shipped changes to try and resolve problems we’ve had with our marketing data accuracy.
Parabol is a data-driven company. When 2020 began and our team began to grow, so did the number, frequency, and depth of questions we wanted to ask of our data in order to guide our decision-making. Questions like:
- Which marketing channels are driving the most traffic?
- Which paid advertising campaigns are driving the right sort of users to use our product?
- Which companies are most likely to purchase our product?
These questions were surprisingly hard to answer. Now, you might be thinking: analytics is a solved problem! Just use solution X! No so fast.
Parabol has several stakeholders to our data. First and foremost are our users, when they interact with Parabol it should behave the way they want and expect. Second is our product team, we should be able to see which journeys and features are working, and which need improvement. Third is our marketing team, we should be able to see where our users are coming from and which are driving the right sort of usage and revenue. Lastly is our sales team, they should be able to focus their time speaking with the right prospects and provide them with accurate information.
To serve all these stakeholders, we have to bring together events and analyses together from our marketing web site, our web application, and our server application. These data must be gathered whether users are authenticated (i.e. we know their Parabol username) or not (i.e. they’ve just landed on one of our properties the first time). The information must be gathered in compliance with international data privacy regulations. Further, these data have to be served to downstream applications our stakeholders use: product folks use Amplitude, marketing folks use Google Analytics, and our sales folks use HubSpot CRM. All of these applications have their own APIs, ways of organizing information, and eccentricities.
One way we try to reduce the burden to maintain the serving of all of these systems is by using Segment: by abstracting everything we want to measure to a Segment event, we can let Segment handle how to deliver it to the downstream application. Simple enough, right? Not so fast, again!
An acute, recent problem for us has been getting Google Analytics channel attribution working correctly. If a user signs up for Parabol, we want to know if it was via an email invitation from another Parabol user, a paid ad, a referral, etc. While Google Analytics was providing us with some data, it looked wrong. Most of the traffic was being attributed as “direct,” which is Google Analytics-speak for “I don’t know where this traffic came from.”
Reviewing the data in Segment’s tools, everything looked correct: anonymous users were visiting the Parabol marketing site, the same anonymous user was visiting Parabol’s application, when they’d sign up we’d pair their anonymous id with their new Parabol user id. We struggled to debug this for weeks. We needed help, and ended up paying Segment for time with a solutions architect. We were grateful to learn debugging this was difficult for them too!
As it turned out, Google Analytics doesn’t have a mechanism to tie to the anonymous user id together with a stable, application-provided id, so the channel attribution would always be wrong. We needed to implement a work around, specific to Google Analytics.
While we were gigging into the problem with the solutions architect, we asked, “is it always this hard do all companies have this issue?”
He replied, “oh yes, it’s spaghetti everywhere.”
There No Shame in Hard Things
On the internet, everybody seems to win all the time. Nothing is hard. Your company is always behind. Of course, this isn’t true. It’s become normal to hide how difficult all of this really is.
Not so here. This stuff is hard. And there is no shame in saying so.
Almost all of our top and mid funnel metrics were up this week, save a slight decline in the number of weekly meetings run. There were no new trends, or new signals this week needing our attention. All systems nominal!
This week we…
…published a new blog post on 8-tips for Better Remote Teamwork.
…shipped v5.19.1 into production. We fixed a bug where the latest meetings weren’t always appearing on the Timeline, some interactive demo bugs, and multi-player task card editing state bugs. We also attempted a new fix for Google Analytics channel attribution, as documented above.
…continued full-bore working on Sprint Poker.
…planned our next all-company retreat, S.P.A. 16. Due to the global pandemic, it’ll be remote once again. Our company will plan on suspending our usual duties from October 19th – October 23rd.
…kicked off a couple of batting practices for our open roles.
Next week we’ll…
…host retrospectives at the eXperience Agile conference. Parabol will be used to gather feedback on the conference, and improve the experience for everybody in real-time.
…wrap up Sprint #65.
Have feedback? See something that you like or something you think could be better? Leave a public response here, or write to us.