#398 – Team Insights Model Development

Friday Ship #398 | May 24th, 2024

This week we evaluated the prototypes we developed during last week’s Product Team retreat to San Diego for our upcoming team insights feature.

It’s hard for us to believe but folks have held more than 6,000,000 discussions on the Parabol SaaS platform. That’s an awful lot of reflecting, brainstorming, and improving the way teams work. Since very early on in our company, we dreamt of having the capability to analyze these data at scale to give teams and leaders insights into the systemic issues that are challenging teams from being able to do their best work. The astounding rate of advancement of large language models (“LLMs”) has made what was once a difficult and costly engineering problem to something well within the realm of our small company to tackle on its own.

Last year, we began developing the very first LLM-based prototypes for transforming a team’s data into insights. Over the past few months we shipped a generalized architecture for transforming our data into formats that are more readily used by today’s LLM technologies (embedding vectors and plaintext representations, for the technical among our readers).

Last week our team gathered in La Jolla, California (near San Diego) to develop several prototypes on top of this architecture to evaluate approaches for extracting team-level and organization-wide insights from these data (we previously wrote about planning this retreat in Friday Ship #396). We rented a big house and spent nearly the entire week developing varied approaches and putting them to the test with real customers. Several folks at Parabol are fans of the card game “Dominion” so we offered a copy of the game as a prize to the top-performing model.

Models were developed on a common set of training data (including the Parabol organization’s some 8 years’ worth of meeting data). When it came time to evaluate model performance, novel data the developer’s hadn’t previously had access to was ran through each model and the model outputs were gathered and sent to customers for evaluation in a survey instrument.

While we’re not quite ready to reveal example insights publicly yet, we can show you what a piece of our model performance evaluation looked like. Below is a graph of 7 models developed during last week and their mean aggregate score as collected by our survey instrument. Here, the model codenamed “Radiance” was the top performer, just edging out “Skylark” and “Whisperwind.”

One noteworthy output of last week’s experimentation was experiencing just how advanced GPT-4o “omni” is as a general AI tool. Our top 3 performing prototype models had 1 thing in common, they all fed data directly to GPT with minimal preprocessing. Other models used more “traditional” processing pipelines involving the use of embedding vectors/distance searches, clustering, etc. None of these models outperformed “just sending all the data to GPT with a prompt.” Fascinating!

Enjoying Time Together

We were long overdue for some time to catch up and bond as a team. While we had a good time hacking away on our laptops at the Airbnb, we also made time for a little recreation. There are plenty of good options in and around San Diego:

🥾 Hiking: we hiked Torrey Pines State Natural Reserve. It was great to get moving, take in the natural environment, and check out the wonderful view of the coast.

🃏 Games: we played several rounds of Dominion. Being remote, we play online every couple of weeks. This week we got to have good, old-fashioned rounds at the table.

🧑‍🍳 Grilling: we had a great outdoor kitchen. Jordan loves to grill out. It was the perfect combo for some great dinners.

🏄 Surfing: a few of us wanted to try surfing so we rented some boards and hit La Jolla Beach. All of us managed to learn a little and a few of us got up on the board!

🗿 Site-seeing: we visited the USS Midway which served as a carrier in the US Navy from 1945 to 1992. It was fascinating to see the different eras of technology that were used on the ship during this timespan.

Our time together was quite energizing and refreshing. It was a good reminder of how important it is to get together to work on a project, fellowship over meals, and have some fun. We feel the quality of our work and time together was outstanding.

Metrics

We see some down metrics this week after a high in weekly meetings last week. It’s good to see Standups having steady growth over the past few weeks. We’ll keep an eye on traffic for the rest of the month to ensure our signup growth continues.

This week we…

…we kicked off a TACFI with the USAF. 🚀 ****A small team of us met up with leaders in San Antonio to kick off the project. We have some great deliverables that we are excited about developing and shipping to our customers.

…we shipped a favorites feature for our Activity Library. ❤️ Now users can mark which activities are their favorite to run with their teams and projects. They can easily access these from the top navigation.

Next week we’ll

…hold a strategy planning session as an executive team. 🗺️ It’s time to refresh our strategy for T3. We’ll include input that we’re gathering from everyone at the company. The good news is we feel like we’re on the right track overall, but may need to make some small adjustments to our course.

Terry Acker

Terry specializes in front-end architecture, UX strategy, UI design, and brand systems. He has previously worked for Quirky, BoomTown, and Children’s Medical Center of Dallas, and has served as an advisor to several early-stage start-ups. Terry lives and works near Tyler, TX.

All your agile meetings in one place

Run efficient meetings, get your team talking, and save time. Parabol is free for up to 2 teams.

Get Parabol for Free