On the podcast: How to make better decisions with data, the many pitfalls of collecting and interpreting data, and why the best executive dashboard is probably a hand-written weekly email.
Key Takeaways:
📝Balance data collection with business goals. Collecting all possible data can drown teams in noise and lead to compliance risks. Focus on collecting semantically important data that aligns with business goals and use cases to avoid unnecessary complexity and costs.
💡Prevent exponential cost increases by structuring data early. Establishing a well-structured data collection and management process early on prevents costly modifications and adjustments later. Early alignment and thoughtful planning are crucial.
🔒 Maintain control over data collection to simplify compliance. Managing your own data collection processes can reduce legal and compliance challenges associated with third-party data processors. This is especially crucial for adhering to regulations like GDPR.
🔧Opt for off-the-shelf data solutions early on. Leveraging open-source or ready-made solutions can save time and resources. Maintain a clear evaluation structure for transitioning to custom solutions when needed, and accept changes in data collection methods to avoid outdated systems.
📊Simplified insights over complex dashboards. Dashboards can overwhelm executives with too much data. Instead, providing a succinct, focused summary of key insights through something as simple as a weekly email can be more effective for decision-making.
About Guest
📈 Director of Data Products at News Corp.
💡With over 15 years of experience, Taylor is an expert in building and implementing effective data collection and analytics strategies — helping organizations like Disney+, Business Insider, and Deloitte collect the right data and turn it into actionable insights.
👋 LinkedIn
Follow us on X:
Episode Highlights
[3:44] Laying a foundation: Data collection is a lot like constructing a building — setting up the right framework from the beginning can save you a lot of time, effort, and money later.
[7:30] The Goldilocks zone: Collecting either too much or too little data is costly and can potentially have ramifications for data regulation and privacy laws.
[16:58] Information overload: Data is only helpful if you derive actionable information from it.
[20:33] Distilling data: What is a “data product” team? (And why might you need one?)
[26:13] Build vs. buy: Most companies should start with an off-the-shelf data collection solution instead of building something internally — then consider a switch later when the scale and financials make sense.
[33:45] What’s in a name? What you call specific data points and even your data collection system can be very important.
[42:11] Ditch the dashboard: Fancy data analytics dashboards need to be interpreted to be valuable — and without context, they can be misleading.
[51:27] Trix are for… kids?: How Taylor’s experience promoting the television show “Bluey” on Disney+ illustrates the incredible power of data analytics.
David Barnard:
Welcome to the Sub Club Podcast, a show dedicated to the best practices for building and growing app businesses. We sit down with the entrepreneurs, investors, and builders behind the most successful apps in the world to learn from their successes and failures.
Sub Club is brought to you by RevenueCat. Thousands of the world's best apps trust RevenueCat to power in-app purchases, manage customers, and grow revenue across iOS, Android, and the web. You can learn more at revenuecat.com.
Let's get into the show.
Hello, I'm your host, David Barnard, and with me today, RevenueCat CEO Jacob Eiting. Our guest today is Taylor Wells, Director of Data Products at News Corp. Having worked on data projects at Deloitte, Disney, and Business Insider, Taylor now works on data strategy, product insights, and identity frameworks across News Corp's many digital products. On the podcast, we talk with Taylor about how to make better decisions with data, the many pitfalls of collecting and interpreting data, and why the best executive dashboard is probably a handwritten weekly email.
Hey, Taylor, thanks so much for joining us on the podcast today.
Taylor Wells:
Thanks for having me. Excited to be here.
David Barnard:
So you've had a ton of experience working on data at Deloitte, Business Insider, Disney+, and now News Corp, and I wanted to kick things off just talking about what data science is. Why data?
Taylor Wells:
I would almost separate that into two different things, why I choose to beat myself up getting into a career in data or why data matters. Honestly though, on the why-data-matters part, I would almost start with instead of looking at it as what data can do for you, it's more just like how badly can data be misused, frankly, these days? I think that you see that across most companies. So everyone, when they think of how data and analytics and data science teams work, if they're not in those spaces but are in corporate settings, they often think of the problems that they've observed in the past, broken reports, or problems trying to get to data, or a lot of time, effort, and money spent on data-related initiatives or things like that that ultimately don't show a lot of the value that they think that it's going to show.
So I think it's very valid for people to be hesitant or reluctant about the return on investment in data itself, but I think that people also realize that data as a concept and as a raw set is essentially some of the most valuable assets or products on Earth when used well. So it's almost like atomic energy or something like that, right? It can be used for great things, but you remember Fukushima and things like that instead.
So everyone knows what bad data looks like and the challenges that you have, like they start a new initiative where they're going to build out a new data warehouse or move something over, and then that turns into a slog. There's now changes that were unexpected or it's even harder to do something. It takes longer than they thought or costs are crazy out of control. And then you also have things like companies that go out and hire a lot of data scientists or data analysts, but they don't really know what they should be focused on and it's more an afterthought of like, "I'm going to hire great, talented, smart people in those spaces because I don't understand them and they'll figure it out as we're going," and that can work, but it so often doesn't.
I think that when data is driven by a business strategy that informs a data strategy, then things are incredibly effective and you're able to really extract a lot of value and learnings out of data, especially in data science, in understanding consumer interaction and engagement in ways that aren't privacy-violating or invasive, but are used to better that product and that experience for the customers. No one wants a lot of ads on the page, but they may need to generate that ad revenue in order to support things. But if you're actually making the experience better in order to have those ads on the page that are less invasive or the customer is not having to cover as much of that cost for the company because you've built things really effectively and intuitively internally, then it can ultimately add value across the board. That's how I see data and data science adding value generally.
David Barnard:
So then what would you say is the foundation of a good data practice? Data is just a bunch of raw numbers.
Taylor Wells:
Right. Well, one thing I would say is that more often than not, companies start by saying, "We should just be collecting everything and then we'll figure out what we need." So there's a lot of challenges around that, but there's also the challenge of even if you have a honed focus on what you want to collect, say, "I don't need to record a viewport of the user, but I at least need to know when they click on a tile or whether or not the video plays whenever they click on the play button," without having structure around how you expect that to change over time to grow other use cases that are likely to come, you inevitably fall into the trap of building something that will have to be reversed or modified or adjusted, and each step of that exponentially increases your cost.
Catching something early on could be 1X of your cost and then 10X if you're fixing it by cleanup downstream because now you've got new data coming in that you'd have to continually add new iterative processes to fix. And then lastly, at the last mile, it may be 100X more cost if you have bad reporting that are actually telling the wrong things because you have all of this brittle framework.
Jacob Eiting:
Then you, yeah, make some wrong business move.
Taylor Wells:
It's basically like when you look at data, data should be looked at in the same way that you're engineering a building. You're going through the process of permits and agreements and alignment across the board on what things are going to be called. Are we going to call this a wall or are we calling the wires within the wall the conduit or are we calling it this so that if someone says that term later, we all know exactly what that is referring to?
So it's a lot of getting that alignment and then thoughtfully laying out what you expect, both the current version and the potential modularity or general generic nature that it needs to be in order for you to continue to scale that out or use it for different use cases.
And then lastly, you need to really have a deep understanding from the data team side of what the business is trying to do even further than what they're asking for today, but also what the product does itself and then what your output looks like for the user. You often see people that'll build dashboards, but they don't necessarily understand how the dashboard's even going to be consumed. They just know the generic requirements in the dash and therefore the dash may technically meet the requirements, but is a challenge to use or misses the mark on what you're trying to uncover.
Jacob Eiting:
Yeah, if nobody looks at it, it's pointless. Right? Yeah.
Taylor Wells:
Exactly. And on the flip side of that, you may have data engineers or data scientists who are trying to understand behavior based on how that's been described to them in the problem statements or the requirements, the tickets, whatever it is, but they aren't regular users of the application and therefore don't understand how that works.
And it doesn't necessarily even mean that just heavy usage will fix that. On Disney+, if you're a heavy user but you only have an Apple TV and you're saying, "This is the behavior that I understand," which is very different than a low-end Roku device or a phone and how they're engaging with a phone, like a swipe all the way down to the last row versus having to use the remote to do that, then you still may not fully understand how the product is being consumed. So that deep understanding of what the upstream is, what needs to come out of this, and what are the business teams within it trying to accomplish both now and in the future.
David Barnard:
Yeah, I think we see similar problems all the way from indie developers with just the basic Firebased analytics, just collecting a ton of data, mislabeling that data, and then making bad decisions based on data they don't fully understand. But then that propagates up at every stage, like you move from Firebase to Amplitude with the same mistakes, and then you move from Amplitude to a custom data stack, make the same mistakes, and you have a whole data team propagating the same data mistakes.
What are some concrete things you've seen in what data to collect and how to think about that data collection then?
Jacob Eiting:
Yeah, I'm curious on that spectrum of collect everything versus things that are semantically important.
Taylor Wells:
Absolutely.
Jacob Eiting:
Obviously, I think both extremes don't work, but collecting everything has cost implications and you're just drowning your data team in noise typically. So how do you navigate that spectrum?
Taylor Wells:
Right, and that frankly, I think one of the more hidden problems too is that when you have all of that data, it's similar to saying, "What are the lunch options," and giving them two options to choose from are three at a meeting versus bringing in an entire catering staff that can make any food. Now you've opened the door to-
Jacob Eiting:
Paralysis, right.
Taylor Wells:
... "Well, I didn't actually want you to order an omelet. We don't have the equipment for an omelet," but now you've ordered it because I've said that you could do that, right?
Jacob Eiting:
Yeah.
Taylor Wells:
As well as whether meaningful, non-harmful, but ultimately violating data or willfully violating data privacy and usage. If you're collecting all data and listing it, for example, in the EU by their GDPR constraints as functionally necessary because it's an internal tool that you also use to make sure that the site is up, that the paywall is working, that consent is being captured, et cetera, and you're flagging those so that they can't be used downstream, but you're not gating that to where it technically can't be used downstream as well or masking it or protecting it or omitting it when not needed, then you now open the door for a potential risk. Now that data may be used by a marketing team that is uninformed on how it can be used to now target somebody based on geolocation or IP address or something.
So I think that the balance is, to go back to the original question of where is the right mix, if there was a clear answer to that for an app versus a website versus a cell phone or whatever you're manufacturing, then it would be more standardized across the industry. But the fact is it really goes back to what is the company trying to accomplish?
I think that even in our earlier talks, we spoke about how data in itself is not a value-add to the business. Collecting it, storing it, having that is not a value to the business. It's whether or not you're actually using it and turning it into meaningful insights that impact the business. Not even insights just for the sake of insights, which is also a problem area where you get hat 90% of the way, but it's not actually tied to the decision-makers or the right levers aren't available for those people.
So I think it is literally a process of not trial and error, but a full understanding across data. And that's why, to your example that you gave earlier of how it just continues to work the problem up in different ways, if you're starting without a foundation each time that you restart and wondering why it's not working, it's because you have to slow down and set that foundation, have a full understanding across those teams, and then everything else becomes exponentially faster and easier, but so often companies do not do that.
And frankly, I get to brag or hang my hat on the fact that Disney+ was not only such a great success, which a lot of that's a mixture of things happening, COVID and things like that, but it also was a great luxury to start from scratch from a data perspective on an app that we had the full support of the company both financially and backing it and pushing it as, "This is what our future will look like. This will be a key factor as much as parks or anything else," but also that it was they gave us the runway to launch the app a year and a half... Now you see these streaming apps come out after two months of being announced because they can burn through that, but that opens the door for a lot of problems too, or they try and bolt several together and then they're building off of legacy systems.
With Disney+, we were starting from scratch, ESPN+, Disney+. We got to really go in and say, "What do we need to collect? How is this going to work across all the devices? What are the limitations of each device that we need to factor in so that we can normalize some of this information and ensure that we aren't tracking something in one way on one and not going to"-
Jacob Eiting:
That's what I mean when I say, "Semantics." When you're coming up with a tracking plan, it's like thinking about the semantics and how can you compare that? Because to some degree, you don't want to compare an action on a phone from an action on an OTT device, but in some cases, you do, probably like a watch should mean the same thing, right, to some degree?
Taylor Wells:
Yes.
Jacob Eiting:
You made the point of some amount of trial and error and guesswork, and that's what I've found historically is for indies and whatever, when you're thinking about that first iteration of what do I care about, just be logical. Go through and use your app and think about the things you do that you think would be interesting to know and put those 10 or 20 actions in, and that's a good place to start. And then I've always found it's usually if you miss something, it doesn't take long to go back and add. You might lose some historical data.
But also, you keep making the point that data as collected is in some ways valueless, right? It's like crude oil or some other raw asset. Until you're refining it and actually actioning on it, it actually has zero value. So don't worry too much about hoarding everything because I've seen people do that too, and it can be used, but as you said, there's implications for data regulation and privacy. There's also implications for costs, like running a data... If you have your own data warehouse or even just running through all of these events platforms, they usually charge on events, some of them charge on users, which is nice because then if you have a user, you can track zero or 100,000 events, which is nice, but they're not all like that. And so yeah, just being somewhat logical, it's a good start.
Taylor Wells:
Yeah. To that point, I think that there are ways to structure how you're approaching it in a generic form where you can ultimately not then hinder changes going forward where it has changed what you're trying to collect by having that framework that actually allows for modularity or scale.
So one example would be if you were planning to track how many people clicked a download button in a streaming app to see how many people actually use the download feature, how many downloads are completing, you might build an event that is download clicked, download started, download paused, download completed, deleted, whatever. But instead, if you're focusing generically, you can say, "I'm going to build tracking for button clicked or click action," and then give keys to those buttons, ensure that there's structure in how that's built, so there's a bottom button that has to sit within a container that sits within a set within a page-
Jacob Eiting:
You just have it, right. Yeah.
Taylor Wells:
... and I've got those hierarchal keys built up so that now when someone comes back and says, "Well, now we need to know when they hit download and then pause and then share and then download again. Oh, that's a new function," well, now I can just say that either way, those were clicks of those actions and I've mapped that behind the scenes to what those keys and tags represent. And that's like how you ultimately can say, "We don't necessarily need to know everything upfront," but we haven't backed ourselves into a corner where we were tracking something that was download click before and now it's button click and I've got to merge that data and clean it and normalize it for eternity.
Jacob Eiting:
But you can do that pretty easily. I mean, that's still pretty far away down the semantic meaning stack versus tap on X value Y value. You know what I mean?
Taylor Wells:
Yeah.
Jacob Eiting:
Which you, in theory, could collect what does every single tap event on the device, and that's probably pretty useless, but hooking into your generic button class and making sure you capture all those, I think about some of the tracking plan for RevenueCat is we did have semantic stuff, like created an app, created a project, which just creates... Going into the conversation about downstream consumers, if you have an event that is project created, that sets yourself up for much better self-serve use cases because most people understand what that means and they can probably pull a thing together and it's most likely going to be correct.
But we still also have a page loaded event. So we know when a page loads and we have the path broken down, and in some cases, we've been able to retroactively, without going in and re-engineering a semantically tracked whatever, we're able to go in and retrofit that in, which is good. And that's still there's not that many page loads, there's not that many button clicks, right?
Taylor Wells:
Yeah. And honestly, as things mature over time, you really figure out what you need to be focused on.
Jacob Eiting:
Care about, yeah.
Taylor Wells:
For example, do I need to know that all the icons of the home menu were shown when they clicked the word home? I can assume fairly safely with QA testing and things like that of the app that all nine tabs are showing. I don't need to confirm that nine were there every time.
But to your point, some of that can also be done within the stream of data and you can say that, "This button with this type that's being clicked has a key that represents that," and as that data is being pulled in, I'm enriching it to then classify it as, "This is project started or app created," and send that to the teams that it's relevant to while still being able to reverse engineer the raw data that you collected before.
But where companies... I think most people think of GA, Google Analytics, and others when they think of these things is it collects, especially Adobe, each event is collecting 100 pieces of metadata across them, yet they're almost all identical for that session or that user or whatever. And in broad ways, like for a lot of the interactions, offloading some of that to essentially just capture when any of those states change and then being able to apply logic downstream to how frequently they're resetting a password instead of just saying, "I'm capturing password field" every event-
Jacob Eiting:
Yeah, don't recommend that one.
Taylor Wells:
Yes.
Jacob Eiting:
Don't capture your password fields. Official security advice.
Taylor Wells:
Poor example, yeah.
David Barnard:
So once you start collecting data, one of the things you and I talked about before the podcast was how data is really just a symbolic representation, but that information is what we derive from the data. So tell me a little bit more about what you meant about that.
Taylor Wells:
Yeah, so I think everyone also knows that's had a data set anywhere that execs will often hone in on something and then there's not a lot of questions asked as to whether or not it's still needed, is it tracking the right thing? Are we looking at the right thing and are we making any decisions off of it?
I can go into a theoretical company every week and give a weekly status that shows how many subscriber signups there were across the app or how many page views or whatever, but the question is rarely asked, "Are we doing something with this information each time? And if so, what? Are we retroactively doing it? Do we do it every week? Would we like to do it more frequently? Are we able to do it more frequently?" And so that's the value that ultimately eludes companies around data very often.
Jacob Eiting:
What do you think the bottleneck is there? Because you have ostensibly smart people consuming this, but there's failure to act. Is that because there aren't things to do, which I think is maybe even a valid path? What's your theory for when that fails?
Taylor Wells:
Yeah. I think it depends on the company's size, frankly. So at a larger company, what you often get is people not asking questions because a senior asked for it and they don't want to push back. And so you're building layers in between to the point that the people that are actually executing on building a report or surfacing a report or interpreting a report can't even necessarily get that question-
Jacob Eiting:
Oh, man. I love when somebody tells me it's a stupid question, just that would save me so much time because that's actually probably what the crux I'm trying to get to. I have a curiosity about something and so I ask, "Oh, can we pull this together? If somebody who knows better than me of why this data is trash could tell me before I spend all that time getting that"-
Taylor Wells:
Right, or knows what you'll ask after you get that and just gets you what you ultimately are trying to do.
Jacob Eiting:
Yeah, what I want or knows the why of what I'm trying to ask. But you're right, and I'm sure it depends on the culture you have as a company as to how much pushback you can give and stuff like that.
Taylor Wells:
Right. And again, I think a lot of these are rooted in some foundational things, like do the employees of the company across the board fully understand what the business is trying to do? I've worked at too many companies where it is unclear to all how all of it comes together. What is ad tech and marketing doing? What are the business development teams doing? What is strategy focusing on? Who are our core customers?
Even at a prior publication, I found it incredible to see that when asked what their audience was, "Who are your readers? Who are the people that you're trying to reach? But who are your readers today?" That they would have strong opinions of who those are, but they were based on the assumptions of the executives. Maybe it's even the CEO. "Our readers are tech-savvy, Northeastern or West Coast, they don't want old legacy media, blah, blah, blah," whatever the definition is, yet you go back and say, "What is this based on? Are these based on personas that you've created from data science? Is this customer interviews and working sessions for how the product is actually consumed? Have you done polls? Do you send out newsletters and you see what they're responding to and then you're profiling it?" It's often, "No." It's just like, "That's what I built it for originally and I'm assuming that that's who my audience is because no one's said otherwise."
So it's really just breaking that down of if no one's pushing back to say, "We can't define our customer based on nothing, based on just our assumptions," then things continue to only actually exponentially do that.
Jacob Eiting:
Who do you think are, like for data teams to understand what the business should be focusing on and how do you think the best way for those folks to get that information, where should that come from?
Taylor Wells:
Absolutely. It should be a coupling with data product, frankly. Data product is that weird space, right? It is not a product team in the traditional sense, and I like to think of myself as a product person-
Jacob Eiting:
Can you define data product a little bit?
Taylor Wells:
Yeah, exactly. But data product lives in between. I've seen it live under product and engineering. I've seen it live solely under product. I've seen it live under its own umbrella under data, like chief data officer potentially, and then I've seen it live in some hybrid forms or embedded models like a hub-and-spoke where you have different people from the data product side working closer with the businesses while other product teams also are involved.
So think you're developing a new feature like 3D video for a streamer, you may have that team that's conceptualizing the 3D video that they've pitched and sold executives on. You have a data product person or team that are working now with the engineers alongside that to understand what that product team envisions for that, how they actually expect it to happen, how engineers actually expect that it can be built and delivered in order to inform how they should collect and capture that data and how it's going to impact all the stuff that they actually sit on top of downstream.
And then they have to take that message and understanding back to the data science, data engineering, and data analytics teams to explain that, sell the vision of what we're trying to track, get feedback on how it could be better done or how they envision that it's most efficiently done, what are some pitfalls, et cetera, and then lay out that roadmap of features, like in a Jira sense, like the work that needs to actually happen.
Jacob Eiting:
Right. So you're saying the folks in the company who are responsible for integrating data into the tip-of-the-spear product features and stuff like that, they're learning through osmosis or direct interaction with the producted leaders and engineering team's building, who should ideally have a better understanding of why they're doing things.
Taylor Wells:
Yes, and then often would not be effective to try and ask them to understand the data part while going on that. In fact, it often never works. If you were to go to a product team and say, "Put data first in your mind when you are developing a new product idea or a new feature," they will still wind up, even if they're coming up with KPIs and metrics that they're going to track to measure ROI and success down the road, they still wind up going, "Oh, I forgot about that because I was so focused on seeing the feature come together in wireframes, tests, blah, blah, blah, but now it's live and we don't have any data for it."
Now, I've actually seen that even at a publication doing a podcast or a streaming audio where they'll partner with a third-party company, build this thing or this idea, we all know data's important and that we'll need to understand the data about it, but at the last second, you're like, "Now this has actually already been pushed out live and we're not collecting any data on it. Oh, no. Well, what do we do? How do we [inaudible 00:24:06] see more"-
Jacob Eiting:
You just got to do it a little bit before that. Usually I do my data tracking in the last day, which honestly, kind of fine because you don't know... And instrumentation these days is, at least for the indie case, you're using a tool or something like that, it can be done pretty quickly.
But yeah, I also... You said teams coming up with KPIs. It's interesting. I've watched us build sub-features and things like this that we design very bespoke and interesting KPIs for and a KPI's usefulness goes down proportional to the amount of time you spent thinking about it.
Taylor Wells:
Absolutely. 100%.
Jacob Eiting:
The best ones are just like how many people use how many dollars through a feature, and I almost wonder if, going back to interpretability, it's like if a KPI, you need a PhD to design it, you probably need at least a master's to understand it. And let's be frank, most people don't understand that stuff, so don't overthink it. You know?
Taylor Wells:
Absolutely. Yep. Don't overthink it-
Jacob Eiting:
In fact, under-think it might be... Right?
Taylor Wells:
Yes. Don't overthink it as it starts to grow. A lot of my examples too here are companies at large scale that have a lot to lose-
Jacob Eiting:
Yeah, I was going to say, most of these, our customers are-
Taylor Wells:
... and a lot of investments upfront.
Jacob Eiting:
In terms of like, oh, you have a data product person who's bringing stuff back, and most companies, it's either going to be... Well, startups, it's going to be the founder, maybe the lead engineer. Maybe if you're a little bit big, you've got somebody who's data-literate, but they're going to be going across the stack. They're probably responsible for the technical stack, but then also probably the instrumentation and all of that stuff. And I've been that guy before where I've had to be jumping up and down being like, "Hey, we don't have to do a lot, but let's make sure we collect the basics here."
The beauty is I think today there's enough tools that you can get 80% of what you need with some of this off-the-shelf stuff. But yeah, I mean, even at RevenueCat, David and I were talking about build versus buy on the way over here in terms of even just an event-tracking platform, and we were like, "Yeah, I don't know why anybody would ever build internally." I was like, "Well, actually, today we're almost building our own internally."
David Barnard:
We are.
Jacob Eiting:
Because at some point, you begin to rely on usually an internal BI system and you probably have a data warehouse, and at that point, it's like, well, tracking event streams is trivial at that point and there's some customization advantages you can get.
David Barnard:
Yeah, I'm really curious your thoughts on that because you and I were talking about your time at Disney+ where it was ZIRP era, there were high hopes for Disney+, and there was almost a blank check for what you were able to build inside Disney+. And I'm curious, it sounds like it was ultimately very successful being able to ground up, but very few companies have the luxury. I think you told me at one point there was 300 engineers and multiple teams across the company all working on essentially building an entire analytics company inside Disney.
Taylor Wells:
And that was really not even outside of Disney direct-to-consumer. That's not taking into account parks, cruises, yeah.
David Barnard:
Yeah, so then what are your thoughts on the build versus buy and blending the two and when it actually makes sense to start building out your own data product versus relying more on off-the-shelf and what are the benefits and drawbacks of each? And then after hundreds of engineers building this, now you still need tens of engineers to maintain it over time, and very few companies, especially in our space in subscription apps, have the luxury of dedicating teams to that. And then if they do, maybe those 10 engineers would even maybe be better served-
Jacob Eiting:
Get all of our R&D spend. It almost doesn't really make sense, right?
David Barnard:
Yeah. Their time would be better spent on consumer-facing features-
Taylor Wells:
100%.
David Barnard:
... versus building infrastructure. So yeah, I've really teed up a long question, but-
Taylor Wells:
No, no, no. I think this is actually-
Jacob Eiting:
Build versus buy, what's the answer?
Taylor Wells:
... going to be one of the more succinct ones for me. I think it does depend, but I would generally lean towards use off-the-shelf, use open source, things like that early on if you're not expecting immediate, massive scale, unless you just think that you've struck the mother load in terms of an idea for an application or a website and there is a 10% chance that you could suddenly have 100 million users or something and now you've backed yourself into a vendor corner from a volume perspective or pricing, then start with something more basic, but with a clear evaluation structure around when you cut over to other things and a culture that both acknowledges and accepts that it's okay that things may not be congruent over time when you switch.
One of the biggest pitfalls of every company that's existed for more than a couple of years that I've been at is literally that they say, "Well, we've already started to collect this and if we cut over to something new, it's not going to match the old reports," or, "I can't see, if we start to track"-
Jacob Eiting:
Yeah, you end up with these geological layers of decision-making.
Taylor Wells:
Yeah. I mean, imagine the panic that's been going across the industry in publishing where it's not tech-focused for the most part about the switch from GA 360 to GA4. GA 360 lived on for years and years past when it should have when Google said that they would kill it because customers were just so concerned about, "My bosses are saying that this is a completely new way to do it and they're not ready to start looking at things from a session perspective versus an event perspective or a device perspective. It's a sea change in the way that we track it and we know that it's better, but it means that now I can't show year-over-year comparisons that actually have the aligned data.
Jacob Eiting:
Can't go back the beginning or the charts that we've relied on for years are now invalid. They just don't make sense in this new paradigm. It's very painful, that organizational change, right?
Taylor Wells:
Yes, absolutely. I mean, even think about something like for Disney+, the idea of the bundle was introduced very late in the lead up to the launch. We had an idea, again, "We," I'm just referring more to non-executive levels that something like that would happen, but we also thought that it was more likely that it would just become subsumed in some way in either a direction. Maybe it's Disney+ merging into Hulu or vice versa, but that you end up with one app, and I still think that'll ultimately happen, but at the time, the idea of needing to synchronize user access across Hulu and Disney when someone changes to a bundle option, that was not present, the idea of ads were not present, so you didn't really have the ability to figure out how that was then going to look on a report and it... As long as the company culture was setting the expectation of, "I don't actually care if it translates over time," then you can win.
But also, again, startups have the luxury of saying, "We don't have any historical data." So as long as we're moving quickly and we end up in a good spot in the first couple of years, then you can usually establish a good practice going forward. And when those businesses are coming back and trying to challenge, like that, "But I want it this way," just pushing back on what value do you get out of being able to translate that. If they have all historical clicks of an article for the Journal or Business Insider going back for 10 years so that they can show a line graph of 10-year month-over-month growth down to a section, but the sections have changed over time and now they're frustrated that the reports aren't going to be congruent, say, "What value are you getting in comparing something that happened 10 years ago when the market was absolutely different or five years ago"-
Jacob Eiting:
Oh, it just keeps going up. It's just nice to look at charts that are going up.
Taylor Wells:
Exactly.
Jacob Eiting:
Yeah, yeah. What's wrong with that?
Taylor Wells:
But they're not pushing-
Jacob Eiting:
Is that not good? Is that not a valid executive use case?
Taylor Wells:
Yeah. They're often not pushing back on the... They're often not pushing back on the boss to say-
Jacob Eiting:
Yeah, "Why is this better?"
Taylor Wells:
"Hey, do we actually... What are we going to do if we see a big change? What about a small change?"
Jacob Eiting:
Yeah, yeah, yeah. It's like a photo album. I just like to be able to flip through and be like, "Oh, remember when we were at this much?" But it's a really good point. I mean, I think about our transition from off-the-shelf Amplitude to when did we add an ETL or when did we add a phase where we could transform some of the evented data into something more semantically meaningful? And yeah, it happened when we started to break definitions, and not just break them through foolishness, but through, okay, the app is different now.
I think one big example for us was we refactored what it meant to be a platform versus a different platform. Anyway, a big structural change that we did stuff and suddenly X did not mean X anymore. And to encapture that sometimes in a third-party tool like Amplitude, it can be very difficult because the operator of building charts and stuff has to have that information constantly loaded in their memory or they'll pull up stuff that doesn't make sense. It's non-congruent, right?
Taylor Wells:
Mm-hmm.
Jacob Eiting:
And that's where for us, it made sense to introduce some sort of pipelining step because then what we can do is encode that change into a pipeline and be like, "Okay, in this month of this year, we changed this definition and create some normalized super definition," that takes some of the sharp edges off, right, for data consumers?
Taylor Wells:
Absolutely.
Jacob Eiting:
But again, if you're just launching your app, you don't need that. That is a luxury of a problem to have. And to your point earlier, you're not going to launch that app that has 100 million users in the first year. And anyway, if you do and you owe Amplitude a bunch of money-
Taylor Wells:
Then you don't care about any of this.
Jacob Eiting:
... you don't care!
Taylor Wells:
You don't.
Jacob Eiting:
You're like, "Great. One fire I don't have to worry about right now." They'll cut you a deal probably, right?
Taylor Wells:
Absolutely. To that point though of at scale where it can be problematic if you do have success is when we built GLIMPSE internally-
David Barnard:
And GLIMPSE is?
Taylor Wells:
GLIMPSE is the internal event structure of clickstream data within Disney+, which is still there today, at least as of the last time that I'd looked. So those events were that generic structure. It was named GLIMPSE, it's an acronym for Gathering Live Interactions via Multipurpose Surfacing of Events.
Jacob Eiting:
Ah, I wish I were in that meeting. Somebody felt really proud to come up with that.
Taylor Wells:
That was me.
Jacob Eiting:
Congratulations.
Taylor Wells:
And I had come up with about 40 of them one night of, "What are we going to call this?" And by the way, there's a reason that I did it. It's because my boss had been calling it UAT for so long, user activity tracking, and I was saying, "UAT means user acceptance testing. We can't confuse it. You got to call it something else." So I finally said-
Jacob Eiting:
The acronym, the acronym solution.
Taylor Wells:
"... I got to come up with an acronym."
Jacob Eiting:
You came in pitching like a director pitching a new movie, you're storyboarding all these things.
Taylor Wells:
It was a rapid Slack dump and my boss at the time, the head of data, Laura Evans, she's fantastic, she just said, "I don't care. GLIMPSE fine." I had sent a big list of them.
Jacob Eiting:
Oh, you know they care.
David Barnard:
That's impressive pre-GPT-4 to be able to come up with such a fancy acronym.
Taylor Wells:
Oh, it would've saved me countless nights while my two-year-olds were asleep and should have been getting a bedtime story. Countless nights of me coming up with ridiculous acronyms.
Jacob Eiting:
Well, I mean, we were joking about it, but naming and interpretability of this data really matters. It really matters. I mean, even a cute name for a system helps. It helps people talk about a thing, and I'm battling this now. In some cases in terms of data, a lot of times in terms of how we explain the product, and it's not always... Yeah, at a high level, it's easy to explain our product, but a specific aspect of our product, you need a surprising number of names for things and the consequences of choosing those names is a lot higher than you'd think, right?
Taylor Wells:
Yes. I mean, I think design, UI, product marketing, all of that are so underrated frankly these days just because AI is the focus and people want to cut costs, and that's often an area that they feel like they can cut costs because now every app has been built in several different ways, but there aren't many new ways that people are coming up with things. They think, "Oh, I could just use standard practices and such and not have to worry about it."
But branding is incredibly important. Think about Apple Intelligence and how much we'll be saying that term for the next five to 10 years if it's successful versus if they just stuck with, "We added AI in here," because no one's going to continue to call Google Services with Gemini. It'll morph or whatever, or they'll just never say it. They'll just say it's Google with AI. But to brand that in a way that Windows did with Copilot, right? And now you see that ecosystem expanding to now their hardware is-
Jacob Eiting:
Yeah, it's under, it's a [inaudible 00:35:49] order now. Yeah.
Taylor Wells:
... it's a Copilot-driven laptop. So I think that it's so important, but it also then, like you said, allows people internally to discuss things quickly without having to go back to, "Where is this?" So when people talk about clickstream events here, they have no necessary standard source of truth in their mind of what that means.
Jacob Eiting:
What that actually means, if it's precision, yeah.
Taylor Wells:
Is it GA? Is it Parse.ly? Is it Adobe, et cetera, or is it our internal events? But at Disney+, it was the GLIMPSE events, which everyone then knew to mean these activity-tracking events that are used for product analytics, business decisions, et cetera. And then over here is Dust, which is our service-monitoring events. And I don't know, that one predated me from the BAM Tech days, but GLIMPSE was an important branding for that.
And what I was saying on the cost piece is when we built GLIMPSE, the reason was that we wanted to only have one integration within the app. We didn't want to rely on third parties. Then it became sticky in terms of GDPR and consent and things like that. We'd rather just say, "The data that we're collecting within the app-"
Jacob Eiting:
Stays here.
Taylor Wells:
"... is only going to Disney," and then we use that internally in the ways that we define, but the moment that you add Adobe or GA or anything like that, you've now opened up things like competition in terms of them being able to understand how much your volume is increasing people to monitor things and look to find insights about something versus if you're just encrypting and sending it directly-
Jacob Eiting:
Not to mention just even to add a third-party processor, you have to go through the legal rigamarole internally. As a data team, you have to advocate for that and all that stuff, and you have to maintain the DPAs and make sure all of that stuff's up-to-date. GDPR really... I mean, I actually think the requirement for disclosure of third-party data processing was good. I think it was one of the things that came out of GDPR that was really pretty well-done and maybe could have been more simplified. I feel like there's a lot of DPA theater and this is very much analytics company problems, but people just need a DPA. It's like, "All right, okay. You could also just tell your customers you're sending data to us and what you're sending and everybody will just trust everybody, but I guess we'll have to do this little dance." I get it now.
And the nice thing too is the tools for rolling some of this stuff yourself are becoming... Between the data, just talking about AWS, the ways for you to store bulk data in AWS, the tooling on processing over large flat files has gotten so much better. Tools like DBT have made it much better to manage SQL pipelines and then the database options have gotten so much better too. When I-
Taylor Wells:
Snowflake.
Jacob Eiting:
Yeah, yeah. Snowflake has changed everything. Even Redshift has come a long way from where it was when we started using it a decade ago. ClickHouse, I've not used personally, but there are options now, right?
Taylor Wells:
Yeah, yeah.
Jacob Eiting:
And so I still think when you're at day zero, do not do any of that stuff unless you're an expert and you've done it and you know exactly what you want and you know how to set it up very well, quickly and cheaply. But I think the time to it making sense to actually use this stuff is actually much earlier than it's ever been.
Taylor Wells:
Right. In that case, it was built because of that, but it was also built because of a concern that costs would-
Jacob Eiting:
Which, yeah, at that scale, when you're Disney.
Taylor Wells:
... exponentially increase in a really successful scenario, even though the bar, I think, was set very low early on. I think the original target was 50 million to 60 million users, 40 to 60 in three years, and we hit 10 million the first day and 100 million in the first quarters.
But to that point, they also had Adobe as the backup plan, right?
Jacob Eiting:
Sure, yeah.
Taylor Wells:
So let's build it internally, but we've never done this before in this sense at Disney. So let's also, in the back, have a contract with Adobe and be integrating it lightly, but able to pull it out later, right?
Jacob Eiting:
And that poor AE who was ready.
Taylor Wells:
In the first couple of weeks, we were able to show that the GLIMPSE events that we'd built were running us around $1,000 a day in total, end-to-end cost, and I think the Adobe was costing us about $33,000 a day. And so we were able to quickly say, "Let's rip this out because the data is reliable, it's working, we're confident in"-
Jacob Eiting:
Yeah, it's something you have it under control, and if you have the right people who understand it to respond to enhancement requests and stuff like that, I think it can be very superior. But on the cost point, it's like if your product generates any revenue marginally per user, usually the tracking and analytics stuff is a rounding error. I mean, when it's a rounding error of 30,000, that's-
Taylor Wells:
Well, when you're talking about billions of events a day, for example-
Jacob Eiting:
Is that 30 million a year or something?
Taylor Wells:
... versus we were in another level on that. But I think that if Disney+ had launched incredibly slowly or-
Jacob Eiting:
Yeah, or wasn't Disney, right, basically?
Taylor Wells:
... anything. Yeah, like in ESPN+, I don't think that we even had Adobe integrated there. Maybe we did, but in that case, it would've been fine to run that and not have to argue that, "Oh, but we should move that internally immediately." The volume wasn't there at first.
David Barnard:
But even $33,000 versus $1,000 isn't the actual cost because you have, what, 10 engineers? All-in costs of 400K a year per engineer-
Taylor Wells:
I mean, if you're paying your resources on your team and things like that $32,000 a day combined to make those two even, then I'd love to work there.
Jacob Eiting:
It'd be a lot. Yeah, yeah, that's true.
Taylor Wells:
$33,000 a day adds up really quick, and then you really can't argue that you're getting any value out the data because what on Earth in terms of insights would you be producing off of that data-
Jacob Eiting:
That's worth that much, yeah.
Taylor Wells:
... that would actually be worth that much?
Jacob Eiting:
What is that $33,000? That's millions, right? Yeah, yeah.
Taylor Wells:
It was a lot of money if you had just continued with that. And by the way, that was our early estimates. We added triple, quadruple the number of events down the road.
Jacob Eiting:
Well, that's, I think, one of the... Locking into a tool stack that has constraints like that, I think it forces you to not do things that might be helpful because of the costs. And that stinks because sometimes if you really want to do something and you're like, "OH, I can't because Mixpanel or whoever's going to charge me an arm and a leg," then... Which you also run into with your own stack sometimes. But again, it's like sometimes you can just bring events into the first stage of your processing and leave them there. We might even not pull these into Snowflake yet, but we're just going to let them stack up in cold storage-
Taylor Wells:
Staged, yeah.
Jacob Eiting:
... and if we ever need them, we have them. You can make those decisions, right, and control costs quite effectively?
Taylor Wells:
Yes. It goes back to the original question and statement, which is that having those checkpoints or evaluation points understood across the business and the teams and then adhering to them can actually prevent a lot of what seems like a risk. You could say, "We're going to have Adobe in here, but if we're moderately successful, we need to evaluate within three months whether or not the continued cost needs to start being offset, whether or not it's that we're scaling back some of the events or whether or not we are adding additional tracking at a certain rate," or you could say, "if we're wildly successful, we need to evaluate within a month how quickly we can start to deploy an internal solution that can offset that, how much of an overlap time that we need of a month for comparison of the metrics" or something to ensure that things you put in worked, and then turn the old one off.
But as long as you have those, then the process is actually pretty easy. It's just that they don't often plan those and they'll say, "We're going to buy instead of build," and then it becomes, "well, we bought, so we didn't think about the scenario. Oh, well. We're here now. It's just going to be an inherent built-in cost that I have to now raise my price or do something different to offset and find it elsewhere." But in reality, you could have just said, "No, we already had a plan for when we checked this and now we checked it and it didn't pass, so let's pivot."
Jacob Eiting:
It made me think of an example, a company I worked at where we did the opposite, where we started off with a very bare bones, dumping logs into a SQL database, and we actually moved everything back to, I think it was Mixpanel in this case. And the reason was it was too hard to play with. You couldn't play with... And at the time-
Taylor Wells:
It had the right tools, yeah.
Jacob Eiting:
Yeah. At the time, Tableau wasn't where it is. Looker didn't exist. There wasn't really a world where you could easily layer a BI tool, at least in the startup mark, something a small startup would pay for, you could layer on, and that was prohibitive because we just weren't asking questions. Because my data process, that's a big part of how I'll use data is I like to just explore, just play around. If you have good base data and the semantics are mostly understandable, and if 75% of the answers you come to are valid, right? Once in a while, you'll pull a query and you'll be like, "Wow, it doesn't make sense because whatever." I find that to be an incredibly powerful tool, and I don't know, I might be an exception in the exec world of somebody who's very from a data-
Taylor Wells:
You are. You are, but you're one of the exceptions that's actually on the right thing.
Jacob Eiting:
Yeah. But I feel like it's got to be on the rise, right? It's more-
Taylor Wells:
No.
Jacob Eiting:
It's more... No?
Taylor Wells:
At least not what I've seen. It's still very entrenched.
Jacob Eiting:
I mean, think about it this way. As the C-suite becomes more and more that they started their careers when SQL existed, I think it's likely that data literacy becomes a more common thing in the boardroom.
Taylor Wells:
I mean, I would say you're right in that this example would be Michael Paull was the president of Disney+ at the time. He was one of the types of people who would say, "Dashboards are meaningless if it's just showing me the same stuff. I'm now losing the thread. I don't know what I should be focused on, and I'm continually adding to that stack of data." If you're thinking about printing them out, "I'm adding to a stack of what I have to go through each time, but it all looks fairly the same as what it did the last time that I looked at it. So what exactly is it telling me?"
And it got to the point where near the end of when I was working on data products there was I was tasked with writing an email each day by noon that just had three bullets and up to two sub-bullets within each of the bullets. Just an email that just said, "Here are the three things that you should be focused on that happened the day before. Here is the change." And even if that's, "Here is an interesting insight that you should be focused on that's based on something where I'm looking over time or something," but I'm abstracting that and they have the trust to understand and believe that you're surfacing them the right things, then suddenly dashboards become absolutely irrelevant, frankly, to me.
I love a good dashboard as a data person. I build dashboards on my own personal data. I love playing around with it, but if I needed to get value out of it before saying that, "Is it worth the time and investment," then I'd probably say, "No, it's not worth me charting how many iMessage emojis I send to each person in a list," or something like some nonsense. But companies will continue to track that and say, "Which are the most used emojis?" Instead of just saying, "Does it actually matter? Does it cost anything to do that?" And they're handing that report every month, every day or whatever.
So once you actually can raise that, and it starts at the top, of, "I don't actually want to look at dashboards anymore. I don't want to look at reports and I don't care about month-over-month if it's not actionable. I'd much rather you tell me that we're actually seeing a weird trend where people that log in after 10:00 PM on weekends wind up being our longest tenured customers over time," or, "people that sign up for the bundle annually wind up using it 40% more than other consumers, then that's what I actually want to know."
Jacob Eiting:
You're not going to have that in your dashboard, right?
Taylor Wells:
Right, because then I can aim business teams at those problems. "Hey, we need to go promote more this annual bundle. We need to go aim at ad campaigns that run in the evening because we're missing that market."
Chapek, after him, even made a comment at one conference and said one of the craziest things that he found from the data at Disney+ was that more than 50% of the audience was single adults, and I think even broadly adult male. But the assumption coming in was that this is going to be a kid's app. I mean, that was what consumers interpreted initially, and then they saw that it would be more than that, but it also was never planning to have R-rated content or FX content, et cetera. What are you going to do with the Fox stuff?
So he was surprised to see that this Disney-branded, family-oriented thing was surprisingly used by people that didn't have kids and were adults. But that's the kind of stuff that you can unlock from those insights that you find early on, and then you can adjust. But if I'm just showing you a report that you asked that was saying, "Show me X, Y, Z," and it wasn't focused on how much has this thing grown into a tool that's primarily used by what we think are adult individuals that don't have children, then yeah, it could be six months before you actually realize it and you've missed out. Maybe that's when another competitor came out and ate your lunch.
Jacob Eiting:
You made me think of a neural activation. The dashboards are raw input, but there needs to be a...
Taylor Wells:
A filter.
Jacob Eiting:
You would think that AI would get there, but I'm not sure we're anywhere close to that because it's like I have a bunch of dashboards. We have our reporting for board decks and all that stuff. It's a little bit more formal. But then I also have my own crazy dashboards that are not super interpretable, but I know what they mean and they give me signal. But again, it's like I don't really care how many API connects we had yesterday. I care about that piece of data in context with everything else on the dashboard, in context what I'm seeing in Slack, it's not every day-
Taylor Wells:
Budgets, budgets, people, all of it.
Jacob Eiting:
What's, what... But yeah, but it's like if a human at this stage still is not there to synthesize that into what you're talking about, those three bullet points, which I don't send this to anybody because they're for me. Sometimes I'll send them to the team and I'll be like, "Whoa, look at this." The value is questionable. And it's interesting to think, this is just speculative, but will we be able to, in data land, go... I think that is the holy grail. Everybody says they want to generate insights, but I'm not sure we're anywhere close to that.
David Barnard:
I was going to say, I think my biggest takeaway from this whole conversation is that the absolute best BI tool that you can build is a really smart human with access to good data who writes three bullet points a day in an email.
Taylor Wells:
And who isn't being asked to do things that don't add value, right? Who is challenged to actually get out of the normal practice in data, because I mean, some of the bullets that we would send are there to give that context. It could be, "Hey, we had a 10% increase in signups in the EU." That may be on a main dash, but drilling into that may be, "And it was primarily in the Netherlands yesterday"-
Jacob Eiting:
Yeah, figuring out why.
Taylor Wells:
"... and that's because we launched a new promo in the Netherlands overnight for three months free, but we spent with a target of 20% overnight so we actually performed poorly." But you would lose that at a typical dashboard level by saying, "Hey, look, things are going well," but in reality, you lost the money. What other anomaly? And how these questions morph over time.
Someone may want to know how a Taylor Swift Folklore exclusive service that launches at midnight does for the next 24 hours. And then going forward, they may say, "Now I want to see how Beyonce's does when we launch that. And then I want to stack it against how well movies are doing versus those exclusive things. And ultimately, what was the budget? How much promotion did we do? What did we initially expect? And did that one actually outperform but it had 3X higher budget and expectations and it actually underperformed?" I need something that can kind of synthesize that.
And I think that I may have a counter view of that in that I do think that with good data and well-informed AI, that you actually can reach that. It's that you would have to have the patience and commitment of teams to both get out of their comfort zone, know that things can be wrong sometimes, and just accept that you have to double-check a bit, but that ultimately it will get you to that point.
I've even dumped credit card data, bank data, and things like that into GPT and run with its new data analytics model things like, "Look for insights of what jumps out to you. Graph these by categories, et cetera. What's changing?" And it can do a fairly good job of saying, "Hey, you're spending more on restaurants late in the month. Or in the summer, your restaurant costs double, but you're factoring a budget of X across the year as a standard thing, so you should have a seasonal budget for these things because you're going out more for conferences or it's just nice out." It does a decent job of not only finding those, but then having the memory to add that information later and continue where it left off and say, "Yeah, I see these patterns"-
Jacob Eiting:
Yeah, maybe I'll be wrong. Maybe I'll be wrong. Maybe I'll be wrong.
Taylor Wells:
I definitely would not bet-
Jacob Eiting:
Underestimate.
Taylor Wells:
... anything on it currently, but I think that it's actually surprised me in the last couple of months versus where it started about a year ago.
Jacob Eiting:
There's a blindness there where you're always assuming that the thing you do will never be replaced by AI.
Taylor Wells:
Exactly. Once I saw a C3 AI demo, that's an Andreessen Horowitz one, it's solely geared at AI for analytics within enterprise corporations, synthesizing all that information, citing its sources, and being able to look across all these things without people having to do the ETL and all that. I was like, "This is amazing. Oh, no. Where do I go?"
Jacob Eiting:
To a beach somewhere.
Taylor Wells:
Yes. So no one is safe.
David Barnard:
All right, we've gone super deep into all the theory and practice and I do want to end with a few more concrete examples because you shared some really cool ones with me before this, so let's just go through a few and we'll wrap up. But one of the examples you shared was Bluey and understanding it as content in ways that you wouldn't expect.
Taylor Wells:
Right. The short of that was that early on we had assumptions about what could be popular. We had assumptions about what would be better algorithmically driven versus curated sets of content. Do we want to create one called puppies and kittens or do we want something that's going to figure out whether or not you like puppies but hate kittens and don't want to see a row with a mix?
And so early on, we had to make decisions on which rows would be curated and where those would be placed, and then also what content would be in the curated rows and what order those fell. And a lot of times, that was based on, without getting into specifics, maybe it's priority, based on budget of the content itself. "We want to self-promote some of these blockbusters higher." Maybe it was, "Well, these probably won't be popular," or, "Oh, this is what Disney has found on its linear channels in the past are successful in terms of audience ratings from Nielsen or whatever, so these are probably going to be our more popular shows."
One of those sets was Disney Jr. on the adult section. And they thought, "This can go..." Essentially, this is my assumption of what they were thinking was, "Well, this would just go at the end because otherwise they'd use a kid profile. So we'll just put Disney Jr. at the bottom." And one of the newer shows on that was Bluey, which is a BBC and ABC, Australian Broadcasting Company, collab show that's based out of Australia on an Australian cartoon family of dogs. It is by far the best show, that I almost would say it's worth watching even if you're not a parent, or planning to be a parent-
Jacob Eiting:
It's the only cartoon that's made me cry.
Taylor Wells:
Exactly. I've cried more times than I'd like to admit, frankly, maybe more than I've cried over things that my children have [inaudible 00:54:05]-
Jacob Eiting:
Have actually done.
Taylor Wells:
I mean, in total. It's such a moving show and it's so well-formatted, seven-minute sections. It's not this long-form content, so it can be something to get them tired or to distract them for a moment or whatever, but they squeeze such a fantastic message in it. But it's not on the radar of Disney Jr., for example, historical numbers. It had just launched on cable channels, so there weren't really metrics on how many people were watching it or how it was used. But on Disney+, we were seeing fantastic numbers from it if you were just looking at what content gets watched all the way through, a 99, 98% completion rating each episode, or how frequently did they finish the whole season, or how frequently did they rewatch the season, or something like that. But it was still a limited number of people that were actually doing that.
And as a data person, I wanted to use that one as an example, not only because I loved the show and I just intuitively thought that it would be successful if it had more of a platform, but why should it be a hero row? How am I going to justify that it's a big masthead image at the top of an adult profile? So then let's look at the child profile. Well, we're not really promoting the child profile. Most people may not even know that it exists. They log into Disney and if they skip the profile setup-
Jacob Eiting:
Sure, yeah. We don't have a child profile.
Taylor Wells:
... then your just name is blank. It's a Mickey head and you never think about it again unless somebody prompts you to reset that up. So they're just saying-
Jacob Eiting:
My whole Disney+ is a child profile.
Taylor Wells:
Right. And also as a parent, when you're launching this thing, you probably just bought it because the kid's screaming in the corner and you're like, "Purchase, purchase, just"-
Jacob Eiting:
Yeah, yeah, yeah, I know. "Hold on, honey, I have to do more forms."
Taylor Wells:
Yes, say the word Bluey in there, pull it up, hit play, and then we're off to the races. But now you've got an adult profile using it.
So to me, it was already inherently flawed because most people were not setting up child profiles. And I didn't think that it was because they didn't have kids. It was just because they were watching younger-age content. But how discoverable was Bluey? It's on the last row in one of the last tiles. So even on an Apple TV with a quickly scrolling wheel, you'd have to still scroll through 25 rows and then scroll over 15, 20 tiles before you got to that content. But if you completed the content, it now was not in your continue watching and you'd have to go find that each time. And at the time, there was only one season so you could burn through it in a couple of hours.
It became so frustrating because the kids would want to watch it. So I had the luxury of at least having a captive audience that I could test on and a real use case, but it was a good example of that, it could have been any content and it could have been anything, how hard is it for them to get to it and what shows up when I start to type Bluey? If I type B-L and it's Blue's Clues or something like that, and down here buried is Bluey because we chose to do it alphabetically or because we chose to do it based on number of views, how hard is it for that parent to ultimately get back to that? And in the child profile, the search was non-existent. You couldn't search, so you had to literally find it within these groupings that didn't have names because your expectation is a five-year-old's using it.
So I would use that one to say you have this content that is being consumed heavily by a subset of users of the app and they are religious about it, they're sharing links to it, et cetera, but you've made it nearly impossible for them to pin that as a rewatch once they've completed it or for them to find any of that other similar kid content, but your root problem is actually because of a broken process with how the profiles are set up and how they would find that content in the other piece.
So that problem is a representation of something that you can... What's what you can do right this second? Move Bluey up higher, move that row up higher or make something tagable. But what's something you could do long-term? Think about thoughtful ways to promote or nudge people that do have kids at home to set up that child profile and not so that you can just know that they have a kid or know that they have three kids because they set up three profiles, but what value do they get out of it?
By doing this, their Bluey stuff's always right there and your recommended-for-you row is not getting muddied with child content as well. Now I know the shows that you want to watch and I can show you the right thing. So that's your long term, but in the short term, I can also profit.
And for Disney's case, how that then turns into a massive business is if this thing is successful and you move it up and you're seeing that exponential growth, you're seeing it in real time versus Nielsen scores or a movie in a theater or something like that where you're waiting for box office and run-up and all that, how can we quickly say, "Here's where we need to do, a Broadway show. We need to ramp up merch manufacturing because maybe that needs a two-month lead time of design and engineering of what all these toys are environments are going to look like. How do we get shipping out for that? How do we promote the show more on Disney Jr.? Do we need to reshuffle a lot of what we've already put there? Do we need to promote it in the parks and have a ride or an experience? Have Bluey walking around?"
Whatever it is, you see how that has that knock-on effect of not only that one insight of this is a good piece of content buried down here, but you miss that if you're not understanding the data and you're asking for structured things that are expected and you're not saying, "What's the thing I should be looking at instead?" And that could cause you to miss out on maybe a small bump, but maybe an entire new business model, and in this case, what has turned into a phenomenon of a show with Broadway-
Jacob Eiting:
Right. This might be the killer app for Disney+, I think. We maybe don't watch that many movies, but Bluey is one of these things we watch over and over and over again because it's infinitely repeatable.
It's a good example of just not carrying in too many assumptions about how your users will use the thing and just listen to your users. They will tell you what they care about and maybe tune your data stack so that you can detect and understand and promote how your users are actually using if your app is RevenueCat or whatever. That's classic product market fit finding, right?
Taylor Wells:
Yes.
Jacob Eiting:
It's like don't let your assumptions carry you too far. And I think that's one of the problems with building in big orgs too is you have so many ways people think it should work because their org and their thing and whatever, and it's all very well-intentioned, but at the end of the day, you have to look at the data and the evidence.
Taylor Wells:
The presumption, yeah. The presumptions that they lead in with. And that's to that last point of how it can impact is coming in with the expectation of what we think that these are going to be bucketed by, like Marvel and Star Wars superfans combined or Disney animated fanatics. That might be what your executives are telling. But if you're not giving the leeway for data scientists and data analytic teams to explore what the true definitions may be or whether or not they can reinforce those definitions, then you can miss out. And in that case, that's where we were able to reorient and say, "Hey, the data science teams, the data analytics teams, they're finding that these buckets actually make more sense in terms of what expected behavior will happen or what they'll resonate with based on tranches like parents with kids, parents with young kids, parents with older kids, adult couples, single adults, teens."
Whatever those groupings are actually have more patterns in common than users when you try to bucket based on a Star Wars super fan because a third of them may love Star Wars, but also like animated movies. A third of them may love Star Wars and Marvel a ton. A third may hate Marvel and love Star Wars. You know what I mean?
So if you're going in with those presumptions instead of being curious and allowing those teams the leeway to explore that, you can miss out. And then don't shoot down based on personal biases what's right in front of you. There are times that I have intuition on something and I chase it down and I see early on that, "Hey, that's actually going different than I thought," and there is that urge internally. Maybe you've gone through those executive meetings and said, "I'm really confident that this is actually"-
Jacob Eiting:
Maybe we just need to run the test longer.
Taylor Wells:
"... a good hypothesis," but you get so much farther in your career by being ruthless with yourself about what you're seeing. COVID, data, things like that, you can apply this almost anywhere. But also, that it goes so far in companies with how they will trust you with larger and larger things in the future if you're able to own up with the fact that that actually turned out to not be. "I know that I thought that that was, and I told you that and we're half-bought in on it, but we got to pivot and this is just a miss. Here's a good thing that I found of that as well, some positive, but yeah, it stinks that we missed the mark."
They now know that if I-
Jacob Eiting:
Missing the mark is better than chasing false...
Taylor Wells:
Yes, and they now know that when I'm thinking about making a VP or other decision to launch a new thing, that you're not going to buy into that and potentially ride it into the ground and risk something because you don't want it to go against what you had previously stated.
Jacob Eiting:
One bias there, I think oftentimes I feel like people have a bias to believe that everything should be causally explainable. You'll have some thing, some truth in the data, but the world's chaotic and there may not be a causative explanation for this other than just random fluctuations. And I find that to be a really hard thing where people get hung up on is they're like, "If a number goes up, there must be a reason why." And it's like, "Maybe. Maybe.
Taylor Wells:
Yes, that this is doing that, that this is leading to that, yeah. It's no-
Jacob Eiting:
But sometimes also, the dice get rolled a certain way and the number goes and there may be-
Taylor Wells:
Or how quickly the answer is back to them of what would you do to prove that those are interrelated? Would it be that you change for 10% of the users that they don't see the thing that you think is driving this action and then that you're measuring whether or not that actually declined? And if it didn't, then it actually isn't tied to that?
Jacob Eiting:
Yeah, could you actually design an experiment?
Taylor Wells:
Cool, okay, then how quickly can we do that? Right?
Jacob Eiting:
You're all thinking that, right.
Taylor Wells:
So even if somebody wants to try and correlate, just challenge them to okay, trust and verify, like-
Jacob Eiting:
Well, it forces them to think about the production process of the data and then you go, "Well, I guess the production process is inherently chaotic and so likely there isn't actually data here."
David Barnard:
Well, I think that's a great place to wrap up.
Taylor Wells:
Yeah.
David Barnard:
We could go on for four more hours, but-
Taylor Wells:
I told you. It could turn into a book if you let it.
David Barnard:
Yeah. But thanks so much for joining us. This was a really fun conversation.
Taylor Wells:
Absolutely.
David Barnard:
I think a lot of insightful things for folks to think about, and I know if any indie developers made it all the way through to this point, a lot of this is not directly applicable except that you need to think through a lot of this even as an indie, thinking about your little Firebase-
Jacob Eiting:
Yeah, yeah, yeah. The most successful indies I know are the most data-oriented ones. Yeah, always.
David Barnard:
Yeah, your little Firebase instance, it matters because that's how you're going to improve your product. And having good data in there, making good decisions based on that data is a key to building a great product.
Taylor Wells:
And I think that thinking of it with a data-oriented mindset as an engineer and a developer on front-end applications is super important in thinking "What happens if I change this later," and how that's going to impact all the stuff downstream that relies on that data, whereas that's usually an afterthought in an engineer's mind unless data is pushing for it. Right? So yeah, that's a big takeaway too.
David Barnard:
Yeah. All right. Well, thanks so much for joining us.
Jacob Eiting:
Thank you.
Taylor Wells:
Yeah, thank you, guys. It's been a wonderful conversation. I can always dive in to stuff like this for hours, so thank you.
David Barnard:
Thanks so much for listening. If you have a minute, please leave a review in your favorite podcast player. You can also stop by chat.subclub.com to join our private community.