Data science has traditionally been an analysis-only endeavor: using historical statistics, user interaction trends, or AI machine learning to predict the impact of deterministically coded software changes. For instance, “how do we think this change to the onboarding workflow will shift user behavior?” This is data science (DS) as an offline toolkit to make smarter decisions.
Increasing, though, companies are building statistical or AI/Machine Learning features directly into their products. This can make our applications less deterministic – we may not know exactly how applications behave over time, or in specific situations – and harder to explain. It also requires product managers to engage much more directly with data scientists about models, predictability, how products work in production, how/why users interact with our products, and how our end users measure success. (Hint: most users don’t understand or care about F1-scores; they just want to get the right answer.)
- Provide much deeper context than traditional software projects, especially use cases and business goals
- Remember that data science projects are uncertain and our judgment may be weak
- Choosing/accessing data sets is crucial
- Describe how accurate this application needs to be, and anticipate handling “wrong” answers
- “Done” means operationalized, not just having insights
Data-driven applications are more complicated than deterministic software products. And working with data scientists has some unique challenges. We need to approach these thoughtfully, recognize the patterns, and respect the special talents of each group.