Add Three Tablespoons of Data Science, Stir Gently
What happens when you apply data science to social or business challenges? What factors will help to make such projects successful?
One of my favorite analysis of these questions comes from Kentaro Toyama's book "Geek Heresy: Rescuing Social Change from the Cult of Technology" (2015).
In the book, Toyama summarizes his learnings as the Head of Microsoft's Research & Development Center in India.
The center's mission was to design information technology to reduce social inequality and roll them out. Implementation happened first in trial projects with partner nonprofit organizations. If those trials were successful, they rolled out the technologies to a wider audience. Some of their projects included:
Special tablets for use in rural schools
Video-recorded lessons for classrooms,
Learning games
A job site for illiterate women to find service sector jobs
The team found out that their results were rather mixed. In all their projects, they took care to use the contemporary best practices in the field Human-Computer Interaction. They made sure to identify and design for the specific needs of their beneficiaries. Still, when they measured the success of their projects, most of them failed to have a positive impact on the intended audience.
But some of their projects actually worked. Trying to figure out why some of them did, and most did not, Toyama identified three main factors that made a project successful:
The dedication of the involved researchers to social impact rather than research outcomes
The commitment and capacity of the partner organization
The desire and ability of the beneficiaries to take advantage of the provided technology
Quoting from the book: "All three factors, though, point to human context as what matters most. Or, to put it another way, the technology isn’t the deciding factor even in a technology project. The right people can work around a bad technology, but the wrong people will mess up even a good one."
Toyama calls is findings "The Law of Amplification of Technology": "Technology's primary effect, is to amplify human forces ... and magnify existing social forces. The degree to which technology makes an impact depends on existing human capacities."
For example, they found that user interfaces with extra icons designed to help illiterate beneficiaries mostly helped semi-literate and literate beneficiaries.
Reading Toyama's findings was a lightbulb moment for me. It provides, for example, plausible explanations for:
Why technology professionals tend to be much more optimistic about the benefits of applying technology to social challenges than nonprofit professionals. Technologist experience much greater benefits from applying technology in their own lives. So it seem obvious to them that more technology will also benefit marginalized groups, if only they had access.
Why technology projects to help marginalized and poor people have very high failure rates. Most XForGood projects focus on delivering whatever X is (data science, mobile technology, internet access, ....). They focus less on the critical capacity building needed for that technology to amplify positive forces.
Why we have one unsuccessful buzzword technology and management fad the tech industry after another. It may be AI, scrum or microservice architectures, or any other of your favorite buzzwords. Like in the Tech4Good space, many industry projects I have seen for such "technologies du jour" focus on the technology. They ignore the individual and organizational capacity needed to make use of them. And like Toyama observed for Tech4Good projects, many of them fail to make companies more successful.
In the second part of his book, Toyama sketches his own theory of factors for successful technology implementation.
He calls them "heart, mind, and will". These refer to the intentions, capabilities and self-control of the designers, implementers and beneficiaries of tech projects.
I find his three factors a useful tool to analyze where data science projects failed. Or where they might fail given the current organizational context of a project.
Failures of intention are a common reason for failed data science projects. I have often seen these in business settings. When data science team tries to get other departments to make us of their insights and these are not motivated to change their ways of operating. And in top-down- driven initiatives, such as a CXO attempting to make the company more data-driven.
Failures of capabilities come in all shapes and sizes. Some examples that I have seen:
Use of biased datasets for training ML algorithms, leading to discriminatory systems. I have described several of these before.
Misinterpretation of the results of data analyses. Fo example, ignoring statistical limitations such as small sample sizes.
Mistakes in data collection that impair data quality
Failures of self-control are especially widespread in a buzzwordy field as data science. Whenever we prefer a fancy machine learning model over a simpler one that will do the job. Whenever we measure our datasets by size rather than quality. And whenever believe that hiring a small army of senior data scientists will magically prepare our company for "the coming AI future".
Summing up, I found Toyama's book to be an enjoyable and insightful read. His analysis generalizes much further than the original context of India.
Keeping an eye on these questions could help avoid some common traps in data science projects:
Intention: Does every stakeholder actually care about the success of the project?
Capabilities: What skills, resources and contacts does each stakeholder need? Are those in place?
Self-control: Is everyone focussed on achieving the goal?
Or are we getting sidetracked by "shiny objects" or other distractions?