Idealistics is shutting down. This blog will continue to be updated at fullcontactphilanthropy.com.
Data does not make decisions

I participated in my first Google Hangout last Friday on the topic of using data in homeless services. The discussion was organized by Mark Horvath, a homeless advocate and founder of Invisible People. The call included Mark, myself, and three other practitioners with experience applying metrics in homeless services. You can check out the recording of the conversation here.

I was clearly the outlier on the panel as I work with a range of organizations on a variety of issues, so while homeless services is a part of my work, the other folks on the hangout are fully focused on working with those experiencing homelessness, and it shows in their depth of knowledge.

There were a lot of interesting take aways from the conversation, so I’ll likely be referring back to this discussion in future blog posts. But one thing that stood out to me was that the conversation reflected on both the opportunities and the risks of using data in homeless services, a point that applies to all possible uses of outcomes data in the social sector.

At one point in the discussion, our attention turned to the possibility that homeless service providers could use predictive analytics to exclude people from receiving services. Before I go any further, let me describe what I mean by predictive analytics.

Predicative analytics

Predictive analysis is the process of using historical data to try to predict what will happen in the future. There are various types of statistical techniques for doing this, but the basic idea is you try to fit a model that explains what happened to a set of observations in an existing data set. You then use that model to try to predict what will happen to a new set of people you don’t have any data on.

For example, your model might suggest that people who have a criminal background are more likely to get evicted from housing.

On the one hand you might use this finding to provide additional supportive services to keep people with criminal background in housing, on the other hand you might choose to exclude people from your housing program who have criminal backgrounds, which brings us back to a worry my co-panelists raised on the Google Hangout.

Same data, different decisions

Limited resources is a fact of life, which makes it particularly important that organizations develop intelligent ways of rationing their services that maximize the social value they aim to create. So does that necessarily mean we should use data to weed out those that are hardest to serve?

Not necessarily.

People make decisions, not data. Two people can look at the same data, the same sound analysis, and make two different decisions. One organization might decide to not serve a certain target demographic because they believe those individuals would not fare well in their program, but another organization might decide the exact opposite, reasoning that because those individuals have poorer prospects and risk worse outcomes, that they should be a higher priority.

Indeed, that is exactly what the vulnerability index does. The vulnerability index is essentially a triage tool for prioritizing chronically homeless people into housing. The more vulnerable someone is, the higher priority they are to house.

My point is not to argue that it is better to serve those who are more or least vulnerable, but rather to illustrate that data is a tool that helps us make decisions that are consistent with our own values.

While data can help us better assess what has happened and what might happen in the future, it does not tell us what to do. The decisions we make should be informed by data, but data does not make decisions for us.

  • Justin Dove

    This is an interesting point. I find that it’s important to emphasize that while predictive analysis may not work well on the individual level (meaning, trying to predict whether person A will take a certain future action), it does have significant benefits for organizations trying to narrow their focus to those who are more likely to benefit from their services.

    Also, are you concerned that non-profit organizations, in an attempt to jump on the big data bandwagon and predictive analysis, will make hasty correlations thus guiding them toward wayward strategic decisions? Of the pitfalls that I’ve read recently regarding non-profit organizations and data analysis, not much of has focused on making sure that organizations are collecting quality, valid data (or at least enough data to make valued predictions). Am I over-thinking that this might be a significant concern?

    • http://www.fullcontactphilanthropy.com David Henderson

      Justin – you make a number of good points, the first of which is that models tend to tell us about “the average” person, which is interesting on the aggregate but not always instructive on a case-by-case basis.

      As to the question of organizations making hasty correlations, I do think that this is a concern, although I’m not sure it is necessarily isolated just to the nonprofit sector as data-literacy tends to be low societally.

      I think there are really two issues you raise here, one referring to data integrity and the other having to with analysis of data, assuming data integrity. To that first issue, in my experience organizations tend to try to collect way too many indicators, which leads to poor data quality, something I have written about before:

      http://idealistics.org/fcp/2012/09/06/too-many-indicators-means-a-whole-lot-of-nothing/

      To the second issue of sound data analysis, I tend to think it is just as important to illustrate what our data does not say as much as it is to highlight what it does. Without understanding error bounds around findings and other diagnostic metrics, it is easy to conclude that models provide more precision than they actually do, which can lead to spurious conclusions.