Deployed Organization’s first machine learning model

The short version

Designed and deployed Best Friends' first productionized unsupervised machine learning model — replacing a methodologically flawed estimation process with a continuously updating, shelter-level prediction engine that now tells us, any given day, which shelters are saving lives and which ones need help.

  ---                                                       

The Problem

  Best Friends has never received data from every shelter in the country. Estimation has always been necessary. The

  method we inherited extrapolated national shelter data from human population counts at the county level — a blunt     

  instrument that couldn't answer shelter-level questions, couldn't break data down below the state level, and relied on

   figures that were sometimes three years old.                                                                         

  Before the pandemic, that was workable. Animal sheltering numbers were relatively stable, and three-year-old data     

  wasn't wildly off. Then the pandemic hit, and everything changed — sometimes dramatically, sometimes overnight. A

  number that was two years old could be in a completely different place than reality. We also had older shelter-level  

  data we weren't using, and internal organizational expertise — strategists who'd worked at or alongside specific

  shelters — that we had no way to capture.

  The existing process was leaving value on the table and producing estimates we couldn't fully defend.

  ---

What I Did

  I identified that we could build a regression model using the data we did have — our own shelter records, combined

  with publicly available data — to predict what was happening in shelters that weren't reporting to us. I led the      

  variable selection process: testing organization type, population size, prior intake data, legal factors like

  spay/neuter mandates, and demographic and economic indicators. Prior intake turned out to be the strongest predictor  

  of outcomes and save rates. Organization type mattered significantly. Most other variables didn't move the needle.

  We also created a formal process for capturing organizational expertise — for the small number of shelters we'd never 

  received data from but knew well through our field teams. That accounts for roughly 3% of the dataset.

  We ran the regression manually the first year, moved to monthly updates, then rebuilt it as a machine learning model  

  that now runs continuously — incorporating rolling 12-month data and updating predictions daily for any shelter that

  hasn't reported in the last 12 of 24 months.                                                                          

  Getting executive buy-in was its own challenge. There was significant reluctance to move away from the existing       

  methodology, even when the case for change was clear. I made the argument that we couldn't defend what we had — and

  that standing still wasn't actually a neutral choice. The shift happened, but it required sustained advocacy.         

  ---

The Result

 For the first time, we have a current, shelter-level view of what's happening across the national shelter system —updated daily, not annually. We know which shelters are no-kill, which are trending in the wrong direction, and which are right on the cusp.

A few months after launch, we tested the model's predictions against shelters that had subsequently provided their actual data. We were approximately 90% accurate, with false positives and false negatives distributed equally. Since then, the model has received two independent external validations from data scientists at the University of Oklahoma and University of New Hampshire. Both confirmed approximately 87% accuracy, with correlations above 0.91 across all tested markets.

That precision has changed how we plan communications, how we target interventions, and how we allocate resources.    

 Building the internal data literacy to trust and use that precision was as much work as building the model itself.

Next
Next

Patent Strategy