Austen Head: Halo's business logic requires the predictions we make to be highly interpretable. The functionality to explain which features influenced individual predictions was originally designed for customer audit/explanation purposes. But now, we find ourselves using that functionality internally to easily review the impact of models before deploying them.
Q: Halo sounds like a cool company! What do you do?
Austen Head: Halo accelerates the adoption of medical advancements by improving the sales and marketing effectiveness of biotech and medtech manufacturing companies. We recommend who a client should reach out to, what the client should teach them about, why that prospect would benefit from that instruction, and which specific product from the client addresses that prospect's need.
We are able to provide these insights because we gather and synthesize data to understand the market ground truth of all organizations that might purchase a bio/medtech device (academic researchers, clinical labs, consumer biotech companies, and others).
Halo is a SaaS company of 9 people (4 on data eng/sci, 7 have PhDs). We're led by our CEO, a former Google exec who launched AdSense for content and grew it from $0 to $2B annual revenue.
Austen Head: I'll present an example of a two-layer stacked ensemble model in which the outputs at each layer are interpretable and useful. In order to understand the model, the listener first needs to understand that Halo's clients (enterprises) also have their own customers.
The first layer of models predicts a client's customer response based on features intrinsic to the customer (such as the amount of funding from government grants). The second layer of models predicts how much the client can change the response of customers based on features that are actionable and related to client-customer interactions (such as whether the client invited the customer to a technical webinar on a particular topic).
This framework allows us to recommend that clients reach out to customers where the predicted response is much greater than the observed response in the first model and where the features explain a large amount of the local variability in the second model.
About the Startups Track
The data-oriented Startups Track at DataEngConf features dozen of startups forging ahead with innovative approaches to data and new data technologies. We find the most interesting startups at the intersection of ML, AI, data infrastructure and new applications of data science and highlight them in technical talks from their CTOs and their lead engineers building their platforms.