As a product manager at Valimail, my thoughts on the value of machine learning solutions and the use of PII have changed a lot. In this blog post, I’ll talk about the evolution in my thinking about PII.
Prior to joining Valimail as a product manager, I had a very different perspective on the use of sensitive personal data in the product design phase. While I am proud of the machine learning solutions that I built during my days as a product manager in the marketing attribution and consumer insights space, in hindsight, I was always chasing shiny new data sets that could improve the performance of the models.
With this approach, however, the models only ever saw tiny gains, and those improvements came with a high cost: The risk of exposing personally identifiable information (PII).
At Valimail, the product design approach recognizes this cost. Once PII is integrated into the data pipeline, it is almost impossible to control its use. It’s like paint: once you mix two colors together, there is no way to separate them again.
First, let’s look at the four key pillars of product design we maintain here at Valimail:
- Avoid use of PII. We are highly sensitive to privacy concerns and go to great lengths to eschew personally identifiable information.
- Build solutions that work 100%. We favor deterministic outcomes, not probability scores. We go after results, not ambiguity.
- Automate everything. This is not a revolutionary principle, of course, but it’s one we are obsessed with.
- Make implementation easy for customers. Offering consulting services to operate the product for our customers is not our model. Our objective is to make the product simple and easy to use, not only from a UX standpoint, but also for implementation and operation.
Avoid Use of PII
Let’s consider how that first principle, Avoid use of PII, impacted decision-making in the design and development of our newest product, Valimail Defend. Valimail Defend solves the problem of “lookalike-domain” phishing attacks (e.g. firstname.lastname@example.org).
We found that most solutions on the market were based on content scanning and made heavy use of PII, such as the sender’s full email address. They relied upon similarity algorithms: feeding the email subject and message body to a machine learning model to produce a guess on whether a message was a phishing email. Most of these approaches just didn’t work. They not only had limited success, but they were also fraught with privacy issues. Our requirement to avoid PII precluded us from using content-scanning methods.
By clearly defining what we would NOT do, we ended up with a remarkable solution. Valimail Defend uses identity, not content, as the key piece of the puzzle. Specifically, we built a clearinghouse for trusted senders of email that knows with certainty whether a sender is authorized to reach the employee inbox. If the sender is not authorized, Defend does not let the email pass through to the inbox.
This approach does not use PII, because the only data that is used is the sender’s domain. The email administrator has granular policy control over employee inboxes, and only trusted email traffic is allowed into the organization.
The Valimail Defend approach stands in stark contrast to content-scanning analysis that looks for anomalous, potentially malicious email. Instead of always chasing the bad guys and trying to keep up with their ever-evolving patterns of fraud by applying machine learning systems that rely on PII and other sensitive data, Defend takes a more deterministic approach based on who is trusted and allowed in — and who isn’t.
Of course, this is not to say that the challenge of being able to identify trusted senders is an easy one. But, by implementing a product design principle whereby no PII can be used, and by defining the problem as one of establishing identity versus anomaly detection, we are able to focus our efforts on solving a more narrowly-defined and, importantly, finite problem.
In the next post in this series we’ll look at how our second pillar of product design, Solutions that work 100%, served as both influence and objective in the building of Valimail Defend.