How we used science to improve on agile and data-driven development
As technology progresses and new technologies proliferate, there’s a similar spike in the invention and use of buzzwords. Most of us working in software development are familiar with the term “data-driven,” but what does that actually mean? Like “agile” and “scrum,” it’s a methodology that is divergent at almost every organization.
Here at Valimail, we have applied the scientific method to development, by combining data-driven and agile methodologies in a way that has proven effective in solving real-world problems.
We first started this process when we hit a wall with our data storage. As our customer base has grown, our data has scaled accordingly and we found that the technology we were using was not keeping pace. We picked a new technology and made plans to migrate over our data and queries. Like most development organizations we plodded along, closing ticket after ticket, until we hit the testing wall. There were numerous optimizations we could choose from and our developers were spending cycles documenting various options. We realized none of the documentation mattered if we didn’t actually run any tests. And none of the results of those tests mattered if we didn’t have initial data to benchmark against (a control, if you will).
At this point, we knew we had to do something different. So we looked outside the tech world and drew upon our combined background in scientific research. There, we found a methodology better suited to our testing needs: the scientific method.
We first gathered the query and page load times running on our old database, then ran the same tests against our new data warehouse. With that, the science could begin.
One of the core tenets of the scientific method is that the results should be repeatable by anyone following the same methodology. By first developing hypotheses and then methods for testing these hypotheses, we arrive at a repeatable process with reproducible results. By sharing these within the community (in this case our development organization), we drive technology forward collaboratively.
Defining the Hypothesis
Being “data-driven” doesn’t really mean anything unless you know first what you’re trying to solve and why you’re trying to solve it. Otherwise, you have no idea what data you even care about.
For this particular project, we knew our goals were to optimize data ingestion as well as page load and query times. We looked at the various different optimization strategies and made a few hypotheses to answer our most pressing questions: First, what would be the most likely strategy to speed up data ingestion? Next, what would speed up page load and query time? Armed with hypotheses involving a variety of configurations, we got to testing, recording our results in a shared spreadsheet.
You can be scientifically-minded and follow the scientific method and not get anything done if your hypotheses and test grounds aren’t based in real world acceptance criteria. Or you can do everything agile and sprint in totally wrong directions. But by putting these two methodologies together, we found a way to make decisions very quickly. By starting with at least one hypothesis to prove or disprove and identifying methods to test that hypothesis, we found unequivocal answers to a variety of questions. We took out the guesswork and deployed our new data and improved warehouse without a single incident, with exponentially faster ingestion, query and page load times.
How to Use the Scientific Method in Development
Do you use data to drive decisions? If so, when do you gather your data and how? It’s all the rage to gather all the data all the time, and this is absolutely necessary for any application shop for debugging and incident forensics.
However, for product development, consider your metrics first. What exactly are you trying to do with this release? Which KPIs (key performance indicators) do you care most about? It is much easier to build in the instrumentation and gather the data you want to drive your decisions as you build out your new features, so consider front-loading your development as we did with a series of hypothesis, methods for testing and measuring and an idea of what success looks like.
When starting a new project, consider adding in science like so:
- Come up with a series of hypotheses about your product that you want or need to test before releasing.
- Build out a dashboard or spreadsheet with the metrics and KPIs you want to gather to help answer your hypotheses and validate that those metrics are going to actually drive development.
- Establish a baseline for your metrics by testing before introducing any changes.
- Begin testing changes, perhaps independently, to determine which change or set of changes are most useful to your project.
And now you have a truly data-driven product. Hypotheses, methods and measures help remove the feeling of flying blind that many of us in devops and product struggle with day to day.
By marrying science and agile in this manner, your team can sprint more quickly in the right direction.