Original article was published by Foreseer on Artificial Intelligence on Medium
Mitigating Risks and Minimizing Costs while Building Data Products
If you are a data provider looking to expand your product line into brand-new sectors, what do you do? Well, if you have access to big budgets, you could buy one or more companies that sell products in the sectors you want to expand. If you don’t have access to the funds to go company-shopping, you will probably consider building the product on your own. While building a data product from the ground-up not only takes deep subject matter expertise and a determined team to handle the various aspects of the build, it also takes an incredible amount of time. How do you minimize the time to build a brand-new data product, mitigate risks better and go to market faster? We’ll explore a few approaches.
Building datasets can get human intensive. If you’re able to find data suppliers selling quality data that can make the raw materials of your product, you’re in a better situation. If you’re having to build the dataset 100% organically yourself, you might need a large team of people to source, aggregate, standardize and validate data. Say, you’re building a dataset covering the top commercial banks in Europe. You’ve shortlisted 500 banks to cover and 300 metrics for each bank that capture the bank’s financial standing. You’re interested in the last 5 years’ historical data. The biggest source of information for banks’ financial performance are annual reports. Now, you’re looking at a total of 500 banks x 5 annual reports (one for each year) to process and collect insights from. If collecting data for a bank from across 5 reports for 300 metrics takes 8 hours of a data analyst, you will need a team of 10 analysts working through a year to collect data for your product.
Now, the most straightforward approach to minimize the time to go to market is hiring more data analysts. With a team of 10, it takes a year. With a team of 20, it should take 6 months. Although this sounds straightforward in theory, this may not always be feasible. Hiring more people to build your product creates a handful of problems.
- It increases the cost of the product. Higher costs lead to longer periods of time to breakeven. With your ability to start making profits stretched out in time, the advantage that you expected to gain by going to market faster at the first place might no longer be one.
- The other key problem that comes with adding more people to the build of the product is the cost and utility of that workforce when the build is complete. What do you do with a team of 20 data analysts when the product is built? Do you need a team of 20 data analysts to maintain the product? Most probably not. Unless you have multiple new product development initiatives that your organization is involved in, you’re probably going to find it harder to justify the cost of a large team.
The second approach to building data products faster is to automate processes. Automating starts with process discovery. What processes do your data analysts traverse to complete a unit of work, say, collect data for a bank from its annual report? Once you’ve discovered the processes that dictate your workflow, you now sit to ensure that the processes are as lean as they can be. Which parts of the process add little value? Which parts of the process result in output that our customers don’t pay for? Which parts of the process slow us down? Are the processes that slow us down worth in value the effort? Engineering your processes to ensure that they are optimized and are lean is a key fundamental step before you start automating.
Once your processes are fine-tuned, you begin the journey of automation. You start with the low hanging opportunities. You start automating bottom-up. How do I automate the collection of annual reports from the banks’ websites? How do I determine if an annual report has information worth the time of a data analyst? What parts of the reports present data in a structured fashion that I can extract using a simple rules-based system? Along the way, you hire experts in the fields of scripting, RPA, web scraping and similar. Once the low hanging opportunities are exhausted, you look at the more daunting automation problems. How do I automate the collection of unstructured data? How do I make a system extract data from annual reports that present data with no definite structure and consistency?
It is now that you start having the build-or-buy discussion for a smart, AI-based technology solution with your stakeholders. As you engage in discussions, you realize that both the “build” and “buy” approaches attract at least two critical considerations.
- The cost. Building or buying a smart, AI-driven technology solution to solve your core unstructured data problems is not going to come cheap. If you hire a team of developers, data scientists and architects, besides the high costs, you’re usually looking at an incredibly long time for the team to deliver a solution that starts adding value to your business. If you engage a third-party solution provider, you might expedite the process, but you’re looking at higher costs of maintenance, increased risks associated with greater dependence on a third-party and lack of accountability.
- The risk of failure. The other key consideration is the likely possibility of your in-house team or the third-party vendor failing in their pursuit to deliver the solution that your business needs. Did you know that only less than 20% of the unstructured data proof-of-concept engagements succeed? The risk of failure if profound because of the usual amounts of investments that go into these engagements before the first symptoms of failure emerge.
Now, if the risks of hiring large teams or investing in technology upfront seem prohibitive for your business at the nascent stage of your product lifecycle, you could consider partnering with technology solution providers that run launch partnership programs. Launch partner programs of technology companies enable you to use their technology capabilities at practically no cost to build your data products. You’re required to pay the technology partner only when the data product you build starts bringing in revenue. Upon evaluating your product, the technology partner accepts the chances of success or failure of your product upfront and is entitled to an agreed share of your product revenue. Launch partner programs with technology companies is a great tool to mitigate risks, keep costs of building products low, leverage proven smart capabilities and go to market faster.