So You Want to Do a Data Science Project: Now What? (Part I)

In the first article of this series, Applying Data Science to Make Smarter Business Decisions , we examined how businesses can leverage data science and the foundational themes you need to build into a data science project. So now that you decided to do a data science project, what’s next?

First, there are some tough questions to ask about the availability of internal skills, the level of urgency, and your company strategy. Executing a data science project also requires three major knowledge-area building blocks:

  1. Knowledge of the data—not only in terms of database administration skills, but also what is being measured, and whether the source data “makes sense” in terms of being helpful to business users.…
  2. Knowledge of how to refine the data into information—this means working knowledge of the “salad bar” of analytic tools and techniques that pertain to the type of data you have collected as well as the programming language(s) required to stage the data for analysis. The information that’s generated must also support making smarter decisions
  3. Knowledge of the data presentation tools—that are best suited to deliver a clear representation of the information and can drive the required smarter decisions.

If your organization is lucky enough to have resources with all this knowledge already up-and-running, (perhaps on different pieces of different projects?) you will likely be able to run the project internally. If there is some time urgency, you might need some staff augmentation. And if you do not have these resources readily available, assess what is missing, and come up with a plan to fill the gaps.

Should your business strategy require more and more data science projects on a regular basis, it might make sense to establish and grow these skills internally. However, if data science is not a strategic technology, but instead a means to increase operational excellence, working with an external partner who has already demonstrated the skills and can provide the necessary resources will yield better value over time than ramping up an internal team.

Questions Hold the Key to Discovering All You Need to Know

After deciding which resources will tackle your data science project, there are several key questions to ask to uncover all you need to know about where to start:

  • Who owns each of the data sets you want to tap into?
  • Will those owners readily share their data?
  • Do you have a sufficient storage and computational infrastructure to carry out the analytics
  • Who is the target audience that will receive the information?
  • How will you deliver the information, inferences and propensities to the target audience(s)
  • How will the information integrate into their current decision-making process?
  • What is the cadence for feeding information into the process?
  • How will business leaders know if the new decision support system is working?

If you aren’t quite sure about some of these answers, it is highly recommended to collaborate with a data science partner who offers a complete platform of services to get you up-and-running. The partner should be able to offer resources who can stay with you throughout the lifecycle of the project—including the roll-out of the solution to the decision-making team.

Data Science in Action: 2 Use-Cases

The benefits of going through these questions are demonstrated by a project I collaborated on with a mid-sized commercial real estate company in Los Angeles. They manually packaged bundles of properties for large clients and had trouble getting the bundles set up to sell quickly. This resulted in sequestered properties in the bundles being in inventory for too long, which caused issues with the agents on the selling side.

We helped this customer by designing a rich, slice-and-dice tool that allows sales agents to look at key features in properties bought by specific clients in the past. The agents can look at many different aspects, such as multi-tenant, mixed-use, large truck access, parking, permitting, and proximity to other features such as freeways and the availability of cold storage.

Through this tool, the customer models different packages and tests the propensities of purchase without having to bombard investors with options. This approach lowers iterations, improves profitability, and reduces the time in inventory for the properties. A side benefit is that customer satisfaction went way up on the investor side; they perceived the firm as being much “smarter.”

Previously, agents had the information they needed, but it took too many iterations to go back and forth with clients. Our data science solution solved this challenge by modeling past buying behavior and allowing the agents to test the bundles against the models to come up with the one that would score highest. Our approach combined presentation tools, a rich data slicer/dicer, and knowing what questions to ask. This made it easier and faster to rank one potential property bundle against another.

A Tiempo client in the semiconductor industry had developed a very expensive infrastructure for testing chip designs. As the IT asset was about to get overloaded, we helped them use machine learning to test “smarter” rather than conducting 100% testing each time.

By looking only at where designs change and where things usually break, they reduced testing to 80%. The result is that the company can defer a large capital equipment investment of replicating the test machine for another year or so.

In this use-case, our customer had ALL the data in-house and knew what questions to ask. But they didn’t have the knowledge of the information conversion technique to find out which tests were the most valuable to conduct following a particular design change to the chip. Once the information was available, it was easily plugged into the test automation system as the “smarter” test regime and created more bandwidth for other products to be tested on the same platform.

Perhaps a Money-Making Proposition

During the execution of your data science project, your team may stumble upon other “rhythms in the chaos” of your data, which have similar benefits to the business in other areas. Sometimes the results can even create industry-specific or sector-specific tools that can drive royalties or licensing for data science as a service (DSaaS) revenues.

So the key decision then becomes…if the data science capability is a strategic technology for your business, build the core team internally and outsource what isn’t strategic. Otherwise, find a partner who can fill in any skill gaps you have, work with your existing teams, and deliver an end-to-end solution.

And whether you take on the project internally or externally, set up a high-level framework that will assist your internal or an external team’s ability to understand what it is you’re trying to get done and begin the collaboration. Either way, stay open-minded, observant, and ready for a “smarter decision” business world. It’s going to be a good adventure!

In the next article of this series, So You Want to Do a Data Science Project: Now What? (Part II), we discuss how to create a high-level framework to understand the mission of your data science project and how a solid understanding of the overall process will foster stronger collaboration among your team members—whether they are internal or external.