Data mining and predictive analytics are proven technologies that have become an integral part of the daily operations of leading organizations– from the FORTUNE 500 to government agencies and academic institutions. While the benefits of these technologies are no longer a matter of debate, the focus has shifted overwhelmingly to devising organization specific processes that ensure successful business outcomes from data mining investments.

While several methodologies exist for implementing a data mining project and have been successfully adopted, not many companies follow the same rigor when it comes to planning these projects. We distinguish between planning and implementation as the former being a process guide to decide the business case for a data mining project with the latter focusing more on rolling out the project once a go/no-go decision has been made.

In this article, we discuss four high-level steps for planning data mining projects. The objective is to articulate a high-level framework that allows companies to make informed decisions about committing time and resources to data mining projects using a comprehensive cost-benefit analysis.

Define focus

Successful data mining initiatives often start with a narrow focus and addressing the most critical organizational issue, such as retaining customers longer, improving lifetime value, making marketing investments more profitable and so on. What are the various business issues that each of the departments faces? Is there a department in which the company is not making the kind of progress it had hoped for? What insight is needed to improve decision making and get things back on track?

Typically, this step involves devising ways of collecting and organizing department level business challenges and then filtering down to the one that is universally accepted to benefit the most from data mining investments.

Identify a clear, strategic business outcome

Once the focus of initial data mining engagements is finalized, the next step is to develop consensus on the strategic outcome for the specific business issues that need addressing. In this step, we move from identifying the most critical problem(s) to pinning down the end game in terms of business outcomes. Developing this strategic context is critical to maximizing the value of data mining and avoiding the “ad hoc trap” wherein enterprise resources are wasted on considerations around tactical issues such as algorithm selection, data planning, tool selection and so on.

Once again, companies may craft their own unique steps to identify these outcomes but the common underlying focus of all such approaches should be to ensure that the end goals are specific, actionable, of business value and can be realistically obtained within resource and time constraints.

Line up the right resources

At a conceptual level, data mining can be thought of as a set of 4 components-The right business outcome, the right people, the right data, and the right tools. Once you’ve identified the right business outcome with justifiable solution costs and executive commitment– securing the necessary resource commitments for remaining components is usually much easier.

Lining up staff-Data mining requires several roles including business experts, data specialists, statisticians, and project management staff. The right resources for each need to be identified and executive buy-in secured to ensure uninterrupted engagement of these resources regardless of commitments to other departmental projects. Secondly, all the team members need to be brought onboard with specific project objectives so that they all share the same common execution goal and can feel justified in making their time investments.

Line up the right data-At a very simplistic level, data mining involves understanding the data and putting together samples of random data that is likely to be representative of the wider population. This requires coordination between business experts and data specialists who can plan the configuration and sourcing of these datasets as a pre-requisite to modeling. A number of considerations come into play in this step but the overall focus remains that of ensuring that the underlying datasets can produce models that take us in the direction of realizing the strategic business outcome.

Line up the right data mining tool– For data mining to be successful from a business perspective, models must be developed quickly and deployed cost-effectively for use within current operational systems and business processes. Successful data mining requires data mining tools that broadly support three major characteristics- An open architecture to support data integration without native connectors, rapid model development capabilities, and flexible model deployment options. An open architecture helps the tool integrate quickly with existing data sources without much technical rework. Rapid model development capabilities typically involve support for a large number of statistical techniques and an easy to use GUI environment for data exploration, partitioning, and testing. Finally, flexible deployment options involve the ability to deploy results from predictive models into production data and business processes.

Create an executable data mining strategy

This last step involves developing a roadmap for bringing together the resources identified in Step 3 in order to realize the strategic business outcome outlined in Step 2. At a very high level, four key activities are involved

Standardize on a data mining method-Adopting a consistent methodology for data mining lies at the heart of benefits realization from analytics investments.  Such a method would typically involve having a set of repeatable steps, with definite agenda for each along with specific inputs and outputs, the templates and artifacts required for each and finally the roles required for execution. A structured approach developed on these lines ensures purpose and repeatability in addition to allowing data mining to be implemented as discrete projects with defined budgets and timelines. CRISP-DM is the de-facto method used by several organizations worldwide although many have chosen to tailor it based on their own specific business context.

Define clear data mining goals-Regardless of the method adopted for implementing data mining, coming up with a clear definition of data mining goals is usually the first step for any data mining project. This involves translating the business objective into technical goals. For example, reducing churn might be a business goal but from a data mining point of view, the project objective would be to a) identify the behavioral factors for churn and b) identify customers most likely to churn. Notice that developing strategies for handling such customers will most likely not be part of the data mining project even though it is an integral part of the overall business objective for which data mining is used.

Define data mining success-The most common data mining success criterion is the predictive accuracy of the model– but models only need to be accurate to a certain level to achieve business objectives. Plan to make tough tradeoffs to determine when predictive accuracy is high enough to achieve your business goal. Other considerations beyond model accuracy would typically include how easy/difficult it is to understand the model, availability of underlying data, complexity of data transformations required, model stability, ease of deployment and so on.

Create project plan– The project plan describes the intended plan for achieving the data mining goals, including outlining specific steps and a proposed timeline. Create a project plan based on the process steps of the data mining methodology you have chosen and make sure to confirm resource commitments as you collaboratively plan with project stakeholders.

Putting it all together

The series of activities outlined above culminate in a concrete, albeit a high-level project plan that provides ballpoint estimates for time, cost and resource requirements to deliver on specific data mining goals. At this stage of the planning process, careful evaluation needs to be undertaken to ascertain as to whether the effort input justifies the end goal given a set of risks, constraints, and assumptions. Once stakeholder agreement and buy-in is obtained, companies can move onto the actual implementation phase which is typically an elaboration of the activities outlined in the project plan.

About this sample

This article outlines the high level steps to build a business case for data mining projects. This is NOT about how to build project implementation plans but more about using a structured approach to make go/no-go decisions about individual projects

Target audience

Business Analytics Project Managers, other Senior Stakeholders

About the Client

UK based Business Analytics Strategy Consultancy