Data Science technologies optimizes production, deliverance, fraud detection, prediction etc like no other technologies in modern times. It is one of the most revolutionary set of technologies of 21st century. Data is the new oil and every company and institution wants to understand and use it to their advantage. In such extremely competitive business environment when the companies are not just competing with local brands or products but internationally, the difference between successful companies has now become a matter of their data strategy and how they optimize their company into reacting fast and intelligently by gaining knowledge from their data. Companies want super reflexes and want fast decision to gain competitive advantage. So, unvaryingly, a lot of business decisions today are focused on optimization of their data intelligence. Despite all advantages and importance of data, companies are facing extreme challenges into adopting a successful data science project and practices. For example, according to one report by Gartner, almost 60 % of Data Science projects fail. Last year in July, Deborah Leff who is CTO for data science and AI at IBM, publically mentioned that the success rate of data science project is just 1 out of 10. So, despite all the wow factor and possibilities, data science projects happen to fail much more than they succeed.
Indeed the failure rate of data science projects are extreme, organizations will be forced to adopt it to compete in the market. The organizations that have implemented and created data culture have extreme competitive advantage. They have refined strategy, better decision making capabilities, reduced cost, and better sales, over all much better suited for modern world. It is also that the most of the failure in achieving goals happens not because of the problem with data science technique. The reason for failure in most of the projects is problem with the quality of data and lack of management willingness to use it to their advantage. As data science roots towards basic and critical evaluation of past business decisions, most of the management gloss over it and fails to take the hard step to self-evaluate and reform.
So, why Data science project fails
Understanding the failure of data science project is extremely important. The probability of a failed data science project is much more than its success. As with all remedial, prevention is always better than cure and so for the data science. Understanding why data science projects fails can help organization adopt preventive organizational strategy which can help them in their successful implementation of projects. Special care should be taken before taking data science decision. The scientist, his team and the organization should have discussion at the start about the scope of the projects and amount of data sharing which will be required for understanding the problem. All of the team should be on the same page and organizational resistance should be reduced by taking preventive measures. The reason for the failure of data science projects is long and complicated. Broadly the failure can be classified into two broad categories:
- Organizational failure: When the organization structure, management, and policies responsible for failure of the data science efforts.
- Data team failure: The team which you have hired, leased or taking consultation from are asking the wrong question, are over committing or just not technically capable of implementing the project.
Some of the reason why data science projects fails is as follow:
1. Bad Data:
This is biggest reason for failure of data science project is the problem with collecting good data. A lot of organizational data collection practices have tendency of collecting biased, untrustworthy or outdated data. KPMG report states that 80 percent of CEOs are concerned about quality of data. Organization should first find the minimum needed data quality which is required in their organization. They need to understand the quality of present data and compare it with expectations and then adopt best practices to help overcome it. Data governance should be implemented and a data protection strategy and infrastructure can be adopted. Major improvement should be adopted into how to store and process large amount of data. Importance of data can be gauged from this Harvard report. The data should be unbiased, right and should be collected by thoroughly thinking about the measurement factors. Almost 80 percent of data scientist time is spent cleansing the data.
2. Unrealistic Expectations and over promising:
Data science isn’t an elixir of business life. Nor it is super remedy. Organization have tendency to overestimate the scope or benefits of data driven projects which puts unrealistic expectations on the process. The definition of result should be clear from the start. This can result into adoption of very complex but not needed process which becomes a burden in successful implementation and delays projects. Data science expectation management is very important for both data scientist and the management for super implementation of the project. Also, it is very important for data scientist to priority wise communicate with the management and other stakeholders the expectations and should proactively seek to negate any misunderstanding. Data science has a history of over promising and not delivering. Unrealistic expectations are harmful. This lack of realistic professionalism has resulted into credibility problem for data science.
3. Lack of firm support by key stakeholders:
Stakeholder Management is also very important for the success of data science project. Data driven leadership is extremely critical in this regard. As data science is a collaborative field and need support in multiple dimension, lack of key support by some stakeholder can really disrupt the whole process. Stakeholder needs to be committed into project goals and their individual responsibility to make the project a success. Stakeholders need to be convinced about the importance of data decision for the industry. Understanding of expectation and the hierarchy of stakeholder is very important. Also, special care should be given towards those stakeholders who have difficulty understanding the concepts.
4. Problematic Data Science Team:
A range of skill sets are needed to implement a successful data science projects such as data modeling experts, data acquisition experts etc. Also, the team should have people who are subject matter expert and also those who are well versed with internal business operations. Relying too much on individual talent can hamper the project as well. Data team can be a collaboration of both in-house and out-house teams and should be enough flexible to implement the objectives. A lot of organization creates a data science team just to follow industry trend. Without proper goal in mind, data science team is a burden. Creating a team is important in the sense that the team should be properly evaluated in all important dimension like leadership, technicality, workforce, knowledge requirement etc.
5. Wrong question:
Considering in the hierarchy of importance, understanding of problem your company is facing is the most important thing for successful prioritization of the objective, a wrong question can result into a complete wastage of time, effort and resources. Proper understanding of objective and the requirement should be stated and understood so that the possibility of asking question which are not beneficial for the project can be minimized. Starting with right question and understanding of right problems can increase the factor of success manifold. This is true for projects which need fast implementation. Data Science projects are not towards endless chase of achieving idealistic need but prioritizing needs and deliverability. Asking right and priority question is as important as the project itself because the way a data scientist ask question is the direction he will take to solve the problem.