5 skills of a data scientist Jun26


Related Posts

Share This

5 skills of a data scientist

A data scientist is capable in both technical and business aspects. For that purpose, he needs to be proficient with the following 3 technical and 2 business skills.


In terms of techniques, the data scientist is a data hacker, able to extract data from multiple sources, model and cluster it and finally display the derived information:


Data Munging is the process of converting and mapping raw data into a usable format. It includes the tasks of data retrieval, cleansing, parsing and validating data before further analysis.

The real-world data is non-consistent, non-complete and non-uniform. This means it has to be extracted and transformed to be usable. Most of the data scientists use high-level scripting to manipulate the data, such as Python and leverage the scalability of frameworks such as Hadoop to cope with the amount of data. Data is then fed in analytical systems.

Statistical Learning is the framework for machine learning. It includes competencies in statistics and in computer science. Once the data has been munged, data scientists use statistical learning to derive predictions, classifications for example to analyze it.

Requiring hard and continuous learning, machine learning algorithms are a highly valuable skill. Data scientists use statistics software to implement their models such as R or Matlab.

Vizualisation is the proficiency in building eye-catching drawings to display the information. Data scientists use interactive charts to allow the recipient to play with the newly described value-added information.

There are complete solutions on the market to focus essentially on the content itself such as BIRT, but if you want to present really ad hoc solutions, you might consider using libraries such as D3.js or Raphael.js to display interactive highly customized results.


In terms of business, the data scientist is a man from the field; he knows the processes and performance drivers of an organization. He is also confident with speaking with any person in the hierarchy.


Communication is an essential aspect of the data scientist’s work. Being able to develop different views for different audiences, get their attention and create storytelling visualizations are a necessity for a data scientist. A data scientist does not just report the information; he makes it valuable for the organization and enforces its use.

Business Acumen is the business know-how behind the hacking; what makes the data scientist go in the right direction. It is very easy to derive correlations from non-directly related factors and misunderstand the causality constraints if you do not understand the essence of the business.


In a few words, if you are looking for a data scientist, you are looking for a business hacker, a person that understands the business, finds bottlenecks and looks for solutions.