官术网_书友最值得收藏!

What kind of skills are required to become a data scientist?

In the industry, the reality is that data science is so new that companies do not yet have a well-defined career path for it. How do you get hired for a data scientist position? How many years of experience is required? What skills do you need to bring to the table? Math, statistics, machine learning, information technology, computer science, and what else?

Well, the answer is probably a little bit of everything plus one more critical skill: domain-specific expertise.

There is a debate going on around whether applying generic data science techniques to any dataset without an intimate understanding of its meaning, leads to the desired business outcome. Many companies are leaning toward making sure data scientists have substantial amount of domain expertise, the rationale being that without it you may unknowingly introduce bias at any steps, such as when filling the gaps in the data cleansing phase or during the feature selection process, and ultimately build models that may well fit a given dataset but still end up being worthless. Imagine a data scientist working with no chemistry background, studying unwanted molecule interactions for a pharmaceutical company developing new drugs. This is also probably why we're seeing a multiplication of statistics courses specialized in a particular domain, such as biostatistics for biology, or supply chain analytics for analyzing operation management related to supply chains, and so on.

To summarize, a data scientist should be in theory somewhat proficient in the following areas:

  • Data engineering / information retrieval
  • Computer science
  • Math and statistics
  • Machine learning
  • Data visualization
  • Business intelligence
  • Domain-specific expertise

Note

If you are thinking about acquiring these skills but don't have the time to attend traditional classes, I strongly recommend using online courses.

I particularly recommend this course: https://www.coursera.org/: https://www.coursera.org/learn/data-science-course.

The classic Drew's Conway Venn Diagram provides an excellent visualization of what is data science and why data scientists are a bit of a unicorn:

Drew's Conway Data Science Venn Diagram

By now, I hope it becomes pretty clear that the perfect data scientist that fits the preceding description is more an exception than the norm and that, most often, the role involves multiple personas. Yes, that's right, the point I'm trying to make is that data science is a team sport and this idea will be a recurring theme throughout this book.

主站蜘蛛池模板: 临城县| 鄄城县| 潢川县| 华安县| 马龙县| 榆树市| 通山县| 库车县| 阿图什市| 揭阳市| 黑水县| 遵义市| 灌南县| 盘锦市| 利津县| 崇左市| 佛教| 忻州市| 阜平县| 顺平县| 额敏县| 汉源县| 台江县| 东辽县| 富裕县| 新巴尔虎左旗| 饶平县| 博野县| 株洲县| 同江市| 冕宁县| 陈巴尔虎旗| 天津市| 余姚市| 海丰县| 金溪县| 聂拉木县| 永安市| 甘南县| 泰安市| 莎车县|