- Data Analysis with Python
- David Taieb
- 402字
- 2021-06-11 13:31:42
What kind of skills are required to become a data scientist?
In the industry, the reality is that data science is so new that companies do not yet have a well-defined career path for it. How do you get hired for a data scientist position? How many years of experience is required? What skills do you need to bring to the table? Math, statistics, machine learning, information technology, computer science, and what else?
Well, the answer is probably a little bit of everything plus one more critical skill: domain-specific expertise.
There is a debate going on around whether applying generic data science techniques to any dataset without an intimate understanding of its meaning, leads to the desired business outcome. Many companies are leaning toward making sure data scientists have substantial amount of domain expertise, the rationale being that without it you may unknowingly introduce bias at any steps, such as when filling the gaps in the data cleansing phase or during the feature selection process, and ultimately build models that may well fit a given dataset but still end up being worthless. Imagine a data scientist working with no chemistry background, studying unwanted molecule interactions for a pharmaceutical company developing new drugs. This is also probably why we're seeing a multiplication of statistics courses specialized in a particular domain, such as biostatistics for biology, or supply chain analytics for analyzing operation management related to supply chains, and so on.
To summarize, a data scientist should be in theory somewhat proficient in the following areas:
- Data engineering / information retrieval
- Computer science
- Math and statistics
- Machine learning
- Data visualization
- Business intelligence
- Domain-specific expertise
Note
If you are thinking about acquiring these skills but don't have the time to attend traditional classes, I strongly recommend using online courses.
I particularly recommend this course: https://www.coursera.org/: https://www.coursera.org/learn/data-science-course.
The classic Drew's Conway Venn Diagram provides an excellent visualization of what is data science and why data scientists are a bit of a unicorn:

Drew's Conway Data Science Venn Diagram
By now, I hope it becomes pretty clear that the perfect data scientist that fits the preceding description is more an exception than the norm and that, most often, the role involves multiple personas. Yes, that's right, the point I'm trying to make is that data science is a team sport and this idea will be a recurring theme throughout this book.
- 計算機綜合設計實驗指導
- 數據可視化:從小白到數據工程師的成長之路
- Developing Mobile Games with Moai SDK
- 醫療大數據挖掘與可視化
- 跟老男孩學Linux運維:MySQL入門與提高實踐
- 數字媒體交互設計(初級):Web產品交互設計方法與案例
- SQL優化最佳實踐:構建高效率Oracle數據庫的方法與技巧
- 計算機應用基礎教程上機指導與習題集(微課版)
- SQL Server 2012數據庫管理教程
- Oracle高性能SQL引擎剖析:SQL優化與調優機制詳解
- 智能與數據重構世界
- 數據挖掘算法實踐與案例詳解
- 一本書讀懂大數據
- AI Crash Course
- Tableau商業分析從新手到高手(視頻版)