- Building Analytics Teams
- John K. Thompson Douglas B. Laney
- 503字
- 2021-06-18 18:30:47
The implications of proprietary versus open source tools
Over the past 20 years, there has been a significant change in technology that enables teams and inpiduals to design, develop, and deploy analytical applications. The change has been in the evolution and refinement of open source software and related platforms.
20 years ago, there were only proprietary software offerings from SAS, SPSS, Statistica, Minitab, and others. An entire generation, or possibly two generations, of psychologists, social scientists, mathematicians, and analysts grew up using these software systems in undergraduate- and graduate-level academic programs. When entering the business, research, and governmental workforces, those people brought their favorite tools with them.
More recently, open source systems like Knime, RapidMiner, and others offer community versions for free. In addition to the many open source tools and community versions available, the rise and evolution of the R and Python languages have provided a rich toolset for people to build advanced analytics applications without purchasing expensive proprietary software.
Most people stop at the point in the discussion where the community version of the open source tools does not have a license fee. That is missing the point. As the coauthor of my first book, Analytics: How to Win with Intelligence, Shawn Rogers, is fond of saying, "Open source is free like a puppy." Yes, it is great to get the puppy and it is free at that moment, but there are many expenses that come along with the free bundle of joy. In almost all cases, if you are going to use open source software for production purposes, you will need to buy support, the enterprise version, management software, and interactive development environment or other software and/or services to make the environment effective, efficient, productive, secure, and collaborative.
The truly important element of this market characteristic for the purpose of our discussion is that the proprietary software, open source delineation in the market splits the age of the people you will be looking to hire by age.
From my informal research and observation over the past 5 to 7 years, most data scientists that are over 40 years old will predominately want to use proprietary software. The vast majority of data scientists under 30 years old will use open source coupled with R or Python.
This simple observation and the fact that data scientists are typically allowed to use the tools that they feel most comfortable with means that to have a collaborative team, you want the majority of the team to use tools that foster the sharing of code, approaches, and methodologies.
In practical and simplistic terms, this means that you will either end up with an older team using proprietary tools or a younger team using open source tools supplemented with R and or Python. Again, like the evolution of the team, you can let this organically develop, but acting in this manner will cause you and your team lost productivity, team conflict, and other management headaches.
Pick one approach or the other. Do not mix and match.
- 繪制進程圖:可視化D++語言(第1冊)
- Circos Data Visualization How-to
- 手把手教你玩轉RPA:基于UiPath和Blue Prism
- 機器學習與大數據技術
- Google App Inventor
- 自動檢測與轉換技術
- RPA(機器人流程自動化)快速入門:基于Blue Prism
- Kubernetes for Serverless Applications
- 單片機C語言應用100例
- Mastering Predictive Analytics with scikit:learn and TensorFlow
- TensorFlow Deep Learning Projects
- Microsoft Dynamics CRM 2013 Marketing Automation
- 電氣控制及Micro800 PLC程序設計
- Windows 7故障與技巧200例
- 智能+:制造業的智能化轉型