- Data Analysis with Python
- David Taieb
- 187字
- 2021-06-11 13:31:42
Deep diving into a concrete example
Early on, we wanted to build a data pipeline that extracted insights from Twitter by doing sentiment analysis of tweets containing specific hashtags and to deploy the results to a real-time dashboard. This application was a perfect starting point for us, because the data science analytics were not too complex, and the application covered many aspects of a real-life scenario:
- High volume, high throughput streaming data
- Data enrichment with sentiment analysis NLP
- Basic data aggregation
- Data visualization
- Deployment into a real-time dashboard
To try things out, the first implementation was a simple Python application that used the tweepy library (the official Twitter library for Python: https://pypi.python.org/pypi/tweepy) to connect to Twitter and get a stream of tweets and textblob (the simple Python library for basic NLP: https://pypi.python.org/pypi/textblob) for sentiment analysis enrichment.
The results were then saved into a JSON file for analysis. This prototype was a great way to getting things started and experiment quickly, but after a few iterations we quickly realized that we needed to get serious and build an architecture that satisfied our enterprise requirements.
- 漫話大數據
- MySQL數據庫進階實戰
- Python數據分析、挖掘與可視化從入門到精通
- 文本數據挖掘:基于R語言
- Libgdx Cross/platform Game Development Cookbook
- Learning JavaScriptMVC
- Learn Unity ML-Agents:Fundamentals of Unity Machine Learning
- Sybase數據庫在UNIX、Windows上的實施和管理
- Remote Usability Testing
- 基于OPAC日志的高校圖書館用戶信息需求與檢索行為研究
- SQL Server 2012數據庫管理教程
- 改變未來的九大算法
- 數據挖掘與機器學習-WEKA應用技術與實踐(第二版)
- 一本書讀懂大數據
- NoSQL數據庫原理(第2版·微課版)