- Practical Big Data Analytics
- Nataraj Dasgupta
- 177字
- 2021-07-02 19:26:26
The fundamental premise of Hadoop
The fundamental premise of Hadoop is that instead of attempting to perform a task on a single large machine, the task can be subpided into smaller segments that can then be delegated to multiple smaller machines. These so-called smaller machines would then perform the task on their own portion of the data. Once the smaller machines have completed their tasks to produce the results on the tasks they were allocated, the inpidual units of results would then be aggregated to produce the final result.
Although, in theory, this may appear relatively simple, there are various technical considerations to bear in mind. For example:
- Is the network fast enough to collect the results from each inpidual server?
- Can each inpidual server read data fast enough from the disk?
- If one or more of the servers fail, do we have to start all over?
- If there are multiple large tasks, how should they be prioritized?
There are many more such considerations that must be considered when working with a distributed architecture of this nature.
- 繪制進(jìn)程圖:可視化D++語言(第1冊(cè))
- 高性能混合信號(hào)ARM:ADuC7xxx原理與應(yīng)用開發(fā)
- 3D Printing with RepRap Cookbook
- 實(shí)時(shí)流計(jì)算系統(tǒng)設(shè)計(jì)與實(shí)現(xiàn)
- 并行數(shù)據(jù)挖掘及性能優(yōu)化:關(guān)聯(lián)規(guī)則與數(shù)據(jù)相關(guān)性分析
- 流處理器研究與設(shè)計(jì)
- Windows內(nèi)核原理與實(shí)現(xiàn)
- 中國戰(zhàn)略性新興產(chǎn)業(yè)研究與發(fā)展:智能制造
- 傳感器與物聯(lián)網(wǎng)技術(shù)
- 數(shù)據(jù)通信與計(jì)算機(jī)網(wǎng)絡(luò)
- 人工智能:語言智能處理
- AI的25種可能
- Visual Studio 2010 (C#) Windows數(shù)據(jù)庫項(xiàng)目開發(fā)
- Dreamweaver+Photoshop+Flash+Fireworks網(wǎng)站建設(shè)與網(wǎng)頁設(shè)計(jì)完全實(shí)用
- 基于RPA技術(shù)財(cái)務(wù)機(jī)器人的應(yīng)用與研究