舉報

會員
Hands-On Big Data Modeling
Modelingandmanagingdataisacentralfocusofallbigdataprojects.Infact,adatabaseisconsideredtobeeffectiveonlyifyouhavealogicalandsophisticateddatamodel.Thisbookwillhelpyoudeveloppracticalskillsinmodelingyourownbigdataprojectsandimprovetheperformanceofanalyticalqueriesforyourspecificbusinessrequirements.Tostartwith,you’llgetaquickintroductiontobigdataandunderstandthedifferentdatamodelinganddatamanagementplatformsforbigdata.Thenyou’llworkwithstructuredandsemi-structureddatawiththehelpofreal-lifeexamples.Onceyou’vegottogripswiththebasics,you’llusetheSQLDeveloperDataModelertocreateyourowndatamodelscontainingdifferentfiletypessuchasCSV,XML,andJSON.You’llalsolearntocreategraphdatamodelsandexploredatamodelingwithstreamingdatausingreal-worlddatasets.Bytheendofthisbook,you’llbeabletodesignanddevelopefficientdatamodelsforvaryingdatasizeseasilyandefficiently.
最新章節(jié)
- Leave a review - let other readers know what you think
- Other Books You May Enjoy
- Further reading
- Summary
- Clustering
- Data cleansing
品牌:中圖公司
上架時間:2021-06-10 18:21:22
出版社:Packt Publishing
本書數(shù)字版權由中圖公司提供,并由其授權上海閱文信息技術有限公司制作發(fā)行
- Leave a review - let other readers know what you think 更新時間:2021-06-10 18:59:37
- Other Books You May Enjoy
- Further reading
- Summary
- Clustering
- Data cleansing
- Importing a file
- Importing the required libraries
- Starting the platform
- Modeling with the IMDb dataset
- Theory
- Rating data
- Episode data
- Introduction to IMDb data
- Modeling IMDb Data Points with Python
- Further reading
- Summary
- Linear regression to predict the temperature of a city
- Weather statistics by country
- Persistence model forecast
- Modeling with data
- Forecasting Nepal's temperature change
- Importing data
- Introduction to weather data
- Modeling Weather Data Points with Python
- Further reading
- Summary
- Bag of words
- Tokenization
- Noun-phrase extraction
- Parts of speech
- Installing TextBlob
- Sentiment analysis
- The frequency of the tweets
- Modeling Twitter feeds
- Importing Twitter feed data
- Modeling Twitter Feeds Using Python
- Further reading
- Summary
- Prediction
- Constructing the RNN model
- Preprocessing
- Importing datasets
- Importing packages
- Predicting Bitcoin price using Recurrent Neural Network
- Preprocessing and model creation
- Importing required libraries
- Importing Bitcoin data into iPython
- Theory
- Introduction to Bitcoin data
- Modeling Bitcoin Data Points with Python
- Further reading
- Summary
- Querying in AsterixDB
- Inserting into datasets
- Unstructured data in AsterixDB
- Getting started with AsterixDB
- The Asterix query language
- Data models
- AsterixDB
- Aerospike technology
- Aerospike
- Redis and Hadoop
- Advanced key-value stores
- Getting started with Redis on macOS
- Exploring data management with Redis
- BASE properties
- Characteristics of BDMS
- DBMS to BDMS
- Further reading
- Summary
- DBMS and MapReduce-style systems
- Merits of a distributed DBMS
- Features of a distributed DBMS
- Distributed DBMS
- Architectures for parallel databases
- Motivations for parallel DBMS
- Parallel DBMS
- Parallel and distributed DBMS
- Efficient access through optimization
- Data availability
- Data integrity
- Centralized data management and concurrent access
- Controlling data redundancy
- Data independence
- Declarative Query Language (DQL)
- Advantages of the DBMS
- DBMS-based approach to big data
- Problems with processing files
- Filesystems
- Non-DBMS-based approach to big data
- Concept and Approaches of Big-Data Management
- Further study
- Summary
- Exploring streaming sensor data from a weather station
- How a data lake works
- Differences between data lakes and data warehouses
- Data lakes
- Sensor data
- Streaming Sensor Data
- Further reading
- Summary
- Analyzing the streaming data
- Exploring streaming sensor data from the Twitter API
- Streaming data solutions
- Challenges with streaming data
- Needs for stream processing
- Importance and implications of streaming data
- Data analytics
- Data processing
- Data harvesting
- How streaming works
- Data streaming systems
- What is a data stream?
- Use cases of stream processing
- Why is streaming data different?
- Data stream and data model versus data format
- Modeling with Streaming Data
- Further reading
- Summary
- Improving the model
- Data visualization
- Data cleaning and transformations
- Data ingestion
- Unstructured text analysis using R
- The R language
- Characteristics of KNIME
- KNIME
- Weka
- Tools for analyzing unstructured data
- New methods of data processing
- Tools for intelligent analysis
- Getting started with unstructured data
- Modeling with Unstructured Data
- Further reading
- Summary
- Gradient-boosting regression
- Visualizing more than one parameter
- Factors that affect the price of houses
- Visualizing the location of houses based on latitude and longitude
- Modeling structured data using Python
- IPython
- Seaborn
- Matplotlib
- Pandas
- Operations using NumPy
- NumPy
- Getting started with structured data
- Modeling Structured Data
- Further reading
- Summary
- Graph-data models with Gephi
- Graph-data models
- Lucene
- VSM with Lucene
- Getting authorization credentials to access the Twitter API
- Installing Python and the Tweepy library
- Exploring the semi-structured data model of JSON data
- Semi-structured data models
- Structures of Data Models
- Further reading
- Summary
- Object-relational models
- Entity-relationship models
- Object-oriented database model
- Network models
- Advantages of the relational data model
- Relational models
- Hierarchical database models
- Types of data model
- Features of the physical data model
- Physical data modeling
- Benefits of constructing LDMs
- Logical data modeling
- Conceptual data modeling
- Levels of data modeling
- Categorizing Data Models
- Further reading
- Summary
- A unified approach to big data modeling and data management
- Structural constraints
- Domain constraints
- Type constraints
- Cardinality constraints
- Uniqueness constraints
- Value constraints
- Types of constraints
- Data constraints
- Join
- Projection
- Union
- Subsetting
- Data operations
- Comparing structured and unstructured data
- Sources of unstructured data
- Unstructured data
- Structured data
- Data model structures
- Defining Data Models
- Further reading
- Summary
- R on Windows
- R on macOS
- Python on Windows
- Python on macOS
- Getting started with Python and R
- Apache Drill
- Spark SQL
- Cassandra Query Language (CQL)
- Hive Query Langauge (HQL)
- SQL data models
- Advantages of Flink
- Flink
- Reasons to choose Apache Spark
- Spark
- Functional programming
- Yet Another Resource Negotiator
- Features of Hadoop frameworks
- Hadoop
- MapReduce functionality
- MapReduce
- Big data programming models
- Extensible-record stores
- Key-value stores
- Document stores
- NoSQL stores
- Database as a Service (DaaS)
- Scalable relational systems
- Relational stores (SQLs)
- Data models
- Object-based storage
- File-based storage
- Block-based storage
- Storage models
- Big data storage and data models
- Big data management vendors
- Data integration
- Data cleansing
- Big data management services
- Data scalability and security
- Data operations
- Data quality
- Data storage
- Data ingestion
- Big data management
- Data Modeling and Management Platforms
- Further reading
- Summary
- Getting started on macOS
- Getting started on Windows
- Setting up big data modeling platforms
- Challenges in big data management
- Benefits of big data management
- Importance and implications of big data modeling and management
- Introduction to managing big data
- Uses of models
- Introduction to big data modeling
- Challenges of big data
- Sources and types of big data
- Characteristics of big data
- Interesting insights regarding big data
- The concept of big data
- Introduction to Big Data and Data Management
- Reviews
- Get in touch
- Conventions used
- Download the color images
- Download the example code files
- To get the most out of this book
- What this book covers
- Who this book is for
- Preface
- Packt is searching for authors like you
- About the reviewers
- About the authors
- Contributors
- Packt.com
- Why subscribe?
- About Packt
- Title Page
- coverpage
- coverpage
- Title Page
- About Packt
- Why subscribe?
- Packt.com
- Contributors
- About the authors
- About the reviewers
- Packt is searching for authors like you
- Preface
- Who this book is for
- What this book covers
- To get the most out of this book
- Download the example code files
- Download the color images
- Conventions used
- Get in touch
- Reviews
- Introduction to Big Data and Data Management
- The concept of big data
- Interesting insights regarding big data
- Characteristics of big data
- Sources and types of big data
- Challenges of big data
- Introduction to big data modeling
- Uses of models
- Introduction to managing big data
- Importance and implications of big data modeling and management
- Benefits of big data management
- Challenges in big data management
- Setting up big data modeling platforms
- Getting started on Windows
- Getting started on macOS
- Summary
- Further reading
- Data Modeling and Management Platforms
- Big data management
- Data ingestion
- Data storage
- Data quality
- Data operations
- Data scalability and security
- Big data management services
- Data cleansing
- Data integration
- Big data management vendors
- Big data storage and data models
- Storage models
- Block-based storage
- File-based storage
- Object-based storage
- Data models
- Relational stores (SQLs)
- Scalable relational systems
- Database as a Service (DaaS)
- NoSQL stores
- Document stores
- Key-value stores
- Extensible-record stores
- Big data programming models
- MapReduce
- MapReduce functionality
- Hadoop
- Features of Hadoop frameworks
- Yet Another Resource Negotiator
- Functional programming
- Spark
- Reasons to choose Apache Spark
- Flink
- Advantages of Flink
- SQL data models
- Hive Query Langauge (HQL)
- Cassandra Query Language (CQL)
- Spark SQL
- Apache Drill
- Getting started with Python and R
- Python on macOS
- Python on Windows
- R on macOS
- R on Windows
- Summary
- Further reading
- Defining Data Models
- Data model structures
- Structured data
- Unstructured data
- Sources of unstructured data
- Comparing structured and unstructured data
- Data operations
- Subsetting
- Union
- Projection
- Join
- Data constraints
- Types of constraints
- Value constraints
- Uniqueness constraints
- Cardinality constraints
- Type constraints
- Domain constraints
- Structural constraints
- A unified approach to big data modeling and data management
- Summary
- Further reading
- Categorizing Data Models
- Levels of data modeling
- Conceptual data modeling
- Logical data modeling
- Benefits of constructing LDMs
- Physical data modeling
- Features of the physical data model
- Types of data model
- Hierarchical database models
- Relational models
- Advantages of the relational data model
- Network models
- Object-oriented database model
- Entity-relationship models
- Object-relational models
- Summary
- Further reading
- Structures of Data Models
- Semi-structured data models
- Exploring the semi-structured data model of JSON data
- Installing Python and the Tweepy library
- Getting authorization credentials to access the Twitter API
- VSM with Lucene
- Lucene
- Graph-data models
- Graph-data models with Gephi
- Summary
- Further reading
- Modeling Structured Data
- Getting started with structured data
- NumPy
- Operations using NumPy
- Pandas
- Matplotlib
- Seaborn
- IPython
- Modeling structured data using Python
- Visualizing the location of houses based on latitude and longitude
- Factors that affect the price of houses
- Visualizing more than one parameter
- Gradient-boosting regression
- Summary
- Further reading
- Modeling with Unstructured Data
- Getting started with unstructured data
- Tools for intelligent analysis
- New methods of data processing
- Tools for analyzing unstructured data
- Weka
- KNIME
- Characteristics of KNIME
- The R language
- Unstructured text analysis using R
- Data ingestion
- Data cleaning and transformations
- Data visualization
- Improving the model
- Summary
- Further reading
- Modeling with Streaming Data
- Data stream and data model versus data format
- Why is streaming data different?
- Use cases of stream processing
- What is a data stream?
- Data streaming systems
- How streaming works
- Data harvesting
- Data processing
- Data analytics
- Importance and implications of streaming data
- Needs for stream processing
- Challenges with streaming data
- Streaming data solutions
- Exploring streaming sensor data from the Twitter API
- Analyzing the streaming data
- Summary
- Further reading
- Streaming Sensor Data
- Sensor data
- Data lakes
- Differences between data lakes and data warehouses
- How a data lake works
- Exploring streaming sensor data from a weather station
- Summary
- Further study
- Concept and Approaches of Big-Data Management
- Non-DBMS-based approach to big data
- Filesystems
- Problems with processing files
- DBMS-based approach to big data
- Advantages of the DBMS
- Declarative Query Language (DQL)
- Data independence
- Controlling data redundancy
- Centralized data management and concurrent access
- Data integrity
- Data availability
- Efficient access through optimization
- Parallel and distributed DBMS
- Parallel DBMS
- Motivations for parallel DBMS
- Architectures for parallel databases
- Distributed DBMS
- Features of a distributed DBMS
- Merits of a distributed DBMS
- DBMS and MapReduce-style systems
- Summary
- Further reading
- DBMS to BDMS
- Characteristics of BDMS
- BASE properties
- Exploring data management with Redis
- Getting started with Redis on macOS
- Advanced key-value stores
- Redis and Hadoop
- Aerospike
- Aerospike technology
- AsterixDB
- Data models
- The Asterix query language
- Getting started with AsterixDB
- Unstructured data in AsterixDB
- Inserting into datasets
- Querying in AsterixDB
- Summary
- Further reading
- Modeling Bitcoin Data Points with Python
- Introduction to Bitcoin data
- Theory
- Importing Bitcoin data into iPython
- Importing required libraries
- Preprocessing and model creation
- Predicting Bitcoin price using Recurrent Neural Network
- Importing packages
- Importing datasets
- Preprocessing
- Constructing the RNN model
- Prediction
- Summary
- Further reading
- Modeling Twitter Feeds Using Python
- Importing Twitter feed data
- Modeling Twitter feeds
- The frequency of the tweets
- Sentiment analysis
- Installing TextBlob
- Parts of speech
- Noun-phrase extraction
- Tokenization
- Bag of words
- Summary
- Further reading
- Modeling Weather Data Points with Python
- Introduction to weather data
- Importing data
- Forecasting Nepal's temperature change
- Modeling with data
- Persistence model forecast
- Weather statistics by country
- Linear regression to predict the temperature of a city
- Summary
- Further reading
- Modeling IMDb Data Points with Python
- Introduction to IMDb data
- Episode data
- Rating data
- Theory
- Modeling with the IMDb dataset
- Starting the platform
- Importing the required libraries
- Importing a file
- Data cleansing
- Clustering
- Summary
- Further reading
- Other Books You May Enjoy
- Leave a review - let other readers know what you think 更新時間:2021-06-10 18:59:37