- Apache Spark Quick Start Guide
- Shrey Mehrotra Akash Grade
- 108字
- 2021-07-02 13:39:56
Spark graph processing
Spark also has a component to process graph data. A graph consists of vertices and edges. Edges define the relationship between vertices. Some examples of graph data are customers's product ratings, social networks, Wikipedia pages and their links, airport flights, and more.
Spark provides GraphX to process such data. GraphX makes use of RDD for its computation and allows users to create vertices and edges with some properties. Using GraphX, you can define and manipulate a graph or get some insights from the graph.
GraphFrames is an external package that makes use of DataFrames instead of RDD, and defines vertex-edge relation using a DataFrame.