- Apache Hive Essentials
- Dayong Du
- 416字
- 2021-07-23 20:25:30
Using the Hive command line and Beeline
Hive first started with HiveServer1. However, this version of the Hive server was not very stable. It sometimes suspended or blocked clients' connection quietly. Since version 11, Hive includes a new Hive server called HiveSever2 as an addition to HiveServer1. HiveServer2 is an enhanced Hive server designed for multiclient concurrency and improved authentication. HiveServer2 also supports Beeline as the alternative command-line interface. HiveServer1 is deprecated and removed from Hive since version 1.0.0.
The primary difference between the two Hive servers is how the clients connect to Hive. Hive CLI is an Apache Thrift-based client, and Beeline is a JDBC client based on SQLLine (http://sqlline.sourceforge.net/) CLI. The Hive CLI directly connects to the Hive drivers and requires installing Hive on the same machine as the client. However, Beeline connects to HiveServer2 through JDBC connections and does not require the installation of Hive libraries on the same machine as the client. That means we can run Beeline remotely from outside of the Hadoop cluster.
The following table is the commonly used commands for both Beeline and Hive CLI. For more usage of HiveServer2 and Beeline, refer to https://cwiki.apache.org/confluence/display/Hive/HiveServer2+Clients.

The following is the command-line syntax in Beeline or Hive CLI:

Note
For Beeline, ;
is not needed after the command that starts with !
.
When running a query in Hive CLI, the MapReduce statistics information is shown in the console screen while processing, whereas Beeline does not.
Both Beeline and Hive CLI do not support running a pasted query with <tab> inside, because <tab> is used for autocomplete by default in the environment. Alternatively, running the query from files has no such issues.
Hive CLI shows the exact line and position of the Hive query or syntax errors when the query has multiple lines. However, Beeline processes the multiple-line query as a single line, so only the position is shown for query or syntax errors with the line number as 1 for all instances. For this aspect, Hive CLI is more convenient than Beeline for debugging the Hive query.
In both Hive CLI and Beeline, using the up and down arrow keys can retrieve up to 10,000 previous commands. The !history
command can be used in Beeline to show all history.
Both Hive CLI and Beeline supports variable substitution; refer to https://cwiki.apache.org/confluence/display/Hive/LanguageManual+VariableSubstitution.
A list of Hive configuration settings and properties can be accessed and overwritten by the set
keyword from the command-line environment. For more details, refer to the Apache Hive wiki at https://cwiki.apache.org/confluence/display/Hive/Configuration+Properties.