有些仓促,有遗漏或者错误的地方还请各位大侠给予指正
1. 下载软件
登录http://mirror.mel.bkb.net.au/pub/apache/网下载hive的软件包
我下载的是http://mirror.mel.bkb.net.au/pub/apache//hive/stable/hive-0.12.0.tar.gz
上传到suse linux
执行 tar zxf hive-0.12.0.tar.gz #解压软件包
在master节点上 mv hive-0.12.0 /usr/ #复制到与hadoop相同的目录,便于管理
ln -s hive-0.12.0 hive #创建一个连接
2. 设置环境变量
针对hadoop用户,本机的hadoop用户名就是hadoop
vi .bash_profile
export JAVA_HOME=/usr/lib/jvm/java-1.6.0-openjdk-1.6.0.0.x86_64/jre
export HADOOP_HOME=/usr/hadoop
export HIVE_HOME=/usr/hive
export HADOOP_CONF_DIR=$HOME/conf
export HIVE_CONF_DIR=$HOME/hive-conf
export CLASSPATH=$HIVE_HOME/lib:$JAVA_HOME/lib:$JAVA_HOME/jre/lib:$HADOOP_HOME
export PATH=$HIVE_HOME/bin:$HADOOP_HOME/bin:$JAVA_HOME/bin:/sbin/:/bin:.:$PATH
3. 配置hive
进入目录/usr/hive
cp -r conf $HIVE_CONF_DIR/
cd $HIVE_CONF_DIR/
cp hive-default.xml.template hive-default.xml
cp hive-env.sh.template hive-env.sh
vi hive-env.sh #编辑文件增加下边两列
export HADOOP_HEAPSIZE=512
export HIVE_CONF_DIR=/home/hadoop/hive-conf
4. 测试HIVE
$ hive
hive> show tables;
OK
Time taken: 4.824 seconds
hive> create table product(id int,name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',';
OK
Time taken: 0.566 seconds
......
hive> select * from product;
OK
1 zhang
2 wang
1 zhang
2 wang
Time taken: 0.099 seconds, Fetched: 4 row(s)
#本地加载需要带local,不带的为从hadoop上加载的,需要先上传的hadoop的目录中
hive> load data local inpath '/home/hadoop/product.txt' into table product;
Copying data from file:/home/hadoop/product.txt
Copying file: file:/home/hadoop/product.txt
Loading data to table default.product
Table default.product stats: [num_partitions: 0, num_files: 3, num_rows: 0, total_size: 45, raw_data_size: 0]
OK
Time taken: 0.57 seconds
hive> select * from product;
OK
1 zhang
2 wang
1 zhang
2 wang
1 zhang
2 wang
Time taken: 0.132 seconds, Fetched: 6 row(s)
hive> CREATE TABLE product1 (id int,name string) ROW FORMAT DELIMITED FIELDS TERMINATED BY ',' LOCATION '/tmp/product/'
OK
Time taken: 0.566 seconds
###只要把文件上传到/tmp/product/这个目录就可以了 hadoop fs -put /home/hadoop/product.txt /tmp/product
hive> select * from product1;
OK
1 zhang
2 wang
Time taken: 0.101 seconds, Fetched: 2 row(s)
$ hadoop dfs -lsr /user/hive
drwxrwxr-x - hadoop supergroup 0 2013-11-27 20:52 /user/hive/warehouse
drwxr-xr-x - hadoop supergroup 0 2013-11-27 23:27 /user/hive/warehouse/product
-rw-r--r-- 1 hadoop supergroup 15 2013-11-27 20:39 /user/hive/warehouse/product/product.txt
-rw-r--r-- 1 hadoop supergroup 15 2013-11-27 20:51 /user/hive/warehouse/product/product_copy_1.txt
-rw-r--r-- 1 hadoop supergroup 15 2013-11-27 23:27 /user/hive/warehouse/product/product_copy_2.txt
$ hadoop dfs -lsr /tmp/product
-rw-r--r-- 1 hadoop supergroup 15 2013-11-27 22:58 /tmp/product/product.txt
####删除文件以及目录的命令,文件删除数据库表中的数据也同时删除。
hadoop fs -rm /user/hive/warehouse/product/product_copy_2.txt
hadoop fs -rmr /user/hive/warehouse/product/
5. 常见错误
error 1:
-------------------------------------------------
执行hadoop命令的时候会出现这个错误:
FAILED: Could not create the Java Virtual Machine
Solution:
减小hadoop-env.sh 里面的 hadoop-heapsize=200
同时减少hive-conf 里面的 hadoop-heapsize=200
error 2:
-------------------------------------------------
hadoop datanode 无法启动,通过日志查看发现是namespaceIDs的问题
ava.io.IOException: Incompatible namespaceIDs in /tmp/hadoop/hadoop-hadoop/dfs/data:
NameNode namespaceID = 1307672299; DataNode namespaceID = 389959598
这是因为namenode和datanode节点namespaceID不一致
Solution:
1、停止hadoop,删除/tmp/hadoop/*,执行hadoop namenode -format,启动hadoop
2、停止hadoop,编辑/tmp/hadoop/hadoop-hadoop/dfs/data/current/VERSION,修改namespaceID = 389959598 为 namespaceID = 1307672299,启动hadoop