Spring集成Hadoop和Hbase

hadoop是大数据环境下必备的一套系统,使用hadoop集群可以充分的共享服务器资源,在离线处理上已经有了多年的应用。

Spring Hadoop简化了Apache Hadoop,提供了一个统一的配置模型以及简单易用的API来使用HDFS、MapReduce、Pig以及Hive。还集成了其它Spring生态系统项目,如Spring Integration和Spring Batch.。

Spring Hadoop2.5的官方文档及API地址:

spring-hadoop文档

spring-hadoop API

Spring Hadoop

  1. 添加仓库,配置依赖

 1<repositories>
2    <repository>
3      <id>clouderaid>
4      <url>https://repository.cloudera.com/artifactory/cloudera-repos/url>
5      <releases>
6        <enabled>trueenabled>
7      releases>
8      <snapshots>
9        <enabled>falseenabled>
10      snapshots>
11    repository>
12  repositories>
13
14  <properties>
15    <project.build.sourceEncoding>UTF-8project.build.sourceEncoding>
16    <hadoop.version>2.6.0-cdh5.7.0hadoop.version>
17  properties>
18
19  <dependencies>
20    <dependency>
21      <groupId>org.apache.hadoopgroupId>
22      <artifactId>hadoop-clientartifactId>
23      <version>${hadoop.version}version>
24      <scope>providedscope>
25    dependency>
26    
27    <dependency>
28      <groupId>com.kumkeegroupId>
29      <artifactId>UserAgentParserartifactId>
30      <version>0.0.1version>
31    dependency>
32    <dependency>
33      <groupId>junitgroupId>
34      <artifactId>junitartifactId>
35      <version>4.10version>
36      <scope>testscope>
37    dependency>
38    
39    <dependency>
40      <groupId>org.springframework.datagroupId>
41      <artifactId>spring-data-hadoopartifactId>
42      <version>2.5.0.RELEASEversion>
43    dependency>
44  dependencies>
45 ```
46 2. 在Spring的配置文件中添加hadoop配置
47 ```
48 
49<beans xmlns="http://www.springframework.org/schema/beans"
50       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"
51       xmlns:hdp="http://www.springframework.org/schema/hadoop"
52       xmlns:context="http://www.springframework.org/schema/context"
53       xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd
54        http://www.springframework.org/schema/hadoop
55        http://www.springframework.org/schema/hadoop/spring-hadoop.xsd http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd"
>

56    ......
57
58    
59    <context:property-placeholder location="application.properties"/>
60    <hdp:configuration id="hadoopConfiguration">
61        
62        fs.defaultFS=${spring.hadoop.fsUri}
63    hdp:configuration>
64    
65    <hdp:file-system id="fileSystem" configuration-ref="hadoopConfiguration" user="root"/>
66beans>

只需配置hadoop服务器,服务器的url和加载属性文件。

然后再创建一个属性文件application.properties,添加hadoop配置信息。

  1. test

 1public class SpringHadoopApp {
2
3    private ApplicationContext ctx;
4    private FileSystem fileSystem;
5
6    @Before
7    public void setUp() {
8        ctx = new ClassPathXmlApplicationContext("applicationContext.xml");
9        fileSystem = (FileSystem) ctx.getBean("fileSystem");
10    }
11
12    @After
13    public void tearDown() throws IOException {
14        ctx = null;
15        fileSystem.close();
16    }
17
18    /**
19     * 在HDFS上创建一个目录
20     * @throws Exception
21     */

22    @Test
23    public void testMkdirs()throws Exception{
24        fileSystem.mkdirs(new Path("/SpringHDFS/"));
25    }
26}

或者可以采用直接加载hadoop的配置文件的方式进行配置
/etc/hadoop/core-site.xml和/etc/hadoop/hdfs-site.xml拷贝过来进行配值

Spring Data Hbase

  1. 添加依赖

 1<dependency> 
2        <groupId>org.apache.hadoopgroupId
3        <artifactId>hadoop-authartifactId
4    dependency
5    <dependency> 
6        <groupId>org.apache.hbasegroupId
7        <artifactId>hbase-clientartifactId
8        <version>1.2.3version
9        <scope>compilescope
10        <exclusions> 
11            <exclusion> 
12                <groupId>log4jgroupId
13                <artifactId>log4jartifactId
14            exclusion
15            <exclusion> 
16                <groupId>org.slf4jgroupId
17                <artifactId>slf4j-log4j12artifactId
18            exclusion
19        exclusions
20    dependency
  1. 拷贝Hbase配置文件,整合applictionContext.xml

将HBase的配置文件hbase-site.xml复制到resources下,新建Spring配置文件applicationContext.xml

 1 
2<beans xmlns="http://www.springframework.org/schema/beans" 
3       xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance" 
4       xmlns:context="http://www.springframework.org/schema/context" 
5       xmlns:hdp="http://www.springframework.org/schema/hadoop" 
6       xsi:schemaLocation="http://www.springframework.org/schema/beans http://www.springframework.org/schema/beans/spring-beans.xsd 
7    http://www.springframework.org/schema/context http://www.springframework.org/schema/context/spring-context.xsd 
8    http://www.springframework.org/schema/hadoop http://www.springframework.org/schema/hadoop/spring-hadoop.xsd"
>
 
9
10    <context:annotation-config/> 
11    <context:component-scan base-package="com.sample.hbase"/> 
12    <hdp:configuration resources="hbase-site.xml"/> 
13    <hdp:hbase-configuration configuration-ref="hadoopConfiguration"/> 
14    <bean id="hbaseTemplate" class="org.springframework.data.hadoop.hbase.HbaseTemplate"> 
15        <property name="configuration" ref="hbaseConfiguration"/> 
16    bean
17beans

配置HbaseTemplate,和hbase配置文件位置

  1. test

 1@RunWith(SpringJUnit4ClassRunner.class) 
2@ContextConfiguration(locations = {"classpath*:applicationContext.xml"}) 
3public class BaseTest 
4
5    @Autowired 
6    private HbaseTemplate template; 
7
8    @Test 
9    public void testFind() 
10        List rows = template.find("user""cf""name"new RowMapper() { 
11            public String mapRow(Result result, int i) throws Exception 
12                return result.toString(); 
13            } 
14        }); 
15        Assert.assertNotNull(rows); 
16    } 
17
18    @Test 
19    public void testPut() 
20        template.put("user""xiaogao""cf""name", Bytes.toBytes("Alice")); 
21    } 
22




请使用浏览器的分享功能分享到微信等