`
nanjingjiangbiao_T
  • 浏览: 2572668 次
  • 来自: 深圳
文章分类
社区版块
存档分类
最新评论

开源云计算技术系列(四)(Cloudera安装配置 0.183稳定版)

 
阅读更多

节省篇幅,直入正题。

首先用虚拟机virtualbox 配置一台debian 5.0.

debian在开源linux里面始终是最为纯正的linux血统,使用起来方便,运行起来高效,重新审视一下最新的5.0,别有一番似是故人来的感觉。

只需要下载debian-501-i386-CD-1.iso进行安装,剩下的基于debian强大的网络功能,可以很方便的进行软件包的配置。具体过程这里略去,可以在www.debian.org里面找到所有你需要的信息。

下面我们来体验一下稳定版0.183的方便和简洁。

step1.配置Cloudera Repository

<wbr><span style="font-size:14px">de style="margin-top: 0px; margin-right: 0px; margin-bottom: 0px; margin-left: 0px; padding-top: 2px; padding-right: 2px; padding-bottom: 2px; padding-left: 2px; font: normal normal normal 13px/normal 'Courier New', Courier, Verdana, Arial, sans-serif; color: rgb(167, 50, 40); "&lt;创建一个新的配置文件 vi /etc/apt/sources.list.d/cloudera.list<wbr>de&lt;</wbr></span></wbr>

more /etc/apt/sources.list.d/cloudera.list
deb
http://archive.cloudera.com/debianlenny contrib
deb-src
http://archive.cloudera.com/debianlenny contrib

增加Adding the Cloudera Key

debian:~# curl -shttp://archive.cloudera.com/debian/archive.key| apt-key add -
OK

更新APT Index

debian:~# apt-get update
Ign cdrom://[Debian GNU/Linux 5.0.1 _Lenny_ - Official i386 CD Binary-1 20090413-00:10] lenny Release.gpg
Ign cdrom://[Debian GNU/Linux 5.0.1 _Lenny_ - Official i386 CD Binary-1 20090413-00:10] lenny/main Translation-en_US
Ign cdrom://[Debian GNU/Linux 5.0.1 _Lenny_ - Official i386 CD Binary-1 20090413-00:10] lenny Release
Ign cdrom://[Debian GNU/Linux 5.0.1 _Lenny_ - Official i386 CD Binary-1 20090413-00:10] lenny/main Packages/DiffIndex
Get:1
http://archive.cloudera.comlenny Release.gpg [197B]
Get:2
http://volatile.debian.orglenny/volatile Release.gpg [189B]
Ign
http://volatile.debian.orglenny/volatile/main Translation-en_US
Hit
http://ftp.us.debian.orglenny Release.gpg
Ign
http://archive.cloudera.comlenny/contrib Translation-en_US
Hit
http://security.debian.orglenny/updates Release.gpg
Ign
http://security.debian.orglenny/updates/main Translation-en_US
Get:3
http://volatile.debian.orglenny/volatile Release [40.7kB]
Ign
http://ftp.us.debian.orglenny/main Translation-en_US
Hit
http://security.debian.orglenny/updates Release
Get:4
http://archive.cloudera.comlenny Release [2391B]
Hit
http://ftp.us.debian.orglenny Release
Ign
http://security.debian.orglenny/updates/main Packages/DiffIndex
Ign
http://archive.cloudera.comlenny/contrib Packages
Ign
http://security.debian.orglenny/updates/main Sources/DiffIndex
Ign
http://ftp.us.debian.orglenny/main Packages/DiffIndex
Ign
http://ftp.us.debian.orglenny/main Sources/DiffIndex
Hit
http://security.debian.orglenny/updates/main Packages
Hit
http://ftp.us.debian.orglenny/main Packages
Ign
http://archive.cloudera.comlenny/contrib Sources
Ign
http://volatile.debian.orglenny/volatile/main Packages/DiffIndex
Hit
http://security.debian.orglenny/updates/main Sources
Ign
http://volatile.debian.orglenny/volatile/main Sources/DiffIndex
Hit
http://ftp.us.debian.orglenny/main Sources
Get:5
http://archive.cloudera.comlenny/contrib Packages [4480B]
Get:6
http://volatile.debian.orglenny/volatile/main Packages [7471B]
Get:7
http://volatile.debian.orglenny/volatile/main Sources [2350B]
Get:8
http://archive.cloudera.comlenny/contrib Sources [1431B]
Fetched 59.2kB in 4s (12.5kB/s)
Reading package lists… Done
debian:~#

查看 Cloudera packages

debian:~# apt-cache search hadoop
hadoop – A software platform for processing vast amounts of da<wbr>ta<br style="padding-bottom:0px; margin:0px; padding-left:0px; padding-right:0px; padding-top:0px"> hadoop-conf-pseudo – Pseudo-distributed Hadoop configuration<br style="padding-bottom:0px; margin:0px; padding-left:0px; padding-right:0px; padding-top:0px"> hadoop-datanode – Da<wbr>ta Node for Hadoop<br style="padding-bottom:0px; margin:0px; padding-left:0px; padding-right:0px; padding-top:0px"> hadoop-doc – Documentation for Hadoop<br style="padding-bottom:0px; margin:0px; padding-left:0px; padding-right:0px; padding-top:0px"> hadoop-jobtracker – Job Tracker for Hadoop<br style="padding-bottom:0px; margin:0px; padding-left:0px; padding-right:0px; padding-top:0px"> hadoop-namenode – Name Node for Hadoop<br style="padding-bottom:0px; margin:0px; padding-left:0px; padding-right:0px; padding-top:0px"> hadoop-native – Native libraries for Hadoop (e.g., compression)<br style="padding-bottom:0px; margin:0px; padding-left:0px; padding-right:0px; padding-top:0px"> hadoop-pipes – Interface to author Hadoop MapReduce jobs in C++<br style="padding-bottom:0px; margin:0px; padding-left:0px; padding-right:0px; padding-top:0px"> hadoop-secondarynamenode – Secondary Name Node for Hadoop<br style="padding-bottom:0px; margin:0px; padding-left:0px; padding-right:0px; padding-top:0px"> hadoop-tasktracker – Task Tracker for Hadoop<br style="padding-bottom:0px; margin:0px; padding-left:0px; padding-right:0px; padding-top:0px"> hive – A da<wbr>ta warehouse infrastructure built on top of Hadoop<br style="padding-bottom:0px; margin:0px; padding-left:0px; padding-right:0px; padding-top:0px"> libhdfs0 – JNI Bindings to access Hadoop HDFS from C<br style="padding-bottom:0px; margin:0px; padding-left:0px; padding-right:0px; padding-top:0px"> pig – A platform for analyzing large da<wbr>ta sets using Hadoop<br style="padding-bottom:0px; margin:0px; padding-left:0px; padding-right:0px; padding-top:0px"> debian:~#</wbr></wbr></wbr></wbr>

ok,准备工作到此,下面开始正式安装,还是非常方便的。

我们选择安装Hadoop (Pseudo-Distributed Mode)的模式。能完整体验hadoop的功能。

昨天我们体验了hadoop-conf-pseudo 0.18.3-0cloudera0.3.0~intrepid,今天放出了基于最新版hadoop 0.20的cloudera软件试用包,既然如此,那就趁机尝一把鲜吧,这就是开源软件的速度,每天都有新感觉。

需要java6。

配置

debian:~/codeblue2/client/examples# more /etc/apt/sources.list
#
# deb cdrom:[Debian GNU/Linux 5.0.1 _Lenny_ - Official i386 CD Binary-1 20090413-00:10]/ lenny main

deb cdrom:[Debian GNU/Linux 5.0.1 _Lenny_ - Official i386 CD Binary-1 20090413-00:10]/ lenny main

debhttp://ftp.us.debian.org/debian/lenny main contrib non-free
deb-src
http://ftp.us.debian.org/debian/lenny main contrib non-free

debhttp://security.debian.org/lenny/updates main contrib non-free
deb-src
http://security.debian.org/lenny/updates main contrib non-free

debhttp://volatile.debian.org/debian-volatilelenny/volatile main contrib non-free
deb-src
http://volatile.debian.org/debian-volatilelenny/volatile main contrib non-free

然后apt-get update一把。

debian:~# apt-get install sun-java6-jre

很傻瓜化的就安装好了,这里就略去输出了。

在体验0.20之前,在把0.18.3 的安装说一下,毕竟是稳定版本。

apt-get -y install hadoop-conf-pseudo
Reading package lists… Done
Building dependency tree
Reading state information… Done
The following extra packages will be installed:
hadoop hadoop-native liblzo2-2
The following NEW packages will be installed:
hadoop hadoop-conf-pseudo hadoop-native liblzo2-2
0 upgraded, 4 newly installed, 0 to remove and 0 not upgraded.
Need to get 12.0MB/12.1MB of archives.
After this operation, 21.5MB of additional disk space will be used.
Get:1
http://archive.cloudera.comlenny/contrib hadoop 0.18.3-4cloudera0.3.0~lenny [11.9MB]
Get:2
http://archive.cloudera.comlenny/contrib hadoop-conf-pseudo 0.18.3-4cloudera0.3.0~lenny [93.1kB]
Get:3
http://archive.cloudera.comlenny/contrib hadoop-native 0.18.3-4cloudera0.3.0~lenny [92.7kB]
Fetched 4336kB in 23s (184kB/s)
Selecting previously deselected package liblzo2-2.
(Reading database … 103556 files and directories currently installed.)
Unpacking liblzo2-2 (from …/lzo2/liblzo2-2_2.03-1_i386.deb) …
Selecting previously deselected package hadoop.
Unpacking hadoop (from …/hadoop_0.18.3-4cloudera0.3.0~lenny_all.deb) …
Selecting previously deselected package hadoop-conf-pseudo.
Unpacking hadoop-conf-pseudo (from …/hadoop-conf-pseudo_0.18.3-4cloudera0.3.0~lenny_all.deb) …
Selecting previously deselected package hadoop-native.
Unpacking hadoop-native (from …/hadoop-native_0.18.3-4cloudera0.3.0~lenny_i386.deb) …
Processing triggers for man-db …
Setting up liblzo2-2 (2.03-1) …
Setting up hadoop (0.18.3-4cloudera0.3.0~lenny) …
Setting up hadoop-conf-pseudo (0.18.3-4cloudera0.3.0~lenny) …
Setting up hadoop-native (0.18.3-4cloudera0.3.0~lenny) …

查看一下安装到哪里了。

debian:~# dpkg -L hadoop-conf-pseudo
/.
/etc
/etc/hadoop
/etc/hadoop/conf.pseudo
/etc/hadoop/conf.pseudo/hadoop-default.xml
/etc/hadoop/conf.pseudo/configuration.xsl
/etc/hadoop/conf.pseudo/log4j.properties
/etc/hadoop/conf.pseudo/slaves
/etc/hadoop/conf.pseudo/sslinfo.xml.example
/etc/hadoop/conf.pseudo/hadoop-env.sh
/etc/hadoop/conf.pseudo/masters
/etc/hadoop/conf.pseudo/hadoop-metrics.properties
/etc/hadoop/conf.pseudo/commons-logging.properties
/etc/hadoop/conf.pseudo/hadoop-site.xml
/usr
/usr/share
/usr/share/doc
/usr/share/doc/hadoop-conf-pseudo
/usr/share/doc/hadoop-conf-pseudo/copyright
/usr/share/doc/hadoop-conf-pseudo/changelog.Debian.gz
/usr/share/doc/hadoop-conf-pseudo/changelog.gz
/usr/share/lintian
/usr/share/lintian/overrides
/usr/share/lintian/overrides/hadoop-conf-pseudo

debian:~# ls -l /var/lib/hadoop/cache/hadoop/dfs/name
total 8
drwxr-xr-x 2 hadoop hadoop 4096 2009-06-24 02:58 current
drwxr-xr-x 2 hadoop hadoop 4096 2009-06-24 02:58 image

启动hadoop的服务:

debian:~# /etc/init.d/hadoop-namenode start
Starting Hadoop namenode daemon: starting namenode, logging to /var/log/hadoop/hadoop-hadoop-namenode-debian.out
hadoop-namenode.

/etc/init.d/hadoop-datanode start
Starting Hadoop datanode daemon: starting datanode, logging to /var/log/hadoop/hadoop-hadoop-datanode-debian.out
hadoop-datanode.
debian:~# /etc/init.d/hadoop-jobtracker start
Starting Hadoop jobtracker daemon: starting jobtracker, logging to /var/log/hadoop/hadoop-hadoop-jobtracker-debian.out

hadoop-jobtracker.

查看一下进程是否正常

hadoop 7926 1 0 03:01 ? 00:00:12 /usr/lib/jvm/java-6-sun//bin/java -Xmx100m -Dcom.sun.man
hadoop 8007 1 1 03:02 ? 00:00:14 /usr/lib/jvm/java-6-sun//bin/java -Xmx100m -Dcom.sun.man
hadoop 8053 1 0 03:02 ? 00:00:13 /usr/lib/jvm/java-6-sun//bin/java -Xmx100m -Dcom.sun.man
hadoop 8108 1 0 03:02 ? 00:00:11 /usr/lib/jvm/java-6-sun//bin/java -Xmx100m -Dhadoop.log

hive和pig的安装也就一条命令搞定,方便实惠。

apt-get install hive

apt-get insall pig

ok,我们autoremove掉0.183,体验最新的0.20

debian:~# apt-get autoremove hadoop-conf-pseudo

debian:~# wgethttp://archive.cloudera.com/hadoop-summit-09/hadoop-20-debs/deb_lenny_i386/hadoop-0.20_0.20.0-1cloudera0.5.0~lenny_all.deb

debian:~# dpkg -i hadoop-0.20_0.20.0-1cloudera0.5.0~lenny_all.deb
Selecting previously deselected package hadoop-0.20.
(Reading database … 103589 files and directories currently installed.)
Unpacking hadoop-0.20 (from hadoop-0.20_0.20.0-1cloudera0.5.0~lenny_all.deb) …
Setting up hadoop-0.20 (0.20.0-1cloudera0.5.0~lenny) …
Processing triggers for man-db …

关于0.20的新进展,关注中。

转自:http://rdc.taobao.com/blog/dw/archives/414

分享到:
评论

相关推荐

Global site tag (gtag.js) - Google Analytics