What I Learned in Programming: August 2013

Install Hadoop 1.1.2 on Ubuntu 12.04

Here is the example of installation of Hadoop on a single machine, UBuntu box in this case. For a simple cluster installation, please reference to Running Hadoop on Clusters of Two Nodes using Ubuntu and CentOS.
1. install hadoop 1.1 download Hadoop-1.1.2 (STABLE VERSION OF THE HADOOP)

http://archive.apache.org/dist/hadoop/core/hadoop-1.1.2/hadoop-1.1.2.tar.gz

1.2 untar the package to /opt 1.3 find the JAVA_HOME and HADOOP_HOME directory and add them to .bashrc file

echo export JAVA_HOME = /usr/lib/jvm/java-6-openjdk >> ~/.bashrc
echo export HADOOP_PREFIX = /opt/Hadoop-1.1.2 >> ~/.bashrc

1.4 add $HADOOP_PREFIX/bin to your $PATH variable

export PATH=$PATH:$HADOOP_PREFIX/bin
export PATH=$PATH:$HADOOP_PREFIX/sbin

2. Decide the running mode of your Hadoop:

local mode
standard/local alone mode
Pseudo-Disributed mode

2.1 If you want to start Hadoop in Pseudo-Distributed mode, edit the following three configuration files at $HADOOP_PREFIX/conf, core-site.xml, hdfs-site.xml and mapred-site.xml

#core-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>hadoop.tmp.dir</name>
<value>/home/username/projects/hdfs_test</value>
</property>
<property>
<name>fs.default.name</name>
<value>hdfs://localhost:54310</value>
</property>
</configuration>
#hdfs-site.xml
<?xml version="1.0" encoding="UTF-8"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>dfs.replication</name>
<value>1</value>
</property>
</configuration>
#mapred-site.xml
<?xml version="1.0"?>
<?xml-stylesheet type="text/xsl" href="configuration.xsl"?>
<configuration>
<property>
<name>mapred.job.tracker</name>
<value>localhost:54311</value>
</property>
<property>
<name>mapred.system.dir</name>
<value>/home/username/projects/mapred/system</value>
<final>true</final>
</property>
</configuration>

2.2 create file hadoop-env.sh in $HADOOP_PREFIX/etc/, and write the following to the file:

vim $HADOOP_PREFIX/conf/hadoop-env.sh
export JAVA_HOME = /usr/lib/jvm/java-6-openjdk

3. Format the HDFS system

hadoop namenode -format

4. Start Hadoop

start-all.sh

5. usefule comamnd to check the HDFS system

hadoop fs -ls /: list the root of HDFS
hadoop fs -ls: (without '/'):
# if you could not get anything other than the following error msg:
#"ls: Cannot access .: No such file or directory.".
#You need to do the following steps to make it work.
hadoop fs -mkdir /user
hadoop fs -mkdir /user/username
# Now you should be able to run hadoop fs -ls because by default hadoop is looking for
# "/user/username" structure within HDFS. The error msg is means there is no such structure in HDFS.
#To avoide the error msg we need to create the structure within HDFS for Hadoop.

6. copy files from local "linux" to HDFS I used two text files for testing: file1.txt and file2.txt, they are located at /home/username/projects/hadoop 6.1 copy three files from local to hadoop file system

hadoop dfs -copyFromLocal /home/username/projects/hadoop /user/username
# check copy result
hadoop dfs -ls /user/username/

6.2 download the word-count example from http://repo1.maven.org/maven2/org/apache/hadoop/hadoop-examples/1.0.3/hadoop-examples-1.0.3.jar and put it /home/username/projects/hadoop 6.3 run the mapreduce job

hadoop jar  hadoop-examples-1.0.3.jar wordcount /user/username /user/username-output
# notes for this command:
#(1) if  you see io exception, you might use full path for the jar package
#(2) path "/user/username-output" is the output path for mapreduce, it has not been there before running the job.

6.4 Retrieve the job result from HDFS

# 1. merge mapreduce outputs and copy to local path: /home/username/projects/hadoop/output
hadoop dfs -getmerge /user/username-output /home/username/projects/hadoop/output

7. Hadoop Web Interfaces

NameNode daemon: http://localhost:50070/
JobTracker daemon: http://localhost:50030/
TaskTracker daemon: http://localhost:50060/

7. Questions
Q. $HADOOP_HOME is depreciated always showing up when you run hadoop commands.
A. Replace HADOOP_HOME in your ~/.bashrc file with HADOOP_PREFIX. Check whether $HADOOP_HOME is defined in other places using echo command when you open new terminal.

Use Apache Virtual Host/ Reverse Proxy on Ubuntu

1. install apache2 on Ubuntu

sudo apt-get install apache2

You may notice that http.conf file located at /etc/apach2/ is emplty, which is normal since we are using apache2.conf instead 2. Create symbolic links to enable the proxy modula in Apache2, then restart the server

sudo ln -s /etc/apache2/mods-available/proxy.load /etc/apache2/mods-enabled
sudo ln -s /etc/apache2/mods-available/proxy_http.load /etc/apache2/mods-enabled
sudo /etc/init.d/apache2 restart

3. Create a virtual host file:

sudo gedit /etc/apache2/sites-enabled/proxiedhosts
# and edit the file so that it resembles:
NameVirtualHost *:80  # do not forget this line of code.
<VirtualHost *:80>
ServerName example.com
ProxyRequests off
ProxyPass / http://localhost:8080/
ProxyPassReverse / http://localhost:8080/
</VirtualHost>

It is obviously we are forwarding any request on port 80 to port 8080 4.Activate the virtual host file by making a symbolic link to the Apache2 sites-enabled folder then restarting Apache2:

sudo ln -s /etc/apache2/sites-enabled/proxiedhosts /etc/apache2/sites-enabled
sudo /etc/init.d/apache2 restart

Slow Authentication when SSH Connecting To Your Ubuntu Box

I noticed a dramatic slower authentication process when doing ssh connection to my Ubuntu box than connecting to CentOS box.

time ssh 102.14.98.241

After logging in, type exit and it reveal the time consumed as the following:

real 0m30.397s
user 0m0.012s
sys  0m0.020s

After little research I found modify one line in /etc/ssh/sshd_config could fix the problem.

sudo vim  /etc/ssh/sshd_config
# add the following line
UseDNS no
#save file and restart sshd server
sudo /etc/init.d/ssh restart

Test the new time consuming, use time ssh 102.14.98.241 shows

real 0m5.309s
user 0m0.013s
sys  0m0.015s

Configure sqlldr for batch uploading data to Oracle

I was trying to batch load data to Oracle from client site of computer (Ubuntu 64bit) and found that sqlldr was not installed. Here is a simple instruction of installing/enabling sqlldr on Ubuntu 64Bit.
1. make sure that Oracle Client was installed. If not, following this link to install.
2. Download Oracle Database Express Edition (oracle-xe-11.2.0-1.0.x86_64.rpm.zip) from here:
3. extract rpm file from the downloaded file

unzip -d oracle-ex oracle-xe-11.2.0-1.0.x86_64.rpm.zip
cd oracle-ex/Disk1
# extract files from rpm package
rpm2cpio oracle-xe-11.2.0-1.0.x86_64.rpm | cpio -idmv
# copy file sqlldr and folder mesg to local $ORACLE_HOME
sudo cp u01/app/oracle/product/11.2.0/xe/bin/sqlldr /usr/lib/oracle/11.2/client64/bin
# create fold rdbms/mesg at $ORACLE_HOME
sudo mkdir -p /usr/lib/oracle/11.2/client64/rdbms/mesg
#then copy files from mesg folder
sudo cp u01/app/oracle/product/11.2.0/xe/rdbms/mesg/* /usr/lib/oracle/11.2/client64/rdbms/mesg/

4. now you could run sqlldr to batch load data

sqlldr user/pass@//oracle_server:1521/devl control="RXNCONSO.ctl";

Install cx_Oracle on Ubuntu from rpm package

Make sure that you have installed Oracle instant client. If not, please follow this post.
1. make sure that the path in your oracle.conf file contains libclntsh.so.11.1 file

cat /etc/ld.so.conf.d/oracle.conf
#it shows
/usr/lib/oracle/11.2/client64/lib
# verify that the path contains libclntsh.so.11.1
ll /usr/lib/oracle/11.2/client64/lib | grep libclntsh
lrwxrwxrwx 1 root root        17 Aug  5  2013 libclntsh.so -> libclntsh.so.11.1
-rw-r--r-- 1 root root  52761218 Sep 17  2011 libclntsh.so.11.1

2. update ld.so.cache

sudo ldconfig

3. download and install cx_Oracle library from rpm package

sudo alien -i cx_Oracle-5.1.2-11g-py27-1.x86_64.rpm

4. Configure cx_Oracle to work with Python2.7 within Ubuntu

cd /usr/lib/python2.7
sudo mv site-packages/cx_Oracle* dist-packages/
sudo rmdir site-packages/
sudo ln -s dist-packages site-packages

Install Oracle SQL Plus on Ubuntu

Recently, I switched my development environment from CentOS to Ubuntu. I have difficulties of running Oracle SQL Plus. The installation went fine by following my previous post. However, when I tried to run sqlplus,it complains that libsqlplus.so is missing "error while loading shared libraries: libsqlplus.so": Here is the solution I used to fix the problem: Create a file oracle.conf in

/etc/ld.so.conf.d
# create file oracle.conf
sudo vim /etc/ld.so.conf.d/oracle.conf
# add the following line to the file
/usr/lib/oracle/11.2/client64/lib
# then run ldconfig to update the configuration
sudo ldconfig

But when you run sqlplus again you may find it shows another error: "missing libaio.so.1, to fix that you need install libaio1"
sudo apt-get install libaio1

Now you may be able to connect to Oracle:
sqlplus user/password@//oracle_server:port/SID





By

H. Tan



-

August 05, 2013



2 comments:
  









Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest




Labels:
oracle sqlplus


        

          
        

          
        






Install rpm Packages in Ubuntu





If rpm package is all you got for a software package, you could follow the following instructions to install it on Ubuntu:
# install alien and all the dependencies it needs
sudo apt-get install alien dpkg-dev debhelper build-essential
# Convert a package from rpm to debian format
sudo alien packagename.rpm
# Use dpkg to install the package
sudo dpkg -i packagename.deb






By

H. Tan



-

August 01, 2013



No comments:
  









Email ThisBlogThis!Share to XShare to FacebookShare to Pinterest




Labels:
rpm,
Ubuntu




Newer Posts


Older Posts

Home




Subscribe to:
Comments (Atom)



Datatable static image not found on the server

 When you use ```datatables.min.css``` and ```datatables.min.js``` locally, instead of datatables CDN, you may have encountered that ```sort...






Datatable static image not found on the server
 When you use ```datatables.min.css``` and ```datatables.min.js``` locally, instead of datatables CDN, you may have encountered that ```sort...





Install Oracle XE 11g, update APEX to 4.2.6 and runn Oracle REST Data Services with Tomcat server
 Step 1. Install Oracle XE 11g  1. Download Oracle XE (oracle-xe-11.2.0-1.0.x86_64.rpm.zip) from Oracle official website. You need an accoun...





Install Berkeley DB-5.3.21 on Centos
Berkeley DB source file at Oracle . Install optional dependence: Tcl-8.6.0, OpenJDK-1.7.0.9, and Sharutils-4.13.3 (for the uudecode command)...












Search This Blog




















Pages



Home


HomeSchool


Music


About


Work Templates





















Contributors


H. Tan
H. Tang




Blog Archive








        ► 
      



2022

(3)





        ► 
      



March

(1)







        ► 
      



February

(2)









        ► 
      



2018

(1)





        ► 
      



January

(1)









        ► 
      



2017

(2)





        ► 
      



November

(1)







        ► 
      



September

(1)









        ► 
      



2016

(7)





        ► 
      



May

(2)







        ► 
      



March

(3)







        ► 
      



February

(2)









        ► 
      



2015

(12)





        ► 
      



October

(1)







        ► 
      



July

(1)







        ► 
      



June

(1)







        ► 
      



May

(4)







        ► 
      



April

(5)









        ► 
      



2014

(34)





        ► 
      



October

(1)







        ► 
      



September

(7)







        ► 
      



August

(11)







        ► 
      



July

(2)







        ► 
      



April

(3)







        ► 
      



March

(2)







        ► 
      



February

(6)







        ► 
      



January

(2)









        ▼ 
      



2013

(36)





        ► 
      



December

(1)







        ► 
      



September

(7)







        ▼ 
      



August

(7)

Install Hadoop 1.1.2 on Ubuntu 12.04
Use Apache Virtual Host/ Reverse Proxy on Ubuntu
Slow Authentication when SSH Connecting To Your Ub...
Configure sqlldr for batch uploading data to Oracle
Install cx_Oracle on Ubuntu from rpm package
Install Oracle SQL Plus on Ubuntu
Install rpm Packages in Ubuntu








        ► 
      



July

(7)







        ► 
      



June

(1)







        ► 
      



May

(3)







        ► 
      



April

(1)







        ► 
      



March

(7)







        ► 
      



February

(2)









        ► 
      



2012

(19)





        ► 
      



October

(3)







        ► 
      



June

(7)







        ► 
      



April

(1)







        ► 
      



March

(3)







        ► 
      



February

(3)







        ► 
      



January

(2)









        ► 
      



2011

(134)





        ► 
      



December

(1)







        ► 
      



November

(8)







        ► 
      



October

(18)







        ► 
      



September

(15)







        ► 
      



August

(20)







        ► 
      



July

(12)







        ► 
      



June

(10)







        ► 
      



May

(6)







        ► 
      



April

(14)







        ► 
      



March

(26)







        ► 
      



February

(4)









        ► 
      



2010

(44)





        ► 
      



September

(2)







        ► 
      



August

(2)







        ► 
      



July

(4)







        ► 
      



June

(2)







        ► 
      



April

(18)







        ► 
      



March

(2)







        ► 
      



February

(10)







        ► 
      



January

(4)









        ► 
      



2009

(27)





        ► 
      



December

(5)







        ► 
      



November

(2)







        ► 
      



July

(4)







        ► 
      



April

(6)







        ► 
      



February

(10)









        ► 
      



2007

(4)





        ► 
      



April

(2)







        ► 
      



January

(2)














Followers
















Report Abuse








Labels



Amazon


Apache


apex


apex column data nowrap


APEX column width


apex interactive report checkbox check box


apex javascript


apex long column content hide and show


apex schema parsing_schema


apex subregion left align


apex upgrade


apex upgrade apex5


apex5 icomoon


Apple


Applocale


ASCII


ASUS WL520GU


autopep8


autostart


awk


background


bash


bash history


Berkeley DB


blog


Blog tool


blogger


blogspot


broken link


CentOS


CentOS EPEL


Chado


checkbox


cluster


copy


CoRD


CPAN


cron


crontab


CSFR


cx_oracle


Database


database postgresql


datatables


Django


django join query


Dom4j


Dragoman


Drupal


Dual boot


Dynamic hyperlink


Eclipse


eclipse kepler eclipse.desktop ubuntu


egg


EJBCA


email


English


environment variable


Excel


Financial


Firefox


first


flash player


foreignkey


Galaxy


GET


google Chrome


hadoop


highlight apex5


history


image


installation


IP


Ipad


iptables


Java


Javascript


Job hunting


jRAT


Kernel Headers


keytool


Linux


Linux Command


M2Crypto


MacOS


man page


mapreduce


Mechanical


multidimensional table


multimedia


MySQL


Navicat


NetBeans


network


NVIDIA driver


openam


opendj


ora2pg


oracle


oracle 11gr2


oracle apex


oracle sqlplus


output


pdf


Perl


Picpick


piv


piv card


piv card reader


plsql


port


port forwarding


POST


PostgreSQL


process


Programming


proxy websense


putty


PuTTYtel


PyDev


PyQt


Python


question


R RPy2


Remote


Reportlab


reverse proxy


root user


rpm


rsync


Samba


Schema


scr3310


screen


script


sed


sendmail


smart card


smart card reader


Software


SQL developer


sqlldr


ssh


SSH public key


SSL


static ip


sudo


SVN


SyntaxHighlighter


system beep


text analysis


toad


Tomato


Tomcat


top panel


Topic Models


TPTP


Transfer


Ubuntu


Ubuntu Kernel


ubuntu shortcuts


Ubuntu virtual machine


UML


unrar


User


User group


Uubuntu


Vim


virtual host


VirtualBox


VMD Ubuntu


vmware player port forward


web policy agent


WebService


wget


Windows


windows 10 PIV Login


Windows 7


Windows Server 2008


wordpress


WordPress Blogger


XML


z3c.rml



























Powered by Blogger.