<?xml version='1.0' encoding='UTF-8'?><?xml-stylesheet href="http://www.blogger.com/styles/atom.css" type="text/css"?><feed xmlns='http://www.w3.org/2005/Atom' xmlns:openSearch='http://a9.com/-/spec/opensearchrss/1.0/' xmlns:georss='http://www.georss.org/georss' xmlns:gd='http://schemas.google.com/g/2005' xmlns:thr='http://purl.org/syndication/thread/1.0'><id>tag:blogger.com,1999:blog-1481083534869484444</id><updated>2011-12-02T06:32:39.768-08:00</updated><category term='Hadoop Eclipse'/><category term='map/reduce'/><category term='no-SQL DB'/><category term='ubuntu 8.10'/><category term='cloud computing'/><category term='ec2'/><category term='reduce'/><category term='Amazon'/><category term='Data Mining'/><category term='Machine Learning'/><category term='tiger/line'/><category term='Hadoop Windows'/><category term='Business Intelligence'/><category term='hadoop'/><category term='big data'/><category term='AWS'/><category term='Market Basket Analysis'/><category term='census'/><category term='Eclipse 3.3.2'/><category term='hadoop example'/><category term='amazon ec2 and s3'/><category term='Column Oriented DB'/><category term='map reduce'/><category term='Whirr'/><category term='flickr'/><category term='digg'/><category term='map reduce example'/><category term='hadoop tutorial'/><category term='Hadoop 0.19.2'/><category term='HBase'/><category term='data set'/><category term='noSQL'/><category term='map/reduce example'/><title type='text'>Cloud Computing and Hadoop</title><subtitle type='html'>The blog is to share the information, tutorial, tips etc about Cloud Computing in Map/Reduce, mostly Hadoop at this moment (Aug 2009)</subtitle><link rel='http://schemas.google.com/g/2005#feed' type='application/atom+xml' href='http://dal-cloudcomputing.blogspot.com/feeds/posts/default'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default?max-results=100'/><link rel='alternate' type='text/html' href='http://dal-cloudcomputing.blogspot.com/'/><link rel='hub' href='http://pubsubhubbub.appspot.com/'/><author><name>Dalgual</name><uri>http://www.blogger.com/profile/02066647887262326056</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><generator version='7.00' uri='http://www.blogger.com'>Blogger</generator><openSearch:totalResults>11</openSearch:totalResults><openSearch:startIndex>1</openSearch:startIndex><openSearch:itemsPerPage>100</openSearch:itemsPerPage><entry><id>tag:blogger.com,1999:blog-1481083534869484444.post-5941515122488518115</id><published>2011-08-01T15:51:00.000-07:00</published><updated>2011-08-01T15:59:38.327-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='flickr'/><category scheme='http://www.blogger.com/atom/ns#' term='data set'/><category scheme='http://www.blogger.com/atom/ns#' term='big data'/><category scheme='http://www.blogger.com/atom/ns#' term='census'/><category scheme='http://www.blogger.com/atom/ns#' term='tiger/line'/><category scheme='http://www.blogger.com/atom/ns#' term='digg'/><title type='text'>Data Sets</title><content type='html'>1. TIGER/Line dataset (&lt;a href="http://www.census.gov/geo/www/tiger/"&gt;http://www.census.gov/geo/www/tiger/&lt;/a&gt;)&lt;br /&gt;2.2 millions California Roads in the TIGER/Line dataset widely used in spatial database research.&lt;br /&gt;&lt;br /&gt;2. &lt;a style="TEXT-DECORATION: none" href="http://www.isi.edu/integration/people/lerman/index.html"&gt;Kristina Lerman&lt;/a&gt; at USC ISI&lt;br /&gt;&lt;a href="http://www.isi.edu/integration/people/lerman/downloads.html"&gt;http://www.isi.edu/integration/people/lerman/downloads.html&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;a. Digg 2009&lt;br /&gt;This anonymized data set consists of the voting records for 3553 stories promoted to the front page over a period of a month in 2009. The voting record for each story contains id of the voter and time stamp of the vote. In addition, data about friendship links of voters was collected from Digg.&lt;br /&gt;&lt;br /&gt;&lt;a style="COLOR: rgb(93,0,1); BACKGROUND-COLOR: transparent; TEXT-DECORATION: none" href="http://www.isi.edu/integration/people/lerman/load.html?src=http://www.isi.edu/~lerman/downloads/digg2009.html" target="_blank"&gt;Download Digg 2009 data set&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;b. Flickr personal taxonomies&lt;br /&gt;This anonymized data set contains personal taxonomies constructed by 7,000+ Flickr users to organize their photos, as well as the tags they associated with the photos. Personal taxonomies are shallow hierarchies (trees) containing collections and their constituent sets (aka photo-albums) and collections.&lt;br /&gt;&lt;br /&gt;&lt;a style="COLOR: rgb(93,0,1); BACKGROUND-COLOR: transparent; TEXT-DECORATION: none" href="http://www.isi.edu/integration/people/lerman/load.html?src=http://www.isi.edu/~lerman/downloads/flickr/flickr_taxonomies.html" target="_blank"&gt;Download Flickr data set&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;c. Wrapper maintenance&lt;br /&gt;Wrappers facilitate access to Web-based information sources by providing a uniform querying and data extraction capability. When wrapper stops working due to changed in the layout of web pages, our task is to automatically reinduce the wrapper. The data sets used for experiments in our JAIR 2003 paper contain web pages downloaded from two dozen sources over a period of a year.&lt;br /&gt;&lt;br /&gt;&lt;a style="COLOR: rgb(93,0,1); BACKGROUND-COLOR: transparent; TEXT-DECORATION: none" href="http://www.isi.edu/integration/people/lerman/load.html?src=http://www.isi.edu/~lerman/projects/reinduction/index.html" target="_blank"&gt;Data set&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1481083534869484444-5941515122488518115?l=dal-cloudcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dal-cloudcomputing.blogspot.com/feeds/5941515122488518115/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dal-cloudcomputing.blogspot.com/2011/08/data-sets.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default/5941515122488518115'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default/5941515122488518115'/><link rel='alternate' type='text/html' href='http://dal-cloudcomputing.blogspot.com/2011/08/data-sets.html' title='Data Sets'/><author><name>Dalgual</name><uri>http://www.blogger.com/profile/02066647887262326056</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1481083534869484444.post-8638502470696310853</id><published>2011-07-27T15:23:00.000-07:00</published><updated>2011-07-27T16:03:01.471-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='map reduce'/><category scheme='http://www.blogger.com/atom/ns#' term='no-SQL DB'/><category scheme='http://www.blogger.com/atom/ns#' term='HBase'/><category scheme='http://www.blogger.com/atom/ns#' term='noSQL'/><category scheme='http://www.blogger.com/atom/ns#' term='hadoop'/><title type='text'>Market Basket Analysis Algorithm with Map/Reduce and HBase on AWS EC2</title><content type='html'>Slide and Papers with Hadoop MapReduce and Hbase at PDPTA 2011 (http://www.world-academy-of-science.org/worldcomp11/ws/conferences/pdpta11)and at EDB 2011 (http://dke.khu.ac.kr/edb2011/)&lt;br /&gt;&lt;br /&gt;(1) “(Submitted on June 30 2011) “Market Basket Analysis Algorithm with no-SQL DB HBase and Hadoop”, Jongwook Woo, Siddharth Basopia, Yuhang Xu, Seon Ho Kim, The Third International Conference on Emerging Databases (EDB 2011), Songdo Park Hotel, Incheon, Korea, Aug. 25-27, 2011 (&lt;a href="http://www.calstatela.edu/centers/hipic/contents/research/cloudComputing/2011/edb11.pdf"&gt;pdf &lt;/a&gt;- only page 1 and 12)&lt;br /&gt;&lt;br /&gt;(2) “Market Basket Analysis Algorithm with Map/Reduce of Cloud Computing”, Jongwook Woo and Yuhang Xu, The 2011 international Conference on Parallel and Distributed Processing Techniques and Applications (PDPTA 2011), Las Vegas (July 18-21, 2011) (&lt;a href="http://www.slideshare.net/dalgual/mba-pdpta11-8706980"&gt;Slide&lt;/a&gt;, &lt;a href="http://www.calstatela.edu/centers/hipic/contents/research/cloudComputing/2011/marketPDPTA11.pdf"&gt;pdf&lt;/a&gt;)&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1481083534869484444-8638502470696310853?l=dal-cloudcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dal-cloudcomputing.blogspot.com/feeds/8638502470696310853/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dal-cloudcomputing.blogspot.com/2011/07/market-basket-analysis-algorithm-with.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default/8638502470696310853'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default/8638502470696310853'/><link rel='alternate' type='text/html' href='http://dal-cloudcomputing.blogspot.com/2011/07/market-basket-analysis-algorithm-with.html' title='Market Basket Analysis Algorithm with Map/Reduce and HBase on AWS EC2'/><author><name>Dalgual</name><uri>http://www.blogger.com/profile/02066647887262326056</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1481083534869484444.post-4729811812092951373</id><published>2011-06-21T16:49:00.000-07:00</published><updated>2011-06-23T09:53:14.025-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Whirr'/><category scheme='http://www.blogger.com/atom/ns#' term='ec2'/><category scheme='http://www.blogger.com/atom/ns#' term='Amazon'/><category scheme='http://www.blogger.com/atom/ns#' term='HBase'/><category scheme='http://www.blogger.com/atom/ns#' term='hadoop'/><category scheme='http://www.blogger.com/atom/ns#' term='AWS'/><title type='text'>How to set up Hadoop and HBase together with Whirr on Amazon EC2</title><content type='html'>It is not easy to set up both Hadoop and HBase on EC2 at the same time. This is to illustrate how to set them up together with Apache Incubator project Whirr. Besides, it describes how to login the master node so that you can easily execute your Hadoop codes and HBase data on thde node remotely.&lt;br /&gt;&lt;br /&gt;References&lt;br /&gt;&lt;br /&gt;[1] Phil Whelan, http://www.philwhln.com/run-the-latest-whirr-and-deploy-hbase-in-minutes&lt;br /&gt;[2] http://incubator.apache.org/whirr/quick-start-guide.html&lt;br /&gt;[3] http://incubator.apache.org/whirr/whirr-in-5-minutes.html&lt;br /&gt;[4] http://stackoverflow.com/questions/5113217/installing-hbase-hadoop-on-ec2-cluster&lt;br /&gt;[5] http://www.philwhln.com/map-reduce-with-ruby-using-hadoop&lt;br /&gt;[5.1] http://www.cloudera.com/blog/2011/01/map-reduce-with-ruby-using-apache-hadoop/&lt;br /&gt;&lt;br /&gt;********************** Install Hadoop/HBase on Whirr [1] on Ubuntu 10.04 ********************** &lt;br /&gt;NOTES: install JDK 1.6 not JRE&lt;br /&gt;1) mvn clean install&lt;br /&gt;First time: hbsql not found install error&lt;br /&gt;Second time: no problem successful&lt;br /&gt;&lt;br /&gt;2) set the following:&lt;br /&gt;export AWS_ACCESS_KEY_ID=xxxxxxxxxxxxxxxxxxxxxx&lt;br /&gt;export AWS_SECRET_ACCESS_KEY=xxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxxx&lt;br /&gt;&lt;br /&gt;2.a) 5 min test of Whirr [3]&lt;br /&gt;ssh-keygen -t rsa -P '' -f ~/.ssh/id_rsa_whirr&lt;br /&gt;bin/whirr launch-cluster --config recipes/zookeeper-ec2.properties --private-key-file ~/.ssh/id_rsa_whirr&lt;br /&gt;&lt;br /&gt;echo "ruok" | nc $(awk '{print $3}' ~/.whirr/zookeeper/instances | head -1) 2181; echo&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;2.b) bin/whirr destroy-cluster --config recipes/zookeeper-ec2.properties&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;3)jongwook@localhost:~/whirr$ bin/whirr launch-cluster --config hbase-ec2.properties &lt;br /&gt;&lt;br /&gt;3.a) Exception in thread "main" org.apache.commons.configuration.ConfigurationException: Invalid key pair: (/home/jongwook/.ssh/id_rsa, /home/jongwook/.ssh/id_rsa.pub)&lt;br /&gt;&lt;br /&gt;Solution)&lt;br /&gt;ssh-keygen -t rsa -P ''&lt;br /&gt;&lt;br /&gt;4) You will see the following for about 5 min&lt;br /&gt;&lt;&lt; (ubuntu@184.72.173.143:22) error acquiring Session(ubuntu@184.72.173.143:22): Session.connect: java.io.IOException: End of IO Stream Read&lt;br /&gt;&lt;&lt; bad status -1 ExecResponse(ubuntu@50.17.19.46:22)[./setup-jongwook status]&lt;br /&gt;&lt;&lt; bad status -1 ExecResponse(ubuntu@174.129.131.50:22)[./setup-jongwook status]&lt;br /&gt;&lt;br /&gt;5) then, hbase folder with shell and xml files are generated under '.whirr'&lt;br /&gt;jongwook@localhost:~/whirr$ ls -al /home/jongwook/.whirr/&lt;br /&gt;total 12&lt;br /&gt;drwxr-xr-x  3 jongwook jongwook 4096 2011-06-17 16:19 .&lt;br /&gt;drwxr-xr-x 46 jongwook jongwook 4096 2011-06-17 16:09 ..&lt;br /&gt;drwxr-xr-x  2 jongwook jongwook 4096 2011-06-17 16:19 hbase&lt;br /&gt;&lt;br /&gt;6) Setup proxy server at Systems &gt; Preferences &gt; Network Proxy [5, 5.1]&lt;br /&gt;Mark SOCKS Proxy&lt;br /&gt;Proxy Server: localhost&lt;br /&gt;port: 6666&lt;br /&gt;&lt;br /&gt;6.a) at another terminal - temr2, setupHadoopEnv :&lt;br /&gt;jongwook@localhost:~/whirr$ source ~/Documents/setupHadoop0.20.2.sh &lt;br /&gt;&lt;br /&gt;6.b) And, at another terminal - temr2, Run Hadoop proxy to connect external and internal clusters&lt;br /&gt;jongwook@localhost:~/whirr$ sh ~/.whirr/hbase/hadoop-proxy.sh &lt;br /&gt;Running proxy to Hadoop cluster at ec2-184-xx-xxx-0.compute-1.amazonaws.com. Use Ctrl-c to quit.&lt;br /&gt;Warning: Permanently added 'ec2-184-xx-xxx-0.compute-1.amazonaws.com,184.72.152.0' (RSA) to the list of known hosts.&lt;br /&gt;&lt;br /&gt;7) Run a sample hadoop shell at the original terminal - term1&lt;br /&gt;11/06/17 17:03:20 WARN conf.Configuration: DEPRECATED: hadoop-site.xml found in the classpath. Usage of hadoop-site.xml is deprecated. Instead use core-site.xml, mapred-site.xml and hdfs-site.xml to override properties of core-default.xml, mapred-default.xml and hdfs-default.xml respectively&lt;br /&gt;Found 4 items&lt;br /&gt;drwxr-xr-x   - hadoop supergroup          0 2011-06-17 16:19 /hadoop&lt;br /&gt;drwxr-xr-x   - hadoop supergroup          0 2011-06-17 16:19 /hbase&lt;br /&gt;drwxrwxrwx   - hadoop supergroup          0 2011-06-17 16:18 /tmp&lt;br /&gt;drwxrwxrwx   - hadoop supergroup          0 2011-06-17 16:18 /user&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;8) at another terminal - temr3, setupHadoopEnv :&lt;br /&gt;jongwook@localhost:~/whirr$ source ~/Documents/setupHadoop0.20.2.sh &lt;br /&gt;&lt;br /&gt;8.a) And, at another terminal - temr3, Run HBase proxy to connect external and internal clusters; NOTE: need to close Hadoop proxy server because the port 6666 is shared&lt;br /&gt;jongwook@localhost:~/whirr$ sh ~/.whirr/hbase/hbase-proxy.sh &lt;br /&gt;Running proxy to HBase cluster at ec2-184-72-152-0.compute-1.amazonaws.com. Use Ctrl-c to quit.&lt;br /&gt;Warning: Permanently added 'ec2-184-72-152-0.compute-1.amazonaws.com,184.xx.xxx.0' (RSA) to the list of known hosts.&lt;br /&gt;&lt;br /&gt;9) Log in the master to run hadoop code with hbase data; user name is your local login, eg, jongwook for me.&lt;br /&gt;jongwook@localhost:~/whirr$ ssh -i /home/jongwook/.ssh/id_rsa jongwook@ec2-75-xx-xx-xx.compute-1.amazonaws.com &lt;br /&gt;OR&lt;br /&gt;jongwook@localhost:~/whirr$ ssh -i /home/jongwook/.ec2/id_rsa-dal_keypair jongwook@ec2-75-xx-xx-x.compute-1.amazonaws.com &lt;br /&gt;&lt;br /&gt;10) Now run Hadoop pi demo:&lt;br /&gt;[root@ip-10-116-94-104 ~]# cd /usr/local/hadoop-0.20.2/&lt;br /&gt;[root@ip-10-116-94-104 hadoop-0.20.2]# bin/hadoop jar hadoop-0.20.2-examples.jar pi 20 1000&lt;br /&gt;&lt;br /&gt;11) setup path and CLASSPATH to run hbase and hadoop codes&lt;br /&gt;export HADOOP_HOME=/usr/local/hadoop-0.20.2&lt;br /&gt;export HBASE_HOME=/usr/local/hbase-0.89.20100924&lt;br /&gt;export PATH=$HADOOP_HOME/bin:$HBASE_HOME/bin:$PATH&lt;br /&gt;&lt;br /&gt;# CLASSPATH for HADOOP&lt;br /&gt;export CLASSPATH=$HADOOP_HOME/hadoop-0.20.2-core.jar:$HADOOP_HOME/hadoop-0.20.2-ant.jar:$CLASSPATH&lt;br /&gt;export CLASSPATH=$HADOOP_HOME/hadoop-0.20.2-examples.jar:$HADOOP_HOME/hadoop-0.20.2-test.jar:$CLASSPATH&lt;br /&gt;export CLASSPATH=$HADOOP_HOME/hadoop-0.20.2-tools.jar:$CLASSPATH&lt;br /&gt;#export CLASSPATH=$HADOOP_HOME/commons-logging-1.0.4.jar:$HADOOP_HOME/commons-logging-api-1.0.4.jar:$CLASSPATH&lt;br /&gt;&lt;br /&gt;# CLASSPATH for HBASE&lt;br /&gt;export CLASSPATH=$HBASE_HOME/hbase-0.89.20100924.jar:$HBASE_HOME/lib/zookeeper-3.3.1.jar:$CLASSPATH&lt;br /&gt;export CLASSPATH=$HBASE_HOME/lib/commons-logging-1.1.1.jar:$HBASE_HOME/lib/avro-1.3.2.jar:$CLASSPATH&lt;br /&gt;export CLASSPATH=$HBASE_HOME/lib/log4j-1.2.15.jar:$HBASE_HOME/lib/commons-cli-1.2.jar:$CLASSPATH&lt;br /&gt;export CLASSPATH=$HBASE_HOME/lib/jackson-core-asl-1.5.2.jar:$HBASE_HOME/lib/jackson-mapper-asl-1.5.2.jar:$CLASSPATH&lt;br /&gt;export CLASSPATH=$HBASE_HOME/lib/commons-httpclient-3.1.jar:$HBASE_HOME/lib/jetty-6.1.24.jar:$CLASSPATH&lt;br /&gt;export CLASSPATH=$HBASE_HOME/lib/jetty-util-6.1.24.jar:$HBASE_HOME/lib/hadoop-core-0.20.3-append-r964955-1240.jar:$CLASSPATH&lt;br /&gt;export CLASSPATH=$HBASE_HOME/lib/hbase-0.89.20100924.jar:$HBASE_HOME/lib/hsqldb-1.8.0.10.jar:$CLASSPATH&lt;br /&gt;&lt;br /&gt;12) Run HBase demo:&lt;br /&gt;jongwook@ip-10-xx-xx-xx:/usr/local$ cd hbase-0.89.20100924/&lt;br /&gt;jongwook@ip-10-xx-xx-xx:/usr/local/hbase-0.89.20100924$ ls&lt;br /&gt;bin  CHANGES.txt  conf docs  hbase-0.89.20100924.jar  hbase-webapps  lib  LICENSE.txt NOTICE.txt  README.txt&lt;br /&gt;jongwook@ip-10-108-155-6:/usr/local/hbase-0.89.20100924$ bin/hbase shell&lt;br /&gt;HBase Shell; enter 'help&lt;RETURN&gt;' for list of supported commands.&lt;br /&gt;Type "exit&lt;RETURN&gt;" to leave the HBase Shell&lt;br /&gt;Version: 0.89.20100924, r1001068, Tue Oct  5 12:12:44 PDT 2010&lt;br /&gt;&lt;br /&gt;hbase(main):001:0&gt; status 'simple'&lt;br /&gt;5 live servers&lt;br /&gt;    ip-10-71-70-182.ec2.internal:60020 1308520337148&lt;br /&gt;        requests=0, regions=1, usedHeap=158, maxHeap=1974&lt;br /&gt;    domU-12-31-39-0F-B5-21.compute-1.internal:60020 1308520337138&lt;br /&gt;        requests=0, regions=0, usedHeap=104, maxHeap=1974&lt;br /&gt;    domU-12-31-39-0B-90-11.compute-1.internal:60020 1308520336780&lt;br /&gt;        requests=0, regions=0, usedHeap=104, maxHeap=1974&lt;br /&gt;    domU-12-31-39-0B-C1-91.compute-1.internal:60020 1308520336747&lt;br /&gt;        requests=0, regions=1, usedHeap=158, maxHeap=1974&lt;br /&gt;    ip-10-108-250-193.ec2.internal:60020 1308520336863&lt;br /&gt;        requests=0, regions=0, usedHeap=102, maxHeap=1974&lt;br /&gt;0 dead servers&lt;br /&gt;Aggregate load: 0, regions: 2&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1481083534869484444-4729811812092951373?l=dal-cloudcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dal-cloudcomputing.blogspot.com/feeds/4729811812092951373/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dal-cloudcomputing.blogspot.com/2011/06/how-to-set-up-hadoop-and-hbase-together.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default/4729811812092951373'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default/4729811812092951373'/><link rel='alternate' type='text/html' href='http://dal-cloudcomputing.blogspot.com/2011/06/how-to-set-up-hadoop-and-hbase-together.html' title='How to set up Hadoop and HBase together with Whirr on Amazon EC2'/><author><name>Dalgual</name><uri>http://www.blogger.com/profile/02066647887262326056</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1481083534869484444.post-9023110475913588110</id><published>2011-03-29T12:07:00.001-07:00</published><updated>2011-03-30T11:04:13.892-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Machine Learning'/><category scheme='http://www.blogger.com/atom/ns#' term='Business Intelligence'/><category scheme='http://www.blogger.com/atom/ns#' term='map reduce example'/><category scheme='http://www.blogger.com/atom/ns#' term='Market Basket Analysis'/><category scheme='http://www.blogger.com/atom/ns#' term='hadoop example'/><category scheme='http://www.blogger.com/atom/ns#' term='Data Mining'/><category scheme='http://www.blogger.com/atom/ns#' term='hadoop'/><title type='text'>Market Basket Analysis Example in Hadoop</title><content type='html'>Market Basket Analysis is one of the important approach to analyse the association in Data Mining. The basic idea is to find the associated pairs of items in a store when there are huge volumes of transaction data as follows:&lt;br /&gt;trax1: cracker, icecream, beer&lt;br /&gt;trax2: chicken, pizza, coke, bread&lt;br /&gt;...&lt;br /&gt;&lt;br /&gt;The following is the example code that I implemented on Hadoop  0.21.0, which takes the input "AssociationSP.txt" and generates the top 10 associated items that customers purchased together. After I complete a paper for conference with this example code, I will post more detailed info.&lt;br /&gt;&lt;br /&gt;Donwload&lt;br /&gt;- &lt;a href="http://www.calstatela.edu/centers/hipic/contents/research/cloudComputing/files/market/ItemCount.java"&gt;ItemCount.java&lt;/a&gt; Source file to have an idea how it looks like&lt;br /&gt;- &lt;a href="http://www.calstatela.edu/centers/hipic/contents/research/cloudComputing/files/market/cloud9-csulaud-0.1.jar"&gt;cloud9-csulaud-0.1.jar&lt;/a&gt; file to execute the code&lt;br /&gt;- &lt;a href="http://www.calstatela.edu/centers/hipic/contents/research/cloudComputing/files/market/AssociationsSP.txt"&gt;AssociationsSP.txt&lt;/a&gt; input file&lt;br /&gt;- &lt;a href="http://www.calstatela.edu/centers/hipic/contents/research/cloudComputing/files/market/itemscount_sort2.txt"&gt;itemscount_sort2.txt&lt;/a&gt; and &lt;a href="http://www.calstatela.edu/centers/hipic/contents/research/cloudComputing/files/market/itemscount_sort4.txt"&gt;itemscount_sort4.txt&lt;/a&gt; sample outs for two- and four-pairs of items&lt;br /&gt;&lt;br /&gt;(1) You need to create a dir "data" and upload the file to "data" on HDF:&lt;br /&gt;&gt; hadoop fs -mkdir data &lt;br /&gt;&gt; hadoop fs -put AssociationsSP.txt data/&lt;br /&gt;&lt;br /&gt;(2) type in and run the example code (output dir: itemcount, 5 reducers, 2 pairs of association): &lt;br /&gt;&gt; hadoop jar cloud9-csulaud-0.1.jar edu.calstatela.hadoop.example.associations.ItemCount data/AssociationsSP.txt itemcount 5 2&lt;br /&gt;&lt;br /&gt;(3) Type in the following to see the analysis:&lt;br /&gt;&gt; hadoop jar cloud9-csulaud-0.1.jar edu.calstatela.hadoop.utils.analysis.AnalyzeInputCount itemcount&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1481083534869484444-9023110475913588110?l=dal-cloudcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dal-cloudcomputing.blogspot.com/feeds/9023110475913588110/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dal-cloudcomputing.blogspot.com/2011/03/market-basket-analysis-example-in.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default/9023110475913588110'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default/9023110475913588110'/><link rel='alternate' type='text/html' href='http://dal-cloudcomputing.blogspot.com/2011/03/market-basket-analysis-example-in.html' title='Market Basket Analysis Example in Hadoop'/><author><name>Dalgual</name><uri>http://www.blogger.com/profile/02066647887262326056</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1481083534869484444.post-7769527726114962591</id><published>2011-03-24T12:11:00.000-07:00</published><updated>2011-03-24T12:11:02.951-07:00</updated><title type='text'>Tom White: Learning MapReduce</title><content type='html'>&lt;a href="http://www.lexemetech.com/2008/03/learning-mapreduce.html"&gt;Tom White: Learning MapReduce&lt;/a&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1481083534869484444-7769527726114962591?l=dal-cloudcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='related' href='http://www.lexemetech.com/2008/03/learning-mapreduce.html' title='Tom White: Learning MapReduce'/><link rel='replies' type='application/atom+xml' href='http://dal-cloudcomputing.blogspot.com/feeds/7769527726114962591/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dal-cloudcomputing.blogspot.com/2011/03/tom-white-learning-mapreduce.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default/7769527726114962591'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default/7769527726114962591'/><link rel='alternate' type='text/html' href='http://dal-cloudcomputing.blogspot.com/2011/03/tom-white-learning-mapreduce.html' title='Tom White: Learning MapReduce'/><author><name>Dalgual</name><uri>http://www.blogger.com/profile/02066647887262326056</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1481083534869484444.post-8027208182515979395</id><published>2011-02-26T19:27:00.000-08:00</published><updated>2011-02-26T19:48:02.591-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='ubuntu 8.10'/><category scheme='http://www.blogger.com/atom/ns#' term='map reduce'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud computing'/><category scheme='http://www.blogger.com/atom/ns#' term='Hadoop 0.19.2'/><category scheme='http://www.blogger.com/atom/ns#' term='Eclipse 3.3.2'/><title type='text'>set up Hadoop 0.19.2 on Eclipse 3.3.2 for Ubuntu 8.10</title><content type='html'>&lt;font style="font-weight: bold;"&gt;a.&lt;/font&gt;&lt;font style="font-weight: bold;"&gt; &lt;/font&gt;&lt;font&gt;Refer to: http://ebiquity.umbc.edu/Tutorials/Hadoop/00%20-%20Intro.html&lt;/font&gt;&lt;font style="font-weight: bold;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;font style="font-weight: bold;"&gt;b.&lt;/font&gt; &lt;/font&gt;&lt;font&gt;Download Eclipse 3.3.2 Europa: http://www.eclipse.org/downloads/packages/release/europa/winter&lt;/font&gt;&lt;font style="font-weight: bold;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;font style="font-weight: bold;"&gt;c.&lt;/font&gt; &lt;/font&gt;&lt;font&gt;Download Hadoop 0.19.2:  http://apache.osuosl.org//hadoop/core/hadoop-0.19.2/&lt;br /&gt;&lt;br /&gt;Setup including the following hadoop-site.xml: http://hadoop.apache.org/common/docs/r0.19.2/quickstart.html#Local&lt;/font&gt;&lt;font style="font-weight: bold;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;/font&gt;&lt;font&gt;&amp;lt;configuration&amp;gt;&lt;br /&gt;&amp;lt;property&amp;gt;&lt;br /&gt;   &amp;lt;name&amp;gt;fs.default.name&amp;lt;/name&amp;gt;&lt;br /&gt;   &amp;lt;value&amp;gt;hdfs://localhost:9000/&amp;lt;value&amp;gt;&lt;br /&gt; &amp;lt;/property&amp;gt;&lt;br /&gt; &amp;lt;property&amp;gt;&lt;br /&gt;   &amp;lt;name&amp;gt;mapred.job.tracker&amp;lt;/name&amp;gt;&lt;br /&gt;   &amp;lt;value&amp;gt;localhost:9001&amp;lt;/value&amp;gt;&lt;br /&gt; &amp;lt;/property&amp;gt;&lt;br /&gt; &amp;lt;property&amp;gt;&lt;br /&gt;   &amp;lt;name&amp;gt;dfs.replication&amp;lt;/name&amp;gt;&lt;br /&gt;   &amp;lt;value&amp;gt;1&amp;lt;/value&amp;gt;&lt;br /&gt; &amp;lt;/property&amp;gt;&lt;br /&gt;&amp;lt;/configuration&amp;gt;&lt;/font&gt;&lt;font style="font-weight: bold;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;font style="font-weight: bold;"&gt;d.&lt;/font&gt; &lt;/font&gt;&lt;font&gt;cp [yourpath]/hadoop-0.19.2/contrib/eclipse-plugin/hadoop-0.19.2-eclipse-plugin.jar [yourpath]/eclipse/plugin/&lt;/font&gt;&lt;font style="font-weight: bold;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;font style="font-weight: bold;"&gt;e.&lt;/font&gt; &lt;/font&gt;&lt;font&gt;start eclipse: Can start eclipse at "File Browser" of Ubuntu&lt;/font&gt;&lt;font style="font-weight: bold;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;font style="font-weight: bold;"&gt;f.&lt;/font&gt; &lt;/font&gt;&lt;font&gt;Window &gt; open perspective &gt; other &gt; map/reduce&lt;/font&gt;&lt;font style="font-weight: bold;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;font style="font-weight: bold;"&gt;g.&lt;/font&gt; &lt;/font&gt;&lt;font&gt;In hadoop to start, open 5 terminals and type in the following respectively&lt;br /&gt;~/apache/hadoop-0.19.2/bin/hadoop namenode -format&lt;br /&gt;~/apache/hadoop-0.19.2/bin/hadoop namenode&lt;br /&gt;&lt;br /&gt;~/apache/hadoop-0.19.2/bin/hadoop secondarynamenode&lt;br /&gt;~/apache/hadoop-0.19.2/bin/hadoop jobtracker&lt;br /&gt;&lt;br /&gt;~/apache/hadoop-0.19.2/bin/hadoop datanode&lt;br /&gt;issue: if there is an error "Unexpected version of storage directory /tmp/hadoop-jongwook/dfs/data", remove "data" folder in the error&lt;br /&gt;&lt;br /&gt;~/apache/hadoop-0.19.2/bin/hadoop tasktracker&lt;/font&gt;&lt;font style="font-weight: bold;"&gt;&lt;br /&gt;&lt;br /&gt;&lt;font style="font-weight: bold;"&gt;h.&lt;/font&gt; &lt;/font&gt;&lt;font&gt;At eclipse: http://ebiquity.umbc.edu/Tutorials/Hadoop/17%20-%20set%20up%20hadoop%20location%20in%20the%20eclipse.html&lt;br /&gt;New hadoop location has the default value as follows as defined at hadoop-site.xml:&lt;br /&gt;map/reduce master: localhost:9001&lt;br /&gt;DFS mater: localhost:9000&lt;br /&gt;user name: jongwook&lt;br /&gt;mapred.job.tracker: localhost:9001&lt;br /&gt;&lt;/font&gt;&lt;font style="font-weight: bold;"&gt;&lt;br /&gt;&lt;font style="font-weight: bold;"&gt;i.&lt;/font&gt; &lt;/font&gt;&lt;font&gt;How to run Hadoop example at Eclipse&lt;br /&gt;Refer to: &lt;a href="http://dal-cloudcomputing.blogspot.com/2009/08/hadoop-example-mymaxtemperaturewithcomb.html"&gt;http://dal-cloudcomputing.blogspot.com/2009/08/hadoop-example-mymaxtemperaturewithcomb.html&lt;/a&gt;&lt;br /&gt;- Create (or import) Hadoop example as shown in the above blog as Hadoop project of Eclipse: File &gt; New &gt; Map/Reduce Project&lt;br /&gt;- 1901 does not exist so that use only 1902&lt;br /&gt;- open Hadoop project perspective view at Eclipse&lt;br /&gt; - You can create DFS folder or upload files to DFS at this view&lt;br /&gt;&lt;/font&gt;&lt;font style="font-weight: bold;"&gt;&lt;br /&gt;&lt;/font&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1481083534869484444-8027208182515979395?l=dal-cloudcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dal-cloudcomputing.blogspot.com/feeds/8027208182515979395/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dal-cloudcomputing.blogspot.com/2011/02/set-up-hadoop-0192-on-eclipse-332-for.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default/8027208182515979395'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default/8027208182515979395'/><link rel='alternate' type='text/html' href='http://dal-cloudcomputing.blogspot.com/2011/02/set-up-hadoop-0192-on-eclipse-332-for.html' title='set up Hadoop 0.19.2 on Eclipse 3.3.2 for Ubuntu 8.10'/><author><name>Dalgual</name><uri>http://www.blogger.com/profile/02066647887262326056</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1481083534869484444.post-8295544446564462896</id><published>2011-02-25T12:56:00.000-08:00</published><updated>2011-02-25T16:57:35.775-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='map/reduce'/><category scheme='http://www.blogger.com/atom/ns#' term='Column Oriented DB'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud computing'/><category scheme='http://www.blogger.com/atom/ns#' term='no-SQL DB'/><category scheme='http://www.blogger.com/atom/ns#' term='hadoop'/><title type='text'>The Technical Demand of Cloud Computing (no-SQL DB, Map/Reduce Hadoop)</title><content type='html'>&lt;span style="font-weight:bold;"&gt;&lt;a href="http://www.calstatela.edu/centers/hipic/contents/research/cloudComputing/files/mapReduceKosen.pdf"&gt;The Technical Demand of Cloud Computing&lt;/a&gt;&lt;/span&gt; in Korean&lt;br /&gt;Technical Report granted from &lt;a href="http://www.kisti.re.kr/english/index.jsp"&gt;KISTI &lt;/a&gt;(Korea Institute of Science and Technical Information) 한국과학기술정보연구원&lt;br /&gt;by Jongwook Woo, California State University Los Angeles&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1481083534869484444-8295544446564462896?l=dal-cloudcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dal-cloudcomputing.blogspot.com/feeds/8295544446564462896/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dal-cloudcomputing.blogspot.com/2011/02/technical-demand-of-cloud-computing.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default/8295544446564462896'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default/8295544446564462896'/><link rel='alternate' type='text/html' href='http://dal-cloudcomputing.blogspot.com/2011/02/technical-demand-of-cloud-computing.html' title='The Technical Demand of Cloud Computing (no-SQL DB, Map/Reduce Hadoop)'/><author><name>Dalgual</name><uri>http://www.blogger.com/profile/02066647887262326056</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1481083534869484444.post-1364237680507309810</id><published>2010-01-28T11:55:00.000-08:00</published><updated>2010-01-28T12:48:42.548-08:00</updated><title type='text'>How to clone a Github Repository</title><content type='html'>I want to create a backup repository that includes all git history before I restart a new project. I need to restruct our project but we want to create a repository for old git and work on new code at the same respoitory.&lt;br /&gt;&lt;br /&gt;That is,&lt;br /&gt;&lt;br /&gt;a. Local PC: source code version controlled with git@github.com:myaccount/current_project.git&lt;br /&gt;&lt;br /&gt;b. github server: create a new repo named https://github.com/myaccount/backup_project&lt;br /&gt;&lt;br /&gt;c. want to push Local PC's source code to the new repo named https://github.com/myaccount/backup_project&lt;br /&gt;&lt;br /&gt;d. Keep the old git: git@github.com:myaccount/current_project.git&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;Assuming you have a githu account.&lt;br /&gt;(1) Create a new repositoty for backup&lt;br /&gt;Go to http://github.com/repositories/new&lt;br /&gt;&lt;br /&gt;(2) Fill out the textbox on the page and URL for the new repository&lt;br /&gt;For example: URL should be&lt;br /&gt;https://github.com/myaccount/backup_project&lt;br /&gt;&lt;br /&gt;(3) Make sure if you have a local repository for the current project. Thus, create a loal repository in case:&lt;br /&gt;git clone git@github.com:myaccount/current_project.git current_project_01282010&lt;br /&gt;&lt;br /&gt;(4)Then&lt;br /&gt;cd current_project_01282010&lt;br /&gt;git remote add origin git@github.com:myaccount/backup_project.git&lt;br /&gt;&lt;br /&gt;(5) If it has a following error:&lt;br /&gt;fatal: remote origin already exists.&lt;br /&gt;&lt;br /&gt;cd config&lt;br /&gt;vi config&lt;br /&gt;&lt;br /&gt;Then, rename the origin to "git@github.com:myaccount/backup_project.git" from "git@github.com:myaccount/current_project.git"&lt;br /&gt;&lt;br /&gt;(6) at gihub.com/myaccount/backup_project.git, don't forget to add your local account as Repository Collaborators.&lt;br /&gt;&lt;br /&gt;(7) push your local repository to git@github.com:myaccount/backup_project.git&lt;br /&gt;git push origin master&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1481083534869484444-1364237680507309810?l=dal-cloudcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dal-cloudcomputing.blogspot.com/feeds/1364237680507309810/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dal-cloudcomputing.blogspot.com/2010/01/how-to-clone-github-repository.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default/1364237680507309810'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default/1364237680507309810'/><link rel='alternate' type='text/html' href='http://dal-cloudcomputing.blogspot.com/2010/01/how-to-clone-github-repository.html' title='How to clone a Github Repository'/><author><name>Dalgual</name><uri>http://www.blogger.com/profile/02066647887262326056</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1481083534869484444.post-435914082737793119</id><published>2009-12-21T15:46:00.000-08:00</published><updated>2009-12-21T15:59:20.526-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='map/reduce'/><category scheme='http://www.blogger.com/atom/ns#' term='amazon ec2 and s3'/><category scheme='http://www.blogger.com/atom/ns#' term='reduce'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud computing'/><category scheme='http://www.blogger.com/atom/ns#' term='hadoop'/><title type='text'>Introduction to Cloud Computing</title><content type='html'>Jongwook Woo, California State University, Los Angeles &lt;a href="http://www.calstatela.edu/faculty/jwoo5/publications/2009/cloudComputingKOCSEA09.pdf"&gt;&lt;br /&gt;   Introduction to Cloud Computing&lt;/a&gt;, the 10th KOCSEA 2009 Symposium, UNLV, &lt;br /&gt;   Dec 18-19, 2009&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1481083534869484444-435914082737793119?l=dal-cloudcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dal-cloudcomputing.blogspot.com/feeds/435914082737793119/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dal-cloudcomputing.blogspot.com/2009/12/introduction-to-cloud-computing.html#comment-form' title='1 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default/435914082737793119'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default/435914082737793119'/><link rel='alternate' type='text/html' href='http://dal-cloudcomputing.blogspot.com/2009/12/introduction-to-cloud-computing.html' title='Introduction to Cloud Computing'/><author><name>Dalgual</name><uri>http://www.blogger.com/profile/02066647887262326056</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>1</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1481083534869484444.post-4850037683300915440</id><published>2009-08-11T20:07:00.000-07:00</published><updated>2009-08-11T20:09:11.004-07:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Hadoop Eclipse'/><category scheme='http://www.blogger.com/atom/ns#' term='map/reduce'/><category scheme='http://www.blogger.com/atom/ns#' term='Hadoop Windows'/><category scheme='http://www.blogger.com/atom/ns#' term='hadoop tutorial'/><category scheme='http://www.blogger.com/atom/ns#' term='hadoop example'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud computing'/><category scheme='http://www.blogger.com/atom/ns#' term='hadoop'/><title type='text'>Set up Hadoop in Eclipse</title><content type='html'>&lt;p class="style1"&gt;&lt;strong&gt;Set up Hadoop in Eclipse&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://ebiquity.umbc.edu/Tutorials/Hadoop/00%20-%20Intro.html"&gt;&lt;br /&gt;Hadoop on Windows with Eclipse&lt;/a&gt; &lt;br /&gt;&lt;/p&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1481083534869484444-4850037683300915440?l=dal-cloudcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dal-cloudcomputing.blogspot.com/feeds/4850037683300915440/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dal-cloudcomputing.blogspot.com/2009/08/set-up-hadoop-in-eclipse.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default/4850037683300915440'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default/4850037683300915440'/><link rel='alternate' type='text/html' href='http://dal-cloudcomputing.blogspot.com/2009/08/set-up-hadoop-in-eclipse.html' title='Set up Hadoop in Eclipse'/><author><name>Dalgual</name><uri>http://www.blogger.com/profile/02066647887262326056</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry><entry><id>tag:blogger.com,1999:blog-1481083534869484444.post-5502007421119957712</id><published>2009-08-11T19:59:00.000-07:00</published><updated>2011-02-25T16:50:40.160-08:00</updated><category scheme='http://www.blogger.com/atom/ns#' term='Hadoop Eclipse'/><category scheme='http://www.blogger.com/atom/ns#' term='map reduce example'/><category scheme='http://www.blogger.com/atom/ns#' term='map/reduce example'/><category scheme='http://www.blogger.com/atom/ns#' term='map/reduce'/><category scheme='http://www.blogger.com/atom/ns#' term='map reduce'/><category scheme='http://www.blogger.com/atom/ns#' term='Hadoop Windows'/><category scheme='http://www.blogger.com/atom/ns#' term='hadoop tutorial'/><category scheme='http://www.blogger.com/atom/ns#' term='hadoop example'/><category scheme='http://www.blogger.com/atom/ns#' term='cloud computing'/><category scheme='http://www.blogger.com/atom/ns#' term='hadoop'/><title type='text'>Hadoop Example: MyMaxTemperatureWithCombiner</title><content type='html'>&lt;p class="body1"&gt;&lt;br /&gt;&lt;/p&gt;&lt;p class="style1"&gt;&lt;ol&gt;&lt;li&gt;&lt;p class="style1"&gt;&lt;strong&gt;Set up Hadoop in Eclipse&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://ebiquity.umbc.edu/Tutorials/Hadoop/00%20-%20Intro.html"&gt;Hadoop on Windows with Eclipse&lt;/a&gt;&lt;br /&gt;&lt;/p&gt;&lt;/li&gt;&lt;li&gt;&lt;p class="style1"&gt;&lt;strong&gt;Hadoop Example&lt;/strong&gt;&lt;br /&gt;&lt;br /&gt;&lt;a href="http://www.calstatela.edu/centers/hipic/contents/research/cloudComputing/files/MyMaxTemperatureWithCombiner.java"&gt;MyMaxTemperatureWithCombiner.java&lt;/a&gt;,&lt;br /&gt;&lt;a href="http://www.calstatela.edu/centers/hipic/contents/research/cloudComputing/files/MaxTemperatureMapper.java"&gt;MaxTemperatureMapper.java&lt;/a&gt;,&lt;br /&gt;&lt;a href="http://www.calstatela.edu/centers/hipic/contents/research/cloudComputing/files/MaxTemperatureReducer.java"&gt;MaxTemperatureReducer.java&lt;/a&gt;&lt;br /&gt;&lt;br /&gt;&lt;br /&gt;&lt;strong&gt;How to run the example codes:&lt;br /&gt;&lt;/strong&gt;&lt;br /&gt;&lt;/p&gt;&lt;/li&gt;&lt;ol&gt;&lt;br /&gt;&lt;li&gt;You need to set up Hadoop as shown above (Set up Hadoop in Eclipse) &lt;/li&gt;&lt;br /&gt;&lt;li&gt;make a directory named "tempIn" at your hadoop:&lt;br /&gt;&lt;span class="style2"&gt;&lt;br /&gt;&lt;em&gt;bin/hadoop fs -mkdir tempIn&lt;br /&gt;&lt;/em&gt;&lt;/span&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;copy input files &lt;a href="http://www.calstatela.edu/centers/hipic/contents/research/cloudComputing/files/1901"&gt;1901&lt;/a&gt; and&lt;br /&gt;&lt;a href="http://www.calstatela.edu/centers/hipic/contents/research/cloudComputing/files/1902"&gt;1902&lt;/a&gt; to your HDF:&lt;br /&gt;&lt;br /&gt;&lt;span class="style2"&gt;&lt;em&gt;bin/hadoop fs -cp 1901 tempIn/&lt;/em&gt;&lt;/span&gt;&lt;br /&gt;&lt;br /&gt;&lt;span class="style2"&gt;&lt;em&gt;bin/hadoop fs -cp 1902 tempIn/&lt;/em&gt;&lt;/span&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;In the eclipse IDE, imports three java files above under package named "edu.calstatela.hipic.hadoop.util"&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;Start Hadoop cluster as shown above (Set up Hadoop in Eclipse)&lt;br /&gt;&lt;/li&gt;&lt;br /&gt;&lt;li&gt;In the eclipse IDE, Right click on&lt;br /&gt;MyMaxTemperatureWithCombiner.java, choose "Run as" &amp;gt; "Run Hadoop&lt;br /&gt;Application"&lt;/li&gt;&lt;br /&gt;&lt;li&gt;You will see the map/reduce results at the HDF folder "tempOut"&lt;br /&gt;in DFS Location of eclipse IDE&lt;/li&gt;&lt;/ol&gt;&lt;/ol&gt;&lt;div class="blogger-post-footer"&gt;&lt;img width='1' height='1' src='https://blogger.googleusercontent.com/tracker/1481083534869484444-5502007421119957712?l=dal-cloudcomputing.blogspot.com' alt='' /&gt;&lt;/div&gt;</content><link rel='replies' type='application/atom+xml' href='http://dal-cloudcomputing.blogspot.com/feeds/5502007421119957712/comments/default' title='Post Comments'/><link rel='replies' type='text/html' href='http://dal-cloudcomputing.blogspot.com/2009/08/hadoop-example-mymaxtemperaturewithcomb.html#comment-form' title='0 Comments'/><link rel='edit' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default/5502007421119957712'/><link rel='self' type='application/atom+xml' href='http://www.blogger.com/feeds/1481083534869484444/posts/default/5502007421119957712'/><link rel='alternate' type='text/html' href='http://dal-cloudcomputing.blogspot.com/2009/08/hadoop-example-mymaxtemperaturewithcomb.html' title='Hadoop Example: MyMaxTemperatureWithCombiner'/><author><name>Dalgual</name><uri>http://www.blogger.com/profile/02066647887262326056</uri><email>noreply@blogger.com</email><gd:image rel='http://schemas.google.com/g/2005#thumbnail' width='16' height='16' src='http://img2.blogblog.com/img/b16-rounded.gif'/></author><thr:total>0</thr:total></entry></feed>
