Tuesday, July 09, 2013

Solr on CDH4 using Cloudera Manager

Install Cloudera Manager from an AWS instance

You can launch an AWS instance at CentOS 6.3 and ssh to the instance. Then, download the Cloudera Manager 4.5 installer and execute it on the remote instance:
$ wget http://archive.cloudera.com/cm4/installer/latest/cloudera-manager-installer.bin
$ chmod +x cloudera-manager-installer.bin
$ sudo ./cloudera-manager-installer.bin

(1) Follow the command based installation accepting licenses.

(2) Open and go to Cloudera Manager's Web UI, which might be local URLbut you may use the global URL, to launch the services.

(3) All Services > Add Cluster > Continue on Coudera Manager Express Wizard > CentOS 6.3; m1.xlarge; ...

Note: >= m1.large recommended

Stop unneccessay services

Stop HBase, Hive, Oozie, Sqoop

All systems are located at 'usr/lib/'

For example, /usr/lib/solr, /usr/lib/zookeeper, ...

Optional: Activate Solr service

After launching Cloudera Manager and its instances as shown at whirr_cm above, go to 'Host' > 'Parcels' tab of the Cloudera Manager's Web UI. Then, you can download the latest available CDH, Solr, Impala.
Download "SOLR 0.9.1-1.cdh4.3.0.p0.275" > Distribute > Activate > Restart the current Cluster

Optional: download CDH 4.3.0-1

Download "CDH 4.3.0-1.cdh4.3.0.p0.22" > Distribute > Activate > Restart the current Cluster
Note: Restarting the cluster will take several minutes

Add Solr service

Actions (of Cluster 1 - CDH4) > Add a Service > Solr > Zookeeper as a dependency

Open a Web UI of Hue

Default login/pwd is admin You can see Solr. Select it

Update Solr conf at a zookeeper node

You can see a solr configuration file as '/etc/default/solr' and update it with as follows:
sudo vi /etc/default/solr 
Note: it may not recognize 'localhost' so that use '127.0.0.1' alternatively

Create the /solr directory in HDFS:

$ sudo -u hdfs hadoop fs -mkdir /solr
$ sudo -u hdfs hadoop fs -chown solr /solr

Create a collection

You change to root account and need to add solr to zookeeper. From now on, I run shell commands as root user.
$ sudo su
$ solrctl init
or
$ solrctl init --force
Then, at Cloudera Manager's Web UI, restart solr service.
Run the following commands to create a collection at a zookeeper node
$ solrctl instancedir --generate $HOME/solr_configs
$ solrctl instancedir --create collection $HOME/solr_configs
$ solrctl collection --create collection -s 1
While running 'solrctl collection ...', you may go to /var/log/solr and check out if the solr runs well without any error:
$ tail -f solr-cmf-solr1-SOLR_SERVER-ip-10-138-xx-xx.ec2.internal.log.out 
Upload an example data to solr
$ cd /usr/share/doc/solr-doc-4.3.0+52/example/exampledocs/
$ java -Durl=http://127.0.0.1:8983/solr/collection/update -jar post.jar *.xml
SimplePostTool version 1.5
Posting files to base url http://127.0.0.1:8983/solr/collection/update using content-type application/xml..
POSTing file gb18030-example.xml
POSTing file hd.xml
POSTing file ipod_other.xml
...
POSTing file utf8-example.xml
POSTing file vidcard.xml
14 files indexed.
COMMITting Solr index changes to http://127.0.0.1:8983/solr/collection/update..
Time spent: 0:00:00.818

Query using Hue Web UI

Open Hue Web UI at Cloudera Manager's Hue service and select solr tab.
1. Make sure to import collections - core may not be needed.
2. select "Search page" link at the top right of the solr web UI page.
3. As default, the page shows 1-15 of 32 results.
4. Type in 'photo' at a search box ans will show 1 -2 of 2 results.

Customize the view of Solr Web UI

Select 'Customize this collection' that will present Visual Editor for view.
Note: you can see the same content from https://github.com/hipic/cdh-solr

References

[1]. http://github.com/hipic/whirr_cm

[2]. http://blog.cloudera.com/blog/2013/03/how-to-create-a-cdh-cluster-on-amazon-ec2-via-cloudera-manager/

[3]. Managing Clusters with Cloudera Manager,http://www.cloudera.com/content/cloudera-content/cloudera-docs/CM4Ent/latest/PDF/Managing-Clusters-with-Cloudera-Manager.pdf

[4]. Cloudera Search Installation Guide,http://www.cloudera.com/content/cloudera-content/cloudera-docs/Search/latest/Cloudera-Search-Installation-Guide

[5]. Cloudera Search User Guide, http://www.cloudera.com/content/cloudera-content/cloudera-docs/Search/latest/Cloudera-Search-User-Guide

[6]. Cloudera Search Installation Guide,http://www.cloudera.com/content/cloudera-content/cloudera-docs/Search/latest/PDF/Cloudera-Search-Installation-Guide.pdf

26 comments:

  1. Thanks so very much for taking your time to create this very useful and informative site. I have learned a lot from your site. Thanks!!

    Salesforce Training

    ReplyDelete
  2. thanks for that useful information...good training in chennai.............Very great article...Some training institutes in chennai...........

    ReplyDelete
  3. Thanks for sharing informative article on cloud computing technology. Your article helped me a lot in understand the future of cloud technology. Having strong expertise in leading cloud based CRM like Salesforce will ensure better career prospects for aspiring professionals. Salesforce.com Training in Chennai

    ReplyDelete
  4. The information you have given here is truly helpful to me. CCNA- It’s a certification program based on routing & switching for starting level network engineers that helps improve your investment in knowledge of networking & increase the value of employer’s network, if you want to take ccna course in Chennai get into FITA, thanks for sharing…
    ccna training in Chennai | ccna training institute in Chennai

    ReplyDelete
  5. Thanks for sharing this niche useful informative post of SAP HCM & ABAP tips to our knowledge, Actually SAP is ERP software that can be used in many companies for their day to day business activities it has great scope in future if anyone wants to take sap training in Chennai get here.
    SAP ABAP Training In Chennai | SAP MM Training In Chennai

    ReplyDelete
  6. I simply want to tell you that I’m new to blogging and definitely savored this blog site. Probably I’m planning to bookmark your website . You certainly have perfect posts. Thanks a bunch for revealing your website page.
    creating a disaster recovery plan

    ReplyDelete
  7. I really found some great and impressive blog, which is interesting and knowledgeable.
    AWS Cloud Computing
    Microsoft Cloud Computing
    Cloud Computing SaaS

    ReplyDelete
  8. thus this blog is really good just i got more information to your blog thus it is really nice and very much interesting.ya it is highlighting many important messages so that i like your message.

    agile devops

    ReplyDelete
  9. This information is impressive. I am inspired with your post writing style & how continuously you describe this topic. Eagerly waiting for your new blog keep doing more.
    IELTS Coaching Classes in Mumbai
    IELTS Course in Mumbai
    IELTS Institute in Mumbai
    Best IELTS Coaching Classes in Mumbai
    IELTS Coaching Center in Mumbai
    Best IELTS Classes in Mumbai

    ReplyDelete
  10. I really loved reading through this article... Thanks for sharing such an amazing post with us. Keep updating your blog.

    Unix Training in Chennai | Unix Course in Chennai | Unix Training Institutes in Chennai | Unix Training in Adyar | Unix Course in Velachery | Unix Training Institutes in Tambaram

    ReplyDelete
  11. This comment has been removed by the author.

    ReplyDelete
  12. This is the best Article I have ever gone through..... Thanks for sharing the useful Information about Cloud.
    AWS Online Training

    ReplyDelete
  13. I’m impressed, I must say. Rarely do I come across a blog that’s both equally educative and amusing, and let me tell you, you have hit the nail on the head. The problem is something which not enough folks are speaking intelligently about. website Now i'm very happy that I stumbled across this in my hunt for something regarding this.

    ReplyDelete
  14. Thank You For The Auspicious Write-up. It In Reality Used To Be A Amusement Account. VT Markets Glance Advanced To More Introduced Agreeable From You! By The Way, How Can We Communicate?

    ReplyDelete

Followers

Profile