AssassinDrake.com




HAKKAPELIITTA!ís Guide to a Better Life

HAKKAPELIITTA!ís Guide to a Better Life

Volume 5: How to Build a Cluster


Group members

Brandon Crismon
John Fowler
J-P Mascarella
Matt Vince

1. Preparing Hardware

Choosing Hardware

When choosing hardware to work with you will want to keep in mind the overall cost and value of the hardware. If you haven't already read Volume 4: How to choose a CPU, we suggest that you should before deciding which CPU you will use. First you must decide how many nodes you want before you can determine how much you can spend on a CPU.

If this is your first time, we suggest having four nodes. You should have no problem following this guide no matter the cluster size you choose, but we warn you that the more nodes you have the more time and energy you will need to put into it. After all, the more nodes you have, the chances you will have for something to go wrong.

Tip:
To simplify things, you should try to make your nodes as similar as possible.

At this time you may also choose your network connection method. In this manual, we use Ethernet but you can choose any method you like as long as you are able to give static IPs to all the nodes. You also have your choice of video cards, but just about anything will do for our purposes (check your OS requirements).

Necessary Hardware

First, you must get hardware supported by the operating system you choose. Things you will need beyond what your operating system require are mentioned here. You will need a network adapter for each node and one switch. You will, of course, need a switch with enough ports to connect all your nodes and enough cable to connect them all to it (if you need cable at all). Other things to consider are the mouse, keyboard and monitor. You will need at least one of each. We suggest having one of each for every node so that you don't need to constantly swap these around. Even if your operating system does not require both a CD-ROM drive and a floppy we suggest you do. It will be useful in later steps especially if you are not connecting your cluster to the outside world.

Side Note:
Our cluster used:
4 Dell Optiplex GX1p Pentium III’s 500 MHz. with 256MB RAM
1 Ethernet Switch

Optional Hardware

You might find it handy (and economical), if you plan to work by yourself, to use a KVM switch instead of having separate keyboard mouse and monitor for your nodes. For our cluster, we had separate keyboard, monitor, and mouse. Other than that, you might also want at least one set of speakers or anything else you want.

Friendly Disclaimer:
You may want to return to this section if you haven't decided what operating system you want.

2. Installing an Operating System

Choosing an Operating System

At this point we assume you will be installing the exact same OS on all nodes. If this is not your intention, make sure you are capable of handling any problems that may arise.

Reminder Tip:
To simplify things, you should try to make your nodes as similar as possible.

When you choose an operating system keep in mind what you plan to do with your cluster. If you haven't already read Volume 3: How to choose an OS, we suggest that you do before deciding which you will use. In our Tutorial, we use RedHat 9. If you plan to use a non-Linux OS, the rest of this manual could be of very little use from this point on. If your nodes came with a preinstalled OS you may skip to part 3 now. If you plan on installing an OS other than RedHat skip to 'Installing a Different OS' now.

Installing RedHat 9

For installing from CD: Place RedHat Install Disc 1 in the CD-ROM Drive and start the computer. If you don't start to see a screen like that in figure 2.1 followed by a prompt to test the CD-ROM drive, followed by the welcome screen in figure 2.2, then you should check your BIOS to make sure you can boot from CD. If your BIOS doesn't give the option to boot from CD, make a boot diskette following the directions in Red Hat Linux Installation Guide. You can also use the Red Hat Linux Installation Guide to install RedHat 9 for a more detailed installation description but you should also read here so you will know what you need to set up your cluster.

HAKKAPELIITTA! Boot
figure 2.1

Once you get to the welcome screen in figure 2.2 click Next. You now should see a list of available languages. Choose whatever language you understand best, or if you can't decide, choose English and click Next. In a similar manner, choose a keyboard configuration, click Next, then choose a mouse configuration and click Next. You may encounter an Upgrade Examine screen, if so simply choose a new installation and click Next.

HAKKAPELIITTA! Running Install
figure 2.2

Now you should get a choice as to a Installation Type (see figure 2.3 below). If you want, you can choose a server type installation. In this tutorial, we choose custom. Click Next. Then you will be given an option on how to partition the hard disk, choose to automatically partition and click Next. If you like you may manually partition the drive, but unless you know what you are doing, it's best to let the installer do it for you. Next you must choose what to do with data from previous partitions. If you have more than one hard drive, choose the ones you want to use for this installation. If you don't know what to do at this step and you know that this copy of RedHat is all you want, choose to remove all partitions on this system. Click Next after you have made your choice. Now you should see a screen which you may not be familiar with. If that is the case all you need to know is that this shows what kind of partitions you are about to create. Click Next. Now the Boot Loader Configuration screen should appear. Just click Next.

HAKKAPELIITTA! Running Install
figure 2.3

Tip:
This manual is written in English, it was intended for English speaking users. If you can't decide on a language, choose English.

The Network Configuration, which should appear next, and the next few steps must be set up properly in order for your cluster to work! For the Hostname, set this to manually, and pick a name. This name should be different for each node. For our cluster we picked a name which would be easy to remember later. You'll notice from figure 2.4 that this name was the same for all nodes just with a different number following the name. This could be a good time to take notes on what you are naming things in case you might forget. After you have you host name set click next. This will bring you to the firewall configuration, choose no firewall and click Next. If you want to use a firewall on your cluster, now or at any other time, you MUST allow ssh.

HAKKAPELIITTA! Running Install
figure 2.4

Tip:
Pick names which you can remember, not only for Hostnames but also users, etc. We used hakka (short for Hakkapeliitta!).

Now you have the option of choosing additional language support. Unlike the pervious language selection, here you have the option to choose multiple languages that will be supported once installation is complete. Choose your languages and then click Next. A map of the world should now appear on the screen. Use this map or the list below it to choose the correct time zone for each node. The map should now have a little red 'x' over your selected city. Click Next.

Trivia:
Superstition has it that if you set the time zone wrong, your nodes will take hours longer to communicate. This is totally untrue, go ahead an set the time zone wrong, nothing bad will happen...

The time has come to set the root password. Think up a good password and then type it in the root password field and retype it in the confirm field. Make sure to remember this password. Click Next. If you want to configure authentication for the node do so now, this is not necessary for your cluster to work. Click Next. If you choose a custom installation (Figure 2.3) you now should see the Package Group Selection menu. Below is a list of packages we installed on our nodes. While not all of these packages are needed to run a cluster they are useful and you might want to consider these to start with. If you did not choose a custom installation just take a look at the screen and click Next to prepare for installation.

Desktops: X-windows system, Gnome desktop environment
Applications: Editors, Graphical Internet, Text-based internet
Servers: Server configuration tools, Web server, Mail server, Windows file server, DNS name server, FTP server, SQL database server, Network server
Development: Development tools, X software development, Gnome software development
System: Administration tools, System tools, Printing support

Tip:
You can always go back after installation and change the packages you've installed.

After you have chosen all the packages you want at this time and click Next. If you have an Unresolved Dependencies screen (which you shouldn't get if you installed the same packages as we have) after that, click Next.

You will be given one final chance to abort the installation and restart the computer. Click Next and wait for the installer to complete, supplying the second and third install discs as necessary.

Installing a Different OS

If you chose to install an OS other than RedHat 9, please read the distribution's provided manual for installation instructions.

Reminder Tip:
This manual is written in English, it was intended for English speaking users. If you can't decide on a language, choose English.

3. Installing MPICH

Set Up IP Addresses

Now you must get your network connections set up. to do this you must configure the IP address of each of the nodes. If you are planning on using these computers as part of a larger network or as a standalone network you will need to choose static IP addresses. You will also need to add each node into the host file of the other nodes.

To do this, click on the red hat at the bottom left corner of the screen.  Go to 'System Settings' -> 'Network'.  On the 'Devices' tab, make sure your network card is selected (it probably will be called 'eth0') and click 'Edit'.  Click 'Statically Set IP Addresses' and fill in the information for your network and click 'OK'.  Next, click on the 'Hosts' tab and add the IP addresses of each of your nodes into the list.  Do this for each node in the cluster.

After you do this for each node, test the connections between the nodes using ping:

$ ping IPADDRESS

Where IPADDRESS is the IP address of the node you are trying to reach.

Tip:
If you cannot get a response from ping, try checking your network switch and check the link light on the switch.

Set Up SSH

Since you are able to ping the other nodes now it is time to configure SSH. If you followed the same install as in chapter 2, then you will not need to install SHH at this time as this has already been done. If you need to get SSH or need to get version 2, visit the SSH download site at http://www.ssh.org/support/downloads/secureshellserver/non-commercial.html, to download it and install it.

To configure SHH, at the terminal prompt type:

$ ssh-keygen -t dsa

This will generate two files, "id_dsa" and "id_dsa.pub". Once this has been done on all the nodes you will need to combine all of the "id_dsa.pub" files into one larger file called "authorized_keys2". This file must be stored in the same directory, /.ssh, where the id_dsa files were created on all the nodes.

To do this, create a text file on a floppy disk with all the public keys in it. Copy the file to the "authorized_keys2" file on each node. Also you must change file permissions for "authorized_keys2". Do this now at the prompt by changing to the .ssh directory and entering:

$ chmod go-rwx authorized_keys2

After this has been done you can test ssh by typing:

$ ssh HOSTNAME/IP

Where HOSTNAME/IP is either the hostname or the IP address of the node you want to access.

Tip:
If SSH is not working, make sure the keys were created correctly. Also, be sure you created the same user account for all the nodes, i.e. "hakka".

Download MPICH

To download MPICH, go to http://www-unix.mcs.anl.gov/mpi/mpich/download.html. For our install of RedHat 9, we need to install the 'Unix (all flavors)' version. Download this file to your home directory for each node on the cluster.

Trivia:
The name mpich is derived from MPI and Chameleon; Chameleon both because mpich can run (adapt its color) on a wide range of environments and because the initial implementation of mpich used the Chameleon message-passing portability system. Mpich is pronounced "Em Pee Eye See Aych," not "Emm Pitch."

Configure, Make and Install MPICH

Perform the following actions for each of the nodes on your cluster:

1. Unzip 'mpich.tar.gz':

$ tar zxovf mpich.tar.gz

2. Invoke configure and wait for configure to complete:

$ ./configure

3. Make MPICH and wait for make to complete:

$ make

4. Set up MPICH:

$ cd /usr/local
$ su
$ mkdir mpich
$ cd mpich
$ mkdir share
$ cd share
$ gedit machine.LINUX

In this file, type the host names for each node. Ex:
hakka1
hakka2
hakka3
hakka4

Save this file and exit gedit. You should be back at the terminal prompt.

$ /usr/local/mpich/sbin/tstmachines

Figure 3.1 shows sample output of this command.

tstmachines output
figure 3.1

Finally, at the command prompt, enter the following command:

setenv P4_RSHCOMMAND ssh

Tip:
For extra help with MPICH, visit the official help page of MPICH at http://www-unix.mcs.anl.gov/mpi/mpich/docs/mpichman-chp4/mpichman-chp4.htm

Make and run sample programs

At the terminal prompt type:

$ cd examples/basic
$ make cpi
$ ../../bin/mpirun -np 4 cpi

This will run an MPI program on your cluster (if you have more or less than 4 nodes ajust accordingly). Figure 3.2 shows sample output of the the sample program.

cpi output
figure 3.2

4. Finding Benchmarks

Plain and Simple

All you need to know about your benchmarks is how multiple nodes compare to just one node. After you get the wall clock time from the sample (or any other program you run), rerun the program with fewer nodes and compare the wall clock times.

Here are some times for our cluster:
Number of nodes wall clock time
40.144339
30.134244
20.127366
10.126866
Examining the results may reveal some startling information. For our cluster it was more efficient to use only one node! When you test your cluster test multiple programs. This will help get a better idea of how well your cluster performs. It is possible that for this program one node is best, but for others four might be best. If you want even more accurate results, test each program a few times and average of the wall clock times.

Tip:
For more accurate results test more than one program.



Privacy Policy | Contact Us | © 2004 J-P Mascarella