This doesn’t seem to be clearly documented, but to make this persistent add the following under your [mysqld] section of my.cnf
external_locking = TRUE delay_key_write = Off query_cache_size = 0
This doesn’t seem to be clearly documented, but to make this persistent add the following under your [mysqld] section of my.cnf
external_locking = TRUE delay_key_write = Off query_cache_size = 0
Now it’s fine to configure our services.
On both nodes install and disable these services (cluster services will manage starting/stopping)
yum install postfix mysqld dovecot chkconfig postfix off chkconfig mysqld off chkconfig dovecot off
Copy the required data to our shared glusterfs storage.
cp -rp /var/lib/mysql /mysql/data/ cp -p /etc/my.cnf /mysql/data cp -rp /etc/postfix /mail/data cp -rp /etc/dovecot /mail/dovecot cp -p /etc/init.d/postfix /mail/data/postfix.sh cp -p /etc/init.d/dovecot /mail/data/dovecot.sh mkdir /mail/data/vmail
/mail/data/vmail will hold our user mail, otherwise we are making sure that our configuration files are located on the shared storage so we have a consistent environment.
Update /mysql/data/my.cnf with
datadir=/mysql/data/mysql
Create /mail/data/mail.sh since our cluster service needs to call both postfix and dovecot.
#!/bin/bash if [ "$1" == "status" ]; then ps -ef | grep -v grep | grep "/usr/libexec/postfix/master" exit $? else /mail/data/dovecot.sh $1; /mail/data/postfix.sh $1 exit 0 fi
Please note this is a quick and dirty hack. You should have more checks than just master running since we care about dovecot as well.
Now again on both nodes, make some symbolic links to the shared storage for our services.
mv /etc/postfix /etc/postfix.bak ln -s /mail/data/postfix /etc/postfix mv /etc/dovecot /etc/dovecot.bak ln -s /mail/data/dovecot /etc/dovecot
You should now be able to start your services. If you run into any errors, check /var/log/messages or /var/log/cluster/cluster.log
clusvcadm -d postfix-svc clusvcadm -d mysql-svc clusvcadm -e postfix-svc clusvcadm -e mysql-svc
To store users mail in /mail/data/vmail, make the following changes to /etc/postfix/main.cf – in this example we are using LDAP.
Both nodes in this case are replicating LDAP information from another server, so both the main LDAP server and one node could go down and users could still authenticate to the cluster services.
accounts_server_host = localhost accounts_search_base = dc=example,dc=com #Assumes users have a mail: attribute, if not use something else accounts_query_filter = (mail=%u) #accounts_result_attribute = homeDirectory accounts_result_attribute = mail #accounts_result_format = %u/Mailbox accounts_result_format = /var/vmail/%u/ accounts_scope = sub accounts_cache = yes accounts_bind = yes accounts_bind_dn = cn=admin,dc=example,dc=com accounts_bind_pw = PASSWORD accounts_version = 3 virtual_transport = virtual virtual_uid_maps = static:5000 virtual_gid_maps = static:5000 virtual_mailbox_base = / virtual_mailbox_maps = ldap:accounts virtual_mailbox_domains = example.com
For Dovecot, configure LDAP normally and then make the following changes
conf.d/auth-ldap.conf.ext: args = uid=vmail gid=vmail home=/var/vmail/%u/ conf.d/10-mail.conf:mail_location = maildir:/var/vmail/%u dovecot.conf:mail_location = maildir:/var/vmail/%u
Once this is done, restart your services (clusvcadm -R servicename) and send some test e-mails to yourself.
Things needed for RHCS on Centos 6
I have two physical ESXi 5.1 servers, so the below will assume ESXi for fencing.
Install and configure glusterfs
I assume you have added additional disks to your VM for this (sdb, sdc)
Add the EPEL/EL repos for your distruction to /etc/yum.repos.d
yum install glusterfs glusterfs-server
Partition and format /dev/sdb1 and /dev/sdb2 as EXT4, adding the following to /etc/fstab ON BOTH NODES
In the below example, I am using LVM
/dev/vg01/mysql /mysql ext4 defaults 0 0
localhost:/gv0 /mysql/data glusterfs defaults 0 0
/dev/vg02/mail /mail ext4 defaults 0 0
localhost:/gv1 /mail/data glusterfs defaults 0 0
Mount /mysql and /mail (not /*/data yet)
Now on only one node, do the following commands
gluster volume create gv0 replica 2 transport tcp centos-cluster1:/mysql/brick centos-cluster2:/mysql/brick gluster volume start gv0 gluster volume create gv1 replica 2 transport tcp centos-cluster1:/mail/brick centos-cluster2:/mail/brick gluster volume start gv1
You should now be able to mount /mail/data and /mysql/data on both nodes.
Install RHCS and tools on both nodes
yum install luci ricci rgmanager cman fence-agents corosync
Once done with this step, make sure both node has the other node in their /etc/hosts file and configure your nodes to use static IPs.
Use luci to create the basic configuration of your cluster. (https://NODE1IP:8084)
Manage Cluster – Add
Once created add your nodes under “Nodes”
Create your fence devices (Use fence_vmware or something similar for now, we will change it later most likely) for each ESXi server.
Create a failover domain (Prioritized=No, Restricted=No)
Create two IP address resources (one for mail services, one for mysql)
Finally create your service groups for each service.
Order should be: IP Resource – Service
For your postfix-svc, use the “Script” type and define the script file as /mail/data/mail.sh
For mysql-svc, use the “Mysql” type and use for the Config File: /mysql/data/my.cnf
You should now be able to run “clustat” on either node, although your services will be failed or disabled for now.
Finally, let’s finish up fencing.
See http://linuxadministration.us/?p=256 for ESXi 5.1 (if you’re using something else, you’ll have to do this part yourself I’m afraid)
Test fencing with the “fence_node” command. Do not skip this step! Make sure fencing works before moving forward.
I’ve been working on a project to learn more about RHCS in general and decided to build a HA Mail server.
Technologies Used:
– Centos 6.6
– RHCS
– Postfix
– Dovecot (IMAP)
– OpenLDAP
– GlusterFS
– MySQL
– Roundcube
Over the coming weeks I’ll be posting the steps required to set this up.
The general idea is to have two RHCS services
mail-svc
mysql-svc
Each node runs the following locally
While the above two services could certainly be cluster services, this is not required.
We NAT our public IP to the VIP for mail-svc.
To avoid SSH key issues, copy your SSH keys from /etc/ssh to the other node (So SSH to the public IP will not result in errors after a failover)
Turns out that using the “free” version of ESXi 5.1 does not work with RHCS fencing due to SOAP limitations.
Place the following file in /usr/sbin/fence_esxi and chmod a+x /usr/sbin/fence_esxi
Make sure to install paramiko (yum install python-paramiko)
#!/usr/bin/python
import paramiko
import sys
import time
import datetime
import re
sys.path.append("/usr/share/fence")
from fencing import *
device_opt = [ "help", "version", "agent", "quiet", "verbose", "debug", "action", "ipaddr", "login", "passwd", "passwd_script", "ssl", "port", "uuid", "separator", "ipport", "power_timeout", "shell_timeout", "login_timeout", "power_wait" ]
options = check_input(device_opt, process_input(device_opt))
f = open("/var/log/cluster/fence_esxi.log","w+")
ts = time.time()
st = datetime.datetime.fromtimestamp(ts).strftime('%Y-%m-%d %H:%M:%S')
f.write(st + " starting fencing.\n")
f.write("-n " + options["-n"] + "\n")
f.write("-a " + options["-a"] + "\n")
f.write("-l " + options["-l"] + "\n")
f.write("-p " + options["-p"] + "\n")
client = paramiko.SSHClient()
client.load_system_host_keys()
client.connect(options["-a"],username=options["-l"],password=options["-p"])
command="esxcli vm process list | grep ^" + options["-n"] + " -A 1 | tail -n 1 | sed \'s/ */ /g\' | cut -d \" \" -f 4"
f.write("Cmd: " + command + "\n")
stdin, stdout, stderr = client.exec_command(command)
while not stdout.channel.exit_status_ready():
f.write("Waiting for command to finish... \n")
time.sleep(2)
wwid = stdout.read()
f.write("wwid: " + wwid + "\n")
if len(wwid) < 2:
f.write("VM not found or alread offline \n")
client.close()
sys.exit(1)
f.write("VM found \n")
command="esxcli vm process kill --type=soft --world-id=" + wwid
f.write("Cmd: " + command + "\n")
stdin, stdout, stderr = client.exec_command(command)
while not stdout.channel.exit_status_ready():
f.write("Waiting for command to finish... \n")
time.sleep(2)
#Give the VM some time to shut down gracefully
time.sleep(30)
f.write("Waited 30 seconds \n")
command="vm-support -V | grep centos | cut -d \"(\" -f 2 | cut -d \")\" -f 1"
f.write("Cmd: " + command + "\n")
stdin, stdout, stderr = client.exec_command(command)
while not stdout.channel.exit_status_ready():
f.write("Waiting for command to finish... \n")
time.sleep(2)
status = stdout.read()
f.write("VM Status: " + status + "\n")
sregex = re.compile('Running')
if sregex.search(status):
f.write("VM still running, hard kill required \n")
command="esxcli vm process kill --type=hard --world-id=" + wwid
f.write("Cmd: " + command + "\n")
stdin, stdout, stderr = client.exec_command(command)
while not stdout.channel.exit_status_ready():
f.write("Waiting for command to finish... \n")
time.sleep(2)
time.sleep(30)
else:
f.write("VM successfully soft killed \n")
#Get VM info while powered off
command="vim-cmd vmsvc/getallvms | grep " + options["-n"] + " | sed 's/ */ /g' | cut -d \" \" -f 1"
f.write("Cmd: " + command + "\n")
stdin, stdout, stderr = client.exec_command(command)
while not stdout.channel.exit_status_ready():
f.write("Waiting for command to finish... \n")
time.sleep(2)
vmid = stdout.read()
#Start VM back up
command="vim-cmd vmsvc/power.on " + vmid
f.write("Cmd: " + command + "\n")
stdin, stdout, stderr = client.exec_command(command)
while not stdout.channel.exit_status_ready():
f.write("Waiting for command to finish... \n")
time.sleep(2)
f.write("fence_esxi exiting...")
f.close()
client.close()
sys.exit(0)
In your cluster.conf you would then have something like the following. Make sure to enable SSH on your ESXi hosts
<fence>
<method name="fence-cluster1">
<device name="esx1" port="centos-cluster1" ssl="on" />
</method>
</fence>
<fencedevices>
<fencedevice agent=”fence_esxi” ipaddr=”esx2.FQDN.com” login=”root” name=”esx2″ passwd=”YOURPASSWORDHERE” delay=”60″ />
</fencedevices>
Following the excellent guide here: https://wiki.debian.org/LDAP/OpenLDAPSetup
I was able to get LDAP replication working fairly easily. There are two problems with this however.
1. The default slapd configuration will use dc=nodomain (if no domain was picked at install) otherwise whatever domain you picked at install. You are not asked to choose, so of course if you have a different domain than your LDAP server replication will not function.
2. The above guide does NOT use SSL for replication for some reason
On your client, do the following to change dc=nodomain to whatever it should be for replication
/etc/init.d/slapd stop
rm /var/lib/ldap/*
vi /etc/ldap/slapd.d/cn\=config/olcDatabase\=\{1\}hdb.ldif
Update all dc=nodomain entries to dc=your,dc=domain
Then start slapd
/etc/init.d/slapd start
Create an LDIF file like the following (in this case, mirror.ldif)
dn: olcDatabase={1}hdb,cn=config
changeType: modify
add: olcSyncrepl
olcSyncrepl: rid=004 provider=ldaps://YOURMASTERHOSTNAME:636 bindmethod=simple binddn="cn=mirrormode,dc=bbis,dc=us" credentials=YOURPASSWORD tls_reqcert=never searchbase="dc=bbis,dc=us" schemachecking=on type=refreshAndPersist retry="60 +" tls_cert=/etc/ldap/ssl/server.pem tls_cacert=/etc/ldap/ssl/server.pem tls_key=/etc/ldap/ssl/server.pem
-
add: olcMirrorMode
olcMirrorMode: TRUE
-
Note that “rid=004” should be different for each LDAP server you bring in to play. Replace dc=bbis,dc=us with your domain.
Now add it to your schema
ldapmodify -QY EXTERNAL -H ldapi:/// -f mirror.ldif
Use ldapsearch to verify functonality
ldapsearch -H ldap://127.0.0.1 -x
crypto ca trustpoint comodo
enrollment terminal
chain-validation stop
revocation-check none
crypto ca authenticate comodo
[Comodo ROOT CA]
crypto ca import comodo pkcs12 tftp: password PASSWORDYOUUSED
[Exported PFX]
This assumes you have already requested and received your UCC certificate (IIS/Apache/etc.)
crypto ca trustpoint godaddy
enrollment terminal
chain-validation stop
revocation-check none
exit
crypto ca authenticate godaddy
—–BEGIN CERTIFICATE—–
Root Godaddy CA Cert (gd-class2-root.crt)
https://certs.godaddy.com/anonymous/repository.pki
—–END CERTIFICATE—–
!Intermediate trustpoint
crypto ca trustpoint intermediate-primary
enrollment terminal
chain-validation continue godaddy
revocation-check none
crypto ca authenticate intermediate-primary
—–BEGIN CERTIFICATE—–
This is the first file inside the PFX container (gd-g2_iis_intermediates)
—–END CERTIFICATE—–
crypto ca trustpoint intermediate-secondary
enrollment terminal
chain-validation continue intermediate-primary
crypto ca authenticate intermediate-secondary
—–BEGIN CERTIFICATE—–
This is the second file inside the PFX container (gd-g2_iis_intermediates)
—–END CERTIFICATE—–
crypto pki import godaddypriv pkcs12 tftp: password PASSWORDHERE
#pkcs12 you export from Windows
crypto pki trustpoint intermediate-secondary
rsakeypair godaddypriv
crypto ca import intermediate-secondary certificate
—–BEGIN CERTIFICATE—–
This should be the CRT godaddy gave you, the file you import into IIS
—–END CERTIFICATE—–
Get list of networks/MTUs
esxcfg-vmknic --list
Set MTU
esxcfg-vmknic --mtu 9000 "Management Network"
esxcfg-vmknic --mtu 9000 "iscsi-name"
/etc/network/interfaces
auto bond0
iface bond0 inet static
address 10.10.0.100
gateway 10.10.0.1
netmask 255.255.255.0
slaves eth0 eth1 eth2
bond-mode 802.3ad
bond-miimon 100
bond-lacp-rate 4
ExtremeWare configuration
As you may be able to guess, this configures ports 41, 42 and 43 for LACP
enable sharing 41 grouping 41,42,43 dynamic