登录  
 加关注
   显示下一条  |  关闭
温馨提示!由于新浪微博认证机制调整,您的新浪微博帐号绑定已过期,请重新绑定!立即重新绑定新浪微博》  |  关闭

SeaRiver Blog

实力才是你一生最好的依靠!

 
 
 

日志

 
 

heartbeat + drbd  

2007-11-29 02:50:34|  分类: load balancer |  标签: |举报 |字号 订阅

  下载LOFTER 我的照片书  |

一、需要安装软件列表(Centos 4.4 X86_64):
heartbeat-2.0.7-1.c4
heartbeat-pils-2.0.7-1.c4
heartbeat-stonith-2.0.7-1.c4

drbd-0.7.23-1.c4
kernel-module-drbd-2.6.9-42.0.10.ELsmp-0.7.23-1.el4.centos

使用yum或者rpm安装。

二、drbd配置
(1)主要配置文件
/etc/drbd.conf

(2)主要配置项,对应每对需同步磁盘要创建一个resource,如下:
...
resource r1 {
  protocol C;
  incon-degr-cmd "echo '!DRBD! pri on incon-degr' | wall ; sleep 60 ; halt -f";
  startup {
    degr-wfc-timeout 120;    # 2 minutes.
  }
  disk {
    on-io-error   detach;
  }
  net {
  }
  syncer {
    rate 10M;
    group 1;
    al-extents 257;
  }

  #Master
  on prmdbrw02 {
    device     /dev/drbd0;
    disk       /dev/sda2;
    address    192.168.43.14:7789;
    meta-disk  internal;
  }

  #Slave
  on prmdbbk01 {
    device    /dev/drbd1;
    disk      /dev/sda2;
    address   192.168.43.13:7789;
    meta-disk internal;
  }
}
...

三、heartbeat配置
1、主要配置文件(必须存在)
/etc/ha.d/ha.cf
/etc/ha.d/authkeys
/etc/ha.d/haresources
注意两台机器的配置文件必须一致。

2、配置项说明:

2.1 cat /etc/authkeys
auth 1
1 crc
#2 sha1 HI!
#3 md5 Hello!

说明:该文件主要是用于集群中两个节点的认证,采用的算法和密钥(如果有的话)在集群中节点上必须相同,目前提供了3种算法:md5,sha1和crc。其中 crc不能够提供认证,它只能够用于校验数据包是否损坏,而sha1,md5需要一个密钥来进行认证,从资源消耗的角度来讲,md5消耗的比较多, sha1次之,因此建议一般使用sha1算法。
我们如果要采用sha1算法,只需要将authkeys中的auth 指令(去掉注释符)改为2,而对应的2 sha1行则需要去掉注释符(#),后面的密钥自己改变(两节点上必须相同)。改完之后,保存,同时需要改变该文件的属性为600,否则 heartbeat启动将失败。具体命令为:chmod 600 authkeys


2.2 ha.cf
heartbeat的主要配置文件,由于该文件比较大,我的注释就直接写在相关地方了,如果我们要采用哪个配置选项(或指令),只需要去掉前面的注释符即可。
#
#       There are lots of options in this file.  All you have to have is a set
#       of nodes listed {"node ...} one of {serial, bcast, mcast, or ucast},
#       and a value for "auto_failback".
#
#       ATTENTION: As the configuration file is read line by line,
#                  THE ORDER OF DIRECTIVE MATTERS!
#
#       In particular, make sure that the udpport, serial baud rate
#       etc. are set before the heartbeat media are defined!
#       debug and log file directives go into effect when they
#       are encountered.
#
#       All will be fine if you keep them ordered as in this example.
#
#
#       Note on logging:
#       If any of debugfile, logfile and logfacility are defined then they
#       will be used. If debugfile and/or logfile are not defined and
#       logfacility is defined then the respective logging and debug
#       messages will be loged to syslog. If logfacility is not defined
#       then debugfile and logfile will be used to log messges. If
#       logfacility is not defined and debugfile and/or logfile are not
#       defined then defaults will be used for debugfile and logfile as
#       required and messages will be sent there.
#
#       File to write debug messages to 用于记录heartbeat的调试信息
#debugfile /var/log/ha-debug
#
#
#       File to write other messages to 用于记录heartbeat的日志信息
#
logfile /var/log/ha-log
#
#
#       Facility to use for syslog()/logger
#
#如果未定义上述的日志文件,那么日志信息将送往local0(对应的#/var/log/messages),如果这3个日志文件都未定义,那么heartbeat默认情况下
#将在/var/log下建立ha-debug和ha-log来记录相应的日志信息。
logfacility     local0
#
#
#       A note on specifying "how long" times below...
#
#       The default time unit is seconds
#               10 means ten seconds
#
#       You can also specify them in milliseconds
#               1500ms means 1.5 seconds
#
#
#       keepalive: how long between heartbeats?
#
#发送心跳报文的间隔,默认单位为秒,如果你毫秒为单位,那么需要在后面跟
#ms单位,如1500ms即代表1.5s
keepalive 250ms
#
#       deadtime: how long-to-declare-host-dead?
#
#               If you set this too low you will get the problematic
#               split-brain (or cluster partition) problem.
#               See the FAQ for how to use warntime to tune deadtime.
#
#用于配置认为对方节点菪掉的间隔
deadtime 1
#
#       warntime: how long before issuing "late heartbeat" warning?
#       See the FAQ for how to use warntime to tune deadtime.
#
#发出最后的心跳警告报文的间隔
warntime 500ms
#
#
#       Very first dead time (initdead)
#
#       On some machines/OSes, etc. the network takes a while to come up
#       and start working right after you've been rebooted.  As a result
#       we have a separate dead time for when things first come up.
#       It should be at least twice the normal dead time.
#
#网络启动的时间,认为heartbeat等待系统或者网络启动的时间
initdead 30
#
#
#       What UDP port to use for bcast/ucast communication?
#
#广播/单播通讯使用的udp端口
udpport 694
#
#       Baud rate for serial ports...
#
#串口通讯的波特率
#baud   19200
#
#       serial  serialportname ...
#使用的串口设备,在linux上即为/dev/ttyS0(1,2,3…)
#serial /dev/ttyS0      # Linux
#serial /dev/cuaa0      # FreeBSD
#serial /dev/cuad0      # FreeBSD 6.x
#serial /dev/cua/a      # Solaris
#
#
#       What interfaces to broadcast heartbeats over?
#
#心跳所使用的网络接口
bcast   eth3            # Linux
#bcast  eth1 eth2       # Linux
#bcast  le0             # Solaris
#bcast  le1 le2         # Solaris
#
#       Set up a multicast heartbeat medium
#       mcast [dev] [mcast group] [port] [ttl] [loop]
#
#       [dev]           device to send/rcv heartbeats on
#       [mcast group]   multicast group to join (class D multicast address
#                       224.0.0.0 - 239.255.255.255)
#       [port]          udp port to sendto/rcvfrom (set this value to the
#                       same value as "udpport" above)
#       [ttl]           the ttl value for outbound heartbeats.  this effects
#                       how far the multicast packet will propagate.  (0-255)
#                       Must be greater than zero.
#       [loop]          toggles loopback for outbound multicast heartbeats.
#                       if enabled, an outbound packet will be looped back and
#                       received by the interface it was sent on. (0 or 1)
#                       Set this value to zero.
#
#
#如果采用组播通讯,在这里可以设置组播通讯所使用的接口,绑定的组播ip地#址(在224.0.0.0 - 239.255.255.255间),通讯端口,ttl(time to live)所能经过路由的#跳数,是否允许环回(也就是本地发出的数据包时候还接收)
#mcast eth0 225.0.0.1 694 1 0
#
#       Set up a unicast / udp heartbeat medium
#       ucast [dev] [peer-ip-addr]
#
#       [dev]           device to send/rcv heartbeats on
#       [peer-ip-addr]  IP address of peer to send packets to
#
#如果采用单播,那么可以配置其网络接口以及所使用的ip地址
#ucast eth0 192.168.1.2
#
#
#       About boolean values...
#
#       Any of the following case-insensitive values will work for true:
#               true, on, yes, y, 1
#       Any of the following case-insensitive values will work for false:
#               false, off, no, n, 0
#
#
#
#       auto_failback:  determines whether a resource will
#       automatically fail back to its "primary" node, or remain
#       on whatever node is serving it until that node fails, or
#       an administrator intervenes.
#
#       The possible values for auto_failback are:
#               on      - enable automatic failbacks
#               off     - disable automatic failbacks
#               legacy  - enable automatic failbacks in systems
#                       where all nodes do not yet support
#                       the auto_failback option.
#
#       auto_failback "on" and "off" are backwards compatible with the old
#               "nice_failback on" setting.
#
#       See the FAQ for information on how to convert
#               from "legacy" to "on" without a flash cut.
#               (i.e., using a "rolling upgrade" process)
#
#       The default value for auto_failback is "legacy", which
#       will issue a warning at startup.  So, make sure you put
#       an auto_failback directive in your ha.cf file.
#       (note: auto_failback can be any boolean or "legacy")
#
#用于决定,当拥有该资源的属主恢复之后,资源是否变迁:是迁移到属主上,
#还是在当前节点上继续运行,直到当前节点出现故障。
auto_failback off
#
#
#       Basic STONITH support
#       Using this directive assumes that there is one stonith
#       device in the cluster.  Parameters to this device are
#       read from a configuration file. The format of this line is:
#
#         stonith <stonith_type> <configfile>
#
#       NOTE: it is up to you to maintain this file on each node in the
#       cluster!
#
#用于共享资源的集群环境中,采用stonith防御技术来保证数据的一致性
#stonith baytech /etc/ha.d/conf/stonith.baytech
#
#       STONITH support
#       You can configure multiple stonith devices using this directive.
#       The format of the line is:
#         stonith_host <hostfrom> <stonith_type> <params...>
#         <hostfrom> is the machine the stonith device is attached
#              to or * to mean it is accessible from any host.
#         <stonith_type> is the type of stonith device (a list of
#              supported drives is in /usr/lib/stonith.)
#         <params...> are driver specific parameters.  To see the
#              format for a particular device, run:
#           stonith -l -t <stonith_type>
#
#
#       Note that if you put your stonith device access information in
#       here, and you make this file publically readable, you're asking
#       for a denial of service attack ;-)
#
#       To get a list of supported stonith devices, run
#               stonith -L
#       For detailed information on which stonith devices are supported
#       and their detailed configuration options, run this command:
#               stonith -h
#
#stonith_host *     baytech 10.0.0.3 mylogin mysecretpassword
#stonith_host ken3  rps10 /dev/ttyS1 kathy 0
#stonith_host kathy rps10 /dev/ttyS1 ken3 0
#
#       Watchdog is the watchdog timer.  If our own heart doesn't beat for
#       a minute, then our machine will reboot.
#       NOTE: If you are using the software watchdog, you very likely
#       wish to load the module with the parameter "nowayout=0" or
#       compile it without CONFIG_WATCHDOG_NOWAYOUT set. Otherwise even
#       an orderly shutdown of heartbeat will trigger a reboot, which is
#       very likely NOT what you want.
#
#该指令是用于设置看门狗定时器,如果节点一分钟内都没有心跳,那么节点将
#重新启动
#watchdog /dev/watchdog
#
#       Tell what machines are in the cluster
#       node    nodename ...    -- must match uname -n
#node   ken3
#node   kathy
#设置集群中的节点,注意:节点名必须与uname –n相匹配
node prmdbrw01
node prmdbbk01

#
#       Less common options...
#
#       Treats 10.10.10.254 as a psuedo-cluster-member
#       Used together with ipfail below...
#       note: don't use a cluster node as ping node
#
#ping指令以及下面的ping_group指令是用于建立伪集群成员,它们必须与下述#的ipfail指令一起使用,它们的作用是监测物理链路,也就是说如果集群节点
#与上述伪设备不相通,那么该节点也将无权接管资源或服务,它将释放掉资源。
ping  192.168.33.1
#
#       Treats 10.10.10.254 and 10.10.10.253 as a psuedo-cluster-member
#       called group1. If either 10.10.10.254 or 10.10.10.253 are up
#       then group1 is up
#       Used together with ipfail below...
#
#ping_group group1 10.10.10.254 10.10.10.253
#
#       HBA ping derective for Fiber Channel
#       Treats fc-card-name as psudo-cluster-member
#       used with ipfail below ...
#
#       You can obtain HBAAPI from http://hbaapi.sourceforge.net.  You need
#       to get the library specific to your HBA directly from the vender
#       To install HBAAPI stuff, all You need to do is to compile the common
#       part you obtained from the sourceforge. This will produce libHBAAPI.so
#       which you need to copy to /usr/lib. You need also copy hbaapi.h to
#       /usr/include.
#
#       The fc-card-name is the name obtained from the hbaapitest program
#       that is part of the hbaapi package. Running hbaapitest will produce
#       a verbose output. One of the first line is similar to:
#               Apapter number 0 is named: qlogic-qla2200-0
#       Here fc-card-name is qlogic-qla2200-0.
#
#hbaping fc-card-name
#
#
#       Processes started and stopped with heartbeat.  Restarted unless
#               they exit with rc=100
#
#respawn userid /path/name/to/run
#可以定义与heartbeat一起启动和停止的进程
respawn hacluster /usr/lib64/heartbeat/ipfail
#
#       Access control for client api
#               default is no access
#
#设置你所指定的启动进程的权限
#apiauth client-name gid=gidlist uid=uidlist
#apiauth ipfail gid=haclient uid=hacluster

###########################
#下面是一些非常用选项,在这里就不祥述了
#       Unusual options.
#
###########################
#
#       hopfudge maximum hop count minus number of nodes in config
#hopfudge 1
#
#       deadping - dead time for ping nodes
#deadping 30
#
#       hbgenmethod - Heartbeat generation number creation method
#               Normally these are stored on disk and incremented as needed.
#hbgenmethod time
#
#       realtime - enable/disable realtime execution (high priority, etc.)
#               defaults to on
#realtime off
#
#       debug - set debug level
#               defaults to zero
#debug 1
#
#       API Authentication - replaces the fifo-permissions-based system of the past
#
#
#       You can put a uid list and/or a gid list.
#       If you put both, then a process is authorized if it qualifies under either
#       the uid list, or under the gid list.
#
#       The groupname "default" has special meaning.  If it is specified, then
#       this will be used for authorizing groupless clients, and any client groups
#       not otherwise specified.
#
#       There is a subtle exception to this.  "default" will never be used in the
#       following cases (actual default auth directives noted in brackets)
#                 ipfail        (uid=HA_CCMUSER)
#                 ccm           (uid=HA_CCMUSER)
#                 ping          (gid=HA_APIGROUP)
#                 cl_status     (gid=HA_APIGROUP)
#
#       This is done to avoid creating a gaping security hole and matches the most
#       likely desired configuration.
#
#apiauth ipfail uid=hacluster
#apiauth ccm uid=hacluster
#apiauth cms uid=hacluster
#apiauth ping gid=haclient uid=alanr,root
#apiauth default gid=haclient

#       message format in the wire, it can be classic or netstring,
#       default: classic
#msgfmt  classic/netstring

#       Do we use logging daemon?
#       If logging daemon is used, logfile/debugfile/logfacility in this file
#       are not meaningful any longer. You should check the config file for logging
#       daemon (the default is /etc/logd.cf)
#       more infomartion can be fould in http://www.linux-ha.org/ha_2ecf_2fUseLogdDirective
#       Setting use_logd to "yes" is recommended
#
# use_logd yes/no
#
#       the interval we  reconnect to logging daemon if the previous connection failed
#       default: 60 seconds
#conn_logd_time 60
#
#
#       Configure compression module
#       It could be zlib or bz2, depending on whether u have the corresponding
#       library in the system.
#compression    bz2
#
#       Confiugre compression threshold
#       This value determines the threshold to compress a message,
#       e.g. if the threshold is 1, then any message with size greater than 1 KB
#       will be compressed, the default is 2 (KB)
#compression_threshold 2
#crm 1

 

2.3 haresource
heartbeat的资源配置文件
#
#       This is a list of resources that move from machine to machine as
#       nodes go down and come up in the cluster.  Do not include
#       "administrative" or fixed IP addresses in this file.
#
# <VERY IMPORTANT NOTE>
#       The haresources files MUST BE IDENTICAL on all nodes of the cluster.
#
#       The node names listed in front of the resource group information
#       is the name of the preferred node to run the service.  It is
#       not necessarily the name of the current machine.  If you are running
#       auto_failback ON (or legacy), then these services will be started
#       up on the preferred nodes - any time they're up.
#
#       If you are running with auto_failback OFF, then the node information
#       will be used in the case of a simultaneous start-up, or when using
#       the hb_standby {foreign,local} command.
#
#       BUT FOR ALL OF THESE CASES, the haresources files MUST BE IDENTICAL.
#       If your files are different then almost certainly something
#       won't work right.
# </VERY IMPORTANT NOTE>
#
#
#       We refer to this file when we're coming up, and when a machine is being
#       taken over after going down.
#
#       You need to make this right for your installation, then install it in
#       /etc/ha.d
#
#       Each logical line in the file constitutes a "resource group".
#       A resource group is a list of resources which move together from
#       one node to another - in the order listed.  It is assumed that there
#       is no relationship between different resource groups.  These
#       resource in a resource group are started left-to-right, and stopped
#       right-to-left.  Long lists of resources can be continued from line
#       to line by ending the lines with backslashes ("\").
#
#       These resources in this file are either IP addresses, or the name
#       of scripts to run to "start" or "stop" the given resource.
#
#       The format is like this:
#
#node-name resource1 resource2 ... resourceN
#
#
#       If the resource name contains an :: in the middle of it, the
#       part after the :: is passed to the resource script as an argument.
#       Multiple arguments are separated by the :: delimeter
#
#       In the case of IP addresses, the resource script name IPaddr is
#       implied.
#
#       For example, the IP address 135.9.8.7 could also be represented
#       as IPaddr::135.9.8.7
#
#       THIS IS IMPORTANT!!     vvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvvv
#
#       The given IP address is directed to an interface which has a route
#       to the given address.  This means you have to have a net route
#       set up outside of the High-Availability structure.  We don't set it
#       up here -- we key off of it.
#
#       The broadcast address for the IP alias that is created to support
#       an IP address defaults to the highest address on the subnet.
#
#       The netmask for the IP alias that is created defaults to the same
#       netmask as the route that it selected in in the step above.
#
#       The base interface for the IPalias that is created defaults to the
#       same netmask as the route that it selected in in the step above.
#
#       If you want to specify that this IP address is to be brought up
#       on a subnet with a netmask of 255.255.255.0, you would specify
#       this as IPaddr::135.9.8.7/24 .
#
#       If you wished to tell it that the broadcast address for this subnet
#       was 135.9.8.210, then you would specify that this way:
#               IPaddr::135.9.8.7/24/135.9.8.210
#
#       If you wished to tell it that the interface to add the address to
#       is eth0, then you would need to specify it this way:
#               IPaddr::135.9.8.7/24/eth0
#
#       And this way to specify both the broadcast address and the
#       interface:
#               IPaddr::135.9.8.7/24/eth0/135.9.8.210
#
#       The IP addresses you list in this file are called "service" addresses,
#       since they're they're the publicly advertised addresses that clients
#       use to get at highly available services.
#
#       For a hot/standby (non load-sharing) 2-node system with only
#       a single service address,
#       you will probably only put one system name and one IP address in here.
#       The name you give the address to is the name of the default "hot"
#       system.
#
#       Where the nodename is the name of the node which "normally" owns the
#       resource.  If this machine is up, it will always have the resource
#       it is shown as owning.
#
#       The string you put in for nodename must match the uname -n name
#       of your machine.  Depending on how you have it administered, it could
#       be a short name or a FQDN.
#
#-------------------------------------------------------------------
#
#       Simple case: One service address, default subnet and netmask
#               No servers that go up and down with the IP address
#
#just.linux-ha.org      135.9.216.110
#
#-------------------------------------------------------------------
#
#       Assuming the adminstrative addresses are on the same subnet...
#       A little more complex case: One service address, default subnet
#       and netmask, and you want to start and stop http when you get
#       the IP address...
#
#just.linux-ha.org      135.9.216.110 http
#-------------------------------------------------------------------
#
#       A little more complex case: Three service addresses, default subnet
#       and netmask, and you want to start and stop http when you get
#       the IP address...
#
#just.linux-ha.org      135.9.216.110 135.9.215.111 135.9.216.112 httpd
#-------------------------------------------------------------------
#
#       One service address, with the subnet, interface and bcast addr
#       explicitly defined.
#
#just.linux-ha.org      135.9.216.3/28/eth0/135.9.216.12 httpd
#
#-------------------------------------------------------------------
#
#       An example where a shared filesystem is to be used.
#       Note that multiple aguments are passed to this script using
#       the delimiter '::' to separate each argument.
#
#node1  10.0.0.170 Filesystem::/dev/sda1::/data1::ext2
#
#       Regarding the node-names in this file:
#
#       They must match the names of the nodes listed in ha.cf, which in turn
#       must match the `uname -n` of some node in the cluster.  So they aren't
#       virtual in any sense of the word.
#
#ws009.ltsp     192.168.100.100 mysql
#mjmnode01 10.103.30.100/24/eth2:0
prmdbrw01 IPaddr::192.168.33.100/24/eth2 mysql_umount mysql

上面是haresource文件,该文件主要是为你部署的集群配置资源或者服务,它的每一有效行的格式如下:
node-name resource1 resource2 ... resourceN
其中node-name即为集群中某一节点的名称,必须与uname –n相同,
后面的资源组resource1 resource2 …resourceN中每一个资源都是一个shell脚本,它们的搜索路径为/etc/init.d/和 /usr/local/etc/ha.d/resource.d(该路径根据你所安装heartbeat的路径有所不同),heartbeat为我们提供了一个非常好的资源扩展框架,如果我们需要控制一种自己的资源,只需要实现一个支持start和stop参数的shell脚本就可以了,目前 heartbeat所支持的资源脚本可以在我提供的上述路径中去查看。


3、haresource中包含两个自定义脚本文件
/etc/ha.d/resource.d
mysql_umount mysql
主要用来设定drbd状态,mount drbd设备,启动mysql服务。

四、服务测试
1、当前环境终止master的heartbeat,backup会自动接管mysql服务
2、不要在现有环境用down掉心跳的网卡设备来测试,会认为对方已经宕机,去接管资源,引起资源冲突
3、需要修改mysql_umount脚本文件,master宕机后drbd状态不会马上改变,backup接管drbd资源有可能冲突,需要加入重试的过程。
4、采用让master无法ping通网关的方法,让其释放资源测试,看backup是否接管资源。


五、以下为PantaRhei环境:
1、Master 192.168.99.171
2、M/S  192.168.99.174
3、Backup 192.168.99.173

关系如下:

                         heartbeat + Drbd 
             Master--------------------Backup
                  \                               /
                    \                           /
    Replication \                      /  Drbd
                        \                   /
                          \               /
                            \           /
                              \       /
                                \   /
                                M/S

 

  评论这张
 
阅读(1374)| 评论(0)

历史上的今天

评论

<#--最新日志,群博日志--> <#--推荐日志--> <#--引用记录--> <#--博主推荐--> <#--随机阅读--> <#--首页推荐--> <#--历史上的今天--> <#--被推荐日志--> <#--上一篇,下一篇--> <#-- 热度 --> <#-- 网易新闻广告 --> <#--右边模块结构--> <#--评论模块结构--> <#--引用模块结构--> <#--博主发起的投票-->
 
 
 
 
 
 
 
 
 
 
 
 
 
 

页脚

网易公司版权所有 ©1997-2018