본문 바로가기

리눅스

[draft] Corosync와 Pacemaker를 사용하여 HA 클러스터를 구성하고 VIP 설정 및 페일오버 테스트를 진행하는 방법

728x90

Corosync와 Pacemaker를 사용하여 High Availability(HA) 클러스터를 구성하고 VIP(Virtual IP) 설정 및 페일오버 테스트를 진행하는 방법

시간 동기화

호스트 파일 설정

cat <<EOF | sudo tee -a /etc/hosts

# HA Cluster
192.168.0.120 vip.cluster.local vip
192.168.0.121 server1.cluster.local server1
192.168.0.122 server2.cluster.local server2
192.168.0.123 server3.cluster.local server3
EOF

1. 필요 패키지 설치

모든 노드에 필요한 패키지를 설치합니다.

sudo apt update
sudo apt install -y corosync pacemaker pcs
$ corosync -v
Corosync Cluster Engine, version '3.1.6'
Copyright (c) 2006-2021 Red Hat, Inc.

Built-in features: dbus monitoring watchdog augeas systemd xmlconf vqsim nozzle snmp pie relro bindnow
Available crypto models: nss openssl
Available compression models: zlib lz4 lz4hc lzo2 lzma bzip2 zstd
$ pacemakerd --version
Pacemaker 2.1.2
Written by Andrew Beekhof
$ pcs --version
0.10.11

클러스터 상태 확인

$ sudo pcs status
Cluster name: debian

WARNINGS:
No stonith devices and stonith-enabled is not false

Cluster Summary:
  * Stack: corosync
  * Current DC: server1 (version 2.1.2-ada5c3b36e2) - partition with quorum
  * Last updated: Wed Oct 30 11:40:57 2024
  * Last change:  Wed Oct 30 11:39:17 2024 by hacluster via crmd on server1
  * 1 node configured
  * 0 resource instances configured

Node List:
  * Online: [ server1 ]

Full List of Resources:
  * No resources

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

2. Corosync 구성

Corosync를 구성하기 위해 /etc/corosync/corosync.conf 파일을 수정합니다.

sudo vim /etc/corosync/corosync.conf
cat <<EOF | sudo tee /etc/corosync/corosync.conf
totem {
    version: 2
    secauth: off
    interface {
        ringnumber: 0
        bindnetaddr: 192.168.0.0
        mcastport: 5405
        ttl: 1
    }
}

logging {
    to_syslog: yes
}

nodelist {
    node {
        ring0_addr: 192.168.0.121
        nodeid: 1
    }
    node {
        ring0_addr: 192.168.0.122
        nodeid: 2
    }
    node {
        ring0_addr: 192.168.0.123
        nodeid: 3
    }
}

quorum {
    provider: corosync_votequorum
}
EOF

corosync 서비스 재시작

sudo systemctl restart corosync

클러스터 상태 확인

$ sudo pcs status
Cluster name: 

WARNINGS:
No stonith devices and stonith-enabled is not false

Cluster Summary:
  * Stack: corosync
  * Current DC: NONE
  * Last updated: Wed Oct 30 13:32:33 2024
  * Last change:  Wed Oct 30 13:30:52 2024 by hacluster via crmd on server1
  * 1 node configured
  * 0 resource instances configured

Node List:
  * Node server1: UNCLEAN (offline)

Full List of Resources:
  * No resources

Daemon Status:
  corosync: active/enabled
  pacemaker: active/enabled
  pcsd: active/enabled

3. Pacemaker 클러스터 구성

pcs를 사용하여 클러스터를 설정합니다.(각 노드에서 실행)

 

pcs 서비스 활성화

sudo systemctl --now enable pcsd

hacluster 사용자 비밀번호 설정

echo -e 'hacluster:hacluster' | sudo chpasswd

노드 인증

sudo pcs host auth -u hacluster -p hacluster \
    192.168.0.121 192.168.0.122 192.168.0.123
192.168.0.121: Authorized
192.168.0.122: Authorized
192.168.0.123: Authorized
더보기

---

기존 클러스터 제거

sudo pcs cluster destroy

클러스터 서비스 중지

sudo systemctl stop corosync pacemaker

---

클러스터 생성

sudo pcs cluster setup my_cluster \
    192.168.0.121 192.168.0.122 192.168.0.123
No addresses specified for host '192.168.0.121', using '192.168.0.121'
No addresses specified for host '192.168.0.122', using '192.168.0.122'
No addresses specified for host '192.168.0.123', using '192.168.0.123'
Destroying cluster on hosts: '192.168.0.121', '192.168.0.122', '192.168.0.123'...
192.168.0.122: Successfully destroyed cluster
192.168.0.121: Successfully destroyed cluster
192.168.0.123: Successfully destroyed cluster
Requesting remove 'pcsd settings' from '192.168.0.121', '192.168.0.122', '192.168.0.123'
192.168.0.121: successful removal of the file 'pcsd settings'
192.168.0.122: successful removal of the file 'pcsd settings'
192.168.0.123: successful removal of the file 'pcsd settings'
Sending 'corosync authkey', 'pacemaker authkey' to '192.168.0.121', '192.168.0.122', '192.168.0.123'
192.168.0.121: successful distribution of the file 'corosync authkey'
192.168.0.121: successful distribution of the file 'pacemaker authkey'
192.168.0.122: successful distribution of the file 'corosync authkey'
192.168.0.122: successful distribution of the file 'pacemaker authkey'
192.168.0.123: successful distribution of the file 'corosync authkey'
192.168.0.123: successful distribution of the file 'pacemaker authkey'
Sending 'corosync.conf' to '192.168.0.121', '192.168.0.122', '192.168.0.123'
192.168.0.121: successful distribution of the file 'corosync.conf'
192.168.0.122: successful distribution of the file 'corosync.conf'
192.168.0.123: successful distribution of the file 'corosync.conf'
Cluster has been successfully set up.

클러스터 시작

sudo pcs cluster start --all
192.168.0.121: Starting Cluster...
192.168.0.122: Starting Cluster...
192.168.0.123: Starting Cluster...

클러스터 중지

sudo pcs cluster stop --all

클러스터 상태 확인

sudo pcs status
$ sudo pcs status
Cluster name: my_cluster

WARNINGS:
No stonith devices and stonith-enabled is not false

Cluster Summary:
  * Stack: unknown
  * Current DC: NONE
  * Last updated: Wed Oct 30 14:37:19 2024
  * Last change:  Wed Oct 30 14:37:15 2024 by hacluster via crmd on 192.168.0.121
  * 3 nodes configured
  * 0 resource instances configured

Node List:
  * Node 192.168.0.121: UNCLEAN (offline)
  * Node 192.168.0.122: UNCLEAN (offline)
  * Node 192.168.0.123: UNCLEAN (offline)

Full List of Resources:
  * No resources

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled
$ sudo pcs status
Cluster name: my_cluster

WARNINGS:
No stonith devices and stonith-enabled is not false

Cluster Summary:
  * Stack: corosync
  * Current DC: 192.168.0.123 (version 2.1.2-ada5c3b36e2) - partition with quorum
  * Last updated: Wed Oct 30 14:38:12 2024
  * Last change:  Wed Oct 30 14:37:36 2024 by hacluster via crmd on 192.168.0.123
  * 3 nodes configured
  * 0 resource instances configured

Node List:
  * Online: [ 192.168.0.121 192.168.0.122 192.168.0.123 ]

Full List of Resources:
  * No resources

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

4. 클러스터 기본 설정

STONITH(Fencing) 비활성화

sudo pcs property set stonith-enabled=false

STONITH(Fencing) 활성화

sudo pcs property set stonith-enabled=true

클러스터 설정 확인

sudo pcs property config
Cluster Properties:
 cluster-infrastructure: corosync
 cluster-name: my_cluster
 dc-version: 2.1.2-ada5c3b36e2
 have-watchdog: false
 stonith-enabled: true

5. VIP 리소스 추가

VIP 리소스를 추가하여 클러스터가 해당 IP를 관리하도록 설정합니다.

sudo pcs resource create VirtualIP ocf:heartbeat:IPaddr2 ip=192.168.0.120 \
    cidr_netmask=24 op monitor interval=30s

클러스터 상태 확인

sudo pcs status
Cluster name: my_cluster

WARNINGS:
No stonith devices and stonith-enabled is not false

Cluster Summary:
  * Stack: corosync
  * Current DC: 192.168.0.123 (version 2.1.2-ada5c3b36e2) - partition with quorum
  * Last updated: Wed Oct 30 14:40:25 2024
  * Last change:  Wed Oct 30 14:40:12 2024 by root via cibadmin on 192.168.0.121
  * 3 nodes configured
  * 1 resource instance configured

Node List:
  * Online: [ 192.168.0.121 192.168.0.122 192.168.0.123 ]

Full List of Resources:
  * VirtualIP   (ocf:heartbeat:IPaddr2):         Started 192.168.0.121

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

리소스 상태 확인

sudo pcs resource status
  * VirtualIP   (ocf:heartbeat:IPaddr2):         Started 192.168.0.121

리소스 구성 검토

sudo pcs resource config
 Resource: VirtualIP (class=ocf provider=heartbeat type=IPaddr2)
  Attributes: cidr_netmask=24 ip=192.168.0.120
  Operations: monitor interval=30s (VirtualIP-monitor-interval-30s)
              start interval=0s timeout=20s (VirtualIP-start-interval-0s)
              stop interval=0s timeout=20s (VirtualIP-stop-interval-0s)
sudo pcs resource config VirtualIP
sudo pcs resource delete VirtualIP

리소스 시작

sudo pcs resource start VirtualIP

리소스 실패 기록 지우기

sudo pcs resource cleanup VirtualIP

6. 페일오버 설정

페일오버 동작을 설정합니다. 예를 들어, VIP 리소스에 대해 마스터/슬레이브 방식을 설정할 수 있습니다.

sudo pcs resource master VirtualIPMaster VirtualIP 100

클러스터 상태 확인

클러스터 상태를 확인하여 모든 것이 올바르게 구성되었는지 확인합니다.

sudo pcs status
Cluster name: my_cluster

WARNINGS:
No stonith devices and stonith-enabled is not false

Cluster Summary:
  * Stack: corosync
  * Current DC: server3.cluster.local (version 2.1.2-ada5c3b36e2) - partition with quorum
  * Last updated: Wed Oct 30 11:57:10 2024
  * Last change:  Wed Oct 30 11:56:42 2024 by root via cibadmin on server1.cluster.local
  * 3 nodes configured
  * 1 resource instance configured

Node List:
  * Online: [ server1.cluster.local server2.cluster.local server3.cluster.local ]

Full List of Resources:
  * VirtualIP   (ocf:heartbeat:IPaddr2):         Stopped

Daemon Status:
  corosync: active/disabled
  pacemaker: active/disabled
  pcsd: active/enabled

7. 페일오버 테스트

페일오버를 테스트하려면, 주 노드에서 VIP 리소스를 내리고 다른 노드로 이전할 수 있습니다.

현재 VIP 리소스가 실행 중인 노드를 확인합니다.

sudo pcs status

해당 노드에서 VIP 리소스를 강제로 이동합니다.

sudo pcs resource move VirtualIP

상태를 확인하여 VIP가 다른 노드로 이동했는지 확인합니다.

sudo pcs status

8. 클러스터 복원

테스트가 끝난 후, 리소스를 원래 노드로 복원할 수 있습니다.

sudo pcs resource clear VirtualIP

Corosync와 Pacemaker를 이용한 HA 클러스터가 설정되고, VIP를 통해 페일오버 테스트를 진행할 수 있습니다.

 

 

728x90