Tuesday, June 19, 2018

[389-users] Master-slave replication procedure

Hello,

I'm trying hard to figure out the right (ansible-automated) procedure for setting up master-slave replication, but I often get RUV errors on agreements pointing to already initialized replicas.

My scenario is with 4 master servers (with multimaster replication working correctly) and 4 (independent) slave servers.

List of steps:

0) setup master-master replication between master servers (works OK)

1) create replication user cn=myreplicationusername,cn=config on all slaves 

2) create LDAP entry:
dn: cn=replica,cn="dc=test,dc=com",cn=mapping tree,cn=config; 
      nsds5replicaroot: "dc=test,dc=com"
      nsds5replicaid: "{{ range(1,65530) | random }}"
      nsds5replicatype: "2"
      nsds5ReplicaBindDN: "cn=myreplicationusername,cn=config"
      nsds5flags: "0"


3) create ro agreement from every master to every slave
on every master server, create LDAP entry
for every slave:
    dn: "cn=ro-to-{{ one of slaves }},cn=replica,cn="dc=test,dc=com",cn=mapping tree,cn=config"
    objectClass:
      - nsds5replicationagreement
      - top
    attributes:
      nsds5replicahost: "{{ one of slaves }}"
      nsds5replicaport: "389"
      nsds5ReplicaBindDN: "cn=myreplicationusername,cn=config"
      nsds5replicabindmethod: "SIMPLE"
      nsds5ReplicaTransportInfo: "LDAP"
      nsds5replicaroot: "dc=test,dc=com"
      description: "Agreement between {{ me }} and {{ one of slaves }}"
      nsds5replicaupdateschedule: "0001-2359 0123456"
      nsds5replicatedattributelist: "(objectclass=*) $ EXCLUDE authorityRevocationList"
      nsds5replicacredentials: "unbreakable"


4) refresh replicas (Created in 2)) on all hosts except the first master 

on {{ first master server }} update all agreements with nsds5BeginReplicaRefresh: "start" 

5) wait until nsds5BeginReplicaRefresh attribute disappears

6) run tests. 

And this is the pain point and the reason I'm emailing the list - I add a dummy record to every master server and check it on all slaves.

But tests often fail on a random server.

# ./test.sh
Testing master-slave replication ...
-----------
Adding entry to ldap-master01.test.com
adding new entry "uid=slave-repl-test-1,dc=test,dc=com"

Checking entry on slave servers
Checking uid=slave-repl-test-1 on ldap-slave01 ... 1 results ✓
Checking uid=slave-repl-test-1 on ldap-slave02 ... 1 results ✓
Checking uid=slave-repl-test-1 on ldap-slave03 ... 1 results ✓
Checking uid=slave-repl-test-1 on ldap-slave04 ... 0 results ☠
Removing entry from ldap-master01
deleting entry "uid=slave-repl-test-1,dc=test,dc=com"

-----------
Adding entry to ldap-master02.test.com
adding new entry "uid=slave-repl-test-2,dc=test,dc=com"

Checking entry on slave servers
Checking uid=slave-repl-test-2 on ldap-slave01 ... 1 results ✓
Checking uid=slave-repl-test-2 on ldap-slave02 ... 1 results ✓
Checking uid=slave-repl-test-2 on ldap-slave03 ... 1 results ✓
Checking uid=slave-repl-test-2 on ldap-slave04 ... 0 results ☠
Removing entry from ldap-master02
deleting entry "uid=slave-repl-test-2,dc=test,dc=com"

-----------
Adding entry to ldap-master03.test.com
adding new entry "uid=slave-repl-test-3,dc=test,dc=com"

Checking entry on slave servers
Checking uid=slave-repl-test-3 on ldap-slave01 ... 1 results ✓
Checking uid=slave-repl-test-3 on ldap-slave02 ... 1 results ✓
Checking uid=slave-repl-test-3 on ldap-slave03 ... 1 results ✓
Checking uid=slave-repl-test-3 on ldap-slave04 ... 0 results ☠
Removing entry from ldap-master03
deleting entry "uid=slave-repl-test-3,dc=test,dc=com"

-----------
Adding entry to ldap-master04.test.com
adding new entry "uid=slave-repl-test-4,dc=test,dc=com"

Checking entry on slave servers
Checking uid=slave-repl-test-4 on ldap-slave01 ... 1 results ✓
Checking uid=slave-repl-test-4 on ldap-slave02 ... 1 results ✓
Checking uid=slave-repl-test-4 on ldap-slave03 ... 1 results ✓
Checking uid=slave-repl-test-4 on ldap-slave04 ... 0 results ☠
Removing entry from ldap-master04
deleting entry "uid=slave-repl-test-4,dc=test,dc=com"

List agreement update status on ldap-master01:

ldap-master01: 

dn: cn=ro-to-ldap-slave01.test.com,cn=replica,cn=dc\3Dtest\2Cdc\3Dcom,cn=mapping tree,cn=config
cn: ro-to-ldap-slave01.test.com
nsds5replicaLastUpdateStatus: Error (1) Can't acquire busy replica

dn: cn=ro-to-ldap-slave02.test.com,cn=replica,cn=dc\3Dtest\2Cdc\3Dcom,cn=mapping tree,cn=config
cn: ro-to-ldap-slave02.test.com
nsds5replicaLastUpdateStatus: Error (1) Can't acquire busy replica

dn: cn=ro-to-ldap-slave03.test.com,cn=replica,cn=dc\3Dtest\2Cdc\3Dcom,cn=mapping tree,cn=config
cn: ro-to-ldap-slave03.test.com
nsds5replicaLastUpdateStatus: Error (1) Can't acquire busy replica

dn: cn=ro-to-ldap-slave04.test.com,cn=replica,cn=dc\3Dtest\2Cdc\3Dcom,cn=mapping tree,cn=config
cn: ro-to-ldap-slave04.test.com
nsds5replicaLastUpdateStatus: Error (19) Replication error acquiring replica: Replica has different database generation ID, remote replica may need to be initialized (RUV error)



The fourth agreement seems uninitialized; but surely it was. I know that "Can't acquire busy replica" is fine.

What am I doing wrong?

389-ds 1.3.7.10-1ubuntu1 on Ubuntu 18.04.

Thank you for help

Michal

No comments:

Post a Comment