Tuesday, June 19, 2018

[389-users] Re: Master-slave replication procedure



On 06/19/2018 04:47 PM, Michal Medvecky wrote:
Hello,

I'm trying hard to figure out the right (ansible-automated) procedure for setting up master-slave replication, but I often get RUV errors on agreements pointing to already initialized replicas.

My scenario is with 4 master servers (with multimaster replication working correctly) and 4 (independent) slave servers.

List of steps:

0) setup master-master replication between master servers (works OK)

1) create replication user cn=myreplicationusername,cn=config on all slaves 

2) create LDAP entry:
dn: cn=replica,cn="dc=test,dc=com",cn=mapping tree,cn=config; 
      nsds5replicaroot: "dc=test,dc=com"
      nsds5replicaid: "{{ range(1,65530) | random }}"
If these are read-only consumers the replica ID must be 65535 (for all of them) - not a random number.  Only masters get unique replica  IDs.  This is probably your problem.  Fix this first, and if you still have problems then turn on replication error logging and share the results.

Thanks,
Mark
      nsds5replicatype: "2"
      nsds5ReplicaBindDN: "cn=myreplicationusername,cn=config"
      nsds5flags: "0"


3) create ro agreement from every master to every slave
on every master server, create LDAP entry
for every slave:
    dn: "cn=ro-to-{{ one of slaves }},cn=replica,cn="dc=test,dc=com",cn=mapping tree,cn=config"
    objectClass:
      - nsds5replicationagreement
      - top
    attributes:
      nsds5replicahost: "{{ one of slaves }}"
      nsds5replicaport: "389"
      nsds5ReplicaBindDN: "cn=myreplicationusername,cn=config"
      nsds5replicabindmethod: "SIMPLE"
      nsds5ReplicaTransportInfo: "LDAP"
      nsds5replicaroot: "dc=test,dc=com"
      description: "Agreement between {{ me }} and {{ one of slaves }}"
      nsds5replicaupdateschedule: "0001-2359 0123456"
      nsds5replicatedattributelist: "(objectclass=*) $ EXCLUDE authorityRevocationList"
      nsds5replicacredentials: "unbreakable"


4) refresh replicas (Created in 2)) on all hosts except the first master 

on {{ first master server }} update all agreements with nsds5BeginReplicaRefresh: "start" 

5) wait until nsds5BeginReplicaRefresh attribute disappears

6) run tests. 

And this is the pain point and the reason I'm emailing the list - I add a dummy record to every master server and check it on all slaves.

But tests often fail on a random server.

# ./test.sh
Testing master-slave replication ...
-----------
Adding entry to ldap-master01.test.com
adding new entry "uid=slave-repl-test-1,dc=test,dc=com"

Checking entry on slave servers
Checking uid=slave-repl-test-1 on ldap-slave01 ... 1 results ✓
Checking uid=slave-repl-test-1 on ldap-slave02 ... 1 results ✓
Checking uid=slave-repl-test-1 on ldap-slave03 ... 1 results ✓
Checking uid=slave-repl-test-1 on ldap-slave04 ... 0 results ☠
Removing entry from ldap-master01
deleting entry "uid=slave-repl-test-1,dc=test,dc=com"

-----------
Adding entry to ldap-master02.test.com
adding new entry "uid=slave-repl-test-2,dc=test,dc=com"

Checking entry on slave servers
Checking uid=slave-repl-test-2 on ldap-slave01 ... 1 results ✓
Checking uid=slave-repl-test-2 on ldap-slave02 ... 1 results ✓
Checking uid=slave-repl-test-2 on ldap-slave03 ... 1 results ✓
Checking uid=slave-repl-test-2 on ldap-slave04 ... 0 results ☠
Removing entry from ldap-master02
deleting entry "uid=slave-repl-test-2,dc=test,dc=com"

-----------
Adding entry to ldap-master03.test.com
adding new entry "uid=slave-repl-test-3,dc=test,dc=com"

Checking entry on slave servers
Checking uid=slave-repl-test-3 on ldap-slave01 ... 1 results ✓
Checking uid=slave-repl-test-3 on ldap-slave02 ... 1 results ✓
Checking uid=slave-repl-test-3 on ldap-slave03 ... 1 results ✓
Checking uid=slave-repl-test-3 on ldap-slave04 ... 0 results ☠
Removing entry from ldap-master03
deleting entry "uid=slave-repl-test-3,dc=test,dc=com"

-----------
Adding entry to ldap-master04.test.com
adding new entry "uid=slave-repl-test-4,dc=test,dc=com"

Checking entry on slave servers
Checking uid=slave-repl-test-4 on ldap-slave01 ... 1 results ✓
Checking uid=slave-repl-test-4 on ldap-slave02 ... 1 results ✓
Checking uid=slave-repl-test-4 on ldap-slave03 ... 1 results ✓
Checking uid=slave-repl-test-4 on ldap-slave04 ... 0 results ☠
Removing entry from ldap-master04
deleting entry "uid=slave-repl-test-4,dc=test,dc=com"

List agreement update status on ldap-master01:

ldap-master01: 

dn: cn=ro-to-ldap-slave01.test.com,cn=replica,cn=dc\3Dtest\2Cdc\3Dcom,cn=mapping tree,cn=config
cn: ro-to-ldap-slave01.test.com
nsds5replicaLastUpdateStatus: Error (1) Can't acquire busy replica

dn: cn=ro-to-ldap-slave02.test.com,cn=replica,cn=dc\3Dtest\2Cdc\3Dcom,cn=mapping tree,cn=config
cn: ro-to-ldap-slave02.test.com
nsds5replicaLastUpdateStatus: Error (1) Can't acquire busy replica

dn: cn=ro-to-ldap-slave03.test.com,cn=replica,cn=dc\3Dtest\2Cdc\3Dcom,cn=mapping tree,cn=config
cn: ro-to-ldap-slave03.test.com
nsds5replicaLastUpdateStatus: Error (1) Can't acquire busy replica

dn: cn=ro-to-ldap-slave04.test.com,cn=replica,cn=dc\3Dtest\2Cdc\3Dcom,cn=mapping tree,cn=config
cn: ro-to-ldap-slave04.test.com
nsds5replicaLastUpdateStatus: Error (19) Replication error acquiring replica: Replica has different database generation ID, remote replica may need to be initialized (RUV error)
.

The fourth agreement seems uninitialized; but surely it was.
I know that "Can't acquire busy replica" is fine.

What am I doing wrong?

389-ds 1.3.7.10-1ubuntu1 on Ubuntu 18.04.

Thank you for help

Michal


_______________________________________________  389-users mailing list -- 389-users@lists.fedoraproject.org  To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org  Fedora Code of Conduct: https://getfedora.org/code-of-conduct.html  List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines  List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org/message/5RYYJTZKPOAXXM2MJXYL6T2X6UDUVGPI/  

No comments:

Post a Comment