Friday, February 26, 2021

[389-users] How to analyze large Multi Master Replication (test)-network?

We are testing scenarios with Multi Master Replication (MMR) with many replicas. We need to gather experience in this area.
The current scenario is a star topology with one central server with replication agreements with all others, and the others only with the central server. It is scaled up to 63 servers replicating with the central server.
To analyze the behaviour in certain situations we would like to have a reliable indicator for the status of the replications.

What we found was the attribute nsds50ruv of the replication agreement. RedHat documentation says:

Information about the replication topology — all of the suppliers which are supplying updates to each other and other replicas within the same replication group — is contained in a set of metadata called the replica update vector (RUV). The RUV contains information about the supplier like its ID and URL, its latest change state number for changes made on the local server, and the CSN of the first change. Both suppliers and consumers store RUV information, and they use it to control replication updates.

1. How is the attribute managed and used, and are there limitations for highly scaled MMR scenarios?
2. Is it necessary that all replicas of all servers in the MMR are listed in this attribute? Are the entries automatically distributed, and who controls this?
3. How do those entries have to look like for the certain server "types": certal servier with replication agreements to all others (1) resp. the "others" (2)?
(1) {replicageneration} <ID> + n * {replica <replica ID> ldap://<FQDN>:389}
(2) {replicageneration} <ID> + {replica <replica ID of central server> ldap://<FQDN of central server>:389}
4. Some entries end with some kind of IDs, possibly replication generation IDs, some don't. Why is that so?
5. What is the meaning of the identifier under the attribute nsruvreplicalastmodified; i.e: replica 1 ldap://<FQDN>:389} 00000000
6. What is a replication group and is that a way to help handling a highly scaled MMR scenario while limiting replication traffic and conflicts, or does "replication group" mean the complete network of the servers in the topology?
7. What must be the order to set up the MMR network? First set up all other MMR server and then setup the central server?
8. We have noticed that the entries are made only after a reboot of the central MMR server. Is this correct or are there other steps that are required to configure the MMR?

Documentation read so far:
[RHC 10, Ch. 3.1.16.12] : https://access.redhat.com/documentation/en-us/red_hat_directory_server/10/html/configuration_command_and_file_reference/core_server_configuration_reference#cn-cleanallruv-cn-tasks
[RHA 10, Ch. 15]: https://access.redhat.com/documentation/en-us/red_hat_directory_server/10/html/administration_guide/removing-supplier-cleanly
https://www.port389.org/docs/389ds/design/csn-pending-lists-and-ruv-update.html
https://directory.fedoraproject.org/docs/389ds/howto/howto-cleanruv.html

Best regards,
Eugen
_______________________________________________
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org
Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure

No comments:

Post a Comment