Friday, June 2, 2017

[389-users] Re: Broken replicas and CleanRUV question

On 06/02/2017 08:47 AM, Predrag Zečević - Technical Support Analyst wrote:
> On 05/31/17 20:44, Mark Reynolds wrote:
>>
>>
>> On 05/31/2017 06:00 AM, Predrag Zečević - Technical Support Analyst
>> wrote:
>>> Hi all,
>>>
>>> long time ago we have started with 389-DS and due to lack of
>>> experience I have installed and used admin server (which is abandoned
>>> later, because it is too complicated and requires someone at keyboard).
>>>
>>> As consequence of that, we have started to replicate netscapeRoot
>>> space... During time, we have upgraded s/w from initial
>>> 389-ds-1.2.1-1.el5 (started from FDS repository, moved to EPEL one
>>> later) to today's 389-ds-base-1.3.5.14-1.el6.x86_64 (this one is
>>> compiled from source and that was introduced before we have migrated
>>> boxes from RHEL5 to RHEL6 - actually CentOS OS).
>>>
>>> During various phases of upgrades, netscapeRoot replicas went out of
>>> sync (we did not spotted that, because of bug in monitoring script -
>>> that is another issue).
>>>
>>> Our setup includes MultiMaster ReadWrite replication (ldap1 <-->
>>> ldap2) and one ReadOnly (ldap3, consumes from both suppliers in MMR).
>>>
>>> Right now, this:
>>> $ for ldap in ldap1 ldap2; do
>>> ldapsearch -x -H ldaps://${ldap}.MyDomain.com -b "cn=mapping
>>> tree,cn=config" -D "cn=Directory Manager" -w ${DMPASS} -o ldif-wrap=no
>>> objectClass=nsDS5ReplicationAgreement |\
>>> awk -vLDAP=${ldap} '/^dn/ {printf("#===== %s =====#\n%s\n", LDAP,
>>> $0); next}; /^nsDS5ReplicaHost:/ {printf("%s\n", $0); next;};
>>> /^nsds5replicaLastUpdateStatus:/ {printf("%s\n", $0); next;}'
>>> done
>>>
>>> returns (I have excluded working MyDomain replicas output):
>>> $ #===== ldap1 =====#
>>> dn: cn=2eLDAPmmr,cn=replica,cn=o\3Dnetscaperoot,cn=mapping
>>> tree,cn=config
>>> nsDS5ReplicaHost: ldap2.MyDomain.com
>>> nsds5replicaLastUpdateStatus: Error (0) No replication sessions
>>> started since server startup
>>> #===== ldap1 =====#
>>> dn: cn=2eLDAPror,cn=replica,cn=o\3Dnetscaperoot,cn=mapping
>>> tree,cn=config
>>> nsDS5ReplicaHost: ldap3.MyDomain.com
>>> nsds5replicaLastUpdateStatus: Error (0) No replication sessions
>>> started since server startup
>>> #===== ldap2 =====#
>>> dn: cn=2eLDAPmmr,cn=replica,cn=o\3Dnetscaperoot,cn=mapping
>>> tree,cn=config
>>> nsDS5ReplicaHost: ldap1.MyDomain.com
>>> nsds5replicaLastUpdateStatus: Error (0) No replication sessions
>>> started since server startup
>>> #===== ldap2 =====#
>>> dn: cn=2eLDAPror,cn=replica,cn=o\3Dnetscaperoot,cn=mapping
>>> tree,cn=config
>>> nsDS5ReplicaHost: ldap3.MyDomain.com
>>> nsds5replicaLastUpdateStatus: Error (0) No replication sessions
>>> started since server startup
>>>
>>> I have tried various tricks to recover that replication, but w/o
>>> luck...
>>>
>>> When I check (for example ldap1) with this:
>>> $ ldapsearch -xLLLo ldif-wrap=no -H ldaps://ldap1.MyDomain.com -D
>>> 'cn=directory manager' -w ${DMPASS} -b o=netscapeRoot
>>> '(&(nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff)(objectclass=nstombstone))'
>>>
>>>
>>> I get as result:
>>> dn: cn=replica,cn=o\3Dnetscaperoot,cn=mapping tree,cn=config
>>> objectClass: nsDS5Replica
>>> objectClass: top
>>> nsDS5ReplicaRoot: o=netscaperoot
>>> nsDS5ReplicaType: 3
>>> nsDS5Flags: 1
>>> nsDS5ReplicaId: 11
>>> nsds5ReplicaPurgeDelay: 604800
>>> nsDS5ReplicaBindDN: cn=replication manager,cn=config
>>> nsDS5ReplicaReferral: ldap://ldap2.MyDomain.com:636/o%3dnetscaperoot
>>> cn: replica
>>> nsState:: CwAAAAAAAACRKiRZAAAAAAAAAAAAAAAAAQAAAAAAAAAAAAAAAAAAAA==
>>> nsDS5ReplicaName: dc964102-1dd111b2-8970c75e-63880000
>>> nsds50ruv: {replicageneration} 4dcb9f790000000b0000
>>> nsds50ruv: {replica 11 ldap://ldap1.MyDomain.com:0}
>>> nsds50ruv: {replica 21 ldap://ldap2.MyDomain.com:0}
>>> 4dda4a3a000000150000 4fd5f742000300150000
>>> nsds5agmtmaxcsn:
>>> o=netscaperoot;2eLDAPror;ldap3.MyDomain.com;636;unavailable
>>> nsruvReplicaLastModified: {replica 11 ldap://ldap1.MyDomain.com:0}
>>> 00000000
>>> nsruvReplicaLastModified: {replica 21 ldap://ldap2.MyDomain.com:0}
>>> 00000000
>>> nsds5ReplicaChangeCount: 1
>>> nsds5replicareapactive: 0
>>>
>>> Tried to CleanRUV (ldif applied with ldapmodify command to all
>>> suppliers and consumers):
>>>
>>> $ cat /tmp/ldap.cleanRUV-tasks-for-netscapeRoot-replica.11.ldif
>>> dn: cn=replica,cn=o\3Dnetscaperoot,cn=mapping tree,cn=config
>>> changetype: modify
>>> replace: nsds5task
>>> nsds5task: CLEANRUV11
>>>
>>> At some moment, ldap1 replied:
>>> "ldap_modify: Server is unwilling to perform (53)"
>>>
>>> which explains nothing, because that error means:
>>>
>>> "Indicates that the LDAP server cannot process the request because of
>>> server-defined restrictions. This error is returned for the following
>>> reasons: The add entry request violates the server's structure
>>> rules...OR...The modify attribute request specifies attributes that
>>> users cannot modify...OR...Password restrictions prevent the
>>> action...OR...Connection restrictions prevent the action. "
>>>
>>> Right now, CleanRUV task is stuck...
>> You should be using the cleanAllRUV task:
>>
>> https://access.redhat.com/documentation/en-us/red_hat_directory_server/10/html/configuration_command_and_file_reference/perl_scripts#cleanallruv.pl
>>
>
> Hi Mark,
>
> I have tried perl script from above:
>
> LDAP1# /usr/sbin/cleanallruv.pl -v -Z ldap1 -D "cn=directory manager"
> -w ${DMPASS} -b "dn:\
> cn=2eLDAPmmr,cn=replica,cn=o\3Dnetscaperoot,cn=mapping tree,cn=config"
> -r 11 -P LDAPS

Hi Predrag,

Close, but it's this:

/usr/sbin/cleanallruv.pl -v -Z ldap1 -D "cn=directory manager" -w
${DMPASS} -b "o=netscaperoot" -r 11 -P LDAPS

Regards,
Mark

> ldap_initialize( ldaps://ldap1.MyDomain.com:636/??base )
> ldap_add: Operations error (1)
> additional info: Could not find replica from dn((null))
> Failed to add task entry "cn=cleanallruv_2017_6_2_14_31_58,
> cn=cleanallruv, cn=tasks, cn=config" error (1)
>
> LDAP1# /usr/sbin/cleanallruv.pl -v -Z ldap1 -D "cn=directory manager"
> -w ${DMPASS} -b
> "cn=2eLDAPmmr,cn=replica,cn=o\3Dnetscaperoot,cn=mapping
> tree,cn=config" -r 11 -P LDAPS
> ldap_initialize( ldaps://ldap1.MyDomain.com:636/??base )
> ldap_add: Operations error (1)
> additional info: Could not find replica from
> dn(cn=2eLDAPmmr,cn=replica,cn=o\3Dnetscaperoot,cn=mapping tree,cn=config)
> Failed to add task entry "cn=cleanallruv_2017_6_2_14_42_1,
> cn=cleanallruv, cn=tasks, cn=config" error (1)
>
> Replica DN specified as '"cn=replica,cn=o\3Dnetscaperoot,cn=mapping
> tree,cn=config"' also fails... Although THIS was returned as DN from
> ldapsearch command:
> $ ldapsearch -xLLLo ldif-wrap=no -H ldaps://ldap1.MyDomain.com -D
> 'cn=directory manager' -w ${DMPASS} -b o=netscapeRoot
> '(&(nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff)(objectclass=nstombstone))'
>
> What string I have to specify as replica DN?
>
> Thanks in advance.
>
> With best regards.
> Predrag Zečević
>
>> or read up on:
>>
>> http://www.port389.org/docs/389ds/howto/howto-cleanruv.html#cleanallruv
>>
>> It also looks like your replicas are not initialized - so I would also
>> try that after cleaning out the old replica ids(ruvs).
>>> and replication is still broken... Similar situation is present on
>>> ldap2, with RUV 21 (if not worse):
>>>
>>> dn: cn=replica,cn=o\3Dnetscaperoot,cn=mapping tree,cn=config
>>> objectClass: nsDS5Replica
>>> objectClass: top
>>> nsDS5ReplicaRoot: o=netscaperoot
>>> nsDS5ReplicaType: 3
>>> nsDS5Flags: 1
>>> nsDS5ReplicaId: 21
>>> nsds5ReplicaPurgeDelay: 604800
>>> nsDS5ReplicaBindDN: cn=replication manager,cn=config
>>> nsDS5ReplicaReferral: ldap://ldap1.MyDomain.com:636/o%3dnetscaperoot
>>> cn: replica
>>> nsState:: FQAAAAAAAADeiyVZAAAAAAAAAAAAAAAAAQAAAAAAAAABAAAAAAAAAA==
>>> nsDS5ReplicaName: cb016902-1dd111b2-821cbcea-f7780000
>>> nsds50ruv: {replicageneration} 4dcb9f790000000b0000
>>> nsds50ruv: {replica 21 ldap://ldap2.MyDomain.com:0}
>>> 4dda4a3a000000150000 4fd5f742000300150000
>>> nsds5agmtmaxcsn:
>>> o=netscaperoot;2eLDAPmmr;ldap1.MyDomain.com;636;unavailable
>>> nsds5agmtmaxcsn:
>>> o=netscaperoot;2eLDAPror;ldap3.MyDomain.com;636;unavailable
>>> nsruvReplicaLastModified: {replica 21 ldap://ldap2.MyDomain.com:0}
>>> 00000000
>>> nsds5ReplicaChangeCount: 1
>>> nsds5replicareapactive: 0
>>>
>>>
>>> # What would be proper way to get out from this situation?
>>> # Do I have to execute CleanAllRUV task and start replication from
>>> scratch or there is better way?
>>>
>>> BTW, loglevel is set to 8192, so from ldap1 logs:
>>> $ sudo grep cleanruv_task: /var/log/dirsrv/slapd-ldap?/errors
>>> [31/May/2017:09:11:39 +0200] NSMMReplicationPlugin - cleanruv_task:
>>> cleaning rid (11)...
>>>
>>> we see that task is "started" and never finished
>>>
>>> Any advice or documentation (which is more up-2-date) than:
>>> *
>>> http://directory.fedoraproject.org/docs/389ds/howto/howto-cleanruv.html#cleanruv
>>>
>>> *
>>> https://access.redhat.com/documentation/en-us/red_hat_directory_server/9.0/html/administration_guide/managing_replication-solving_common_replication_conflicts
>>>
>>> *
>>> http://directory.fedoraproject.org/docs/389ds/FAQ/troubleshoot-cleanallruv.html
>>>
>>> (CleanRUV FAQ troubleshooting is missing at all)
>>>
>>> is welcome.
>>>
>>> With best regards.
>>> Predrag Zečević
>>
>
_______________________________________________
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org

No comments:

Post a Comment