Wednesday, June 14, 2017

[389-users] Re: Broken replicas and CleanRUV question

On 06/02/17 16:22, Mark Reynolds wrote:
>
>
> On 06/02/2017 08:47 AM, Predrag Zečević - Technical Support Analyst wrote:
>> On 05/31/17 20:44, Mark Reynolds wrote:
>>>
>>>
>>> On 05/31/2017 06:00 AM, Predrag Zečević - Technical Support Analyst
>>> wrote:
>>>> Hi all,
>>>>
>>>> long time ago we have started with 389-DS and due to lack of
>>>> experience I have installed and used admin server (which is abandoned
>>>> later, because it is too complicated and requires someone at keyboard).
>>>>
>>>> As consequence of that, we have started to replicate netscapeRoot
>>>> space... During time, we have upgraded s/w from initial
>>>> 389-ds-1.2.1-1.el5 (started from FDS repository, moved to EPEL one
>>>> later) to today's 389-ds-base-1.3.5.14-1.el6.x86_64 (this one is
>>>> compiled from source and that was introduced before we have migrated
>>>> boxes from RHEL5 to RHEL6 - actually CentOS OS).
>>>>
>>>> During various phases of upgrades, netscapeRoot replicas went out of
>>>> sync (we did not spotted that, because of bug in monitoring script -
>>>> that is another issue).
>>>>
>>>> Our setup includes MultiMaster ReadWrite replication (ldap1 <-->
>>>> ldap2) and one ReadOnly (ldap3, consumes from both suppliers in MMR).
>>>>
>>>> Right now, this:
>>>> $ for ldap in ldap1 ldap2; do
>>>> ldapsearch -x -H ldaps://${ldap}.MyDomain.com -b "cn=mapping
>>>> tree,cn=config" -D "cn=Directory Manager" -w ${DMPASS} -o ldif-wrap=no
>>>> objectClass=nsDS5ReplicationAgreement |\
>>>> awk -vLDAP=${ldap} '/^dn/ {printf("#===== %s =====#\n%s\n", LDAP,
>>>> $0); next}; /^nsDS5ReplicaHost:/ {printf("%s\n", $0); next;};
>>>> /^nsds5replicaLastUpdateStatus:/ {printf("%s\n", $0); next;}'
>>>> done
>>>>
>>>> returns (I have excluded working MyDomain replicas output):
>>>> $ #===== ldap1 =====#
>>>> dn: cn=2eLDAPmmr,cn=replica,cn=o\3Dnetscaperoot,cn=mapping
>>>> tree,cn=config
>>>> nsDS5ReplicaHost: ldap2.MyDomain.com
>>>> nsds5replicaLastUpdateStatus: Error (0) No replication sessions
>>>> started since server startup
>>>> #===== ldap1 =====#
>>>> dn: cn=2eLDAPror,cn=replica,cn=o\3Dnetscaperoot,cn=mapping
>>>> tree,cn=config
>>>> nsDS5ReplicaHost: ldap3.MyDomain.com
>>>> nsds5replicaLastUpdateStatus: Error (0) No replication sessions
>>>> started since server startup
>>>> #===== ldap2 =====#
>>>> dn: cn=2eLDAPmmr,cn=replica,cn=o\3Dnetscaperoot,cn=mapping
>>>> tree,cn=config
>>>> nsDS5ReplicaHost: ldap1.MyDomain.com
>>>> nsds5replicaLastUpdateStatus: Error (0) No replication sessions
>>>> started since server startup
>>>> #===== ldap2 =====#
>>>> dn: cn=2eLDAPror,cn=replica,cn=o\3Dnetscaperoot,cn=mapping
>>>> tree,cn=config
>>>> nsDS5ReplicaHost: ldap3.MyDomain.com
>>>> nsds5replicaLastUpdateStatus: Error (0) No replication sessions
>>>> started since server startup
>>>>
>>>> I have tried various tricks to recover that replication, but w/o
>>>> luck...
>>>>
>>>> When I check (for example ldap1) with this:
>>>> $ ldapsearch -xLLLo ldif-wrap=no -H ldaps://ldap1.MyDomain.com -D
>>>> 'cn=directory manager' -w ${DMPASS} -b o=netscapeRoot
>>>> '(&(nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff)(objectclass=nstombstone))'
>>>>
>>>>
>>>> I get as result:
>>>> dn: cn=replica,cn=o\3Dnetscaperoot,cn=mapping tree,cn=config
>>>> objectClass: nsDS5Replica
>>>> objectClass: top
>>>> nsDS5ReplicaRoot: o=netscaperoot
>>>> nsDS5ReplicaType: 3
>>>> nsDS5Flags: 1
>>>> nsDS5ReplicaId: 11
>>>> nsds5ReplicaPurgeDelay: 604800
>>>> nsDS5ReplicaBindDN: cn=replication manager,cn=config
>>>> nsDS5ReplicaReferral: ldap://ldap2.MyDomain.com:636/o%3dnetscaperoot
>>>> cn: replica
>>>> nsState:: CwAAAAAAAACRKiRZAAAAAAAAAAAAAAAAAQAAAAAAAAAAAAAAAAAAAA==
>>>> nsDS5ReplicaName: dc964102-1dd111b2-8970c75e-63880000
>>>> nsds50ruv: {replicageneration} 4dcb9f790000000b0000
>>>> nsds50ruv: {replica 11 ldap://ldap1.MyDomain.com:0}
>>>> nsds50ruv: {replica 21 ldap://ldap2.MyDomain.com:0}
>>>> 4dda4a3a000000150000 4fd5f742000300150000
>>>> nsds5agmtmaxcsn:
>>>> o=netscaperoot;2eLDAPror;ldap3.MyDomain.com;636;unavailable
>>>> nsruvReplicaLastModified: {replica 11 ldap://ldap1.MyDomain.com:0}
>>>> 00000000
>>>> nsruvReplicaLastModified: {replica 21 ldap://ldap2.MyDomain.com:0}
>>>> 00000000
>>>> nsds5ReplicaChangeCount: 1
>>>> nsds5replicareapactive: 0
>>>>
>>>> Tried to CleanRUV (ldif applied with ldapmodify command to all
>>>> suppliers and consumers):
>>>>
>>>> $ cat /tmp/ldap.cleanRUV-tasks-for-netscapeRoot-replica.11.ldif
>>>> dn: cn=replica,cn=o\3Dnetscaperoot,cn=mapping tree,cn=config
>>>> changetype: modify
>>>> replace: nsds5task
>>>> nsds5task: CLEANRUV11
>>>>
>>>> At some moment, ldap1 replied:
>>>> "ldap_modify: Server is unwilling to perform (53)"
>>>>
>>>> which explains nothing, because that error means:
>>>>
>>>> "Indicates that the LDAP server cannot process the request because of
>>>> server-defined restrictions. This error is returned for the following
>>>> reasons: The add entry request violates the server's structure
>>>> rules...OR...The modify attribute request specifies attributes that
>>>> users cannot modify...OR...Password restrictions prevent the
>>>> action...OR...Connection restrictions prevent the action. "
>>>>
>>>> Right now, CleanRUV task is stuck...
>>> You should be using the cleanAllRUV task:
>>>
>>> https://access.redhat.com/documentation/en-us/red_hat_directory_server/10/html/configuration_command_and_file_reference/perl_scripts#cleanallruv.pl
>>>
>>
>> Hi Mark,
>>
>> I have tried perl script from above:
>>
>> LDAP1# /usr/sbin/cleanallruv.pl -v -Z ldap1 -D "cn=directory manager"
>> -w ${DMPASS} -b "dn:\
>> cn=2eLDAPmmr,cn=replica,cn=o\3Dnetscaperoot,cn=mapping tree,cn=config"
>> -r 11 -P LDAPS
>
> Hi Predrag,
>
> Close, but it's this:
>
> /usr/sbin/cleanallruv.pl -v -Z ldap1 -D "cn=directory manager" -w
> ${DMPASS} -b "o=netscaperoot" -r 11 -P LDAPS
>
> Regards,
> Mark

Hi Mark,

that also did not help. Right now, when searching all replicas [using:
-b "o=netscapeRoot"
'(&(nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff)(objectclass=nstombstone))'
as base and filter]

I get (very confusing, inconsistent) results:
### ldap1 ###
dn: cn=replica,cn=o\3Dnetscaperoot,cn=mapping tree,cn=config
nsds50ruv: {replica 11 ldap://ldap1.MyDomain.com:0}
### ldap2 ###
dn: cn=replica,cn=o\3Dnetscaperoot,cn=mapping tree,cn=config
nsds50ruv: {replica 21 ldap://ldap2.MyDomain.com:0}
nsds50ruv: {replica 11 ldap://ldap1.MyDomain.com:0}
### ldap3 ###
dn: cn=replica,cn=o\3Dnetscaperoot,cn=mapping tree,cn=config
nsds50ruv: {replica 21 ldap://ldap2.MyDomain.com:0} 4dda4a3a000000150000
4fd5f742000300150000

I can afford to live without that replication (since we are not using
admin server at all), so next question is HOW to permanently remove all
agreements for NetscapeRoot from all servers involved?

Thank you in advance for your time.

With best regards.
Predrag Zečević
>
>> ldap_initialize( ldaps://ldap1.MyDomain.com:636/??base )
>> ldap_add: Operations error (1)
>> additional info: Could not find replica from dn((null))
>> Failed to add task entry "cn=cleanallruv_2017_6_2_14_31_58,
>> cn=cleanallruv, cn=tasks, cn=config" error (1)
>>
>> LDAP1# /usr/sbin/cleanallruv.pl -v -Z ldap1 -D "cn=directory manager"
>> -w ${DMPASS} -b
>> "cn=2eLDAPmmr,cn=replica,cn=o\3Dnetscaperoot,cn=mapping
>> tree,cn=config" -r 11 -P LDAPS
>> ldap_initialize( ldaps://ldap1.MyDomain.com:636/??base )
>> ldap_add: Operations error (1)
>> additional info: Could not find replica from
>> dn(cn=2eLDAPmmr,cn=replica,cn=o\3Dnetscaperoot,cn=mapping tree,cn=config)
>> Failed to add task entry "cn=cleanallruv_2017_6_2_14_42_1,
>> cn=cleanallruv, cn=tasks, cn=config" error (1)
>>
>> Replica DN specified as '"cn=replica,cn=o\3Dnetscaperoot,cn=mapping
>> tree,cn=config"' also fails... Although THIS was returned as DN from
>> ldapsearch command:
>> $ ldapsearch -xLLLo ldif-wrap=no -H ldaps://ldap1.MyDomain.com -D
>> 'cn=directory manager' -w ${DMPASS} -b o=netscapeRoot
>> '(&(nsuniqueid=ffffffff-ffffffff-ffffffff-ffffffff)(objectclass=nstombstone))'
>>
>> What string I have to specify as replica DN?
>>
>> Thanks in advance.
>>
>> With best regards.
>> Predrag Zečević
>>
>>> or read up on:
>>>
>>> http://www.port389.org/docs/389ds/howto/howto-cleanruv.html#cleanallruv
>>>
>>> It also looks like your replicas are not initialized - so I would also
>>> try that after cleaning out the old replica ids(ruvs).
>>>> and replication is still broken... Similar situation is present on
>>>> ldap2, with RUV 21 (if not worse):
>>>>
>>>> dn: cn=replica,cn=o\3Dnetscaperoot,cn=mapping tree,cn=config
>>>> objectClass: nsDS5Replica
>>>> objectClass: top
>>>> nsDS5ReplicaRoot: o=netscaperoot
>>>> nsDS5ReplicaType: 3
>>>> nsDS5Flags: 1
>>>> nsDS5ReplicaId: 21
>>>> nsds5ReplicaPurgeDelay: 604800
>>>> nsDS5ReplicaBindDN: cn=replication manager,cn=config
>>>> nsDS5ReplicaReferral: ldap://ldap1.MyDomain.com:636/o%3dnetscaperoot
>>>> cn: replica
>>>> nsState:: FQAAAAAAAADeiyVZAAAAAAAAAAAAAAAAAQAAAAAAAAABAAAAAAAAAA==
>>>> nsDS5ReplicaName: cb016902-1dd111b2-821cbcea-f7780000
>>>> nsds50ruv: {replicageneration} 4dcb9f790000000b0000
>>>> nsds50ruv: {replica 21 ldap://ldap2.MyDomain.com:0}
>>>> 4dda4a3a000000150000 4fd5f742000300150000
>>>> nsds5agmtmaxcsn:
>>>> o=netscaperoot;2eLDAPmmr;ldap1.MyDomain.com;636;unavailable
>>>> nsds5agmtmaxcsn:
>>>> o=netscaperoot;2eLDAPror;ldap3.MyDomain.com;636;unavailable
>>>> nsruvReplicaLastModified: {replica 21 ldap://ldap2.MyDomain.com:0}
>>>> 00000000
>>>> nsds5ReplicaChangeCount: 1
>>>> nsds5replicareapactive: 0
>>>>
>>>>
>>>> # What would be proper way to get out from this situation?
>>>> # Do I have to execute CleanAllRUV task and start replication from
>>>> scratch or there is better way?
>>>>
>>>> BTW, loglevel is set to 8192, so from ldap1 logs:
>>>> $ sudo grep cleanruv_task: /var/log/dirsrv/slapd-ldap?/errors
>>>> [31/May/2017:09:11:39 +0200] NSMMReplicationPlugin - cleanruv_task:
>>>> cleaning rid (11)...
>>>>
>>>> we see that task is "started" and never finished
>>>>
>>>> Any advice or documentation (which is more up-2-date) than:
>>>> *
>>>> http://directory.fedoraproject.org/docs/389ds/howto/howto-cleanruv.html#cleanruv
>>>>
>>>> *
>>>> https://access.redhat.com/documentation/en-us/red_hat_directory_server/9.0/html/administration_guide/managing_replication-solving_common_replication_conflicts
>>>>
>>>> *
>>>> http://directory.fedoraproject.org/docs/389ds/FAQ/troubleshoot-cleanallruv.html
>>>>
>>>> (CleanRUV FAQ troubleshooting is missing at all)
>>>>
>>>> is welcome.
>>>>
>>>> With best regards.
>>>> Predrag Zečević
>>>
>>
>

--
Predrag Zečević
Technical Support Analyst
2e Systems GmbH

Telephone: +49 6196 9505 815, Facsimile: +49 6196 9505 894
Mobile: +49 174 3109 288, Skype: predrag.zecevic
E-mail: predrag.zecevic@2e-systems.com

Headquarter: 2e Systems GmbH, Königsteiner Str. 87,
65812 Bad Soden am Taunus, Germany
Company registration: Amtsgericht Königstein (Germany), HRB 7303
Managing director: Phil Douglas

http://www.2e-systems.com/ - Making your business fly!
_______________________________________________
389-users mailing list -- 389-users@lists.fedoraproject.org
To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org

No comments:

Post a Comment