Tuesday, October 19, 2021

[389-users] Re: changelog program - _cl5AddThread - Invalid changelog state - 2

On 19-10-2021 15:58, Kees Bakker wrote:
On 19-10-2021 14:13, Mark Reynolds wrote:
On 10/19/21 5:35 AM, Kees Bakker wrote:
On 18-10-2021 20:18, Mark Reynolds wrote:
On 10/18/21 1:52 PM, Kees Bakker wrote:
On 18-10-2021 16:30, Mark Reynolds wrote:
On 10/18/21 8:17 AM, Kees Bakker wrote:
Hi,

Today I tried 389-base 1.4.4.17 for a fix of retro cl trimming [1]

Unfortunately the ns-slapd got into some sort of deadlock, I think.
Anyway, I reverted 389-base back to 1.4.3.23.

Yeah the replication changelog was moved in 1.4.4, so by
downgrading you
most likely corrupted the changelog.  Stop the server, remove the old
changelog: /var/lib/dirsrv/slapd-INST/db/changelogdb, and the new one
/var/lib/dirsrc/slapd-INST/db/userroot/replication_changelog.db
Hmm. I don't have these (files?).

/var/lib/dirsrv/slapd-INST/db/changelogdb/  <===  this directory, and
its contents, are the 1.4.3 replication changelog (typically defined in
cn=changelog5,cn=config).
We don't have cn=changelog5,cn=config

Right, you need to create it to fix the issue on 1.4.3, like in the link
I sent you below.  Otherwise you have no changelog and replication can
not work.
Ah, you're right. I really appreciate your feedback.

I looked on all the backups, we never had /var/lib/dirsrv/slapd-INST/db/changelogdb/
However, we did have /var/lib/dirsrv/slapd-INST/cldb/ which is now gone.

With that in mind I constructed an ldapmodify to re-created the cldb database.

From what I can see the replication agreements are still present. I now have to make sense of the last few bullets
  • Use one supplier, a data master, as the source for initializing consumers.

Check.

  • Do not reinitialize a data master when the replication agreements are created. For example, do not initialize server1 from server2 if server2 has already been initialized from server1.

Makes sense, if I understand correctly.

  • For a multi-master scenario, initialize all of the other master servers in the configuration from one master.

Check.

  • For cascading replication, initialize all of the hubs from a supplier, then initialize the consumers from the hubs.

Not applicable for us.

Everything is back to normal, it seems. In summary, I did:
* re-create the changelog db
* re-initialize the replica
* restart dirsrv
ipa-healthcheck is happy too.




HTH,
Mark


On the other hand, we do have cn=changelog,cn=ldbm
database,cn=plugins,cn=config
with
nsslapd-directory: /var/lib/dirsrv/slapd-INST/db/changelog


/var/lib/dirsrc/slapd-INST/db/userroot/replication_changelog.db <=== in
1.4.4 we moved the global replication changelog into a database file for
each backend.

If you don't see these, then there is nothing to clean up.


I do have this directory: /var/lib/dirsrv/slapd-INST/db/changelog
Should I remove that whole directory?

No, that is the retro changelog database.  There is no need to remove
it.

So I suspect the downgrade to 1.4.3 screwed everything up.  So sounds
like you need to simply recreate the replication changelog (if you are
are staying on 1.4.3), please follow the instructions from this link:

https://access.redhat.com/documentation/en-us/red_hat_directory_server/10/html/administration_guide/managing_replication-configuring-replication-cmd#Configuring-Replication-Suppliers-cmd

I don't feel comfortable to execute the commands in this document. If
there are no  "simpler" methods then I will probably re-install the
replica. In any event, I want to be very careful before I take the
next step.

Just to be safe, it might be a good idea to restart the server after
adding the replication changelog config entry.

HTH,
Mark



Start the server, and reinit the agreements on this server

That should do it.

Mark


But now I have a replication problem. Could this have been caused by
the update to 1.4.4.17 ? And, if yes, how can I fix this?

[18/Oct/2021:12:17:41.750334062 +0200] - ERR -
NSMMReplicationPlugin -
changelog program - _cl5AddThread - Invalid changelog state - 2
[18/Oct/2021:12:17:41.782505596 +0200] - ERR -
NSMMReplicationPlugin -
send_updates - agmt="cn=iparep4.example.com-to-rotte.example.com"
(rotte:389): Changelog database was in an incorrect state
[18/Oct/2021:12:17:41.827732779 +0200] - ERR -
NSMMReplicationPlugin -
repl5_inc_run - agmt="cn=iparep4.example.com-to-rotte.example.com"
(rotte:389): Incremental update failed and requires administrator
action

[1] https://github.com/389ds/389-ds-base/pull/4895

--
Directory Server Development Team


--
Directory Server Development Team


--
Directory Server Development Team



_______________________________________________  389-users mailing list -- 389-users@lists.fedoraproject.org  To unsubscribe send an email to 389-users-leave@lists.fedoraproject.org  Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/  List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines  List Archives: https://lists.fedoraproject.org/archives/list/389-users@lists.fedoraproject.org  Do not reply to spam on the list, report it: https://pagure.io/fedora-infrastructure  

No comments:

Post a Comment