Tuesday, December 7, 2021

[389-users] Re: Recent commits in stable 389ds branches - discussion

Hi Mark,

thank you for your detailed reply. I do not have objections, it's just more of a return of experience of the recent changes that i wanted to share, for me some things were a bit unexpected. My comments are below.

On 12/3/21 6:29 AM, Ivanov Andrey (M.) wrote:
I'd like to discuss several recent (since a couple of months) commits in stable branches of 389ds. I will be talking about 1.4.4 https://github.com/389ds/389-ds-base/tree/389-ds-base-1.4.4 since it's the one we are using in production, but i think it's the same for 1.4.3. These commits are welcome and go in the right direction, however the changes they produce are not something one expects when the server version changes in 4th digit (ex. 1.4.4.17 -> 1.4.4.18). Here they are:
I guess we don't follow the same principles :-)  For the most part these are all minor RFE's except for Rust, but Rust has been in use in our product (1.4.x series) for well over a year now, so I'm surprised to see issues arise about it now.  But adding these RFE's is not out of line IMHO, obviously you feel a little different about that.
Yes, i would think these changes (especially the shift of certain files to /dev/shm and rust dependency during server build) should have landed in 1.4.5 (corresponding to RHEL 8.n -> REHL 8.n+1).

If i take my experience as an example, the move of DB files to /dev/shm has broken the startup of a newly created server (dscreate -f ...) since /dev/shm size by default represents only 50% of server memory. In my case the size of these files was more then 50% of memory, so i had to make adjustments (it was either to increase the memory of the VM or change the parameter db_home_dir to move the abovementioned files back to disk). As for the rust - i did not have rust installed ever on my build server, i used my usual ./configure switches that worked for 1.4.4.17, the server compiled OK but the error logs at startup were filled up with "ERR" criticity messages.  Anyway, the problem is resolved and i think that as you say we don't have the same perception of change importance vs. server version change.


1) Some database files [presumable memory-mapped files that are ok to be lost at reboot] that were previously in /var/lib/dirsrv/slapd-instance/db/ are now moved to /dev/shm/slapd-instance/. This modification seems to work fine (and should increase performance), however there is an error message at server startup when /dev/shm is empty (for example, after each OS reboot) when the server needs to create the files:
[03/Dec/2021:12:12:14.887200364 +0100] - ERR - bdb_version_write - Could not open file "/dev/shm/slapd-model/DBVERSION" for writing Netscape Portable Runtime -5950 (File not found.)
After the next 389ds restart this ERR message does not appear, but it appears after each OS reboot (since /dev/shm is cleaned up after each reboot).

We can look into modifying this behavior, especially since it's not a fatal error.  We can change the logging severity to NOTICE (from ERR) or something like that.

Yes, i think that if it is not critical the logging severity for this particular message should be lowered to NOTICE. Every message with ERR level criticity makes me a bit nervous about server data sanity and integrity, especially at startup


To be honest error log messages should not be expected to be static.  As work is done to the server logging messages are added/removed and/or changed all the time, and that's not going to change.

I agree 100% with that, and as you say the criticity level of this finally benign case (absence of "/dev/shm/slapd-xxx/DBVERSION") should be adjusted to NOTICE in odrer not to scare the admin :))



  Now I know when we added the "wtime" and "optime" to the access logging that did cause some issues for Admins who parse our access logs.  We could have done better with communicating this change (live and learn).  But at the same time this new logging is tremendously useful, and has helped many customers troubleshoot various performance issues.  So while these changes can be disruptive we felt the pro's outweighed the cons.

I found that change when it was introduced very interesting and useful tbh, it simplifies debugging perfomance issues and lockups.


2) UNIX socket of the server was moved to /run/slapd-instance.socket, a new keyword in .inf file for dscreate ("ldapi") has appeared.
Works fine, but it had an impact on our scripts that use ldapi socket path.
In this case using /var/run was outdated and was causing issues with systemd/tmpfiles on RHEL, and moving it to /run was the correct thing to do.  What I don't understand is why adding the option to set the LDAPI path in the INF file is a problem. Can you elaborate on that please?
It was not a problem, just a change that i did not expect to happen in this tag/branch, once again, it's a matter of perception.



See my comment https://github.com/389ds/389-ds-base/issues/5008#issuecomment-983759224. Rust becomes a requirement for building the server, which is fine, but then it should be enabled by default in "./configure". Without it the server does not compile the new plugin and complains about it when starting:
[01/Dec/2021:12:54:04.460194603 +0100] - ERR - symload_report_error - Could not open library "/Local/dirsrv/lib/dirsrv/plugins/libpwdchan-plugin.so" for plugin PBKDF2

Yes I do understand this frustration, and it is now fixed for non-rust builds. 

In our specfile we do enable Rust by default (and have for over a year now), so I guess you don't use our specfile (389-ds-base/rpm/389-ds-base.spec.in) as a reference for building your server.  Also we have discussed moving to Rust on the public devel mailing list for along time now, so if you are not on this list (389-devel) then I strongly suggest you, or anyone who builds the server for themselves, to subscribe to it.  Again, we probably could have communicated this more "loudly".

I was aware of the existence of rust, but i have never enabled it explicitely in "./configure" since i thought it was used for some optional "exotic" plugins we do not use or compile or for some container integration. And yes, i do not compile rpm or use spec file, though i check the spec file from time to time ( mainly at major version change) to be in sync with dev.


-------------------------------------------------------------------------------------------
Just to add to the previous mail - there is another phenomenon linked apparently  to the new plugin - at each start of the server two error messages about plugins with NULL identities are displayed:
...
[03/Dec/2021:14:41:38.945576751 +0100] - INFO - main - 389-Directory/1.4.4.17 B2021.337.1333 starting up
[03/Dec/2021:14:41:38.951185055 +0100] - ERR - allow_operation - Component identity is NULL
[03/Dec/2021:14:41:38.951846429 +0100] - ERR - allow_operation - Component identity is NULL
[03/Dec/2021:14:41:39.546909815 +0100] - INFO - PBKDF2_SHA256 - Based on CPU performance, chose 2048 rounds



Yes we are aware of this useless error message popping up.  I thought we removed it already, if not we will be doing that very soon.
It's essentially just the severity ERR that i have found a bit disconcerting.


...

Thank you and keep up the good work, we use 389ds in production since 2007 and we are quite happy with it :)

Great to hear, and it is always nice working with you Andrey.  You never get too upset about most issues :-) 

Anyway as for 1.4.4 we are not planning any other major changes with it.  Any performance/stability improvements will be backported, but that should be it at this stage of 1.4.4 (well there are some nice UI changes coming). 

Yes, i really appreciate the new UI LDAP editor. Even if i do not use it,the absence of this editor was a bit of drawback compared to the previous java-based console.

Thanks again for your comments,

Andrey

No comments:

Post a Comment