Fedora Info
Thursday, October 17, 2024
[Test-Announce]REMINDER: F41 Go/No-Go in One Week, and other dates
Please be advised of a few dates for the upcoming F41 release, and F42
development.
The Fedora Linux 41 Final Go/No-Go[1] is now happening on Thursday
24th October. You can find details in fedocal[2] and our schedule[3]
has been updated slightly to reflect new target dates. F41 Final is
still targeting a release date of Tuesday 29th October, however, if
the release candidate is deemed unsuitable, our next release target
date is Tuesday November 12th. We are currently in final freeze, which
means you are unable to land any major changes at this time and
updates will be pulled into the updates repository (if not fixing a
release blocker bug). Please refer to our updates policy: final freeze
section[4] for more details.
Fedora Linux 39 will go EOL on 26th November 2024.
For Fedora Linux 42 (the answer to life, the universe and everything),
please take note of some important upcoming dates for proposing
changes:
- Changes needing infra changes: 18th December 2024
- System Wide: 24th December 2024
- Self Contained: 14th January 2025
- F42 Branching AND Changes Testable: 4th February 2024
For other dates, please check the full F42 schedule[5].
[1] https://fedoraproject.org/wiki/Go_No_Go_Meeting
[2] https://calendar.fedoraproject.org/meeting/10917
[3] https://fedorapeople.org/groups/schedule/f-41/f-41-key-tasks.html
[4] https://docs.fedoraproject.org/en-US/fesco/Updates_Policy/#final-freeze
[5] https://fedorapeople.org/groups/schedule/f-42/f-42-all-tasks.html
--
Aoife Moloney
Fedora Operations Architect
Fedora Project
Matrix: @amoloney:fedora.im
IRC: amoloney
--
_______________________________________________
test-announce mailing list -- test-announce@lists.fedoraproject.org
To unsubscribe send an email to test-announce-leave@lists.fedoraproject.org
Fedora Code of Conduct: https://docs.fedoraproject.org/en-US/project/code-of-conduct/
List Guidelines: https://fedoraproject.org/wiki/Mailing_list_guidelines
List Archives: https://lists.fedoraproject.org/archives/list/test-announce@lists.fedoraproject.org
Do not reply to spam, report it: https://pagure.io/fedora-infrastructure/new_issue
Wednesday, October 16, 2024
[389-users] Re: Inconsistent Ldap connection issues
Yes I can during the next round of testing. I ll see if I can see anything obvious in wireshark. I ll look for mis colored connections, right ? ( I have not looking for missing syn-acks before, and wanted to check).
-Gary
On 10/16/24 2:26 AM, William Brown via 389-users wrote:
These errors are only shown on the client, yes? Is there any evidence of a failed connection in the access log?Correct, those 2 different contacting ldap error issues. I have searched for various things in the logs, but I havent read it line by line. I dont see "err=1", no fd errors, or "Not listening for new connections - too many fds open".
So, that means the error is happening *before* 389-ds gets a chance to accept on the connection.
Are there any routers, middlewares, firewalls, idp's etc between the client/ldap server? Load balancer?
We encountered a similar issue recently with another load test, where the load tester wasn't averaging it's connections, it would launch 10,000 connections at once and hope they all worked. With your load test, is it actually spreading it's connections out, or is it bursting?It's a ramp up of 500 users logging in and starting their searches, the initial ramp up is 60 seconds, but the searches and login/logouts is over 6 minutes. I just spliced up the logs to see what that first minute was like:Peak Concurrent Connections: 689
Total Operations: 18770
Total Results: 18769
Overall Performance: 100.0%
Total Connections: 2603 (21.66/sec) (1299.40/min)
- LDAP Connections: 2603 (21.66/sec) (1299.40/min)
- LDAPI Connections: 0 (0.00/sec) (0.00/min)
- LDAPS Connections: 0 (0.00/sec) (0.00/min)
- StartTLS Extended Ops: 2571 (21.39/sec) (1283.42/min)
Searches: 13596 (113.12/sec) (6787.01/min)
Modifications: 0 (0.00/sec) (0.00/min)
Adds: 0 (0.00/sec) (0.00/min)
Deletes: 0 (0.00/sec) (0.00/min)
Mod RDNs: 0 (0.00/sec) (0.00/min)
Compares: 0 (0.00/sec) (0.00/min)
Binds: 2603 (21.66/sec) (1299.40/min)
With these settings below, the test results are in, they still get 1 ldap error per test.
Any chance that you can get a tcp-dump over the 6 minutes and try to find the syn without ack around the time of the failure ?
We still don't know what the cause *is* so just tweaking values won't help. We need to know what layer is triggering the error before we make changes.
net.ipv4.tcp_max_syn_backlog = 8192
net.core.somaxconn = 8192
Suggestions ? Should I bump these up more ?
Reading these numbers, this doesn't look like the server should be under any stress at all - I have tested with 2cpu / 4gb ram and can easily get 10,000 simultaneous connections launched and accepted by 389-ds.
My thinking at this point is there is something in between the client and 389 that is not coping.
--
Sincerely,
William Brown
Senior Software Engineer,
Identity and Access Management
SUSE Labs, Australia
[389-users] Re: Inconsistent Ldap connection issues
Are there any routers, middlewares, firewalls, idp's etc between the client/ldap server? Load balancer?When this first started happening, the client a cluster of containers just spoke to ldap server directly over a peering connection. Since the error was unable to connect to ldap, I thought perhaps the one ldap server could not handle it. So I added a load balancer (AWS NLB) and a second ldap server. It didnt help. Since this was happening before the load balancer, I dont think its that. There is a ALB in front of the cluster.
Are there any routers, middlewares, firewalls, idp's etc between the client/ldap server? Load balancer?When this first started happening, the client a cluster of containers just spoke to ldap server directly over a peering connection. Since the error was unable to connect to ldap, I thought perhaps the one ldap server could not handle it. So I added a load balancer (AWS NLB) and a second ldap server. It didnt help. Since this was happening before the load balancer, I dont think its that. There is a ALB in front of the cluster.
-Gary
On 10/15/24 17:26, William Brown wrote:
These errors are only shown on the client, yes? Is there any evidence of a failed connection in the access log?Correct, those 2 different contacting ldap error issues. I have searched for various things in the logs, but I havent read it line by line. I dont see "err=1", no fd errors, or "Not listening for new connections - too many fds open".
So, that means the error is happening *before* 389-ds gets a chance to accept on the connection.
We still don't know what the cause *is* so just tweaking values won't help. We need to know what layer is triggering the error before we make changes.We encountered a similar issue recently with another load test, where the load tester wasn't averaging it's connections, it would launch 10,000 connections at once and hope they all worked. With your load test, is it actually spreading it's connections out, or is it bursting?It's a ramp up of 500 users logging in and starting their searches, the initial ramp up is 60 seconds, but the searches and login/logouts is over 6 minutes. I just spliced up the logs to see what that first minute was like:Peak Concurrent Connections: 689
Total Operations: 18770
Total Results: 18769
Overall Performance: 100.0%
Total Connections: 2603 (21.66/sec) (1299.40/min)
- LDAP Connections: 2603 (21.66/sec) (1299.40/min)
- LDAPI Connections: 0 (0.00/sec) (0.00/min)
- LDAPS Connections: 0 (0.00/sec) (0.00/min)
- StartTLS Extended Ops: 2571 (21.39/sec) (1283.42/min)
Searches: 13596 (113.12/sec) (6787.01/min)
Modifications: 0 (0.00/sec) (0.00/min)
Adds: 0 (0.00/sec) (0.00/min)
Deletes: 0 (0.00/sec) (0.00/min)
Mod RDNs: 0 (0.00/sec) (0.00/min)
Compares: 0 (0.00/sec) (0.00/min)
Binds: 2603 (21.66/sec) (1299.40/min)
With these settings below, the test results are in, they still get 1 ldap error per test.
net.ipv4.tcp_max_syn_backlog = 8192
net.core.somaxconn = 8192
Suggestions ? Should I bump these up more ?
Reading these numbers, this doesn't look like the server should be under any stress at all - I have tested with 2cpu / 4gb ram and can easily get 10,000 simultaneous connections launched and accepted by 389-ds.
My thinking at this point is there is something in between the client and 389 that is not coping.
--
Sincerely,
William Brown
Senior Software Engineer,
Identity and Access Management
SUSE Labs, Australia
[Test-Announce]Re: Fedora Linux 41 Final Go/No-Go Meeting on Thursday 17th Oct
Hi all,The Fedora Linux 41 Final Go/No-Go meeting[1] will be held next Thursday, 17th October @ 1700 UTC in #meeting:fedoraproject.org on Matrix.At this time, we will determine the status of the F41 Final for the current target
date[2] of Tuesday October 22nd. For more information about the Go/No-Go meeting, see the wiki[3].--
--
[389-users] Re: Inconsistent Ldap connection issues
These errors are only shown on the client, yes? Is there any evidence of a failed connection in the access log?Correct, those 2 different contacting ldap error issues. I have searched for various things in the logs, but I havent read it line by line. I dont see "err=1", no fd errors, or "Not listening for new connections - too many fds open".
So, that means the error is happening *before* 389-ds gets a chance to accept on the connection.
Are there any routers, middlewares, firewalls, idp's etc between the client/ldap server? Load balancer?
We encountered a similar issue recently with another load test, where the load tester wasn't averaging it's connections, it would launch 10,000 connections at once and hope they all worked. With your load test, is it actually spreading it's connections out, or is it bursting?It's a ramp up of 500 users logging in and starting their searches, the initial ramp up is 60 seconds, but the searches and login/logouts is over 6 minutes. I just spliced up the logs to see what that first minute was like:Peak Concurrent Connections: 689
Total Operations: 18770
Total Results: 18769
Overall Performance: 100.0%
Total Connections: 2603 (21.66/sec) (1299.40/min)
- LDAP Connections: 2603 (21.66/sec) (1299.40/min)
- LDAPI Connections: 0 (0.00/sec) (0.00/min)
- LDAPS Connections: 0 (0.00/sec) (0.00/min)
- StartTLS Extended Ops: 2571 (21.39/sec) (1283.42/min)
Searches: 13596 (113.12/sec) (6787.01/min)
Modifications: 0 (0.00/sec) (0.00/min)
Adds: 0 (0.00/sec) (0.00/min)
Deletes: 0 (0.00/sec) (0.00/min)
Mod RDNs: 0 (0.00/sec) (0.00/min)
Compares: 0 (0.00/sec) (0.00/min)
Binds: 2603 (21.66/sec) (1299.40/min)
With these settings below, the test results are in, they still get 1 ldap error per test.
Any chance that you can get a tcp-dump over the 6 minutes and try to find the syn without ack around the time of the failure ?
We still don't know what the cause *is* so just tweaking values won't help. We need to know what layer is triggering the error before we make changes.
net.ipv4.tcp_max_syn_backlog = 8192
net.core.somaxconn = 8192
Suggestions ? Should I bump these up more ?
Reading these numbers, this doesn't look like the server should be under any stress at all - I have tested with 2cpu / 4gb ram and can easily get 10,000 simultaneous connections launched and accepted by 389-ds.
My thinking at this point is there is something in between the client and 389 that is not coping.
--
Sincerely,
William Brown
Senior Software Engineer,
Identity and Access Management
SUSE Labs, Australia
Tuesday, October 15, 2024
[389-users] Re: Inconsistent Ldap connection issues
These errors are only shown on the client, yes? Is there any evidence of a failed connection in the access log?Correct, those 2 different contacting ldap error issues. I have searched for various things in the logs, but I havent read it line by line. I dont see "err=1", no fd errors, or "Not listening for new connections - too many fds open".
We still don't know what the cause *is* so just tweaking values won't help. We need to know what layer is triggering the error before we make changes.We encountered a similar issue recently with another load test, where the load tester wasn't averaging it's connections, it would launch 10,000 connections at once and hope they all worked. With your load test, is it actually spreading it's connections out, or is it bursting?It's a ramp up of 500 users logging in and starting their searches, the initial ramp up is 60 seconds, but the searches and login/logouts is over 6 minutes. I just spliced up the logs to see what that first minute was like:Peak Concurrent Connections: 689
Total Operations: 18770
Total Results: 18769
Overall Performance: 100.0%
Total Connections: 2603 (21.66/sec) (1299.40/min)
- LDAP Connections: 2603 (21.66/sec) (1299.40/min)
- LDAPI Connections: 0 (0.00/sec) (0.00/min)
- LDAPS Connections: 0 (0.00/sec) (0.00/min)
- StartTLS Extended Ops: 2571 (21.39/sec) (1283.42/min)
Searches: 13596 (113.12/sec) (6787.01/min)
Modifications: 0 (0.00/sec) (0.00/min)
Adds: 0 (0.00/sec) (0.00/min)
Deletes: 0 (0.00/sec) (0.00/min)
Mod RDNs: 0 (0.00/sec) (0.00/min)
Compares: 0 (0.00/sec) (0.00/min)
Binds: 2603 (21.66/sec) (1299.40/min)
With these settings below, the test results are in, they still get 1 ldap error per test.
net.ipv4.tcp_max_syn_backlog = 8192
net.core.somaxconn = 8192
Suggestions ? Should I bump these up more ?
Sincerely,
William Brown
Senior Software Engineer,
Identity and Access Management
SUSE Labs, Australia
[389-users] Re: Inconsistent Ldap connection issues
Hi William,
These errors are only shown on the client, yes? Is there any evidence of a failed connection in the access log?Correct, those 2 different contacting ldap error issues. I have searched for various things in the logs, but I havent read it line by line. I dont see "err=1", no fd errors, or "Not listening for new connections - too many fds open".
We encountered a similar issue recently with another load test, where the load tester wasn't averaging it's connections, it would launch 10,000 connections at once and hope they all worked. With your load test, is it actually spreading it's connections out, or is it bursting?It's a ramp up of 500 users logging in and starting their searches, the initial ramp up is 60 seconds, but the searches and login/logouts is over 6 minutes. I just spliced up the logs to see what that first minute was like:
Peak Concurrent Connections: 689
Total Operations: 18770
Total Results: 18769
Overall Performance: 100.0%
Total Connections: 2603 (21.66/sec) (1299.40/min)
- LDAP Connections: 2603 (21.66/sec) (1299.40/min)
- LDAPI Connections: 0 (0.00/sec) (0.00/min)
- LDAPS Connections: 0 (0.00/sec) (0.00/min)
- StartTLS Extended Ops: 2571 (21.39/sec) (1283.42/min)
Searches: 13596 (113.12/sec) (6787.01/min)
Modifications: 0 (0.00/sec) (0.00/min)
Adds: 0 (0.00/sec) (0.00/min)
Deletes: 0 (0.00/sec) (0.00/min)
Mod RDNs: 0 (0.00/sec) (0.00/min)
Compares: 0 (0.00/sec) (0.00/min)
Binds: 2603 (21.66/sec) (1299.40/min)
With these settings below, the test results are in, they still get 1 ldap error per test.
net.ipv4.tcp_max_syn_backlog = 8192
net.core.somaxconn = 8192
Suggestions ? Should I bump these up more ?
Thanks,
Gary
Ah yes of course. Here is 1 run of their web app load test, it is 6 minutes long, and it should mostly be only the test it self. I will start looking for
We encountered 2 "Can not contact ldap server" errors during this run.
2 cant contact ldap server errors in this run below.
These errors are only shown on the client, yes? Is there any evidence of a failed connection in the access log?
After the run I bumped up these from 4096,
net.ipv4.tcp_max_syn_backlog = 6144
net.core.somaxconn = 6144Yet we still get the ldap errors (this one and the start tls request error previously mentioned.)
Should I bump up the nsslapd-listen-backlog-size, net.ipv4.tcp_max_syn_backlog, net.core.somaxconn more ?
We encountered a similar issue recently with another load test, where the load tester wasn't averaging it's connections, it would launch 10,000 connections at once and hope they all worked. With your load test, is it actually spreading it's connections out, or is it bursting?
--
Sincerely,
William Brown
Senior Software Engineer,
Identity and Access Management
SUSE Labs, Australia