Fedora Info: [389-users] Re: CPU Scalability / Scaling

On Fri, Aug 14, 2020 at 6:19 PM Marc Sauton <msauton@redhat.com> wrote:

On Fri, Aug 14, 2020 at 1:31 PM Ben Spencer <isatworktoday@gmail.com> wrote:

On Fri, Aug 14, 2020, 10:53 AM David Boreham <david@bozemanpass.com> wrote:

On 8/14/2020 9:04 AM, Ben Spencer wrote:
> After a little investigation I didn't find any recent information on
> how well / linearly 389 scales from a CPU perspective. I also realize
> this is a more complicated topic with many factors which actually play
> into it.
>
> Throwing the basic question out there: Does 389 scale fairly
> linearly as the number of CPUs are increased? Is there a point where
> it drops off?

Cached reads (cached anywhere : filesystem cache, db page pool, entry
cache) should scale quite well, at least to 4/6/8 CPU. I'm not sure
about today's 8+ CPU systems but would assume probably not great scaling
beyond 8 until proven otherwise.

Interesting since we are currently sitting with 10 CPU per server. Things organically grew over time without much thought given.

Having more CPUs may displace problems, to for example, more threads contention and newer performance problems.

That is one of the concerns we have.

Related to CPU and configurations, the "autotuning" for nsslapd-threadnumber is recommended.
http://www.port389.org/docs/389ds/design/autotuning.html
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/performance_tuning_guide/ds-threads
( "excessive" manual thread setting will have a counter effect )

We use autotuning but if the information is to be believed, 389 supports 512 threads. I could read this to mean that performance drops off at 512 threads but it is not clear if performance is linear up until that point. Do 512 threads handle 16x more work than 32 threads without an execution time drop off or an increase in queuing.

There is another aspect, on the LDAP client side:
High CPU use is often the result of "poorly" designed applications that are hammering an LDAP server with a constant flow of complex search filters with pattern matching.
And very often all the long server side CPU processing is useless.
Analysing the LDAP server access log can help tune, change the filters those applications are sending, and can have a high impact on the server side.
So often, only the global settings are kept, and there is a server side configuration that is overlooked, that can really help optimize the CPU and I/O: the "fine grained" ID list scan limit.
http://www.port389.org/docs/389ds/design/fine-grained-id-list-size.html
https://access.redhat.com/documentation/en-us/red_hat_directory_server/11/html/configuration_command_and_file_reference/database_plug_in_attributes#nsIndexIDListScanLimit
And reduce or optimize index use on the system resources.

Unfortunately we've gone through this exercise previously with no actionable items.

so there should be some investigation before adding more system resources, logs, pstacks, or some gdb stack traces, index configurations.

Writes are going to be heavily serialized, assume no CPU scaling. Fast
I/O is what you need for write throughput.
> Where am I going with this?
> We are faced with either adding more CPUs to the existing servers or
> adding more instances or a combination of the two. The current servers
> have 10 CPU with the entire database fitting in RAM but, there is a
> regular flow of writes. Sometimes somewhat heavy thanks to batch
> updates. Gut feeling tells me to have more servers than a few huge
> servers largely because of the writes/updates and lock contention.
> Needing to balance the server sprawl as well.

I'd look at whether I/O throughput (Write IOPS particularly) can be
upgraded as a first step. Then perhaps look at system design to see if
the batch updates can be throttled/trickled to reduce the cross-traffic
interference. Usually the write load is the limiting factor scaling
because it has to be replayed on every server regardless of its read
workload.

Something to consider. Hard to resolve in the environment where the servers are.
large bulk updates always degrade LDAP servers performance.
Adding more replicas will create more contention for replication sessions.
LDAP replication can be tuned to accommodate replication sessions competition, but there isn't a dynamic tuning to adapt to traffic pattern changes, large bursts of modifications, so throttling scheduled updates seems a good and easier approach.

Being that we are weighting adding CPU vs adding more replicas, those tuning options may be useful. Couple of questions around this:

1) is the contention on inbound only, inbound and outbound or the entire replication domain?

2) What are some of the tuning knobs?

Fedora Info

Sunday, August 16, 2020

[389-users] Re: CPU Scalability / Scaling

No comments:

Post a Comment