Thursday, June 30, 2016

[389-devel] Re: Replication after full online init

On Thu, 2016-06-30 at 14:53 -0700, Noriko Hosoi wrote:
> On 06/30/2016 12:45 AM, Ludwig Krispenz wrote:
> > Hi William,
> >
> > the reason that after a total init the consumer does not have the
> > latest state of the supplier RUV and is receiving updates based on the
> > RUV at start of the total init is independent of the modrdn problem.
> > When a supplier is performing a total init it is still accepting
> > changes, the total init can take a while and there are scenarios where
> > an entry which is already sent is updated before total init finishes.
> > We cannot loose these changes.
> OK... Then, RUV needs to be created at the time when the supplier
> starts online init?
>
> The test case would be something like this?
> 1. run online init on the supplier.
> 2. do some operation like move entries against the supplier while the
> online init is still running on the consumer.
> 3. do some operation which depends upon the previous operation done in
> the step 2.
> 4. check the consumer is healthy or not.
>
> Isn't it a timestamp issue from which operation should be replayed after
> the total update? Regardless of the way how to fix 48755, unless the
> step 2 operation(s) are replayed after the online init is done, the
> consumer could get broken/inconsistent?
>

It's not the "post init" operations I'm worried about.

It's that operations that were part of the init to the consumer are
replayed from the changelog.

Operations that occurred after the init starts, definitely still need to
be replayed, and this makes sense.

Lets say we have:

1 - insert A
2 - insert ou=B
3 - modrdn A under ou=B
4 - insert C
xxxxxx <<-- We start to transmit the data here.
5 - modrdn C


Once the online init is complete, the master replays the log from event
1 -> 5 to the consumer, even though it should now be up to date at
position 4.

Previously we could not guarantee this because in the scenario above, A
would have sorted before ou=B, by would not be able to be applied
because the consumer hadn't seen B yet. So after the init, the consumer
would have B and C, but not A, so we had to replay 1 -> 4 to fix this
up.

So I am suggesting that when we begin the online init we set the RUV of
the consumer to match the CSN of the master at the moment we begin the
transmission of data, so that we only need to replay event 5+, rather
than 1->5+

Does that make sense?


--
Sincerely,

William Brown
Software Engineer
Red Hat, Brisbane

No comments:

Post a Comment