based refactors: Improve init/cleanup and split Corosync-related functions into new file#4135
Open
nrwahl2 wants to merge 28 commits into
Open
based refactors: Improve init/cleanup and split Corosync-related functions into new file#4135nrwahl2 wants to merge 28 commits into
nrwahl2 wants to merge 28 commits into
Conversation
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
The NULL checks will get de-duplicated in an upcoming commit. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Also move the ipcs_{ro,rw,shm} extern declarations from
based_callbacks.h to based_ipc.h, now that there is a more appropriate
home for them. And make ipc_{ro,rw}_callbacks static to based_ipc.c,
since we no longer use them in pacemaker-based.c.
The goal is to look more like attrd, the fencer, and the scheduler.
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
The goal is to look more like attrd, the fencer, and the scheduler. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
This seems like a more correct location for it. based_terminate() is where we free data structures. This also makes the calls to pcmk__stop_based_ipc() redundant, and allows making the qb_ipcs_service_t declarations static. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Also make -1 the "uninitialized" value. 0 is a valid file descriptor, even though it's much more likely to be used for stdin. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To look more like attrd and fenced. More change are coming. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
It doesn't make sense to have this handful of unrelated initializations in a separate function, when most of main() is for initializing things. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Place the call right before the first thing that needs it (based_activate_cib()). The rest of the initialization helpers are called in this region, so let's collect them. We can't make them all contiguous though -- based_remote_init() requires that the CIB be read and activated first. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To mirror attrd and fenced. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
pcmk__corosync_connect() doesn't use the caches directly. pcmk__get_node() initializes the caches when it gets called. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Using include-what-you-use. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
It shouldn't matter whether this is a Corosync cluster or not. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To mirror attrd and fenced. Notes: * Previously, we weren't freeing the cluster object on exit in based_terminate(). Now we are. * pcmk_cluster_free() calls pcmk__cluster_destroy_node_caches(), which is why we drop the call to that function. * I'm fairly certain that the reason the pcmk_cluster_disconnect() call previously occurred before the done section, is that prior to a recent commit, we weren't NULL-checking the cluster argument before disconnect. We should be able to call based_cluster_disconnect() regardless of how based_terminate() wherever we want to free the cluster object. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
And NULL-check the attrd_cluster variable. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
pcmk_cluster_disconnect() now returns EINVAL if passed a NULL argument. Previously, if given a NULL argument in a Corosync cluster, pcmk_cluster_disconnect() would call down to pcmk__cpg_disconnect(), which would dereference the NULL pointer. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To mirror attrd and the fencer. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
The CIB manager started sending a shutdown request with commit 1f61358 (2006). There is not much detail about the issue this aimed to fix. However, the commit message says the following: """ Fix for a timing issue where a remote CCM registered the DC exited before the last CIB update(s) from the DC were applied. ... As long as one node has the update, the join process will make sure it ends up on the new DC. If for some reason *none* of our peers reply, we still exit in at most N seconds and we're no worse off than we were before. """ It very much sounds like the problem was tied to legacy mode and however we handled CIB updates back then. As of Pacemaker 1.1.12, all modifying operations get forwarded and processed on all nodes. Commit 15c4d2a: """ Feature: cib: Send all r/w operations via the cluster connection and have all nodes process them The idea of a master cib instance goes away, and all instances apply all raw (and now ordered thanks to corosync) updates (not diffs) This ensures updates are never lost, even if there is no DC """ That was the replacement for legacy mode behavior. Given that modifying ops get forwarded to all nodes, I don't see how "...a remote CCM registered the DC exited before the last CIB update(s) from the DC were applied" would present a problem. If the DC's CIB manager receives a modifying request, it will forward it via the cluster layer to all nodes for processing. A request could always be dropped if a node abruptly leaves the cluster before it's able to forward a request that a client sent, but there's nothing we can do about that. At least the CIB would remain in a consistent state. Note that this also makes the peer status callback pointless. So we drop it, and we also drop the associated "-1" logic from based_terminate(). Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
And make it static. It seems logical to have it in the same file as the main() function. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
This gets rid of the last use of based_cluster outside of based_corosync.c. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
And rename to cluster. We don't typically care much about making variable names distinct, although we try to make function names distinct. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To replace cib_shutdown_flag. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To replace global stand_alone variable. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
To drop a global. Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
Signed-off-by: Reid Wahl <nrwahl@protonmail.com>
add6809 to
93dc367
Compare
Contributor
Author
|
I haven't run cts-lab on this in a while (assuming I ever did so), so that should be done before merging. |
Contributor
Author
Running now. 12 tests successful so far. |
Contributor
Author
Did a 35-test run, all good. |
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Next batch from #4011