Bug 1076231 - sbd pacemaker process segfaults over and over
sbd pacemaker process segfaults over and over
Status: RESOLVED FIXED
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: High Availability
Current
Other Other
: P5 - None : Normal (vote)
: ---
Assigned To: Yan Gao
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2018-01-16 16:47 UTC by Kristoffer Gronlund
Modified: 2018-02-28 10:44 UTC (History)
2 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments
hb_report showing the problem (74.68 KB, application/x-bzip)
2018-01-16 16:48 UTC, Kristoffer Gronlund
Details

Note You need to log in before you can comment on or make changes to this bug.
Description Kristoffer Gronlund 2018-01-16 16:47:31 UTC
After installing the latest update to sbd: sbd-1.3.1+20171220.1e93740-66.4.x86_64

I'm seeing repeated crashes of the pcmk servant:

Jan 16 16:43:36 webui sbd[17134]:  warning: cleanup_servant_by_pid: Servant for pcmk (pid: 23957) has terminated

Setup:

This is using the Hawk development environment which is based on Tumbleweed, with updated HA packages from network:ha-clustering:Factory.

Nothing strange other than installing and running crm cluster init --enable-sbd.
Comment 1 Kristoffer Gronlund 2018-01-16 16:48:25 UTC
Created attachment 756276 [details]
hb_report showing the problem
Comment 2 Yan Gao 2018-01-17 13:51:04 UTC
I'm suspecting the sbd and pacemaker are from different repositories?

sbd-1.3.1+20171220.1e93740-66.4.x86_64 is built against pacemaker-1.1.18+20180104.7ba28d854

, but you are running pacemaker-1.1.18+20171221.c91a650ec?
Comment 3 Yan Gao 2018-01-17 14:09:25 UTC
Well, sbd-1.3.1+20171220.1e93740-1.1 from openSUSE:Factory seems to be built against an even older pacemaker-1.1.18-1.2, but meanwhile there's actually pacemaker-1.1.18+20180104.7ba28d854 in there.

And they are not ABI compatible...

Of course a new submission of sbd will trigger the rebuild, but apparently build service doesn't not work like what I thought :-\

Could you please try the things that are both from network:ha-clustering:Factory?
Comment 4 Kristoffer Gronlund 2018-01-17 15:02:14 UTC
Hmm. That seems like something that the package dependencies should have caught :/

Could it be that pacemaker-1.1.18-1.2 sorts as newer than pacemaker-1.1.18+20180104.7ba28d854? I would have thought not..
Comment 5 Yan Gao 2018-01-17 15:48:14 UTC
Hmm, it seems that openSUSE:Factory probably doesn't run with the same policy as normal branches like n:h:F...
Comment 6 Yan Gao 2018-02-14 11:30:06 UTC
A new submission of SBD got accepted:
https://build.opensuse.org/request/show/572562

So that it go built against the latest pacemaker. It has been released for Tumbleweed.

It seems working well for me. You'd probably like to verify that too, Kristoffer/Eric ;)
Comment 7 Yan Gao 2018-02-26 14:36:53 UTC
Could you please confirm if it works for you too, Eric?
Comment 8 zhen ren 2018-02-28 05:00:10 UTC
(In reply to Yan Gao from comment #7)
> Could you please confirm if it works for you too, Eric?

I tested on openSUSE Tumbleweed 20180221, and don't see the coredump error any more.
Comment 9 Yan Gao 2018-02-28 10:44:02 UTC
Thanks for verifying, Eric. I'm closing this.