Bugzilla – Bug 906689
xendomains fails to auto start domus
Last modified: 2015-09-09 07:42:59 UTC
I have a Xen server and several Domu (all hvm). Starting the vm using xl command works fine and all the config have a symbolic link in /etc/xen/auto. But xendomains fail to restore a save state of all the domus and don't gives a lot of information about the reason: Nov 22 13:54:23 pdrvsrv001 xendomains[1445]: napier Nov 22 13:54:23 pdrvsrv001 xendomains[1445]: An error occurred while restoring domain napier: Nov 22 13:54:23 pdrvsrv001 xendomains[1445]: Loading new save file /var/lib/xen/save/napier (new xl fmt info 0x0/0x0/660) Nov 22 13:54:23 pdrvsrv001 xendomains[1445]: Savefile contains xl domain config Nov 22 13:54:23 pdrvsrv001 xendomains[1445]: WARNING: you seem to be using "kernel" directive to override HVM guest firmware. Ignore that. Use "firmware_override" instead if you really want a Nov 22 13:54:23 pdrvsrv001 xendomains[1445]: WARNING: ignoring device_model directive. Nov 22 13:54:23 pdrvsrv001 xendomains[1445]: WARNING: Use "device_model_override" instead if you really want a non-default device_model Nov 22 13:54:23 pdrvsrv001 xendomains[1445]: xc: error: 0-length read: Internal error Nov 22 13:54:23 pdrvsrv001 xendomains[1445]: xc: error: rdexact failed (read rc: 0, errno: 0): Internal error Nov 22 13:54:23 pdrvsrv001 xendomains[1445]: xc: error: read: p2m_size (0 = Success): Internal error Nov 22 13:54:23 pdrvsrv001 xendomains[1445]: libxl: error: libxl_create.c:959:libxl__xc_domain_restore_done: restoring domain: Resource temporarily unavailable Nov 22 13:54:23 pdrvsrv001 xendomains[1445]: libxl: error: libxl_create.c:1041:domcreate_rebuild_done: cannot (re-)build domain: -3 Nov 22 13:54:23 pdrvsrv001 xendomains[1445]: libxl: error: libxl.c:1405:libxl__destroy_domid: non-existant domain 4 Nov 22 13:54:23 pdrvsrv001 xendomains[1445]: libxl: error: libxl.c:1369:domain_destroy_callback: unable to destroy guest with domid 4 Note that in 13.1, all the autostart works fine.
I have tested a save and a restore operation on a running vm: [root|pdrvsrv001:/var/tmp]xl list Name ID Mem VCPUs State Time(s) Domain-0 0 8882 4 r----- 2984.5 descartes 8 1000 1 -b---- 1854.4 napier 9 1000 1 -b---- 233.6 lagrange 10 1000 1 -b---- 481.9 pdrapp001 11 1000 1 -b---- 1179.8 pdrdb001 12 768 1 -b---- 129.8 pdrfs001 13 768 1 -b---- 113.6 galilee 15 1000 1 -b---- 8.0 [root|pdrvsrv001:/var/tmp]xl save 15 /var/tmp/galilee Saving to /var/tmp/galilee new xl format (info 0x0/0x0/735) xc: Saving memory: iter 0 (last sent 0 skipped 0): 1044481/1044481 100% [root|pdrvsrv001:/var/tmp]xl restore /var/tmp/galilee Loading new save file /var/tmp/galilee (new xl fmt info 0x0/0x0/735) Savefile contains xl domain config Parsing config from <saved> WARNING: you seem to be using "kernel" directive to override HVM guest firmware. Ignore that. Use "firmware_override" instead if you really want a non-default firmware WARNING: ignoring device_model directive. WARNING: Use "device_model_override" instead if you really want a non-default device_model and it's working perfectly fine. I put in the ticket the log of the commands with the -v option.
Created attachment 614759 [details] save and restore state operation on a vm
Is the domain properly savedI To me it looks like just the config header is stored and the memory dump is missing. Looks like systemd just kills the xl process..
While I have seen the "xc: error: 0-length read: Internal error" with SLE12 the xendomains script appears to work there. Please check what is in /var/lib/xen/save, is it just a small text file?
This sounds similar to a bug we had with libvirt-guests and qemu/kvm, where the qemu process running the VM was killed off before libvirt-guests could save the VM memory https://bugzilla.redhat.com/show_bug.cgi?id=1031696 Xen is quite different in this regard, but perhaps something similar is happening.
(In reply to Olaf Hering from comment #4) > While I have seen the "xc: error: 0-length read: Internal error" with SLE12 > the xendomains script appears to work there. > > Please check what is in /var/lib/xen/save, is it just a small text file? Hi, I only see text files, the same in fact as the domus definition in /etc/xen/vm Thanks Romain
If this is still reproducible, please adjust /usr/lib/systemd/system/xendomains.service. It has an After= line, but no Requires=. I think it should look like this: After=xencommons.service network-online.target Requires=xencommons.service network-online.target
SUSE-SU-2015:1042-1: An update that solves 7 vulnerabilities and has one errata is now available. Category: security (important) Bug References: 906689,931625,931626,931627,931628,932770,932790,932996 CVE References: CVE-2015-3209,CVE-2015-4103,CVE-2015-4104,CVE-2015-4105,CVE-2015-4106,CVE-2015-4163,CVE-2015-4164 Sources used: SUSE Linux Enterprise Software Development Kit 12 (src): xen-4.4.2_06-21.1 SUSE Linux Enterprise Server 12 (src): xen-4.4.2_06-21.1 SUSE Linux Enterprise Desktop 12 (src): xen-4.4.2_06-21.1
(In reply to Olaf Hering from comment #7) > If this is still reproducible, please adjust > /usr/lib/systemd/system/xendomains.service. It has an After= line, but no > Requires=. I think it should look like this: > > After=xencommons.service network-online.target > Requires=xencommons.service network-online.target Hi, I have made the change suggested and will test it soon. Romain
I assume this is fixed.
openSUSE-SU-2015:1092-1: An update that solves 17 vulnerabilities and has 10 fixes is now available. Category: security (important) Bug References: 861318,882089,895528,901488,903680,906689,910254,912011,918995,918998,919098,919464,919663,921842,922705,922706,922709,923758,927967,929339,931625,931626,931627,931628,932770,932790,932996 CVE References: CVE-2014-3615,CVE-2015-2044,CVE-2015-2045,CVE-2015-2151,CVE-2015-2152,CVE-2015-2751,CVE-2015-2752,CVE-2015-2756,CVE-2015-3209,CVE-2015-3340,CVE-2015-3456,CVE-2015-4103,CVE-2015-4104,CVE-2015-4105,CVE-2015-4106,CVE-2015-4163,CVE-2015-4164 Sources used: openSUSE 13.2 (src): xen-4.4.2_06-23.1
Assuming this is fixed.