Bug 1189868

Summary: [Build 20210825] 389ds: dsctl status fails
Product: [openSUSE] openSUSE Tumbleweed Reporter: Dominique Leuenberger <dimstar>
Component: OtherAssignee: William Brown <william.brown>
Status: RESOLVED FIXED QA Contact: E-mail List <qa-bugs>
Severity: Normal    
Priority: P5 - None CC: dimstar, wegao, yuwang
Version: Current   
Target Milestone: ---   
Hardware: Other   
OS: Other   
URL: https://openqa.opensuse.org/tests/1888017/modules/openldap_to_389ds/steps/75
Whiteboard:
Found By: openQA Services Priority:
Business Priority: Blocker: Yes
Marketing QA Status: --- IT Deployment: ---

Description Dominique Leuenberger 2021-08-26 21:31:33 UTC
## Observation

openQA test in scenario opensuse-Tumbleweed-DVD-x86_64-openldap_to_389ds@64bit fails in
[openldap_to_389ds](https://openqa.opensuse.org/tests/1888017/modules/openldap_to_389ds/steps/75)

Error: 'NoneType' object has no attribute 'get'
WVImd-1-



## Test suite description
ldap_to_389ds tools test
Maintainer: wegao@suse.com


## Reproducible

Fails since (at least) Build [20210825](https://openqa.opensuse.org/tests/1887217)


## Expected result

Last good: [20210824](https://openqa.opensuse.org/tests/1885598) (or more recent)


## Further details

Always latest result in this scenario: [latest](https://openqa.opensuse.org/tests/latest?arch=x86_64&distri=opensuse&flavor=DVD&machine=64bit&test=openldap_to_389ds&version=Tumbleweed)
Comment 1 openQA Review 2021-09-10 00:48:38 UTC
This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: openldap_to_389ds
https://openqa.opensuse.org/tests/1909473

To prevent further reminder comments one of the following options should be followed:
1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
3. The label in the openQA scenario is removed
Comment 2 William Brown 2021-09-12 23:40:29 UTC
I can not understand what OpenQA is doing in these links, so can someone please provide the reproduction steps that I can apply manually to understand this? I've been unable to reproduce this issue so far so I'm not sure of the true origin yet.
Comment 3 William Brown 2021-09-20 03:17:38 UTC
Reminder for needinfo
Comment 4 Dominique Leuenberger 2021-09-20 09:50:03 UTC
The entire test code is documented in

https://openqa.opensuse.org/tests/1925185/modules/openldap_to_389ds/steps/1/src

The test downloads a pre-installed disk image
(See Logs&Assets of the referenced, always latest build)
boots it, then does:

     zypper in 389-ds sssd sssd-tools
     zypper in info 389-ds

     # Install openldap since we need use slaptest tools
     zypper in openldap2 sssd-ldap openldap2-client
 
     # Disable and stop the nscd daemon because it conflicts with sssd
     systemctl --no-pager disable nscd
     systemctl --no-pager stop nscd
 
     # On newer environments, nsswitch.conf is located in /usr/etc
     # Copy it to /etc directory
     f=/etc/nsswitch.conf; [ ! -f $f ] && cp /usr$f $f
     # Configure nsswitch with sssd
     sed -i 's/^passwd:.*/passwd: compat sss/' /etc/nsswitch.conf
     sed -i 's/^group:.*/group: compat sss/' /etc/nsswitch.conf
     cat /etc/nsswitch.conf
 
<===> This is a bit special - it downloads the data from
https://github.com/os-autoinst/os-autoinst-distri-opensuse/tree/master/data/openldap_to_389ds proxied through openQA and puts it into a 'test' directory)
     # Prepare test env
     assert_script_run "cd; curl -L -v " . autoinst_url . "/data/openldap_to_389ds > openldap_to_389ds.data && cpio -id < openldap_to_389ds.data && mv data test && ls test";
<===>

     cd test
 
     # We need start openldap to kick out date base file which stored in directory
     mkdir /tmp/ldap-sssdtest
     slapd -h 'ldap:///' -f slapd.conf
     ldapadd -x -D 'cn=root,dc=ldapdom,dc=net' -wpass -f db.ldif
     killall slapd
     ps -aux | grep slapd
 
     # setup sssd
     cp ./sssd.conf /etc/sssd/sssd.conf
     systemctl stop sssd
     rm -rf /var/lib/sss/db/*
     systemctl restart sssd
     systemctl status sssd
 
     # Prepare data file for migration
     sed -i 's/^root_password.*/root_password = $password/' ./instance.inf
     mkdir slapd.d
     dscreate from-file ./instance.inf
     dsctl localhost status

The last command, dsctl localhost status is the one failing and returning

Error: 'NoneType' object has no attribute 'get'
Comment 5 William Brown 2021-09-21 01:52:43 UTC
Hi there, I've just tried to reproduce this and I can't do it with latest TW following the steps and using the provided config files:


localhost:~ # dsctl -v localhost status
DEBUG: The 389 Directory Server Administration Tool
DEBUG: Inspired by works of: ITS, The University of Adelaide
DEBUG: Called with: Namespace(func=<function instance_status at 0x7f7a8854ec10>, instance='localhost', json=False, list=False, remove_all=False, verbose=True)
DEBUG: Allocate <class 'lib389.DirSrv'> with None
DEBUG: Allocate <class 'lib389.DirSrv'> with b'localhost':636
DEBUG: Allocate <class 'lib389.DirSrv'> with b'localhost':636
DEBUG: Instance allocated
DEBUG: systemd status -> True
INFO: Instance "localhost" is running


Can you please change 

assert_script_run("dscreate from-file ./instance.inf", timeout => 180);
to
assert_script_run("dscreate -v from-file ./instance.inf", timeout => 180);

and 

     assert_script_run "dsctl localhost status";
to
     assert_script_run "dsctl -v localhost status";


I also need to see the output of 

dsctl -v -l 

rpm -qa | grep -i 389

Thanks,
Comment 6 William Brown 2021-10-04 23:19:11 UTC
Reminder for needinfo
Comment 7 openQA Review 2021-10-18 23:59:19 UTC
This is an autogenerated message for openQA integration by the openqa_review script:

This bug is still referenced in a failing openQA test: openldap_to_389ds
https://openqa.opensuse.org/tests/1974916

To prevent further reminder comments one of the following options should be followed:
1. The test scenario is fixed by applying the bug fix to the tested product or the test is adjusted
2. The openQA job group is moved to "Released" or "EOL" (End-of-Life)
3. The bugref in the openQA scenario is removed or replaced, e.g. `label:wontfix:boo1234`
Comment 8 Dominique Leuenberger 2021-10-19 12:22:32 UTC
https://openqa.opensuse.org/tests/1978570

This test run executed

dscreate -v from-file ./instance.inf

https://openqa.opensuse.org/tests/1978570#step/openldap_to_389ds/75

and

 dsctl -v localhost status

https://openqa.opensuse.org/tests/1978570#step/openldap_to_389ds/78
Comment 9 William Brown 2021-10-21 00:30:14 UTC
I need the content of ./instance.inf in order to try to reproduce this, and I can't make sense of the openqa website so I can't find the file content it's so confusing :( can you upload or provide this file for me? 

On my own TW setup I can't reproduce. 

INFO: Perform post-installation tasks ...
DEBUG: cn=config set REPLACE: ('nsslapd-rootpw', '********')
DEBUG: systemd status -> True
DEBUG: systemd status -> True
DEBUG: systemd status -> True
DEBUG: systemd status -> True
DEBUG:  Instance setup complete
DEBUG: FINISH: Completed installation for instance: slapd-localhost
localhost:/home/admin # dsctl localhost status
Instance "localhost" is running
localhost:/home/admin # dsctl -v localhost status
DEBUG: The 389 Directory Server Administration Tool
DEBUG: Inspired by works of: ITS, The University of Adelaide
DEBUG: Called with: Namespace(func=<function instance_status at 0x7ff47f651c10>, instance='localhost', json=False, list=False, remove_all=False, verbose=True)
DEBUG: Allocate <class 'lib389.DirSrv'> with None
DEBUG: Allocate <class 'lib389.DirSrv'> with b'localhost':636
DEBUG: Allocate <class 'lib389.DirSrv'> with b'localhost':636
DEBUG: Instance allocated
DEBUG: systemd status -> True
INFO: Instance "localhost" is running
Comment 10 Dominique Leuenberger 2021-10-21 07:30:52 UTC
(In reply to William Brown from comment #9)
> I need the content of ./instance.inf in order to try to reproduce this, and
> I can't make sense of the openqa website so I can't find the file content
> it's so confusing :( can you upload or provide this file for me? 

As mentioned in comment #4: all config files used by the test come from

https://github.com/os-autoinst/os-autoinst-distri-opensuse/tree/master/data/openldap_to_389ds

There is also instance.inf

(if you want, I can also give you an introduction to openQA)
Comment 11 Dominique Leuenberger 2021-10-21 08:35:04 UTC
CC maintainer of the openQA test module
Comment 12 William Brown 2021-10-22 00:21:20 UTC
> 
> (if you want, I can also give you an introduction to openQA)

Sure, that would help. But really the whole UI needs work because it's super confusing :| 

Anyway, with these files I still can't reproduce the problem:

localhost:/home/admin # dscreate from-file /root/test.inf
Starting installation ...
Validate installation settings ...
Create file system structures ...
Create self-signed certificate database ...
selinux is disabled, will not relabel ports or files.
Perform SELinux labeling ...
selinux is disabled, will not relabel ports or files.
Create database backend: dc=example,dc=com ...
Perform post-installation tasks ...
Completed installation for instance: slapd-localhost
localhost:/home/admin # dsctl localhost status
Instance "localhost" is running

As far as I can see, the issue is something in openQA and the testcase, not 389-ds itself.
Comment 13 WEI GAO 2021-10-25 03:11:09 UTC
Try give env for William debug
Comment 14 William Brown 2021-10-25 04:31:47 UTC
https://github.com/389ds/389-ds-base/issues/4959

Found cause, reported upstream and will work on a fix.
Comment 15 William Brown 2021-10-25 04:54:12 UTC
https://github.com/389ds/389-ds-base/pull/4960

I've put in a fix to the error message here, but we can't actually do much else beside be clearer about whats going on.

The issue is that susetest machines in openqa seem to have an invalid hostname that doesn't resolve to localhost:

susetest:~/test # cat /etc/hosts
#
# hosts         This file describes a number of hostname-to-address
#               mappings for the TCP/IP subsystem.  It is mostly
#               used at boot time, when no name servers are running.
#               On small systems, this file can be used instead of a
#               "named" name server.
# Syntax:
#
# IP-Address  Full-Qualified-Hostname  Short-Hostname
#

127.0.0.1	localhost
# fallback hostname used by NetworkManager
127.0.0.1	localhost.localdomain

# special IPv6 addresses
::1             localhost ipv6-localhost ipv6-loopback

fe00::0         ipv6-localnet

ff00::0         ipv6-mcastprefix
ff02::1         ipv6-allnodes
ff02::2         ipv6-allrouters
ff02::3         ipv6-allhosts


due to this when lib389 is trying to determine if the instance status can be retrieved, we look if the named instance is local or not. To determine this we use a number of factors, but one is the hostname matching to a local ip on the machine. 

In this case, we likely need to add to /etc/hosts:

127.0.0.1	susetest

Which would resolve this detection issue.
Comment 16 WEI GAO 2021-10-25 09:02:15 UTC
https://progress.opensuse.org/issues/101435 for further tracking openqa task.
Comment 17 yutao wang 2021-10-27 01:41:49 UTC
Added '127.0.0.1	susetest' to /etc/hosts.
It can pass. 
Verified case: https://openqa.opensuse.org/tests/1993549#
Comment 18 William Brown 2021-10-27 01:42:36 UTC
Thanks, I'll get the upstream code merged to help with clearer errors about this.
Comment 19 Dominique Leuenberger 2022-10-26 11:19:54 UTC
This issue has been fixed (also in 389ds upstream)

https://github.com/389ds/389-ds-base/issues/4959