Bug 1155690 - nginx starts faster than the network and need Requires=network.target
nginx starts faster than the network and need Requires=network.target
Status: RESOLVED FIXED
Classification: openSUSE
Product: openSUSE Tumbleweed
Classification: openSUSE
Component: Other
Current
All openSUSE Factory
: P5 - None : Normal (vote)
: ---
Assigned To: Artem Chernikov
E-mail List
:
Depends on:
Blocks:
  Show dependency treegraph
 
Reported: 2019-11-02 21:48 UTC by Илья Индиго
Modified: 2021-03-19 08:05 UTC (History)
3 users (show)

See Also:
Found By: ---
Services Priority:
Business Priority:
Blocker: ---
Marketing QA Status: ---
IT Deployment: ---


Attachments

Note You need to log in before you can comment on or make changes to this bug.
Description Илья Индиго 2019-11-02 21:48:19 UTC
nginx starts faster than the network (systemd-networkd).
When using SSL Stapling,

ssl_stapling on;
ssl_stapling_verify on;
resolver 1.1.1.1 8.8.8.8 valid=300s ipv6=off;

when the server boots up, the OCSP access errors appear in the logs.

[warn] 933#933: "ssl_stapling" ignored, host not found in OCSP responder "ocsp.int-x3.letsencrypt.org" in the certificate

After=network.target alone is not enough, you also need Wants=network.target in service-file.
Comment 1 Илья Индиго 2019-11-02 22:06:37 UTC
Wants=network.target is also not enough, you need Requires=network.target
Comment 2 Илья Индиго 2019-11-02 22:29:39 UTC
I do not understand. Even this is not enough. Errors still appear when the server boots.
The IP address is static from the router, which constantly has an Internet connection.
Comment 3 Aaron Puchert 2019-11-05 17:15:32 UTC
What about After=network-online.target?
Comment 4 Илья Индиго 2019-11-07 14:26:37 UTC
(In reply to Aaron Puchert from comment #3)
> What about After=network-online.target?

After=network-online.target remote-fs.target nss-lookup.target

sudo systemctl daemon-reexec && sudo systemctl reboot

2019/11/07 17:19:51 [warn] 897#897: "ssl_stapling" ignored, host not found in OCSP responder "ocsp.int-x3.letsencrypt.org"...
2019/11/07 17:19:51 [warn] 897#897: "ssl_stapling" ignored, host not found in OCSP responder "ocsp.int-x3.letsencrypt.org"...
2019/11/07 17:19:51 [warn] 897#897: "ssl_stapling" ignored, host not found in OCSP responder "ocsp.int-x3.letsencrypt.org"...
2019/11/07 17:19:51 [warn] 897#897: "ssl_stapling" ignored, host not found in OCSP responder "ocsp.int-x3.letsencrypt.org"...
2019/11/07 17:19:51 [warn] 897#897: "ssl_stapling" ignored, host not found in OCSP responder "ocsp.int-x3.letsencrypt.org"...
2019/11/07 17:19:51 [warn] 897#897: "ssl_stapling" ignored, host not found in OCSP responder "ocsp.int-x3.letsencrypt.org"...
2019/11/07 17:19:51 [warn] 897#897: "ssl_stapling" ignored, host not found in OCSP responder "ocsp.int-x3.letsencrypt.org"...
2019/11/07 17:19:52 [warn] 932#932: "ssl_stapling" ignored, host not found in OCSP responder "ocsp.int-x3.letsencrypt.org"...
2019/11/07 17:19:52 [warn] 932#932: "ssl_stapling" ignored, host not found in OCSP responder "ocsp.int-x3.letsencrypt.org"...
2019/11/07 17:19:52 [warn] 932#932: "ssl_stapling" ignored, host not found in OCSP responder "ocsp.int-x3.letsencrypt.org"...
2019/11/07 17:19:52 [warn] 932#932: "ssl_stapling" ignored, host not found in OCSP responder "ocsp.int-x3.letsencrypt.org"...
2019/11/07 17:19:52 [warn] 932#932: "ssl_stapling" ignored, host not found in OCSP responder "ocsp.int-x3.letsencrypt.org"...
2019/11/07 17:19:52 [warn] 932#932: "ssl_stapling" ignored, host not found in OCSP responder "ocsp.int-x3.letsencrypt.org"...
2019/11/07 17:19:52 [warn] 932#932: "ssl_stapling" ignored, host not found in OCSP responder "ocsp.int-x3.letsencrypt.org"...
Comment 5 Aaron Puchert 2019-11-07 23:09:49 UTC
Hmm, if you're using NetworkManager, that's not very surprising:

$ systemctl list-dependencies --after network.target
network.target
● ├─NetworkManager.service
● └─network-pre.target
$ systemctl list-dependencies --after network-online.target
network-online.target
● └─network.target
●   ├─NetworkManager.service
●   └─network-pre.target

So it doesn't make a difference. The dhclient is a child of NetworkManager and might be started later than that. If you use wicked, could you post the outputs of these two commands? If not, maybe try wicked.
Comment 6 Илья Индиго 2019-11-07 23:20:22 UTC
I'm use systemd-networkd.
Comment 7 Илья Индиго 2019-11-07 23:22:28 UTC
sudo systemctl list-dependencies --after network.target

network.target
● ├─systemd-networkd.service
● └─network-pre.target
●   └─firewalld.service
Comment 8 Aaron Puchert 2019-11-07 23:49:35 UTC
And network-online.target has the same dependencies?
Comment 9 Илья Индиго 2019-11-07 23:51:01 UTC
sudo systemctl list-dependencies --after network-online.target

network-online.target
● └─network.target
●   ├─systemd-networkd.service
●   └─network-pre.target
●     └─firewalld.service
Comment 10 Aaron Puchert 2019-11-08 00:07:08 UTC
Could you try systemd-networkd-wait-online.service?
Comment 11 Stefan Schubert 2019-12-10 15:22:55 UTC
(In reply to Aaron Puchert from comment #10)
> Could you try systemd-networkd-wait-online.service?
Comment 12 Stefan Schubert 2019-12-10 15:24:36 UTC
Artem do you have an idea ? :-)
Comment 13 Илья Индиго 2019-12-11 07:57:54 UTC
Sorry for not responding right away.
It was a lot of work, and then I got sick, and then completely forgot.

Using the "systemd-networkd-wait-online.service" service for something is a very bad idea.

By default, it requires the initialization of ALL network interfaces UNLIMITED time.
This causes the server to hang on the line:
Job is Running for Network to be Configured (... / no limit).

And nothing more can be done except to boot into single-user mode and disable it.

I solved the problem like this.
At first I required (After=php-fpm.service + Wants=php-fpm.service) from nginx.service, I still need nginx last, but even this was not enough, one line in the log still appeared.
And then I required (After=mariadb.service + Wants=mariadb.service) from php-fpm.service.

And only after that the start of nginx was reliably and safely slowed down until the network connection was fully established.

I understand that this is not a universal solution.
If you think that nothing can be done here, you can close the bug.
Comment 14 Aaron Puchert 2019-12-11 14:03:23 UTC
(In reply to Илья Индиго from comment #13)
> Using the "systemd-networkd-wait-online.service" service for something is a
> very bad idea.
> 
> By default, it requires the initialization of ALL network interfaces
> UNLIMITED time.
> This causes the server to hang on the line:
> Job is Running for Network to be Configured (... / no limit).

Are you aware of the options for systemd-networkd-wait-online? [1]

With -i you can specify which interfaces you want to wait for, or you can ignore some interfaces, and you can also specify a timeout.

[1] http://man7.org/linux/man-pages/man8/systemd-networkd-wait-online.service.8.html
Comment 15 Илья Индиго 2019-12-11 17:23:29 UTC
Where and how to specify these options?
sudo systemctl edit systemd-networkd-wait-online.service ?
Comment 16 Aaron Puchert 2019-12-12 01:27:17 UTC
(In reply to Илья Индиго from comment #15)
> Where and how to specify these options?
> sudo systemctl edit systemd-networkd-wait-online.service ?

If you look at the unit file behind that service, /usr/lib/systemd/system/systemd-networkd-wait-online.service, you see that it has this line:

ExecStart=/usr/lib/systemd/systemd-networkd-wait-online

This refers to the executable from the man page in comment 14, there we want to add the options. Now if you edit this file, your changes will be overwritten by the next systemd update. So instead we use the drop-in mechanism [1]: create a directory /etc/systemd/system/systemd-networkd-wait-online.service.d and a file in that directory with ending .conf and content

[Service]
ExecStart=/usr/lib/systemd/systemd-networkd-wait-online <options>

where you add your options. Then check with

systemctl cat systemd-networkd-wait-online.service
systemctl show systemd-networkd-wait-online.service

that your options appear. Maybe you'll have to run "systemctl daemon-reload" first.

[1] https://www.freedesktop.org/software/systemd/man/systemd.unit.html
Comment 17 Илья Индиго 2019-12-12 08:34:22 UTC
Thank you, Aaron!

nginx.service
Wants=network-online.target
After=network-online.target remote-fs.target nss-lookup.target

systemd-networkd-wait-online.service
ExecStart=/usr/lib/systemd/systemd-networkd-wait-online -i enp3s0 -i wlp2s0 --any --timeout=30

3 reboots one after another without errors in nginx log.
Comment 18 Aaron Puchert 2019-12-13 00:14:28 UTC
(In reply to Илья Индиго from comment #17)
> 3 reboots one after another without errors in nginx log.

That's good to hear. So I think the solution for this bug would be to add

After=systemd-networkd-wait-online.service

to nginx.service. Unless I'm mistaken, this will only ensure that nginx is started after systemd-networkd-wait-online.service if that is enabled (it seems to be disabled by default), otherwise it's a no-op.
Comment 19 Илья Индиго 2019-12-13 07:47:02 UTC
> After=systemd-networkd-wait-online.service
"After=" only guarantees that the daemon will be started after the specified daemons, but it does not guarantee that it will start later than them.
nginx is a very fast daemon.

I also conducted an experiment and confirmed that the "After=" section does not affect the result at all.
A stable absence of errors in the log is possible for me only if 2 factors are observed.

1 nginx.service
Wants=network-online.service

2 Configured and running systemd-networkd-wait-online.service

Then everything is stably good for me.

So I recommend at least adding to the standard nginx.service
Wants=network-online.target
And also, just in case, replace in the After section "network.target" with "network-online.target".

And also change or specify timeout=120 for "systemd-networkd-wait-online.service".
The documentation states that it is 120 seconds, but in fact it is infinite, that is, equal to zero.
Comment 20 Aaron Puchert 2019-12-13 22:44:12 UTC
(In reply to Илья Индиго from comment #19)
> > After=systemd-networkd-wait-online.service

Let me correct this, it should be

After=network-online.target

Because systemd-networkd-wait-online.service has

WantedBy=network-online.target

that should be enough, and nginx shouldn't care about the details of the network configuration.

> "After=" only guarantees that the daemon will be started after the specified
> daemons, but it does not guarantee that it will start later than them.
> nginx is a very fast daemon.

What do you mean by this? What is the difference between "after" and "later"?

> I also conducted an experiment and confirmed that the "After=" section does
> not affect the result at all.
> A stable absence of errors in the log is possible for me only if 2 factors
> are observed.
> 
> 1 nginx.service
> Wants=network-online.service

Wants= and Requires= only say that you want network-online.target at some point in time, it does not say anything about the chronological order in which they should be started. ("Note that requirement dependencies do not influence the order in which services are started or stopped.")

So we definitely also need After= here.

I was under the impression that network-online.target is enabled by default, but that's not the case. So we need both Wants= and After=. That's not without precedent, other services like winbind.service have the same.

> 2 Configured and running systemd-networkd-wait-online.service

Correct, but this can't be fixed on the distribution side, I believe. Especially if users have to add their own flags to make it work properly.
 
> And also change or specify timeout=120 for
> "systemd-networkd-wait-online.service".
> The documentation states that it is 120 seconds, but in fact it is infinite,
> that is, equal to zero.

That seems wrong, maybe you can open a separate bug report for that?
Comment 21 Aaron Puchert 2019-12-13 23:02:27 UTC
On the other hand, to quote [1]: "network server software should generally not pull this [=network-online.target] in (since server software generally is happy to accept local connections even before any routable network interface is up), its primary purpose is network client software that cannot operate without network." That seems right to me. By the way, this is a very interesting read and it should answer all your questions.

If I understand correctly, nginx still starts correctly without these changes, there are just some warnings in the beginning because it assumes that the network is already up. Maybe you can turn off these warnings or find a way to delay them for some time?

Anyway, I take back my recommendation to add "After=network-online.target" to nginx.service, because that shouldn't be needed for an HTTP server.

[1] https://www.freedesktop.org/wiki/Software/systemd/NetworkTarget/
Comment 22 Илья Индиго 2019-12-14 07:38:42 UTC
> Let me correct this, it should be
> After=network-online.target
I do not mind, but this alone without Wants=network-online.target is not enough.
Yesterday I spent about an hour in the experiment every time several times in a row rebooting the server.

> What do you mean by this? What is the difference between "after" and "later"?
Translation difficulties.
Suppose we have a daemonA (networkd) that starts in 3 seconds, and a daemon B (nginx) that starts in 1 second.
In daemon B, specify After=daemonB.service
Suppose that the 10th second of booting OS, and the queue goes to daemon B, the first daemon A starts on the 10th second, and on the 11th daemon B.
But loading Daemon A will end only at 13th second, since it takes 3 seconds to load, and daemon B is already at 12th, despite the fact that it started after daemon A.

> So we need both Wants= and After=. 
I completely agree with you, I wrote about it below.
Here I just gave practical results of the experiments.

> That seems wrong, maybe you can open a separate bug report for that?
Well, then I will need to conduct additional experiments to accurately reproduce the scenario of behavior, after which I will write.
Comment 23 Aaron Puchert 2019-12-14 15:39:45 UTC
(In reply to Илья Индиго from comment #22)
> > Let me correct this, it should be
> > After=network-online.target
> I do not mind, but this alone without Wants=network-online.target is not
> enough.

Correct, it also needs to be active.

> Translation difficulties.
> Suppose we have a daemonA (networkd) that starts in 3 seconds, and a daemon
> B (nginx) that starts in 1 second.
> In daemon B, specify After=daemonB.service
> Suppose that the 10th second of booting OS, and the queue goes to daemon B,
> the first daemon A starts on the 10th second, and on the 11th daemon B.
> But loading Daemon A will end only at 13th second, since it takes 3 seconds
> to load, and daemon B is already at 12th, despite the fact that it started
> after daemon A.

I think there is a misunderstanding here. Systemd services have different types, and systemd-networkd-wait-online.service has

Type=oneshot

https://www.freedesktop.org/software/systemd/man/systemd.service.html talks about when services are considered started and how that depends on the type. For oneshot services, it says "the service manager will consider the unit up after the main process exits". Now systemd-networkd-wait-online.service will only exit when the network is actually up, so if nginx has an After= dependency on this service and this service is enabled, nginx will start after the network is up.

The problem is that After= doesn't imply Requires= or Wants=, so if the unit isn't enabled for other reasons, nothing will happen. After= only does something if both units are enabled.
Comment 24 Илья Индиго 2019-12-30 12:42:59 UTC
-After=network.target remote-fs.target nss-lookup.target
+After=network-online.target remote-fs.target nss-lookup.target
+Wants=network-online.target

Accepted to Factory.
Comment 27 Swamp Workflow Management 2020-04-22 13:14:09 UTC
SUSE-RU-2020:1064-1: An update that has three recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 1150711,1155690,1156202
CVE References: 
Sources used:
SUSE Linux Enterprise Module for Server Applications 15-SP1 (src):    nginx-1.16.1-6.10.4
SUSE Linux Enterprise Module for Open Buildservice Development Tools 15-SP1 (src):    nginx-1.16.1-6.10.4

NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.
Comment 28 Swamp Workflow Management 2020-05-01 22:25:29 UTC
openSUSE-RU-2020:0577-1: An update that has three recommended fixes can now be installed.

Category: recommended (moderate)
Bug References: 1150711,1155690,1156202
CVE References: 
Sources used:
openSUSE Leap 15.1 (src):    nginx-1.16.1-lp151.4.9.1
Comment 29 Swamp Workflow Management 2020-05-04 19:16:46 UTC
SUSE-SU-2020:1171-1: An update that solves one vulnerability and has three fixes is now available.

Category: security (moderate)
Bug References: 1150711,1155690,1156202,1160682
CVE References: CVE-2019-20372
Sources used:
SUSE Linux Enterprise Server for SAP 15 (src):    nginx-1.16.1-3.12.7
SUSE Linux Enterprise Server 15-LTSS (src):    nginx-1.16.1-3.12.7
SUSE Linux Enterprise High Performance Computing 15-LTSS (src):    nginx-1.16.1-3.12.7
SUSE Linux Enterprise High Performance Computing 15-ESPOS (src):    nginx-1.16.1-3.12.7

NOTE: This line indicates an update has been released for the listed product(s). At times this might be only a partial fix. If you have questions please reach out to maintenance coordination.
Comment 30 Илья Индиго 2020-05-05 09:02:21 UTC
nginx 1.16.1 old stable branch.
Need update to 1.8.0.