[error_logger:error,2019-12-16T00:25:20.166-08:00,ns_1@cb.local:error_logger<0.32.0>:ale_error_logger_handler:do_log:203]** Generic server ns_heart terminating
** Last message in was beat
** When Server state == {state,undefined,<0.29983.60>,[],0}
[ns_server:debug,2019-12-16T00:25:20.166-08:00,ns_1@cb.local:<0.29983.60>:ns_pubsub:do_subscribe_link_continue:152]Parent process of subscription {buckets_events,<0.29982.60>} exited with reason {enotsup,
By strace'ing the beam.smp process which was running at ~100% CPU I see a number of read syscalls failing with EOPNOTSUPP - as per the ns_heart error seen in the logs:
, when enabled
CONFIG_PSI ("Pressure stall information tracking").
When CONFIG_PSI_DEFAULT_DISABLED is also enabled (and psi=1 kernel
parameter is not set) these files are created, but not readable.
I believe that EOPNOTSUPP needs to be handled as possible error code in this path.
Dave Rigby
added a comment - - edited By strace 'ing the beam.smp process which was running at ~100% CPU I see a number of read syscalls failing with EOPNOTSUPP - as per the ns_heart error seen in the logs:
$ strace -p 76463 -f -e trace=%desc
...
76469 openat(AT_FDCWD, "/proc/pressure/cpu" , O_RDONLY) = 47
76469 fstat(47, {st_mode=S_IFREG|0444, st_size=0, ...}) = 0
76469 read (47, 0x7f8dea68b6c8, 16384) = -1 EOPNOTSUPP (Operation not supported)
76469 close(47)
...
It appears that something is preventing ns_heart from reading from the /proc file /proc/pressure/cpu .
From https://patchwork.ozlabs.org/patch/1031852/ , it appears that those files can be present, but unreadable if CONFIG_PSI_DEFAULT_DISABLED is set:
Kernel v4.20 added /proc/pressure/
Unknown macro: {cpu,io,memory}
, when enabled
CONFIG_PSI ("Pressure stall information tracking").
When CONFIG_PSI_DEFAULT_DISABLED is also enabled (and psi=1 kernel
parameter is not set) these files are created, but not readable.
Signed-off-by: Petr Vorel <pvorel@suse.cz>
—
testcases/kernel/fs/proc/proc01.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/testcases/kernel/fs/proc/proc01.c b/testcases/kernel/fs/proc/proc01.c
index c886e7e85..d917591bf 100644
--- a/testcases/kernel/fs/proc/proc01.c
+++ b/testcases/kernel/fs/proc/proc01.c
@@ -111,6 +111,7 @@ static const struct mapping known_issues[] = {
{"read", "/proc/sys/vm/nr_hugepages", EOPNOTSUPP},
{"read", "/proc/sys/vm/nr_overcommit_hugepages", EOPNOTSUPP},
{"read", "/proc/sys/vm/nr_hugepages_mempolicy", EOPNOTSUPP},
+ {"read", "/proc/pressure/*", EOPNOTSUPP},
{"", "", 0}
};
I believe that EOPNOTSUPP needs to be handled as possible error code in this path.
[error_logger:error,2019-12-16T00:25:20.166-08:00,ns_1@cb.local:error_logger<0.32.0>:ale_error_logger_handler:do_log:203]** Generic server ns_heart terminating
** Last message in was beat
** When Server state == {state,undefined,<0.29983.60>,[],0}
** Reason for termination ==
** {enotsup,[{misc,raw_read_loop,2,[{file,"src/misc.erl"},{line,934}]},
{misc,with_file,3,[{file,"src/misc.erl"},{line,1782}]},
{ns_heart,'-grab_procfs_files/0-lc$^0/1-0-',1,
[{file,"src/ns_heart.erl"},{line,509}]},
{ns_heart,'-grab_procfs_files/0-lc$^0/1-0-',1,
[{file,"src/ns_heart.erl"},{line,515}]},
{ns_heart,current_status_slow_inner,0,
[{file,"src/ns_heart.erl"},{line,347}]},
{ns_heart,current_status_slow,1,
[{file,"src/ns_heart.erl"},{line,250}]},
{ns_heart,update_current_status,1,
[{file,"src/ns_heart.erl"},{line,187}]},
{ns_heart,handle_info,2,[{file,"src/ns_heart.erl"},{line,118}]}]}
[ns_server:debug,2019-12-16T00:25:20.166-08:00,ns_1@cb.local:<0.29983.60>:ns_pubsub:do_subscribe_link_continue:152]Parent process of subscription {buckets_events,<0.29982.60>} exited with reason {enotsup,
[{misc,
raw_read_loop,
2,
[{file,
"src/misc.erl"},
{line,
934}]},
{misc,
with_file,
3,
[{file,
"src/misc.erl"},
{line,
1782}]},
{ns_heart,
'-grab_procfs_files/0-lc$^0/1-0-',
1,
[{file,
"src/ns_heart.erl"},
{line,
509}]},
{ns_heart,
'-grab_procfs_files/0-lc$^0/1-0-',
1,
[{file,
"src/ns_heart.erl"},
{line,
515}]},
{ns_heart,
current_status_slow_inner,
0,
[{file,
"src/ns_heart.erl"},
{line,
347}]},
{ns_heart,
current_status_slow,
1,
[{file,
"src/ns_heart.erl"},
{line,
250}]},
{ns_heart,
update_current_status,
1,
[{file,
"src/ns_heart.erl"},
{line,
187}]},
{ns_heart,
handle_info,
2,
[{file,
"src/ns_heart.erl"},
{line,
118}]}]}
[error_logger:error,2019-12-16T00:25:20.167-08:00,ns_1@cb.local:error_logger<0.32.0>:ale_error_logger_handler:do_log:203]
=========================CRASH REPORT=========================
crasher:
initial call: ns_heart:init/1
pid: <0.29982.60>
registered_name: ns_heart
exception error: enotsup
in function misc:raw_read_loop/2 (src/misc.erl, line 934)
in call from misc:with_file/3 (src/misc.erl, line 1782)
in call from ns_heart:'-grab_procfs_files/0-lc$^0/1-0-'/1 (src/ns_heart.erl, line 509)
in call from ns_heart:'-grab_procfs_files/0-lc$^0/1-0-'/1 (src/ns_heart.erl, line 515)
in call from ns_heart:current_status_slow_inner/0 (src/ns_heart.erl, line 347)
in call from ns_heart:current_status_slow/1 (src/ns_heart.erl, line 250)
in call from ns_heart:update_current_status/1 (src/ns_heart.erl, line 187)
in call from ns_heart:handle_info/2 (src/ns_heart.erl, line 118)
ancestors: [ns_heart_sup,ns_server_sup,ns_server_nodes_sup,<0.206.0>,
ns_server_cluster_sup,root_sup,<0.118.0>]
message_queue_len: 0
messages: []
links: [<0.29981.60>,<0.29983.60>,<0.293.0>]
dictionary: []
trap_exit: true
status: running
heap_size: 6772
stack_size: 27
reductions: 8403
neighbours: