- Jul 17, 2019
-
-
Joel Fernandes (Google) authored
struct pid's count is an atomic_t field used as a refcount. Use refcount_t for it which is basically atomic_t but does additional checking to prevent use-after-free bugs. For memory ordering, the only change is with the following: - if ((atomic_read(&pid->count) == 1) || - atomic_dec_and_test(&pid->count)) { + if (refcount_dec_and_test(&pid->count)) { kmem_cache_free(ns->pid_cachep, pid); Here the change is from: Fully ordered --> RELEASE + ACQUIRE (as per refcount-vs-atomic.rst) This ACQUIRE should take care of making sure the free happens after the refcount_dec_and_test(). The above hunk also removes atomic_read() since it is not needed for the code to work and it is unclear how beneficial it is. The removal lets refcount_dec_and_test() check for cases where get_pid() happened before the object was freed. Link: http://lkml.kernel.org/r/20190701183826.191936-1-joel@joelfernandes.org Signed-off-by:
Joel Fernandes (Google) <joel@joelfernandes.org>
Reviewed-by:
Andrea Parri <andrea.parri@amarulasolutions.com>
Reviewed-by:
Kees Cook <keescook@chromium.org>
Cc: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Matthew Wilcox <willy@infradead.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Will Deacon <will.deacon@arm.com>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Elena Reshetova <elena.reshetova@intel.com>
Cc: Jann Horn <jannh@google.com>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: KJ Tsanaktsidis <ktsanaktsidis@zendesk.com>
Cc: Michal Hocko <mhocko@suse.com>
Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> -
Oleg Nesterov authored
task->saved_sigmask and ->restore_sigmask are only used in the ret-from- syscall paths. This means that set_user_sigmask() can save ->blocked in ->saved_sigmask and do set_restore_sigmask() to indicate that ->blocked was modified. This way the callers do not need 2 sigset_t's passed to set/restore and restore_user_sigmask() renamed to restore_saved_sigmask_unless() turns into the trivial helper which just calls restore_saved_sigmask(). Link: http://lkml.kernel.org/r/20190606113206.GA9464@redhat.com Signed-off-by:
Oleg Nesterov <oleg@redhat.com>
Cc: Deepa Dinamani <deepa.kernel@gmail.com>
Cc: Arnd Bergmann <arnd@arndb.de>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Davidlohr Bueso <dave@stgolabs.net>
Cc: Eric Wong <e@80x24.org>
Cc: Jason Baron <jbaron@akamai.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Al Viro <viro@ZenIV.linux.org.uk>
Cc: Eric W. Biederman <ebiederm@xmission.com>
Cc: David Laight <David.Laight@aculab.com>
Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> -
Elvira Khabirova authored
PTRACE_GET_SYSCALL_INFO is a generic ptrace API that lets ptracer obtain details of the syscall the tracee is blocked in. There are two reasons for a special syscall-related ptrace request. Firstly, with the current ptrace API there are cases when ptracer cannot retrieve necessary information about syscalls. Some examples include: * The notorious int-0x80-from-64-bit-task issue. See [1] for details. In short, if a 64-bit task performs a syscall through int 0x80, its tracer has no reliable means to find out that the syscall was, in fact, a compat syscall, and misidentifies it. * Syscall-enter-stop and syscall-exit-stop look the same for the tracer. Common practice is to keep track of the sequence of ptrace-stops in order not to mix the two syscall-stops up. But it is not as simple as it looks; for example, strace had a (just recently fixed) long-standing bug where attaching strace to a tracee that is performing the execve system call led to the tracer identifying the following syscall-exit-stop as syscall-enter-stop, which messed up all the state tracking. * Since the introduction of commit 84d77d3f ("ptrace: Don't allow accessing an undumpable mm"), both PTRACE_PEEKDATA and process_vm_readv become unavailable when the process dumpable flag is cleared. On such architectures as ia64 this results in all syscall arguments being unavailable for the tracer. Secondly, ptracers also have to support a lot of arch-specific code for obtaining information about the tracee. For some architectures, this requires a ptrace(PTRACE_PEEKUSER, ...) invocation for every syscall argument and return value. ptrace(2) man page: long ptrace(enum __ptrace_request request, pid_t pid, void *addr, void *data); ... PTRACE_GET_SYSCALL_INFO Retrieve information about the syscall that caused the stop. The information is placed into the buffer pointed by "data" argument, which should be a pointer to a buffer of type "struct ptrace_syscall_info". The "addr" argument contains the size of the buffer pointed to by "data" argument (i.e., sizeof(struct ptrace_syscall_info)). The return value contains the number of bytes available to be written by the kernel. If the size of data to be written by the kernel exceeds the size specified by "addr" argument, the output is truncated. [ldv@altlinux.org: selftests/seccomp/seccomp_bpf: update for PTRACE_GET_SYSCALL_INFO] Link: http://lkml.kernel.org/r/20190708182904.GA12332@altlinux.org Link: http://lkml.kernel.org/r/20190510152842.GF28558@altlinux.org Signed-off-by:
Elvira Khabirova <lineprinter@altlinux.org>
Co-developed-by:
Dmitry V. Levin <ldv@altlinux.org>
Signed-off-by:
Dmitry V. Levin <ldv@altlinux.org>
Reviewed-by:
Oleg Nesterov <oleg@redhat.com>
Reviewed-by:
Kees Cook <keescook@chromium.org>
Reviewed-by:
Andy Lutomirski <luto@kernel.org>
Cc: Eugene Syromyatnikov <esyr@redhat.com>
Cc: Benjamin Herrenschmidt <benh@kernel.crashing.org>
Cc: Greentime Hu <greentime@andestech.com>
Cc: Helge Deller <deller@gmx.de> [parisc]
Cc: James E.J. Bottomley <jejb@parisc-linux.org>
Cc: James Hogan <jhogan@kernel.org>
Cc: kbuild test robot <lkp@intel.com>
Cc: Michael Ellerman <mpe@ellerman.id.au>
Cc: Paul Burton <paul.burton@mips.com>
Cc: Paul Mackerras <paulus@samba.org>
Cc: Ralf Baechle <ralf@linux-mips.org>
Cc: Richard Kuo <rkuo@codeaurora.org>
Cc: Shuah Khan <shuah@kernel.org>
Cc: Vincent Chen <deanbo422@gmail.com>
Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org> -
Weitao Hou authored
fix lenght to length Link: http://lkml.kernel.org/r/20190521050937.4370-1-houweitaoo@gmail.com Signed-off-by:
Weitao Hou <houweitaoo@gmail.com>
Acked-by:
Kees Cook <keescook@chromium.org>
Cc: Colin Ian King <colin.king@canonical.com>
Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jul 16, 2019
-
-
Cong Wang authored
trace_get_fields() is the only way to read tracepoint fields at run time, as their fields are defined at compile-time with macros. Make this function visible to all users and it will be used by trace event injection code to calculate the size of a tracepoint entry. Link: http://lkml.kernel.org/r/20190525165802.25944-4-xiyou.wangcong@gmail.com Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by:
Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> -
Cong Wang authored
filter_assign_type() could detect dynamic string and static string, but not string pointers. Teach filter_assign_type() to detect string pointers, and this will be needed by trace event injection code. BTW, trace event hist uses FILTER_PTR_STRING too. Link: http://lkml.kernel.org/r/20190525165802.25944-3-xiyou.wangcong@gmail.com Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by:
Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> -
Cong Wang authored
All callers of tracing_generic_entry_update() have to initialize entry->type, so let's just simply move it inside. Link: http://lkml.kernel.org/r/20190525165802.25944-2-xiyou.wangcong@gmail.com Cc: Ingo Molnar <mingo@redhat.com> Signed-off-by:
Cong Wang <xiyou.wangcong@gmail.com>
Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> -
Masami Hiramatsu authored
Change registered check only by trace_kprobe and remove TP_FLAG_REGISTERED from trace_probe, since this feature is only used for trace_kprobe. Link: http://lkml.kernel.org/r/155931588704.28323.4952266828256245833.stgit@devnote2 Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> -
Masami Hiramatsu authored
Add trace_event_call access APIs for trace_probe. Instead of accessing trace_probe.call directly, use those accesses by trace_probe_event_call() method. This hides the relationship of trace_event_call and trace_probe from trace_kprobe and trace_uprobe. Link: http://lkml.kernel.org/r/155931587711.28323.8335129014686133120.stgit@devnote2 Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> -
Masami Hiramatsu authored
Add trace_probe_name() and trace_probe_group_name() functions for accessing probe name and group name of trace_probe. Link: http://lkml.kernel.org/r/155931586717.28323.8738615064952254761.stgit@devnote2 Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> -
Masami Hiramatsu authored
Add trace_probe_test/set/clear_flag() functions for accessing trace_probe.flag field. This flags field should not be accessed directly. Link: http://lkml.kernel.org/r/155931585683.28323.314290023236905988.stgit@devnote2 Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> -
Masami Hiramatsu authored
Add trace_event_file access APIs for trace_probe data structure. This simplifies enabling/disabling operations in uprobe and kprobe events so that those don't touch deep inside the trace_probe. This also removing a redundant synchronization when the kprobe event is used from perf, since the perf itself uses tracepoint_synchronize_unregister() after disabling (ftrace- defined) event, thus we don't have to synchronize in that path. Also we don't need to identify local trace_kprobe too anymore. Link: http://lkml.kernel.org/r/155931584587.28323.372301976283354629.stgit@devnote2 Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> -
Masami Hiramatsu authored
Since trace_event_call is a field of trace_probe, these operations should be done in trace_probe.c. trace_kprobe and trace_uprobe use new functions to register/unregister trace_event_call. Link: http://lkml.kernel.org/r/155931583643.28323.14828411185591538876.stgit@devnote2 Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> -
Masami Hiramatsu authored
Add common trace_probe init and cleanup function in trace_probe.c, and use it from trace_kprobe.c and trace_uprobe.c Link: http://lkml.kernel.org/r/155931582664.28323.5934870189034740822.stgit@devnote2 Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> -
Masami Hiramatsu authored
Set event call's print format right after parsed command for simplifying (un)register_uprobe_event(). Link: http://lkml.kernel.org/r/155931581659.28323.5404667166417404076.stgit@devnote2 Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> -
Masami Hiramatsu authored
Set event call's print format right after parsed command for simplifying (un)register_kprobe_event(). Link: http://lkml.kernel.org/r/155931580625.28323.5158822928646225903.stgit@devnote2 Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org> -
Masami Hiramatsu authored
Since arm64 kernel initializes breakpoint trap vector in arch_initcall(), initializing kprobe (and run smoke test) in postcore_initcall() causes a kernel panic. To fix this issue, move the kprobe initialization in subsys_initcall() (which is called right afer the arch_initcall). In-kernel kprobe users (ftrace and bpf) are using fs_initcall() which is called after subsys_initcall(), so this shouldn't cause more problem. Link: http://lkml.kernel.org/r/155956708268.12228.10363800793132214198.stgit@devnote2 Link: http://lkml.kernel.org/r/20190709153755.GB10123@lakrids.cambridge.arm.com Reported-by:
Anders Roxell <anders.roxell@linaro.org>
Fixes: b5f8b32c ("kprobes: Initialize kprobes at postcore_initcall")
Tested-by:
Anders Roxell <anders.roxell@linaro.org>
Tested-by:
Mark Rutland <mark.rutland@arm.com>
Signed-off-by:
Masami Hiramatsu <mhiramat@kernel.org>
Signed-off-by:
Steven Rostedt (VMware) <rostedt@goodmis.org>
-
- Jul 15, 2019
-
-
Mauro Carvalho Chehab authored
Those files belong to the admin guide, so add them. Signed-off-by:
Mauro Carvalho Chehab <mchehab+samsung@kernel.org> -
Mauro Carvalho Chehab authored
The stuff under sysctl describes /sys interface from userspace point of view. So, add it to the admin-guide and remove the :orphan: from its index file. Signed-off-by:
Mauro Carvalho Chehab <mchehab+samsung@kernel.org> -
Mauro Carvalho Chehab authored
Rename the /proc/sys/ documentation files to ReST, using the README file as a template for an index.rst, adding the other files there via TOC markup. Despite being written on different times with different styles, try to make them somewhat coherent with a similar look and feel, ensuring that they'll look nice as both raw text file and as via the html output produced by the Sphinx build system. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by:
Mauro Carvalho Chehab <mchehab+samsung@kernel.org> -
Mauro Carvalho Chehab authored
Convert the locking documents to ReST and add them to the kernel development book where it belongs. Most of the stuff here is just to make Sphinx to properly parse the text file, as they're already in good shape, not requiring massive changes in order to be parsed. The conversion is actually: - add blank lines and identation in order to identify paragraphs; - fix tables markups; - add some lists markups; - mark literal blocks; - adjust title markups. At its new index.rst, let's add a :orphan: while this is not linked to the main index.rst file, in order to avoid build warnings. Signed-off-by:
Mauro Carvalho Chehab <mchehab+samsung@kernel.org>
Acked-by:
Federico Vaga <federico.vaga@vaga.pv.it>
-
- Jul 14, 2019
-
-
Dmitry V. Levin authored
The introduction of clone3 syscall accidentally broke CLONE_PIDFD support in traditional clone syscall on compat x86 and those architectures that use do_fork to implement clone syscall. This bug was found by strace test suite. Link: https://strace.io/logs/strace/2019-07-12 Fixes: 7f192e3c ("fork: add clone3") Bisected-and-tested-by:
Anatoly Pugachev <matorola@gmail.com>
Signed-off-by:
Dmitry V. Levin <ldv@altlinux.org>
Link: https://lore.kernel.org/r/20190714162047.GB10389@altlinux.org
Signed-off-by:
Christian Brauner <christian@brauner.io>
-
- Jul 13, 2019
-
-
Yuyang Du authored
The stats variable nr_unused_locks is incremented every time a new lock class is register and decremented when the lock is first used in __lock_acquire(). And after all, it is shown and checked in lockdep_stats. However, under configurations that either CONFIG_TRACE_IRQFLAGS or CONFIG_PROVE_LOCKING is not defined: The commit: 09180651 ("locking/lockdep: Consolidate lock usage bit initialization") missed marking the LOCK_USED flag at IRQ usage initialization because as mark_usage() is not called. And the commit: 886532ae ("locking/lockdep: Move mark_lock() inside CONFIG_TRACE_IRQFLAGS && CONFIG_PROVE_LOCKING") further made mark_lock() not defined such that the LOCK_USED cannot be marked at all when the lock is first acquired. As a result, we fix this by not showing and checking the stats under such configurations for lockdep_stats. Reported-by:
Qian Cai <cai@lca.pw>
Signed-off-by:
Yuyang Du <duyuyang@gmail.com>
Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Paul E. McKenney <paulmck@linux.vnet.ibm.com>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Will Deacon <will.deacon@arm.com>
Cc: arnd@arndb.de
Cc: frederic@kernel.org
Link: https://lkml.kernel.org/r/20190709101522.9117-1-duyuyang@gmail.com
Signed-off-by:
Ingo Molnar <mingo@kernel.org> -
Peter Zijlstra authored
John reported a DEBUG_PREEMPT warning caused by commit: aacedf26 ("sched/core: Optimize try_to_wake_up() for local wakeups") I overlooked that ttwu_stat() requires preemption disabled. Reported-by:
John Stultz <john.stultz@linaro.org>
Tested-by:
John Stultz <john.stultz@linaro.org>
Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Thomas Gleixner <tglx@linutronix.de>
Fixes: aacedf26 ("sched/core: Optimize try_to_wake_up() for local wakeups")
Link: https://lkml.kernel.org/r/20190710105736.GK3402@hirez.programming.kicks-ass.net
Signed-off-by:
Ingo Molnar <mingo@kernel.org> -
Alexander Shishkin authored
So far, we tried to disallow grouping exclusive events for the fear of complications they would cause with moving between contexts. Specifically, moving a software group to a hardware context would violate the exclusivity rules if both groups contain matching exclusive events. This attempt was, however, unsuccessful: the check that we have in the perf_event_open() syscall is both wrong (looks at wrong PMU) and insufficient (group leader may still be exclusive), as can be illustrated by running: $ perf record -e '{intel_pt//,cycles}' uname $ perf record -e '{cycles,intel_pt//}' uname ultimately successfully. Furthermore, we are completely free to trigger the exclusivity violation by: perf -e '{cycles,intel_pt//}' -e '{intel_pt//,instructions}' even though the helpful perf record will not allow that, the ABI will. The warning later in the perf_event_open() path will also not trigger, because it's also wrong. Fix all this by validating the original group before moving, getting rid of broken safeguards and placing a useful one to perf_install_in_context(). Signed-off-by:
Alexander Shishkin <alexander.shishkin@linux.intel.com>
Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org>
Cc: <stable@vger.kernel.org>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Cc: mathieu.poirier@linaro.org
Cc: will.deacon@arm.com
Fixes: bed5b25a ("perf: Add a pmu capability for "exclusive" events")
Link: https://lkml.kernel.org/r/20190701110755.24646-1-alexander.shishkin@linux.intel.com
Signed-off-by:
Ingo Molnar <mingo@kernel.org> -
Peter Zijlstra authored
Syzcaller reported the following Use-after-Free bug: close() clone() copy_process() perf_event_init_task() perf_event_init_context() mutex_lock(parent_ctx->mutex) inherit_task_group() inherit_group() inherit_event() mutex_lock(event->child_mutex) // expose event on child list list_add_tail() mutex_unlock(event->child_mutex) mutex_unlock(parent_ctx->mutex) ... goto bad_fork_* bad_fork_cleanup_perf: perf_event_free_task() perf_release() perf_event_release_kernel() list_for_each_entry() mutex_lock(ctx->mutex) mutex_lock(event->child_mutex) // event is from the failing inherit // on the other CPU perf_remove_from_context() list_move() mutex_unlock(event->child_mutex) mutex_unlock(ctx->mutex) mutex_lock(ctx->mutex) list_for_each_entry_safe() // event already stolen mutex_unlock(ctx->mutex) delayed_free_task() free_task() list_for_each_entry_safe() list_del() free_event() _free_event() // and so event->hw.target // is the already freed failed clone() if (event->hw.target) put_task_struct(event->hw.target) // WHOOPSIE, already quite dead Which puts the lie to the the comment on perf_event_free_task(): 'unexposed, unused context' not so much. Which is a 'fun' confluence of fail; copy_process() doing an unconditional free_task() and not respecting refcounts, and perf having creative locking. In particular: 82d94856 ("perf/core: Fix lock inversion between perf,trace,cpuhp") seems to have overlooked this 'fun' parade. Solve it by using the fact that detached events still have a reference count on their (previous) context. With this perf_event_free_task() can detect when events have escaped and wait for their destruction. Debugged-by:
Alexander Shishkin <alexander.shishkin@linux.intel.com>
Reported-by:
<syzbot+a24c397a29ad22d86c98@syzkaller.appspotmail.com>
Signed-off-by:
Peter Zijlstra (Intel) <peterz@infradead.org>
Acked-by:
Mark Rutland <mark.rutland@arm.com>
Cc: <stable@vger.kernel.org>
Cc: Alexander Shishkin <alexander.shishkin@linux.intel.com>
Cc: Arnaldo Carvalho de Melo <acme@redhat.com>
Cc: Jiri Olsa <jolsa@redhat.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Stephane Eranian <eranian@google.com>
Cc: Thomas Gleixner <tglx@linutronix.de>
Cc: Vince Weaver <vincent.weaver@maine.edu>
Fixes: 82d94856 ("perf/core: Fix lock inversion between perf,trace,cpuhp")
Signed-off-by:
Ingo Molnar <mingo@kernel.org>
-
- Jul 12, 2019
-
-
Aneesh Kumar K.V authored
Architectures like powerpc use different address range to map ioremap and vmalloc range. The memunmap() check used by the nvdimm layer was wrongly using is_vmalloc_addr() to check for ioremap range which fails for ppc64. This result in ppc64 not freeing the ioremap mapping. The side effect of this is an unbind failure during module unload with papr_scm nvdimm driver Link: http://lkml.kernel.org/r/20190701134038.14165-1-aneesh.kumar@linux.ibm.com Signed-off-by:
Aneesh Kumar K.V <aneesh.kumar@linux.ibm.com>
Fixes: b5beae5e ("powerpc/pseries: Add driver for PAPR SCM regions")
Cc: Dan Williams <dan.j.williams@intel.com>
Cc: <stable@vger.kernel.org>
Signed-off-by:
Andrew Morton <akpm@linux-foundation.org>
Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jul 10, 2019
-
-
Arnd Bergmann authored
On 32-bit x86 when building with clang-9, the 'division' loop gets turned back into an inefficient division that causes a link error: kernel/time/vsyscall.o: In function `update_vsyscall': vsyscall.c:(.text+0xe3): undefined reference to `__udivdi3' Use the existing __iter_div_u64_rem() function which is used to address the same issue in other places. Fixes: 44f57d78 ("timekeeping: Provide a generic update_vsyscall() implementation") Signed-off-by:
Arnd Bergmann <arnd@arndb.de>
Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
Reviewed-by:
Nathan Chancellor <natechancellor@gmail.com>
Tested-by:
Nathan Chancellor <natechancellor@gmail.com>
Link: https://lkml.kernel.org/r/20190710130206.1670830-1-arnd@arndb.de
-
- Jul 09, 2019
-
-
Masahiro Yamada authored
Currently, kheaders_data.tar.xz contains some build scripts as well as headers. None of them is needed in the header archive. For ARCH=x86, this commit excludes the following from the archive: arch/x86/include/asm/Kbuild arch/x86/include/uapi/asm/Kbuild include/asm-generic/Kbuild include/config/auto.conf include/config/kernel.release include/config/tristate.conf include/uapi/asm-generic/Kbuild include/uapi/Kbuild kernel/gen_kheaders.sh This change is actually motivated for the planned header compile-testing because it will generate more build artifacts, which should not be included in the archive. Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by:
Joel Fernandes (Google) <joel@joelfernandes.org> -
Masahiro Yamada authored
The -R option of 'ls' is supposed to be used for directories. -R, --recursive list subdirectories recursively Since 'find ... -type f' only matches to regular files, we do not expect directories passed to the 'ls' command here. Giving -R is harmless at least, but unneeded. Signed-off-by:
Masahiro Yamada <yamada.masahiro@socionext.com>
Reviewed-by:
Joel Fernandes (Google) <joel@joelfernandes.org>
-
- Jul 08, 2019
-
-
YueHaibing authored
If CONFIG_NET is not set and CONFIG_CGROUP_BPF=y, gcc building fails: kernel/bpf/cgroup.o: In function `cg_sockopt_func_proto': cgroup.c:(.text+0x237e): undefined reference to `bpf_sk_storage_get_proto' cgroup.c:(.text+0x2394): undefined reference to `bpf_sk_storage_delete_proto' kernel/bpf/cgroup.o: In function `__cgroup_bpf_run_filter_getsockopt': (.text+0x2a1f): undefined reference to `lock_sock_nested' (.text+0x2ca2): undefined reference to `release_sock' kernel/bpf/cgroup.o: In function `__cgroup_bpf_run_filter_setsockopt': (.text+0x3006): undefined reference to `lock_sock_nested' (.text+0x32bb): undefined reference to `release_sock' Reported-by:
Hulk Robot <hulkci@huawei.com>
Suggested-by:
Stanislav Fomichev <sdf@fomichev.me>
Fixes: 0d01da6a ("bpf: implement getsockopt and setsockopt hooks")
Signed-off-by:
YueHaibing <yuehaibing@huawei.com>
Acked-by:
Yonghong Song <yhs@fb.com>
Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net>
-
- Jul 07, 2019
-
-
zhengbin authored
The user value is validated after converting the timeval to a timespec, but for a wide range of negative tv_usec values the multiplication overflow turns them in positive numbers. So the 'validated later' is not catching the invalid input. Signed-off-by:
zhengbin <zhengbin13@huawei.com>
Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/1562460701-113301-1-git-send-email-zhengbin13@huawei.com
-
- Jul 06, 2019
-
-
Zenghui Yu authored
Fix typo in the comment on top of __irq_domain_add(). Signed-off-by:
Zenghui Yu <yuzenghui@huawei.com>
Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/1562388072-23492-1-git-send-email-yuzenghui@huawei.com -
Shijith Thotton authored
The NMI handlers handle_percpu_devid_fasteoi_nmi() and handle_fasteoi_nmi() do not update the interrupt counts. Due to that the NMI interrupt count does not show up correctly in /proc/interrupts. Add the statistics and treat the NMI handlers in the same way as per cpu interrupts and prevent them from updating irq_desc::tot_count as this might be corrupted due to concurrency. [ tglx: Massaged changelog ] Fixes: 2dcf1fbc ("genirq: Provide NMI handlers") Signed-off-by:
Shijith Thotton <sthotton@marvell.com>
Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
Link: https://lkml.kernel.org/r/1562313336-11888-1-git-send-email-sthotton@marvell.com
-
- Jul 04, 2019
-
-
Jann Horn authored
Fix two issues: When called for PTRACE_TRACEME, ptrace_link() would obtain an RCU reference to the parent's objective credentials, then give that pointer to get_cred(). However, the object lifetime rules for things like struct cred do not permit unconditionally turning an RCU reference into a stable reference. PTRACE_TRACEME records the parent's credentials as if the parent was acting as the subject, but that's not the case. If a malicious unprivileged child uses PTRACE_TRACEME and the parent is privileged, and at a later point, the parent process becomes attacker-controlled (because it drops privileges and calls execve()), the attacker ends up with control over two processes with a privileged ptrace relationship, which can be abused to ptrace a suid binary and obtain root privileges. Fix both of these by always recording the credentials of the process that is requesting the creation of the ptrace relationship: current_cred() can't change under us, and current is the proper subject for access control. This change is theoretically userspace-visible, but I am not aware of any code that it will actually break. Fixes: 64b875f7 ("ptrace: Capture the ptracer's creds not PT_PTRACE_CAP") Signed-off-by:
Jann Horn <jannh@google.com>
Acked-by:
Oleg Nesterov <oleg@redhat.com>
Cc: stable@vger.kernel.org
Signed-off-by:
Linus Torvalds <torvalds@linux-foundation.org>
-
- Jul 03, 2019
-
-
Greg Kroah-Hartman authored
When calling debugfs functions, there is no need to ever check the return value. The function can work or not, but the code logic should never do something different based on this. Cc: Konrad Rzeszutek Wilk <konrad.wilk@oracle.com> Cc: Christoph Hellwig <hch@lst.de> Cc: Marek Szyprowski <m.szyprowski@samsung.com> Cc: Robin Murphy <robin.murphy@arm.com> Cc: iommu@lists.linux-foundation.org Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org>
Link: https://lore.kernel.org/r/20190612144314.GA16803@kroah.com
Signed-off-by:
Greg Kroah-Hartman <gregkh@linuxfoundation.org> -
Alexei Starovoitov authored
When equivalent state is found the current state needs to propagate precision marks. Otherwise the verifier will prune the search incorrectly. There is a price for correctness: before before broken fixed cnst spill precise precise bpf_lb-DLB_L3.o 1923 8128 1863 1898 bpf_lb-DLB_L4.o 3077 6707 2468 2666 bpf_lb-DUNKNOWN.o 1062 1062 544 544 bpf_lxc-DDROP_ALL.o 166729 380712 22629 36823 bpf_lxc-DUNKNOWN.o 174607 440652 28805 45325 bpf_netdev.o 8407 31904 6801 7002 bpf_overlay.o 5420 23569 4754 4858 bpf_lxc_jit.o 39389 359445 50925 69631 Overall precision tracking is still very effective. Fixes: b5dc0163 ("bpf: precise scalar_value tracking") Reported-by:
Lawrence Brakmo <brakmo@fb.com>
Signed-off-by:
Alexei Starovoitov <ast@kernel.org>
Acked-by:
Andrii Nakryiko <andriin@fb.com>
Tested-by:
Lawrence Brakmo <brakmo@fb.com>
Signed-off-by:
Daniel Borkmann <daniel@iogearbox.net> -
Thomas Gleixner authored
free_irq() ensures that no hardware interrupt handler is executing on a different CPU before actually releasing resources and deactivating the interrupt completely in a domain hierarchy. But that does not catch the case where the interrupt is on flight at the hardware level but not yet serviced by the target CPU. That creates an interesing race condition: CPU 0 CPU 1 IRQ CHIP interrupt is raised sent to CPU1 Unable to handle immediately (interrupts off, deep idle delay) mask() ... free() shutdown() synchronize_irq() release_resources() do_IRQ() -> resources are not available That might be harmless and just trigger a spurious interrupt warning, but some interrupt chips might get into a wedged state. Utilize the existing irq_get_irqchip_state() callback for the synchronization in free_irq(). synchronize_hardirq() is not using this mechanism as it might actually deadlock unter certain conditions, e.g. when called with interrupts disabled and the target CPU is the one on which the synchronization is invoked. synchronize_irq() uses it because that function cannot be called from non preemtible contexts as it might sleep. No functional change intended and according to Marc the existing GIC implementations where the driver supports the callback should be able to cope with that core change. Famous last words. Fixes: 464d1230 ("x86/vector: Switch IOAPIC to global reservation mode") Reported-by:
Robert Hodaszi <Robert.Hodaszi@digi.com>
Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
Reviewed-by:
Marc Zyngier <marc.zyngier@arm.com>
Tested-by:
Marc Zyngier <marc.zyngier@arm.com>
Link: https://lkml.kernel.org/r/20190628111440.279463375@linutronix.de -
Thomas Gleixner authored
The function might sleep, so it cannot be called from interrupt context. Not even with care. Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
Cc: Marc Zyngier <marc.zyngier@arm.com>
Link: https://lkml.kernel.org/r/20190628111440.189241552@linutronix.de -
Thomas Gleixner authored
When interrupts are shutdown, they are immediately deactivated in the irqdomain hierarchy. While this looks obviously correct there is a subtle issue: There might be an interrupt in flight when free_irq() is invoking the shutdown. This is properly handled at the irq descriptor / primary handler level, but the deactivation might completely disable resources which are required to acknowledge the interrupt. Split the shutdown code and deactivate the interrupt after synchronization in free_irq(). Fixup all other usage sites where this is not an issue to invoke the combined shutdown_and_deactivate() function instead. This still might be an issue if the interrupt in flight servicing is delayed on a remote CPU beyond the invocation of synchronize_irq(), but that cannot be handled at that level and needs to be handled in the synchronize_irq() context. Fixes: f8264e34 ("irqdomain: Introduce new interfaces to support hierarchy irqdomains") Reported-by:
Robert Hodaszi <Robert.Hodaszi@digi.com>
Signed-off-by:
Thomas Gleixner <tglx@linutronix.de>
Reviewed-by:
Marc Zyngier <marc.zyngier@arm.com>
Link: https://lkml.kernel.org/r/20190628111440.098196390@linutronix.de
-