Commit graph

3113 commits

Author SHA1 Message Date
Ilya Leoshkevich 5502e5ca33 linux-user/s390x: Fix single-stepping SVC
Currently single-stepping SVC executes two instructions. The reason is
that EXCP_DEBUG for the SVC instruction itself is masked by EXCP_SVC.
Fix by re-raising EXCP_DEBUG.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Message-Id: <20230510230213.330134-2-iii@linux.ibm.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
(cherry picked from commit 01b9990a3fb84bb9a14017255ab1a4fa86588215)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-06-07 11:55:26 +03:00
Michael Tokarev 89bf901afb linux-user: fix getgroups/setgroups allocations
linux-user getgroups(), setgroups(), getgroups32() and setgroups32()
used alloca() to allocate grouplist arrays, with unchecked gidsetsize
coming from the "guest".  With NGROUPS_MAX being 65536 (linux, and it
is common for an application to allocate NGROUPS_MAX for getgroups()),
this means a typical allocation is half the megabyte on the stack.
Which just overflows stack, which leads to immediate SIGSEGV in actual
system getgroups() implementation.

An example of such issue is aptitude, eg
https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=811087#72

Cap gidsetsize to NGROUPS_MAX (return EINVAL if it is larger than that),
and use heap allocation for grouplist instead of alloca().  While at it,
fix coding style and make all 4 implementations identical.

Try to not impose random limits - for example, allow gidsetsize to be
negative for getgroups() - just do not allocate negative-sized grouplist
in this case but still do actual getgroups() call.  But do not allow
negative gidsetsize for setgroups() since its argument is unsigned.

Capping by NGROUPS_MAX seems a bit arbitrary, - we can do more, it is
not an error if set size will be NGROUPS_MAX+1. But we should not allow
integer overflow for the array being allocated. Maybe it is enough to
just call g_try_new() and return ENOMEM if it fails.

Maybe there's also no need to convert setgroups() since this one is
usually smaller and known beforehand (KERN_NGROUPS_MAX is actually 63, -
this is apparently a kernel-imposed limit for runtime group set).

The patch fixes aptitude segfault mentioned above.

Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
Message-Id: <20230409105327.1273372-1-mjt@msgid.tls.msk.ru>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit 1e35d327890bdd117a67f79c52e637fb12bb1bf4)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18 21:09:59 +03:00
Daniil Kovalev 95cb7a7255 linux-user: Fix mips fp64 executables loading
If a program requires fr1, we should set the FR bit of CP0 control status
register and add F64 hardware flag. The corresponding `else if` branch
statement is copied from the linux kernel sources (see `arch_check_elf` function
in linux/arch/mips/kernel/elf.c).

Signed-off-by: Daniil Kovalev <dkovalev@compiler-toolchain-for.me>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Message-Id: <20230404052153.16617-1-dkovalev@compiler-toolchain-for.me>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit a0f8d2701b205d9d7986aa555e0566b13dc18fa0)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-05-18 21:09:59 +03:00
Mathis Marion 73a11e3723 linux-user: fix timerfd read endianness conversion
When reading the expiration count from a timerfd, the endianness of the
64bit value read is the one of the host, just as for eventfds.

Signed-off-by: Mathis Marion <mathis.marion@silabs.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20230220085822.626798-2-Mathis.Marion@silabs.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit d759a62b122dcdf76d6ea10c56c5dff1d04d731d)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-04-10 11:38:34 +03:00
Ilya Leoshkevich b6abbe6250 linux-user: Fix unaligned memory access in prlimit64 syscall
target_rlimit64 contains uint64_t fields, so it's 8-byte aligned on
some hosts, while some guests may align their respective type on a
4-byte boundary. This may lead to an unaligned access, which is an UB.

Fix by defining the fields as abi_ullong. This makes the host alignment
match that of the guest, and lets the compiler know that it should emit
code that can deal with the guest alignment.

While at it, also use __get_user() and __put_user() instead of
tswap64().

Fixes: 163a05a839 ("linux-user: Implement prlimit64 syscall")
Reported-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20230224003907.263914-2-iii@linux.ibm.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit 9c1da8b5ee7f6e80e6b683e7fb73df1029a7cbbe)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-04-10 11:37:09 +03:00
Mathis Marion b57641e907 linux-user: fix sockaddr_in6 endianness
The sin6_scope_id field uses the host byte order, so there is a
conversion to be made when host and target endianness differ.

Signed-off-by: Mathis Marion <mathis.marion@silabs.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Philippe Mathieu-Daudé <philmd@linaro.org>
Message-Id: <20230307154256.101528-2-Mathis.Marion@silabs.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
(cherry picked from commit 44cf6731d6b9a48bcd57392e8cd6f0f712aaa677)
Signed-off-by: Michael Tokarev <mjt@tls.msk.ru>
2023-04-10 11:34:41 +03:00
Icenowy Zheng 16c81dd563 linux-user: always translate cmsg when recvmsg
It's possible that a message contains both normal payload and ancillary
data in the same message, and even if no ancillary data is available
this information should be passed to the target, otherwise the target
cmsghdr will be left uninitialized and the target is going to access
uninitialized memory if it expects cmsg.

Always call the function that translate cmsg when recvmsg, because that
function should be empty-cmsg-safe (it creates an empty cmsg in the
target).

Signed-off-by: Icenowy Zheng <uwu@icenowy.me>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20221028081220.1604244-1-uwu@icenowy.me>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-11-02 17:29:17 +01:00
Helge Deller 8b95210fcb linux-user: Add strace output for timer_settime64() syscall
Add missing timer_settime64() strace output and specify format for
timer_settime().

Signed-off-by: Helge Deller <deller@gmx.de>

Message-Id: <Y1b5eIXFoMRDcDL9@p100>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-11-02 17:21:06 +01:00
Helge Deller af804f39cc linux-user: Add close_range() syscall
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <Y1dLJoEDhJ2AAYDn@p100>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-11-02 17:17:07 +01:00
Helge Deller dcd86148e2 linux-user/hppa: Detect glibc ABORT_INSTRUCTION and EXCP_BREAK handler
The glibc on the hppa platform uses the "iitlbp %r0,(%sr0, %r0)"
assembler instruction as ABORT_INSTRUCTION.
If this (in userspace context) illegal assembler statement is found,
dump the registers and report the failure to userspace the same way as
the Linux kernel on physical hardware.

For other illegal instructions report TARGET_ILL_ILLOPC instead of
TARGET_ILL_ILLOPN as si_code.

Additionally add the missing EXCP_BREAK exception handler which occurs
when the "break x,y" assembler instruction is executed and report
EXCP_ASSIST traps.

Signed-off-by: Helge Deller <deller@gmx.de>

Message-Id: <Y1osHVsylkuZNUnY@p100>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-11-02 17:14:02 +01:00
Stefan Hajnoczi 08a5d04606 Revert incorrect cflags initialization.
Add direct jumps for tcg/loongarch64.
 Speed up breakpoint check.
 Improve assertions for atomic.h.
 Move restore_state_to_opc to TCGCPUOps.
 Cleanups to TranslationBlock maintenance.
 -----BEGIN PGP SIGNATURE-----
 
 iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmNYlo4dHHJpY2hhcmQu
 aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV9y2wf9EKsCA6VtYI2Qtftf
 q/ujYFmUf8AKTb9eVcA0XX71CT1dEnFR7GQyT8B8X13x0pSbOX7tbEWHPreegTFV
 tESiejvymi6Q9devAB58GVwNoU/zPIQQGhCPxkVUKDmRztJz22MbGUzd7UKPPgU8
 2nVMkIpLTMBsKeFLxE/D3ZntmdKsgyI/1Dtkl9TxvlDGsCbMjbNcr8lM+TLaG2oX
 GZhFyJHKEVy0cobukvhhb/9rU7AWdG/BnFmZM16JxvHV/YCwJBx3Udhcy9xPePUU
 yIjkGsUAq4aB6H9RFuTWh7GmaY5u6gMbTTi2J7hDos0mzauYJtpgEB/H42LpycGE
 sOhkLQ==
 =DUb8
 -----END PGP SIGNATURE-----

Merge tag 'pull-tcg-20221026' of https://gitlab.com/rth7680/qemu into staging

Revert incorrect cflags initialization.
Add direct jumps for tcg/loongarch64.
Speed up breakpoint check.
Improve assertions for atomic.h.
Move restore_state_to_opc to TCGCPUOps.
Cleanups to TranslationBlock maintenance.

# -----BEGIN PGP SIGNATURE-----
#
# iQFRBAABCgA7FiEEekgeeIaLTbaoWgXAZN846K9+IV8FAmNYlo4dHHJpY2hhcmQu
# aGVuZGVyc29uQGxpbmFyby5vcmcACgkQZN846K9+IV9y2wf9EKsCA6VtYI2Qtftf
# q/ujYFmUf8AKTb9eVcA0XX71CT1dEnFR7GQyT8B8X13x0pSbOX7tbEWHPreegTFV
# tESiejvymi6Q9devAB58GVwNoU/zPIQQGhCPxkVUKDmRztJz22MbGUzd7UKPPgU8
# 2nVMkIpLTMBsKeFLxE/D3ZntmdKsgyI/1Dtkl9TxvlDGsCbMjbNcr8lM+TLaG2oX
# GZhFyJHKEVy0cobukvhhb/9rU7AWdG/BnFmZM16JxvHV/YCwJBx3Udhcy9xPePUU
# yIjkGsUAq4aB6H9RFuTWh7GmaY5u6gMbTTi2J7hDos0mzauYJtpgEB/H42LpycGE
# sOhkLQ==
# =DUb8
# -----END PGP SIGNATURE-----
# gpg: Signature made Tue 25 Oct 2022 22:08:14 EDT
# gpg:                using RSA key 7A481E78868B4DB6A85A05C064DF38E8AF7E215F
# gpg:                issuer "richard.henderson@linaro.org"
# gpg: Good signature from "Richard Henderson <richard.henderson@linaro.org>" [full]
# Primary key fingerprint: 7A48 1E78 868B 4DB6 A85A  05C0 64DF 38E8 AF7E 215F

* tag 'pull-tcg-20221026' of https://gitlab.com/rth7680/qemu: (47 commits)
  accel/tcg: Remove restore_state_to_opc function
  target/xtensa: Convert to tcg_ops restore_state_to_opc
  target/tricore: Convert to tcg_ops restore_state_to_opc
  target/sparc: Convert to tcg_ops restore_state_to_opc
  target/sh4: Convert to tcg_ops restore_state_to_opc
  target/s390x: Convert to tcg_ops restore_state_to_opc
  target/rx: Convert to tcg_ops restore_state_to_opc
  target/riscv: Convert to tcg_ops restore_state_to_opc
  target/ppc: Convert to tcg_ops restore_state_to_opc
  target/openrisc: Convert to tcg_ops restore_state_to_opc
  target/nios2: Convert to tcg_ops restore_state_to_opc
  target/mips: Convert to tcg_ops restore_state_to_opc
  target/microblaze: Convert to tcg_ops restore_state_to_opc
  target/m68k: Convert to tcg_ops restore_state_to_opc
  target/loongarch: Convert to tcg_ops restore_state_to_opc
  target/i386: Convert to tcg_ops restore_state_to_opc
  target/hppa: Convert to tcg_ops restore_state_to_opc
  target/hexagon: Convert to tcg_ops restore_state_to_opc
  target/cris: Convert to tcg_ops restore_state_to_opc
  target/avr: Convert to tcg_ops restore_state_to_opc
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-10-26 10:53:41 -04:00
Richard Henderson 8f39e01db9 accel/tcg: Call tb_invalidate_phys_page for PAGE_RESET
When PAGE_RESET is set, we are replacing pages with new
content, which means that we need to invalidate existing
cached data, such as TranslationBlocks.  Perform the
reset invalidate while we're doing other invalidates,
which allows us to remove the separate invalidates from
the user-only mmap/munmap/mprotect routines.

In addition, restrict invalidation to PAGE_EXEC pages.
Since cdf7130851, we have validated PAGE_EXEC is present
before translation, which means we can assume that if the
bit is not present, there are no translations to invalidate.

Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-26 11:11:28 +10:00
Helge Deller bd5ccd6108 linux-user: Add guest memory layout to exception dump
When the emulation stops with a hard exception it's very useful for
debugging purposes to dump the current guest memory layout (for an
example see /proc/self/maps) beside the CPU registers.

The open_self_maps() function provides such a memory dump, but since
it's located in the syscall.c file, various changes (add #includes, make
this function externally visible, ...) are needed to be able to call it
from the existing EXCP_DUMP() macro.

This patch takes another approach by re-defining EXCP_DUMP() to call
target_exception_dump(), which is in syscall.c, consolidates the log
print functions and allows to add the call to dump the memory layout.

Beside a reduced code footprint, this approach keeps the changes across
the various callers minimal, and keeps EXCP_DUMP() highlighted as
important macro/function.

Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <Y1bzAWbw07WBKPxw@p100>
[lv: remove pc declaration and setting]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-10-25 09:20:40 +02:00
WANG Xuerui 35a2c85f7d linux-user: Implement faccessat2
User space has been preferring this syscall for a while, due to its
closer match with C semantics, and newer platforms such as LoongArch
apparently have libc implementations that don't fallback to faccessat
so normal access checks are failing without the emulation in place.

Tested by successfully emerging several packages within a Gentoo loong
stage3 chroot, emulated on amd64 with help of static qemu-loongarch64.

Reported-by: Andreas K. Hüttel <dilfridge@gentoo.org>
Signed-off-by: WANG Xuerui <xen0n@gentoo.org>
Message-Id: <20221009060813.2289077-1-xen0n@gentoo.org>
[lv: removing defined(__NR_faccessat2) in syscall.c,
     adding defined(TARGET_NR_faccessat2) on print_faccessat()]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-10-21 17:46:19 +02:00
Daniel P. Berrangé ed98cdecf8 linux-user: remove conditionals for many fs.h ioctls
These ioctls have been defined in linux/fs.h for a long time

  * BLKGETSIZE64 - <2.6.12 (linux.git epoch)
  * BLKDISCARD - 2.6.28 (d30a2605be9d5132d95944916e8f578fcfe4f976)
  * BLKIOMIN - 2.6.32 (ac481c20ef8f6c6f2be75d581863f40c43874ef7)
  * BLKIOOPT - 2.6.32 (ac481c20ef8f6c6f2be75d581863f40c43874ef7)
  * BLKALIGNOFF - 2.6.32 (ac481c20ef8f6c6f2be75d581863f40c43874ef7)
  * BLKPBSZGET - 2.6.32 (ac481c20ef8f6c6f2be75d581863f40c43874ef7)
  * BLKDISCARDZEROES - 2.6.32 (98262f2762f0067375f83824d81ea929e37e6bfe)
  * BLKSECDISCARD - 2.6.36 (8d57a98ccd0b4489003473979da8f5a1363ba7a3)
  * BLKROTATIONAL - 3.2 (ef00f59c95fe6e002e7c6e3663cdea65e253f4cc)
  * BLKZEROOUT - 3.6 (66ba32dc167202c3cf8c86806581a9393ec7f488)
  * FIBMAP - <2.6.12 (linux.git epoch)
  * FIGETBSZ - <2.6.12 (linux.git epoch)

and when building with latest glibc, we'll see compat definitions
in syscall.c anyway thanks to the previous patch. Thus we can
assume they always exist and remove the conditional checks.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20221004093206.652431-3-berrange@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-10-21 17:46:19 +02:00
Daniel P. Berrangé c5495f4ecb linux-user: add more compat ioctl definitions
GLibc changes prevent us from including linux/fs.h anymore,
and we previously adjusted to this in

  commit 3cd3df2a95
  Author: Daniel P. Berrangé <berrange@redhat.com>
  Date:   Tue Aug 2 12:41:34 2022 -0400

    linux-user: fix compat with glibc >= 2.36 sys/mount.h

That change required adding compat ioctl definitions on the
QEMU side for any ioctls that we would otherwise obtain
from linux/fs.h.  This commit adds more that were initially
missed, due to their usage being conditionalized in QEMU.

Signed-off-by: Daniel P. Berrangé <berrange@redhat.com>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20221004093206.652431-2-berrange@redhat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-10-21 17:46:19 +02:00
Laurent Vivier 00ed8a3459 linux-user: don't use AT_EXECFD in do_openat()
AT_EXECFD gives access to the binary file even if
it is not readable (only executable).

Moreover it can be opened with flags and mode that are not the ones
provided by do_openat() caller.

And it is not available because loader_exec() has closed it.

To avoid that, use only safe_openat() with the exec_path.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220927124357.688536-3-laurent@vivier.eu>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-10-21 17:46:19 +02:00
Laurent Vivier f07eb1c4f8 linux-user: handle /proc/self/exe with execve() syscall
If path is /proc/self/exe, use the executable path
provided by exec_path.

Don't use execfd as it is closed by loader_exec() and otherwise
will survive to the exec() syscall and be usable child process.

Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220927124357.688536-2-laurent@vivier.eu>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-10-21 17:46:19 +02:00
Laurent Vivier 46187d707e linux-user: fix pidfd_send_signal()
According to pidfd_send_signal(2), info argument can be a NULL pointer.
Fix strace to correctly manage ending comma in parameters.

Fixes: cc054c6f13 ("linux-user: Add pidfd_open(), pidfd_send_signal() and pidfd_getfd() syscalls")
cc: Helge Deller <deller@gmx.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
Reviewed-by: Helge Deller <deller@gmx.de>
Message-Id: <20221005163826.1455313-1-laurent@vivier.eu>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-10-21 17:46:19 +02:00
WANG Xuerui eeed22916b linux-user: Fix more MIPS n32 syscall ABI issues
In commit 80f0fe3a85 ("linux-user: Fix syscall parameter handling for
MIPS n32") the ABI problem regarding offset64 on MIPS n32 was fixed,
but still some cases remain where the n32 is incorrectly treated as any
other 32-bit ABI that passes 64-bit arguments in pairs of GPRs. Fix by
excluding TARGET_ABI_MIPSN32 from various TARGET_ABI_BITS == 32 checks.

Closes: https://gitlab.com/qemu-project/qemu/-/issues/1238
Signed-off-by: WANG Xuerui <xen0n@gentoo.org>
Cc: Philippe Mathieu-Daudé <f4bug@amsat.org>
Cc: Jiaxun Yang <jiaxun.yang@flygoat.com>
Cc: Andreas K. Hüttel <dilfridge@gentoo.org>
Cc: Joshua Kinard <kumba@gentoo.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Tested-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Tested-by: Andreas K. Huettel <dilfridge@gentoo.org>
Message-Id: <20221006085500.290341-1-xen0n@gentoo.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-10-21 16:37:36 +02:00
WANG Xuerui 7bf36a5c52
linux-user: Fix struct statfs ABI on loongarch64
Previously the 32-bit version was incorrectly chosen, leading to funny
but incorrect output from e.g. df(1). Simply select the version
corresponding to the 64-bit asm-generic definition.

For reference, this program should produce the same output no matter
natively compiled or not, for loongarch64 or not:

```c
#include <stdio.h>
#include <sys/statfs.h>

int main(int argc, const char *argv[])
{
  struct statfs b;
  if (statfs(argv[0], &b))
    return 1;

  printf("f_type = 0x%lx\n", b.f_type);
  printf("f_bsize = %ld\n", b.f_bsize);
  printf("f_blocks = %ld\n", b.f_blocks);
  printf("f_bfree = %ld\n", b.f_bfree);
  printf("f_bavail = %ld\n", b.f_bavail);

  return 0;
}

// Example output on my amd64 box, with the test binary residing on a
// btrfs partition.

// Native and emulated output after the fix:
//
// f_type = 0x9123683e
// f_bsize = 4096
// f_blocks = 268435456
// f_bfree = 168406890
// f_bavail = 168355058

// Output before the fix, note the messed layout:
//
// f_type = 0x10009123683e
// f_bsize = 723302085239504896
// f_blocks = 168355058
// f_bfree = 2250817541779750912
// f_bavail = 1099229433104
```

Fixes: 1f63019632 ("linux-user: Add LoongArch syscall support")
Signed-off-by: WANG Xuerui <xen0n@gentoo.org>
Cc: Song Gao <gaosong@loongson.cn>
Cc: Xiaojuan Yang <yangxiaojuan@loongson.cn>
Cc: Andreas K. Hüttel <dilfridge@gentoo.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Tested-by: Andreas K. Huettel <dilfridge@gentoo.org>
Message-Id: <20221006100710.427252-1-xen0n@gentoo.org>
Signed-off-by: Song Gao <gaosong@loongson.cn>
2022-10-17 10:28:35 +08:00
Paolo Bonzini 5d2456789a linux-user: i386/signal: support XSAVE/XRSTOR for signal frame fpstate
Add support for saving/restoring extended save states when signals
are delivered.  This allows using AVX, MPX or PKRU registers in
signal handlers.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 10:27:35 +02:00
Paolo Bonzini 2796f290b5 linux-user: i386/signal: support FXSAVE fpstate on 32-bit emulation
Linux can use FXSAVE to save/restore XMM registers even on 32-bit
systems.  This requires some care in order to keep the FXSAVE area
aligned to 16 bytes; for this reason, get_sigframe is changed to
pass the offset into the FXSAVE area rather than the full frame
size.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Paolo Bonzini 5154d35bed linux-user: i386/signal: move fpstate at the end of the 32-bit frames
Recent versions of Linux moved the 32-bit fpstate towards the end of the
frame, so that the variable-sized xsave data does not overwrite the
(ABI-defined) extramask[] field.  Follow suit in QEMU.

Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Paolo Bonzini <pbonzini@redhat.com>
2022-10-11 09:36:01 +02:00
Alex Bennée f7e15affa8 plugins: add [pre|post]fork helpers to linux-user
Special care needs to be taken in ensuring locks are in a consistent
state across fork events. Add helpers so the plugin system can ensure
that.

Signed-off-by: Alex Bennée <alex.bennee@linaro.org>
Fixes: https://gitlab.com/qemu-project/qemu/-/issues/358
Reviewed-by: Daniel P. Berrangé <berrange@redhat.com>
Tested-by: Daniel P. Berrangé <berrange@redhat.com>
Message-Id: <20221004115221.2174499-1-alex.bennee@linaro.org>
2022-10-06 11:53:41 +01:00
Richard Henderson ab419fd8a0 target/sh4: Fix TB_FLAG_UNALIGN
The value previously chosen overlaps GUSA_MASK.

Rename all DELAY_SLOT_* and GUSA_* defines to emphasize
that they are included in TB_FLAGs.  Add aliases for the
FPSCR and SR bits that are included in TB_FLAGS, so that
we don't accidentally reassign those bits.

Fixes: 4da06fb306 ("target/sh4: Implement prctl_unalign_sigbus")
Resolves: https://gitlab.com/qemu-project/qemu/-/issues/856
Reviewed-by: Yoshinori Sato <ysato@users.sourceforge.jp>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
2022-10-04 12:33:05 -07:00
Stefan Hajnoczi 36cd0aeac3 linux-user pull request 20220928-v2
use 'max' instead of 'qemu32' / 'qemu64'
 add  pidfd_open(), pidfd_send_signal() and pidfd_getfd()
 Improve madvise(MADV_DONTNEED)
 futex syscal rework
 strace improvement
 HP/PA fixes and improvement
 Misc fixes
 -----BEGIN PGP SIGNATURE-----
 
 iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAmM0riISHGxhdXJlbnRA
 dml2aWVyLmV1AAoJEPMMOL0/L748gH4P/2wesXJKPMY2zQzP3Rld4iyefoPGG/Yp
 mdq59BbjO2jQMR8GBss/nl9l84cIzzkYRQIogaKsjljtZYm/OO5xRefqrzJY6apD
 eidxv20dAVjuaXHAIdGhbFlxot1ctExbZs9atB4uj5DWxfYGD6e/stoBy5/pSmr4
 M5EbGHhyrRI7tRbHGtVQVvG6AT6XGE0pT9tzT5JLaApF8UPMkgJwmez16PNWvcMm
 v8GEvKm/vEVS8CCpzLV4kfwVeo3f54VAOrEBDi29ph2Yo50IA21k8BvoRZaSp+Kn
 G6TMnnly/DkMspAs5EOVfat+kv3TziNNdDH7EnVU1vV1yTDdZgW/1204Uy/JY0Pw
 WotwAFuO9FYeHKmjY0CfnIIZZHYZpDYUOZ8M6dESD/O0EjoB8LMf5p9cbYlze4DE
 csJZCsVcz19HDv6QZXi5mvvDcJ83B2IDb8/PUAzSc0n62lXL9qjYD0wdb0QsLdAT
 I25qLDge1HCmQfCIKcaoHYvE0pDmvkF6ftuQUXLtIwtaV0Z/N5wDf2PEHikjOYHM
 gD2izz23/2wQx6KP/9ZNnCJ5QEBkEgm5wpHncsvjzSzi1uIdNlHyzJJwGTAcc5qZ
 hOeoJ7dT0D6g0BGnvOdg2W/bDx18KW65mNDxE4d+W0uzn0YmQtArk2YsnhKQNO46
 12/0ltPFnSV/
 =DIzQ
 -----END PGP SIGNATURE-----

Merge tag 'linux-user-for-7.2-pull-request' of https://gitlab.com/laurent_vivier/qemu into staging

linux-user pull request 20220928-v2

use 'max' instead of 'qemu32' / 'qemu64'
add  pidfd_open(), pidfd_send_signal() and pidfd_getfd()
Improve madvise(MADV_DONTNEED)
futex syscal rework
strace improvement
HP/PA fixes and improvement
Misc fixes

# -----BEGIN PGP SIGNATURE-----
#
# iQJGBAABCAAwFiEEzS913cjjpNwuT1Fz8ww4vT8vvjwFAmM0riISHGxhdXJlbnRA
# dml2aWVyLmV1AAoJEPMMOL0/L748gH4P/2wesXJKPMY2zQzP3Rld4iyefoPGG/Yp
# mdq59BbjO2jQMR8GBss/nl9l84cIzzkYRQIogaKsjljtZYm/OO5xRefqrzJY6apD
# eidxv20dAVjuaXHAIdGhbFlxot1ctExbZs9atB4uj5DWxfYGD6e/stoBy5/pSmr4
# M5EbGHhyrRI7tRbHGtVQVvG6AT6XGE0pT9tzT5JLaApF8UPMkgJwmez16PNWvcMm
# v8GEvKm/vEVS8CCpzLV4kfwVeo3f54VAOrEBDi29ph2Yo50IA21k8BvoRZaSp+Kn
# G6TMnnly/DkMspAs5EOVfat+kv3TziNNdDH7EnVU1vV1yTDdZgW/1204Uy/JY0Pw
# WotwAFuO9FYeHKmjY0CfnIIZZHYZpDYUOZ8M6dESD/O0EjoB8LMf5p9cbYlze4DE
# csJZCsVcz19HDv6QZXi5mvvDcJ83B2IDb8/PUAzSc0n62lXL9qjYD0wdb0QsLdAT
# I25qLDge1HCmQfCIKcaoHYvE0pDmvkF6ftuQUXLtIwtaV0Z/N5wDf2PEHikjOYHM
# gD2izz23/2wQx6KP/9ZNnCJ5QEBkEgm5wpHncsvjzSzi1uIdNlHyzJJwGTAcc5qZ
# hOeoJ7dT0D6g0BGnvOdg2W/bDx18KW65mNDxE4d+W0uzn0YmQtArk2YsnhKQNO46
# 12/0ltPFnSV/
# =DIzQ
# -----END PGP SIGNATURE-----
# gpg: Signature made Wed 28 Sep 2022 16:27:14 EDT
# gpg:                using RSA key CD2F75DDC8E3A4DC2E4F5173F30C38BD3F2FBE3C
# gpg:                issuer "laurent@vivier.eu"
# gpg: Good signature from "Laurent Vivier <lvivier@redhat.com>" [full]
# gpg:                 aka "Laurent Vivier <laurent@vivier.eu>" [full]
# gpg:                 aka "Laurent Vivier (Red Hat) <lvivier@redhat.com>" [full]
# Primary key fingerprint: CD2F 75DD C8E3 A4DC 2E4F  5173 F30C 38BD 3F2F BE3C

* tag 'linux-user-for-7.2-pull-request' of https://gitlab.com/laurent_vivier/qemu: (37 commits)
  linux-user: Add parameters of getrandom() syscall for strace
  linux-user: Lock log around strace
  linux-user: Update print_futex_op
  linux-user: Implement PI futexes
  linux-user: Convert signal number for FUTEX_FD
  linux-user: Implement FUTEX_WAKE_BITSET
  linux-user: Sink call to do_safe_futex
  linux-user: Combine do_futex and do_futex_time64
  linux-user: Set ELF_BASE_PLATFORM for MIPS
  linux-user: Introduce stubs for ELF AT_BASE_PLATFORM
  linux-user/s390x: Save/restore fpc when handling a signal
  linux-user: Don't assume 0 is not a valid host timer_t value
  linux-user: fix bug about missing signum convert of sigqueue
  linux-user/hppa: Fix setup_sigcontext()
  linux-user/hppa: Allow PROT_GROWSUP and PROT_GROWSDOWN in mprotect()
  linux-user/hppa: Increase guest stack size to 80MB for hppa target
  linux-user/hppa: Drop stack guard page on hppa target
  linux-user/hppa: Add signal trampoline for hppa target
  linux-user: Add proper strace format strings for getdents()/getdents64()
  linux-user: Fix TARGET_PROT_SEM for XTENSA
  ...

Signed-off-by: Stefan Hajnoczi <stefanha@redhat.com>
2022-09-28 17:03:54 -04:00
Helge Deller 4a877b82f7 linux-user: Add parameters of getrandom() syscall for strace
Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220927093538.8954-2-deller@gmx.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-28 22:24:42 +02:00
Richard Henderson c5a1c6b88c linux-user: Lock log around strace
Do not allow syscall arguments to be interleaved between threads.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220829021006.67305-8-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-27 13:19:05 +02:00
Richard Henderson 53b578f31f linux-user: Update print_futex_op
Use a table for the names; print unknown values in hex,
since the value contains flags.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220829021006.67305-7-richard.henderson@linaro.org>
[lv: update print_futex() according to
"linux-user: Show timespec on strace for futex()"]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-27 13:19:05 +02:00
Richard Henderson c72a90df47 linux-user: Implement PI futexes
Define the missing FUTEX_* constants in syscall_defs.h

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220829021006.67305-6-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-27 13:19:05 +02:00
Richard Henderson 0f94673112 linux-user: Convert signal number for FUTEX_FD
The val argument to FUTEX_FD is a signal number.  Convert to match
the host, as it will be converted back when the signal is delivered.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220829021006.67305-5-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-27 13:19:05 +02:00
Richard Henderson a6180f8aed linux-user: Implement FUTEX_WAKE_BITSET
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220829021006.67305-4-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-27 13:19:05 +02:00
Richard Henderson 57b9ccd4c0 linux-user: Sink call to do_safe_futex
Leave only the argument adjustments within the shift,
and sink the actual syscall to the end.  Sink the
timespec conversion as well, as there will be more users.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220829021006.67305-3-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-27 13:19:05 +02:00
Richard Henderson 0fbc0f8da1 linux-user: Combine do_futex and do_futex_time64
Pass a boolean to select between time32 and time64.

Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220829021006.67305-2-richard.henderson@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-27 13:19:05 +02:00
Jiaxun Yang fbf47c18aa linux-user: Set ELF_BASE_PLATFORM for MIPS
Match most appropriate base platform string based on insn_flags.
Logic is aligned with aligned with set_isa() from
arch/mips/kernel/cpu-probe.c in Linux kernel.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220803103009.95972-3-jiaxun.yang@flygoat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-27 13:19:05 +02:00
Jiaxun Yang fcdc0ab4b4 linux-user: Introduce stubs for ELF AT_BASE_PLATFORM
AT_BASE_PLATFORM is a elf auxiliary vector pointing to a string
to pass some architecture information.
See getauxval(3) man-page.

Signed-off-by: Jiaxun Yang <jiaxun.yang@flygoat.com>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Message-Id: <20220803103009.95972-2-jiaxun.yang@flygoat.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-27 13:19:05 +02:00
Ilya Leoshkevich 2941e0fa05 linux-user/s390x: Save/restore fpc when handling a signal
Linux kernel does this in fpregs_store() and fpregs_load(), so
qemu-user should do this as well.

Found by running valgrind's none/tests/s390x/test_sig.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220817123902.585623-1-iii@linux.ibm.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-27 13:19:05 +02:00
Peter Maydell 9e59899f8c linux-user: Don't assume 0 is not a valid host timer_t value
For handling guest POSIX timers, we currently use an array
g_posix_timers[], whose entries are a host timer_t value, or 0 for
"this slot is unused".  When the guest calls the timer_create syscall
we look through the array for a slot containing 0, and use that for
the new timer.

This scheme assumes that host timer_t values can never be zero.  This
is unfortunately not a valid assumption -- for some host libc
versions, timer_t values are simply indexes starting at 0.  When
using this kind of host libc, the effect is that the first and second
timers end up sharing a slot, and so when the guest tries to operate
on the first timer it changes the second timer instead.

Rework the timer allocation code, so that:
 * the 'slot in use' indication uses a separate array from the
   host timer_t array
 * we grab the free slot atomically, to avoid races when multiple
   threads call timer_create simultaneously
 * releasing an allocated slot is abstracted out into a new
   free_host_timer_slot() function called in the correct places

This fixes:
 * problems on hosts where timer_t 0 is valid
 * the FIXME in next_free_host_timer() about locking
 * bugs in the error paths in timer_create where we forgot to release
   the slot we grabbed, or forgot to free the host timer

Reported-by: Jon Alduan <jon.alduan@gmail.com>
Signed-off-by: Peter Maydell <peter.maydell@linaro.org>
Message-Id: <20220725110035.1273441-1-peter.maydell@linaro.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-27 13:19:05 +02:00
fanwenjie 9b9145f04d linux-user: fix bug about missing signum convert of sigqueue
Fixes: 66fb9763af ("basic signal handling")
Fixes: cf8b8bfc50 ("linux-user: add support for rt_tgsigqueueinfo() system call")
Signed-off-by: fanwenjie <fanwj@mail.ustc.edu.cn>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-27 13:19:05 +02:00
Helge Deller 2319a53758 linux-user/hppa: Fix setup_sigcontext()
We don't emulate a preemptive kernel on this level, and the hppa architecture
doesn't allow context switches on the gateway page. So we always have to return
to sc_iaoq[] and not to gr[31].
This fixes the remaining random segfaults which still occured.

Signed-off-by: Helge Deller <deller@gmx.de>
Message-Id: <20220924114501.21767-8-deller@gmx.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-27 13:19:05 +02:00
Helge Deller 4c184e70ad linux-user/hppa: Allow PROT_GROWSUP and PROT_GROWSDOWN in mprotect()
The hppa platform uses an upwards-growing stack and required in Linux
kernels < 5.18 an executable stack for signal processing.  For that some
executables and libraries are marked to have an executable stack, for
which glibc uses the mprotect() syscall to mark the stack like this:
 mprotect(xfa000000,4096,PROT_EXEC|PROT_READ|PROT_WRITE|PROT_GROWSUP).

Currently qemu will return -TARGET_EINVAL for this syscall because of the
checks in validate_prot_to_pageflags(), which doesn't allow the
PROT_GROWSUP or PROT_GROWSDOWN flags and thus triggers this error in the
guest:
 error while loading shared libraries: libc.so.6: cannot enable executable stack as shared object requires: Invalid argument

Allow mprotect() to handle both flags and thus fix the guest.
The glibc tst-execstack testcase can be used to reproduce the issue.

Signed-off-by: Helge Deller <deller@gmx.de>
Message-Id: <20220924114501.21767-7-deller@gmx.de>
[lvivier: s/elif TARGET_HPPA/elif defined(TARGET_HPPA)/]
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-27 13:18:53 +02:00
Helge Deller 0a3346b593 linux-user/hppa: Increase guest stack size to 80MB for hppa target
The hppa target requires a much bigger stack than many other targets,
and the Linux kernel allocates 80 MB by default for it.

This patch increases the guest stack for hppa to 80MB, and prevents
that this default stack size gets reduced by a lower stack limit on the
host.

Since the stack grows upwards on hppa, the stack_limit value marks the
upper boundary of the stack. Fix the output of /proc/self/maps (in the
guest) to show the [stack] marker on the correct memory area.

Signed-off-by: Helge Deller <deller@gmx.de>
Message-Id: <20220924114501.21767-6-deller@gmx.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-27 09:33:56 +02:00
Helge Deller f43882052f linux-user/hppa: Drop stack guard page on hppa target
The stack-overflow check when building the "grep" debian package fails
on the debian hppa target. Reason is, that the guard page at the top
of the stack (which is added by qemu) prevents the fault handler in the
grep program to properly detect the stack overflow.

The Linux kernel on a physical machine doesn't install a guard page
either, so drop it and as such fix the build of "grep".

Signed-off-by: Helge Deller <deller@gmx.de>
Message-Id: <20220924114501.21767-5-deller@gmx.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-27 09:33:56 +02:00
Helge Deller 47393189ce linux-user/hppa: Add signal trampoline for hppa target
In Linux kernel v5.18 the vDSO for signal trampoline was added.
This code mimiks the bare minimum of this vDSO and thus avoids that the
parisc emulation needs executable stacks.

Signed-off-by: Helge Deller <deller@gmx.de>
Message-Id: <20220924114501.21767-4-deller@gmx.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-27 09:33:19 +02:00
Helge Deller 785783bab1 linux-user: Add proper strace format strings for getdents()/getdents64()
Signed-off-by: Helge Deller <deller@gmx.de>
Message-Id: <20220924114501.21767-3-deller@gmx.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-27 09:33:19 +02:00
Helge Deller d124853bd7 linux-user: Fix TARGET_PROT_SEM for XTENSA
The xtensa platform has a value of 0x10 for PROT_SEM.

Signed-off-by: Helge Deller <deller@gmx.de>
Reviewed-by: Laurent Vivier <laurent@vivier.eu>
Message-Id: <20220924114501.21767-2-deller@gmx.de>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-27 09:33:19 +02:00
Ilya Leoshkevich f93b76958a linux-user: Passthrough MADV_DONTNEED for certain file mappings
This is a follow-up for commit 892a4f6a75 ("linux-user: Add partial
support for MADV_DONTNEED"), which added passthrough for anonymous
mappings. File mappings can be handled in a similar manner.

In order to do that, mark pages, for which mmap() was passed through,
with PAGE_PASSTHROUGH, and then allow madvise() passthrough for these
pages. Drop the explicit PAGE_ANON check, since anonymous mappings are
expected to have PAGE_PASSTHROUGH anyway.

Add PAGE_PASSTHROUGH to PAGE_STICKY in order to keep it on mprotect().

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220725125043.43048-1-iii@linux.ibm.com>
Message-Id: <20220906000839.1672934-5-iii@linux.ibm.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-27 09:30:46 +02:00
Ilya Leoshkevich 375ce49be2 linux-user: Implement stracing madvise()
The default implementation has several problems: the first argument is
not displayed as a pointer, making it harder to grep; the third
argument is not symbolized; and there are several extra unused
arguments.

Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220906000839.1672934-4-iii@linux.ibm.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-27 09:30:44 +02:00
Ilya Leoshkevich 8655b4c709 linux-user: Fix madvise(MADV_DONTNEED) on alpha
MADV_DONTNEED has a different value on alpha, compared to all the other
architectures. Fix by using TARGET_MADV_DONTNEED instead of
MADV_DONTNEED.

Fixes: 892a4f6a75 ("linux-user: Add partial support for MADV_DONTNEED")
Signed-off-by: Ilya Leoshkevich <iii@linux.ibm.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20220906000839.1672934-3-iii@linux.ibm.com>
Signed-off-by: Laurent Vivier <laurent@vivier.eu>
2022-09-27 09:30:09 +02:00