This is the final entry in the main switch that was in a
different form. After this, we have the option to convert
the switch into a function dispatch table.
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Split out a whole bunch of placeholder functions, which are
currently identical. That won't last as more code gets moved.
Use CASE_32_64_VEC for some logical operators that previously
missed the addition of vectors.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This puts the separate mb optimization into the same framework
as the others. While fold_qemu_{ld,st} are currently identical,
that won't last as more code gets moved.
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Rather than try to keep these up-to-date across folding,
re-read nb_oargs at the end, after re-reading the opcode.
A couple of asserts need dropping, but that will take care
of itself as we split the function further.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
There was no real reason for calls to have separate code here.
Unify init for calls vs non-calls using the call path, which
handles TCG_CALL_DUMMY_ARG.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Break the final cleanup clause out of the main switch
statement. When fully folding an opcode to mov/movi,
use "continue" to process the next opcode, else break
to fall into the final cleanup.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Luis Pires <luis.pires@eldorado.org.br>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Notice when the input is known to be zero-extended and force
the TCG_BSWAP_IZ flag on. Honor the TCG_BSWAP_OS bit during
constant folding. Propagate the input to the output mask.
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We're going to change how to look up the call flags from a TCGop,
so extract it as a helper.
Tested-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Because we now store uint64_t in TCGTemp, we can now always
store the full 64-bit duplicate immediate. So remove the
difference between 32- and 64-bit hosts.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Do not allocate a large block for indexing. Instead, allocate
for each temporary as they are seen.
In general, this will use less memory, if we consider that most
TBs do not touch every target register. This also allows us to
allocate TempOptInfo for new temps created during optimization.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
These will hold a single constant for the duration of the TB.
They are hashed, so that each value has one temp across the TB.
Not used yet, this is all infrastructure.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
This propagates the extended value of TCGTemp.val that we did before.
In addition, it will be required for vector constants.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
The temp_fixed, temp_global, temp_local bits are all related.
Combine them into a single enumeration.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
Enable this on i386 to restrict the set of input registers
for an 8-bit store, as required by the architecture. This
removes the last use of scratch registers for user-only mode.
Reviewed-by: Philippe Mathieu-Daudé <f4bug@amsat.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
To be able to compile this file with -Werror=implicit-fallthrough,
we need to add some fallthrough annotations to the case statements
that might fall through. Unfortunately, the typical "/* fallthrough */"
comments do not work here as expected since some case labels are
wrapped in macros and the compiler fails to match the comments in
this case. But using __attribute__((fallthrough)) seems to work fine,
so let's use that instead (by introducing a new QEMU_FALLTHROUGH
macro in our compiler.h header file).
Signed-off-by: Thomas Huth <thuth@redhat.com>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Message-Id: <20201211152426.350966-11-thuth@redhat.com>
Signed-off-by: Thomas Huth <thuth@redhat.com>
This reverts commit cd0372c515.
The patch is incorrect in that it retains copies between globals and
non-local temps, and non-local temps still die at the end of the BB.
Failing test case for hppa:
.globl _start
_start:
cmpiclr,= 0x24,%r19,%r0
cmpiclr,<> 0x2f,%r19,%r19
---- 00010057 0001005b
movi_i32 tmp0,$0x24
sub_i32 tmp1,tmp0,r19
mov_i32 tmp2,tmp0
mov_i32 tmp3,r19
movi_i32 tmp1,$0x0
---- 0001005b 0001005f
brcond_i32 tmp2,tmp3,eq,$L1
movi_i32 tmp0,$0x2f
sub_i32 tmp1,tmp0,r19
mov_i32 tmp2,tmp0
mov_i32 tmp3,r19
movi_i32 tmp1,$0x0
mov_i32 r19,tmp1
setcond_i32 psw_n,tmp2,tmp3,ne
set_label $L1
In this case, both copies of "mov_i32 tmp3,r19" are removed. The
second because opt thought it was redundant. The first is removed
later by liveness because tmp3 is known to be dead. This leaves
the setcond_i32 with an uninitialized input.
Revert the entire patch for 5.2, and a proper optimization across
the branch may be considered for the next development cycle.
Reported-by: qemu@igor2.repo.hu
Reviewed-by: Peter Maydell <peter.maydell@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We can easily propagate temp values through the entire extended
basic block (in this case, the set of blocks connected by fallthru),
simply by not discarding the register state at the branch.
Reviewed-by: Alex Bennée <alex.bennee@linaro.org>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
When the two arguments are identical, this can be reduced to
dup_vec or to mov_vec from a tcg_constant_vec.
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>
We currently search both the root and the tcg/ directories for tcg
files:
$ git grep '#include "tcg/' | wc -l
28
$ git grep '#include "tcg[^/]' | wc -l
94
To simplify the preprocessor search path, unify by expliciting the
tcg/ directory.
Patch created mechanically by running:
$ for x in \
tcg.h tcg-mo.h tcg-op.h tcg-opc.h \
tcg-op-gvec.h tcg-gvec-desc.h; do \
sed -i "s,#include \"$x\",#include \"tcg/$x\"," \
$(git grep -l "#include \"$x\""); \
done
Acked-by: David Gibson <david@gibson.dropbear.id.au> (ppc parts)
Reviewed-by: Paolo Bonzini <pbonzini@redhat.com>
Reviewed-by: Alistair Francis <alistair.francis@wdc.com>
Reviewed-by: Stefan Weil <sw@weilnetz.de>
Reviewed-by: Richard Henderson <richard.henderson@linaro.org>
Signed-off-by: Philippe Mathieu-Daudé <philmd@redhat.com>
Message-Id: <20200101112303.20724-2-philmd@redhat.com>
Signed-off-by: Richard Henderson <richard.henderson@linaro.org>