Commit Graph

56 Commits

Author SHA1 Message Date
Pokechu22
78bfd25964 Fix all uninitialized variable warnings (C26495) 2021-10-13 12:32:16 -07:00
Pierre Bourdon
e149ad4f0a treewide: convert GPLv2+ license info to SPDX tags
SPDX standardizes how source code conveys its copyright and licensing
information. See https://spdx.github.io/spdx-spec/1-rationale/ . SPDX
tags are adopted in many large projects, including things like the Linux
kernel.
2021-07-05 04:35:56 +02:00
JosJuice
2a9d88739c JitArm64: Skip accurate single/double conversion if store-safe 2021-04-25 15:56:58 +02:00
JosJuice
b3b5016f54 Jits: Fix interpreter fallback handling of discarded registers
When the interpreter writes to a discarded register, its type
must be changed so that it is no longer considered discarded.

Fixes a 62ce1c7 regression.
2021-04-25 13:01:40 +02:00
JosJuice
62ce1c7653 Jits: Discard registers which we know will be overwritten
This commit adds a new "discarded" state for registers.
Discarding a register is like flushing it, but without
actually writing its value back to memory. We can discard
a register only when it is guaranteed that no instruction
will read from the register before it is next written to.

Discarding reduces the register pressure a little, and can
also let us skip a few flushes on interpreter fallbacks.
2021-03-24 20:48:44 +01:00
JosJuice
1845c5948d PPCAnalyst: Rework the store-safe logic
The output of instructions like fabsx and ps_sel is store-safe
if and only if the relevant inputs are. The old code was always
marking the output as store-safe if the output was a single,
and never otherwise.

Also, the old code was treating the output of psq_l/psq_lu as
store-safe, which seems incorrect (if dequantization is disabled).
2021-03-24 12:02:09 +01:00
Lioncash
7c12081693 PowerPC/PPCAnalyst: Remove unimplemented LogFunctionCall prototype
This doesn't have an implementation, so it can be removed.
2019-12-15 00:23:32 -05:00
Techjar
ff972e3673 Reformat repo to clang-format 7.0 rules 2019-05-06 18:48:04 +00:00
degasus
55db7c7a05 Jit64: Optimized idle skipping detection. 2019-04-20 20:52:39 +02:00
Lioncash
f4ec419929 SymbolDB: Namespace code under the Common namespace
Moves more common code into the Common namespace where it belongs.
2018-05-27 18:01:40 -04:00
Lioncash
3a8a67025e PPCAnalyst: Make CodeBuffer an alias for std::vector<CodeOp>
This class effectively acted as a "discount vector", that would simply
allocate memory and then delete it in the destructor when it goes out of
scope.

We can just use a std::vector directly to reduce this boilerplate.
2018-05-18 17:19:45 -04:00
Sepalani
7d36165489 PPCSymbolDB: Do not truncate fixed size symbols
Fix comparison warning
2018-04-10 21:50:33 +04:00
Lioncash
f2b2f5b4c7 PPCAnalyst: Make ReorderType an enum class
Makes the values strongly typed and doesn't dump them into the class
itself.
2018-04-08 21:38:19 -04:00
Lioncash
5e5a56bd9b PPCAnalyst: Remove unnecessary includes from header 2018-04-08 21:34:12 -04:00
Lioncash
4bd3b28823 PPCAnalyst: in-class initialize PPCAnalyzer's members
Eliminates the need to assign in the constructor initializer list.
2018-04-08 21:32:15 -04:00
Lioncash
433a56636b PPCAnalyst: Move public interface above private interface 2018-04-08 21:31:19 -04:00
Sepalani
7f552581e7 CodeView: Set Symbol Size added 2017-05-06 13:18:00 +01:00
Lioncash
aaa6430db6 PPCAnalyst: Make SetInstructionStats' opinfo parameter a const pointer
trivial const-correctness stuff
2017-02-18 04:14:26 -05:00
degasus
ca10cf5afe PPCAnalyst: Update comments 2017-01-28 03:03:04 +01:00
degasus
f31b25fe39 Jit64: Enable branch following. 2017-01-28 02:48:56 +01:00
degasus
70caf447b9 JitCache: Get physical addresses from PPCAnalyst.
So we support all kind of degenerated blocks now, not just range+length based ones.
2017-01-23 20:33:44 +01:00
hthh
789975e350 Jit: FIFO optimization improvements
This introduces speculative constants, allowing FIFO writes to be
optimized in more places.

It also clarifies the guarantees of the FIFO optimization, changing
the location of some of the checks and potentially avoiding redundant
checks.
2016-08-15 20:09:52 +10:00
Pierre Bourdon
3570c7f03a Reformat all the things. Have fun with merge conflicts. 2016-06-24 10:43:46 +02:00
Lioncash
6f3598eee4 PPCAnalyst: Remove extra whitespace from CodeBuffer 2015-06-04 19:13:29 -04:00
Lioncash
293c3bae6f PPCAnalyst: Mark some functions as const
Also removes the redundant inline specifier, as functions defined in a class/struct definition are inline by default.
2015-06-04 19:11:38 -04:00
Tillmann Karras
30ebb2459e Set copyright year to when a file was created 2015-05-25 13:22:31 +02:00
Tillmann Karras
cefcb0ace9 Update license headers to GPLv2+ 2015-05-25 13:22:31 +02:00
Fiora
e8cfcd3aeb JIT: make instruction merging generic
Now it should be easier to merge more than 2-instruction-long sequences.
Also correct some minor inconsistencies in behavior between instruction
merging cases.
2015-01-11 09:11:18 -08:00
Fiora
8237004448 JIT: optimize for the common case of unquantized psq_l/st
Optimistically assume used GQRs are 0 in blocks that only use one GQR, and
bail at the start of the block and recompile if that assumption fails.

Many games use almost entirely unquantized stores (e.g. Rebel Strike, Sonic
Colors), so this will likely be a big performance improvement across the board
for games with heavy use of paired singles.
2015-01-10 14:14:43 -08:00
Fiora
72c96c20d3 JIT: more optimizing of float ops based on known input characteristics
If the inputs are both float singles, and the top half is known to be identical
to the bottom half, we can use packed arithmetic instead of scalar to skip
the movddup.

This is slower on a few rather old CPUs, plus the Atom+Silvermont, so detect
Atom and disable it in that case.

Also avoid PPC_FP on stores if we know that the output came from a float op.
2014-11-29 11:33:11 -08:00
Fiora
7df50b0710 JIT: skip weird fmul rounding if the input is known to be single precision 2014-11-29 11:30:51 -08:00
Fiora
97fba41860 JIT: merge fcmpx and cror
Almost all uses of boolean condition-register ops in real code seem to be
the combination fcmpx + cror (e.g. for <= or >=). This merges the two.
2014-10-29 00:30:27 -07:00
comex
b6a7438053 Add BitSet and, as a test, convert some JitRegCache stuff to it.
This is a higher level, more concise wrapper for bitsets which supports
efficiently counting and iterating over set bits.  It's similar to
std::bitset, but the latter does not support efficient iteration (and at
least in libc++, the count algorithm is subpar, not that it really
matters).  The converted uses include both bitsets and, notably,
considerably less efficient regular arrays (for in/out registers in
PPCAnalyst).

Unfortunately, this may slightly pessimize unoptimized builds.
2014-10-25 16:56:51 -04:00
skidau
e50dad67ce Merge pull request #1252 from FioraAeterna/regallocator
JIT: add basic register allocation heuristics
2014-10-14 12:45:48 +11:00
Fiora
7ba9a8537b JIT: add basic register allocation heuristics
Should be at least a bit better than the previous LRU approach. Currently
has two basic components: whether a register is dirty (dirty registers need
to be stored, so clobbering them hurts more) and how many other registers will
be used between now and the next time a register gets used.

Also don't pre-load values that don't need to be in registers.
2014-10-09 20:09:14 -07:00
Lioncash
bb377d0fc3 PPCAnalyst: Remove unnecessary casts 2014-10-09 20:19:01 -04:00
Fiora
8fe730194b JIT: load registers if they're going to be used later in the block 2014-10-03 11:58:04 -07:00
Fiora
f103234e2b JIT: flush a register if it won't be used for the rest of the block
This should dramatically reduce code size in the case of blocks with
lots of branches, and certainly doesn't hurt elsewhere either.

This can probably be improved a good bit through smarter tracking of register
usage, e.g. discarding registers that are going to be overwritten, but this
is a good start and should help reduce code size and register pressure.
Unlike that sort of change, this is a "safe" patch; it only flushes registers,
which can't affect correctness, unlike actually discarding data.

As part of this, refactor PPCAnalyst to support distinguishing between
float and integer registers (to properly handle instructions that access
both, like floating-point loads and stores).

Also update every instruction in the interpreter flags table I could find
that didn't have all the correct flags.
2014-09-22 16:00:25 -07:00
Fiora
08ac10d00a PPCAnalyst/JIT: add ability to easily toggle branch and carry merging 2014-09-13 13:48:24 -07:00
Fiora
54129a8ca5 PPCAnalyst: refactor, add carry op reordering and non-cmp reordering
Tries as hard as possible to push carry-using operations (like addc and adde)
next to each other. Refactor the instruction reordering to be more flexible
and allow multiple passes.

353 -> 192 x86 instructions on a carry-heavy code block in Pokemon Puzzle.
12% faster overall in Pokemon Puzzle; probably less in typical games (Virtual
Console games seem to be carry-heavy for some reason; maybe a different
compiler?)
2014-09-13 13:48:23 -07:00
Fiora
45d84605a9 JIT64: optimize carry calculations further
Keep carry flags in the x86 flags register if used in the next instruction.
2014-09-13 13:48:20 -07:00
Fiora
bea2504a51 JIT64: optimize carry calculations
Omit carry calculations that get overwritten later in the block before they're
used. Very common in the case of srawix and friends.
2014-09-13 13:47:43 -07:00
Ryan Houdek
71cb09f1ca Merge pull request #1027 from rohit-n/change-include
Include CommonTypes.h instead of Common.h.
2014-09-10 00:35:16 -05:00
Rohit Nirmal
fbc64984ca Include CommonTypes.h instead of Common.h. 2014-09-08 15:39:58 -04:00
Ryan Houdek
2b06257e16 Beginning of the AArch64 JIT branch.
This is the bare minimum required to run a few games on AArch64.
Was able to run starfield and Animal Crossing to the Nintendo logo.
QEmu emulation is literally the slowest thing in the world, it maxes out at around 12mhz on my Core i7-4930MX.
2014-09-06 20:14:52 -05:00
Fiora
07e0c917c6 Revert "JIT64: optimize CA calculations" 2014-09-05 10:26:30 -07:00
Fiora
3aa40dab00 JIT64: optimize carry calculations
Omit carry calculations that get overwritten later in the block before they're
used. Very common in the case of srawix and friends.
2014-09-01 20:41:48 -07:00
Lioncash
eb535be874 Core: Clean up brace placements 2014-08-30 18:06:49 -04:00
Fiora
7dbc623dc0 JIT: Initial FPRF support
Doesn't support all the FPSCR flags, just the FPRF ones.
Add PPCAnalyzer support to remove unnecessary FPRF calculations.

POV-ray benchmark with enableFPRF forced on for an extreme comparison:
Before: 1500s
After, fmul/fmadd only: 728s
After, all float: 753s

In real games that use FPRF, like F-Zero GX, FPRF previously cost a few percent
of total runtime.

Since FPRF is so much faster now, if enableFPRF is set, just do it for every
float instruction, not just fmul/fmadd like before. I don't know if this will
fix any games, but there's little good reason not to.
2014-08-26 10:57:03 -07:00
Ryan Houdek
cdec575bef Fixes games that use the MMU to page in code(Rogue Leader).
The issue was that on memory exception we wouldn't call in to PPCAnalyst and our code_block would retain the previous blocks information.
This would cause us to compile the previous blocks instructions in prior to the exception exit.
2014-05-09 09:10:45 -05:00