I tried making the new templated Interpret callback test only the relevant exceptions (EXCEPTION_DSI, EXCEPTION_PROGRAM, or both), but didn't find a significant performance boost in it. As I am learning, the biggest bottleneck is the number of callbacks emitted, not usually the actual contents of them.
WritePC is now needed far less, only for instructions that end the block. Unfortunately, WritePC still needs to update `PowerPCState::npc` to support the false path of conditional branch instructions. Both drawbacks should be smoothed over by optimized cached instructions in the future.
Before:
1. In theory there could be multiple, but in practice they were (manually) cleared before creating one
2. (Some of) the conditions to clear one were either to reach it, to create a new one (due to the point above), or to step. This created weird behavior: let's say you Step Over a `bl` (thus creating a temporary breakpoint on `pc+4`), and you reached a regular breakpoint inside the `bl`. The temporary one would still be there: if you resumed, the emulation would still stop there, as a sort of Step Out. But, if before resuming, you made a Step, then it wouldn't do that.
3. The breakpoint widget had no idea concept of them, and will treat them as regular breakpoints. Also, they'll be shown only when the widget is updated in some other way, leading to more confusion.
4. Because only one breakpoint could exist per address, the creation of a temporary breakpoint on a top of a regular one would delete it and inherit its properties (e.g. being log-only). This could happen, for instance, if you Stepped Over a `bl` specifically, and pc+4 had a regular breakpoint.
Now there can only be one temporary breakpoint, which is automatically cleared whenever emulation is paused. So, removing some manual clearing from 1., and removing the weird behavior of 2. As it is stored in a separate variable, it won't be seen at all depending on the function used (fixing 3., and removing some checks in other places), and it won't replace a regular breakpoint, instead simply having priority (fixing 4.).
Change misleading names.
Fix function usage: Intepreter and Step Out will not check breakpoints in their own wrong way anymore (e.g. breaking on log-only breakpoints).
When the immediate value is zero, we can do a negation. On ARM64 the NEG
/NEGS instructions are just an alias for SUB/SUBS with a hardcoded WZR.
Before:
```
ldr w22, [x29, #0x28]
mov w21, #0x0 ; =0
subs w22, w21, w22
```
After:
```
ldr w22, [x29, #0x28]
negs w22, w22
```
Using shifts and bit tests makes the code unnecessarily annoying to
reason about. I'm replacing it with subtracting from 3 to translate the
bit order from the PowerPC format to the usual format.
BI contains both the field and the flag (5 bits total), so we need to
shift away the 2 flag bits to get the 3 field bits. (Same as the
CRBA/CRBB handling in the code just below the BI code.)
NetBSD doesn't put packages in /usr/local like /CMakeLists.txt thought.
The `#ifdef __NetBSD__` around iconv was actually breaking compilation
on NetBSD when using the system libiconv (there's also a GNU iconv
package)
A C program included from C++ source broke on NetBSD specifically, work
around it.
This doesn't fix compilation on NetBSD, which is currently broken, but
is closer to correct.
This is a JitArm64 version of 219610d8a0.
Due to limitations on how far you can jump with a single AArch64 branch
instruction, going above the former limit of 128 MiB of code (counting
nearcode and farcode combined) requires a bit of restructuring. With the
restructuring in place, the limit now is 256 MiB. See the new large
comment in Jit.h for a description of the new memory layout.