What's new

A high-level core?

blueshogun96

A lowdown dirty shame
Hey, a few days ago, I had this idea for an HLE interpreter. Ok, let's say that I'm emulating an x86 CPU. Would it be possible to do this?

Code:
switch( dwOpcode )
{
...
case 0x38354fae: // Let's say... mov eax, ebx
__asm mov eax, ebx
...
}

It's just an Idea that I havent gotten around to trying for myself, but has anyone ever done anything similar?
 

Doomulation

?????????????????????????
Yeah, but that would mean putting a case for thousands of different cases. You'd need to extract the opcode first and no the data... unless that isn't what you're doing, because that sure doesn't seem like an instruction.
 

ShizZy

Emulator Developer
Well, that really isn't HLE, that's an interpreter written with assembly (which is faster than a C interpreter, if done right). The purpose of HLE is to override cpu instructions with fast, exact, C code. You don't execute per opcode, instead you detect a function from the game's engine or sdk, and you replace the cpu instructions with an emulator function that does the equivelent. And thus you can override the need for real cpu and hardware emulation, at the expense of compatibility. Some stuff you might HLE are OS calls (OSEnableInterrupts, OSDisableInterupts, OSSelectThread), or graphics vector math, etc.
 

Danny

Programmer | Moderator
ShizZy said:
Well, that really isn't HLE, that's an interpreter written with assembly (which is faster than a C interpreter, if done right). The purpose of HLE is to override cpu instructions with fast, exact, C code. You don't execute per opcode, instead you detect a function from the game's engine or sdk, and you replace the cpu instructions with an emulator function that does the equivelent. And thus you can override the need for real cpu and hardware emulation, at the expense of compatibility. Some stuff you might HLE are OS calls (OSEnableInterrupts, OSDisableInterupts, OSSelectThread), or graphics vector math, etc.

ShizZy your making my head spin! :p

You gus certainly know your stuff on HLE!
 

smcd

Active member
doing operations such as changing the actual eax, ebx of the machine currently running the interpretation will mess up the state/execution of the emulator...
 

civilian0746

evil god
Should not be writing cores in c or other high level languages because u have do lots of preprocessing in identifying the insts and ophs. In addition, the way suggested on the first post is like...having a case statement for all the combinations of instructions and values.
 

Doomulation

?????????????????????????
Oh? So you suggest doing a core in assembly, then?
It doesn't matter which language you use, you still need to identify the opcode. Unless it IS the system being emulated, of course.
 

Exophase

Emulator Developer
I think part of what he's suggesting is having a switch case for every possible instruction to avoid decoding the operands or other opcode fields. For platforms with fixed width 16bit instructions this is possible (like SH, Thumb, etc). For platforms with 32bit instructions this isn't feasible, there are too many possible cases. For platforms with variable length instructions (x86 falls under this, as per your example) it's not possible, you have to fetch in multiple steps, although if you fetch to the largest size and pad the remainder with something producing otherwise invalid opcodes (they'd probably have to be) and switch on that it might work. This is only possibly feasible with instructions that vary between 1 and 2 or 3 bytes like 6502. It's really a trade-off between speed and space, but very large switch sets will thrash cache (and possibly RAM at that) pretty quickly.
 
OP
blueshogun96

blueshogun96

A lowdown dirty shame
Ok, so I'm gonna scratch that idea. :p

Anyway, forgive me if I see this incorrectly, but you all are saying that it's not a good idea to do an interpreter on an x86 CPU? This is my first time emulating a RISC cpu btw. So used to the CISC stuff. Also, to do it the normal way, should I set up my union this way? I'll be honest, out of all the years I've been programming (started with basic in 1989) the one thing I never learned how to use in C were unions, but I know everything else lol.

Code:
typedef union
{
  DWORD w;
  WORD w;

  struct
  {
    unsigned char h,l;
  }b;
}iPIIIReg;
 
Last edited:

ShizZy

Emulator Developer
The union looks fine, but keep in mind a Word is 32bits (well, on most systems), and 2 bytes is only 16. So for that to work, you'd have to map 4 bytes in that switch, or two 2byte switches in a switch within the union. As for a risc interpreter on x86 - why not? What else would you write it on, a MAC? ;)

Here's a crash course in RISC arcitecture emulation- (using PPC as an example)
Unlike CISC, where opcodes are of variable size, and can be identified simply by a one or 2 byte field, RISC is as you know fixed length opcodes, usually 32bit. Take a pseudo PPC opcode:

--------------32 Bits----------------------------------------------
[Bits 0-5][Bits 6-10][Bits 11-15][Bits 16-20][Bits 21-30][Bit 31]

As you know, it's divided up into sections (PPC is big endian btw). In this example, the first 6 bits, 0-5, are the opcode identifier, which has a possible range from zero to 64. Each identifier specifies an instruction, or, in some cases, a branch to other instructions. So, a simple way to do this, would be to do:
Code:
switch((opcode >> 26)&0x3F)
{
case 0: Opcode_NOP(); break;
case 1: Opcode_OR(); break;
...
case 31: Opcode_31(); break;
In this case, the value of 31 has multiple opcodes, so you'd call a function with another switch, and this time, check the opcode extension. On PPC, this extension is bits 21-30. It's probably different on the cpu your emulating. So, just do switch((opcode >> 1)&0x3FF).

Tada... instruction decoding on RISC, no strings attached. It's really very easy to do an interpreter like this. Now, take the other sections. Bits 6-10, 11-15, and 16-20 are register fields in this example. Thus, each field represents a value of the GPR to manipulate by the instruction. In this example, the first would be REGD, the second REGA, and the third REGC. Each is 5 bits, thus can have a value from 0-31, which points to one of the PPC's 32 general purpose registers, respectivly. Here's a pseudo op:

In your cpu manual, it might say something like this:

Opcode: MUL
regDest = regSrcA * regSrcB

Now, in C++:
Code:
void Opcode_MUL(u32 opcode)
{
    reg.gpr[(opcode>>21)&0x1F] = reg.gpr[(opcode>>16)&0x1F]  * reg.gpr[(opcode>>11)&0x1F] ;
}

A good tip is to use macros for all that shifting, ie #define REG_D ([(opcode>>21)&0x1F) - so you are less prone to making mistakes. As for Bit31, it could be on anything, on PPC though it's a flag called Rc to calculate a field of the CR register.

There you have it. As for most of the other parts, it acts pretty much in the same nature as a CISC processor, to an extent. Hope that helps a little :)
 
Last edited:
OP
blueshogun96

blueshogun96

A lowdown dirty shame
Hmmmm, interesting :)

You have an interesting way of emulating RISC cpu's. Your advice is greatly appreciated. I guess the RISC cpu stuff sounds more complicated than it really is, lol. Assuming you know I'm emulating a Pentium III.
 

ShizZy

Emulator Developer
Pentium 3? Why would you emulate it on an x86 system when you can already run native files? What happened to PSP?

Anyways, if that's your plan, why not start with a much easier x86 processor, like the 80386? Once you get that running, and trust me it'll be a bit easier, modifying it up to P3 standards won't be hard.
 

Doomulation

?????????????????????????
ShizZy said:
The union looks fine, but keep in mind a Word is 32bits (well, on most systems), and 2 bytes is only 16.
Isn't a WORD supposed to be 2 bytes and not 4? Does this apply to other architectures than the PC? A WORD is a short int if I'm not mistaken and int is long int, aka 4 bytes.
 

bcrew1375

New member
Doomulation said:
Isn't a WORD supposed to be 2 bytes and not 4? Does this apply to other architectures than the PC? A WORD is a short int if I'm not mistaken and int is long int, aka 4 bytes.

I thought the same thing, but apparently a WORD is now associated with 32-bits.
 

smcd

Active member
Check it out and see for yourself :p

Code:
#include<limits.h>
#include<stdio.h>
#include<windows.h>

int main()
{
    printf("There are %d bits in a WORD under this Windows OS.\r\n", sizeof(WORD) * CHAR_BIT);
    return 0;
}

I get 16 as the result under Windows XP Pro 32 bit edition.
 
Last edited:

Doomulation

?????????????????????????
Indeed, and this is what I get:

Code:
// Test2.cpp : Defines the entry point for the console application.
//

#include "stdafx.h"
#include <windows.h>
#include <iostream>
using namespace std;

int _tmain(int argc, _TCHAR* argv[])
{
	WORD test;
	cout << sizeof(test) << endl;
	return 0;
}

Output: 2
 

Cyberman

Moderator
Moderator
Emulation suggestions

1) Think VIRTUALIZATION. Remember your emulator should not change the state of your CURRENT machine. If the program has a nasty bug.. your emulator will suddenly have a nasty bug. Don't let someone elses bad programing kill the machine in other words.
2) It is always best to start with an interpretor. Try a few things as well. I recomend using an Opcode matrix or the CPU developers opcode encoding information. Much of this you can find on the manufacturers web site. PPC (FreeScale or IBM), ARM7/9 and various other instruction mixes (ARM thumb ARM 9 T etc) you might be able to get from ARM (though they seem to be tight arsed about information .. parinoid lot they are).
3) HLE - High Level Emulation... what you are discusing isn't HLE it's translation / recompilation. :)
4) WORD... what is microsoft now screwing with the language? LOL

WORD does not mean 16 32 or 64.. none of the above. if you want to be precise (which I tend to be weird about) the original meaning was from the VAX PDP11 it ment 16 bits since the byte (8 bits) was already in existance along with ye olde nybble ( 4 bits). Now to further complicate things was that word was used for the size of the opcode. IE PDP11's were 16bit machines (not 8 bit). So when 32 bit machines came along.. they called them 32 bit WORDs or Double word.

People who use WORD for 32 bits should in all cases say double or 32 bit. However taking into account just how damned lazy people really are.. you now see that we have 64 bit machines... it a few years WORD may refer to 64 bits. Just how dumb we human beings really are? Hmmmmm. :D I recomend sticking with 16 bits refering to a WORD. Everytime someone says word when discusing computers, ask "is that 32 64 or 16 bits". Because people are lazy, they tend to do what's easiest for them, not what is correct.

And now back to that emulation thing ;)

Cyb
 

ShizZy

Emulator Developer
Doom/Bcrew/Seth: Depends on the arcitecture :p but yeah, word IS 16 bits on x86, but 32 on PPC(and many other cpus). Nonetheless, you shouldn't really use that termonology in an emulator, simply because the target cpu and platform cpu are usually different.

Edit: what cyb said too :p
 
Last edited:

bcrew1375

New member
Yeah, I prefer to use the number of bits when I'm talking about things like that. It's too easy to misunderstand. Instead of saying word, long, int, etc, why not just say the number of bits? Come on, it can't be that hard :p.
 

Top