mirror of
				https://github.com/superseriousbusiness/gotosocial.git
				synced 2025-11-04 01:12:24 -06:00 
			
		
		
		
	
		
			
				
	
	
		
			244 lines
		
	
	
	
		
			9.6 KiB
		
	
	
	
		
			Go
		
	
	
	
	
	
			
		
		
	
	
			244 lines
		
	
	
	
		
			9.6 KiB
		
	
	
	
		
			Go
		
	
	
	
	
	
// Copyright 2019 The Go Authors. All rights reserved.
 | 
						|
// Use of this source code is governed by a BSD-style
 | 
						|
// license that can be found in the LICENSE file.
 | 
						|
 | 
						|
/*
 | 
						|
Package ppc64 implements a PPC64 assembler that assembles Go asm into
 | 
						|
the corresponding PPC64 instructions as defined by the Power ISA 3.0B.
 | 
						|
 | 
						|
This document provides information on how to write code in Go assembler
 | 
						|
for PPC64, focusing on the differences between Go and PPC64 assembly language.
 | 
						|
It assumes some knowledge of PPC64 assembler. The original implementation of
 | 
						|
PPC64 in Go defined many opcodes that are different from PPC64 opcodes, but
 | 
						|
updates to the Go assembly language used mnemonics that are mostly similar if not
 | 
						|
identical to the PPC64 mneumonics, such as VMX and VSX instructions. Not all detail
 | 
						|
is included here; refer to the Power ISA document if interested in more detail.
 | 
						|
 | 
						|
Starting with Go 1.15 the Go objdump supports the -gnu option, which provides a
 | 
						|
side by side view of the Go assembler and the PPC64 assembler output. This is
 | 
						|
extremely helpful in determining what final PPC64 assembly is generated from the
 | 
						|
corresponding Go assembly.
 | 
						|
 | 
						|
In the examples below, the Go assembly is on the left, PPC64 assembly on the right.
 | 
						|
 | 
						|
1. Operand ordering
 | 
						|
 | 
						|
  In Go asm, the last operand (right) is the target operand, but with PPC64 asm,
 | 
						|
  the first operand (left) is the target. The order of the remaining operands is
 | 
						|
  not consistent: in general opcodes with 3 operands that perform math or logical
 | 
						|
  operations have their operands in reverse order. Opcodes for vector instructions
 | 
						|
  and those with more than 3 operands usually have operands in the same order except
 | 
						|
  for the target operand, which is first in PPC64 asm and last in Go asm.
 | 
						|
 | 
						|
  Example:
 | 
						|
    ADD R3, R4, R5		<=>	add r5, r4, r3
 | 
						|
 | 
						|
2. Constant operands
 | 
						|
 | 
						|
  In Go asm, an operand that starts with '$' indicates a constant value. If the
 | 
						|
  instruction using the constant has an immediate version of the opcode, then an
 | 
						|
  immediate value is used with the opcode if possible.
 | 
						|
 | 
						|
  Example:
 | 
						|
    ADD $1, R3, R4		<=> 	addi r4, r3, 1
 | 
						|
 | 
						|
3. Opcodes setting condition codes
 | 
						|
 | 
						|
  In PPC64 asm, some instructions other than compares have variations that can set
 | 
						|
  the condition code where meaningful. This is indicated by adding '.' to the end
 | 
						|
  of the PPC64 instruction. In Go asm, these instructions have 'CC' at the end of
 | 
						|
  the opcode. The possible settings of the condition code depend on the instruction.
 | 
						|
  CR0 is the default for fixed-point instructions; CR1 for floating point; CR6 for
 | 
						|
  vector instructions.
 | 
						|
 | 
						|
  Example:
 | 
						|
    ANDCC R3, R4, R5		<=>	and. r5, r3, r4 (set CR0)
 | 
						|
 | 
						|
4. Loads and stores from memory
 | 
						|
 | 
						|
  In Go asm, opcodes starting with 'MOV' indicate a load or store. When the target
 | 
						|
  is a memory reference, then it is a store; when the target is a register and the
 | 
						|
  source is a memory reference, then it is a load.
 | 
						|
 | 
						|
  MOV{B,H,W,D} variations identify the size as byte, halfword, word, doubleword.
 | 
						|
 | 
						|
  Adding 'Z' to the opcode for a load indicates zero extend; if omitted it is sign extend.
 | 
						|
  Adding 'U' to a load or store indicates an update of the base register with the offset.
 | 
						|
  Adding 'BR' to an opcode indicates byte-reversed load or store, or the order opposite
 | 
						|
  of the expected endian order. If 'BR' is used then zero extend is assumed.
 | 
						|
 | 
						|
  Memory references n(Ra) indicate the address in Ra + n. When used with an update form
 | 
						|
  of an opcode, the value in Ra is incremented by n.
 | 
						|
 | 
						|
  Memory references (Ra+Rb) or (Ra)(Rb) indicate the address Ra + Rb, used by indexed
 | 
						|
  loads or stores. Both forms are accepted. When used with an update then the base register
 | 
						|
  is updated by the value in the index register.
 | 
						|
 | 
						|
  Examples:
 | 
						|
    MOVD (R3), R4		<=>	ld r4,0(r3)
 | 
						|
    MOVW (R3), R4		<=>	lwa r4,0(r3)
 | 
						|
    MOVWZU 4(R3), R4		<=>	lwzu r4,4(r3)
 | 
						|
    MOVWZ (R3+R5), R4		<=>	lwzx r4,r3,r5
 | 
						|
    MOVHZ  (R3), R4		<=>	lhz r4,0(r3)
 | 
						|
    MOVHU 2(R3), R4		<=>	lhau r4,2(r3)
 | 
						|
    MOVBZ (R3), R4		<=>	lbz r4,0(r3)
 | 
						|
 | 
						|
    MOVD R4,(R3)		<=>	std r4,0(r3)
 | 
						|
    MOVW R4,(R3)		<=>	stw r4,0(r3)
 | 
						|
    MOVW R4,(R3+R5)		<=>	stwx r4,r3,r5
 | 
						|
    MOVWU R4,4(R3)		<=>	stwu r4,4(r3)
 | 
						|
    MOVH R4,2(R3)		<=>	sth r4,2(r3)
 | 
						|
    MOVBU R4,(R3)(R5)		<=>	stbux r4,r3,r5
 | 
						|
 | 
						|
4. Compares
 | 
						|
 | 
						|
  When an instruction does a compare or other operation that might
 | 
						|
  result in a condition code, then the resulting condition is set
 | 
						|
  in a field of the condition register. The condition register consists
 | 
						|
  of 8 4-bit fields named CR0 - CR7. When a compare instruction
 | 
						|
  identifies a CR then the resulting condition is set in that field
 | 
						|
  to be read by a later branch or isel instruction. Within these fields,
 | 
						|
  bits are set to indicate less than, greater than, or equal conditions.
 | 
						|
 | 
						|
  Once an instruction sets a condition, then a subsequent branch, isel or
 | 
						|
  other instruction can read the condition field and operate based on the
 | 
						|
  bit settings.
 | 
						|
 | 
						|
  Examples:
 | 
						|
    CMP R3, R4			<=>	cmp r3, r4	(CR0 assumed)
 | 
						|
    CMP R3, R4, CR1		<=>	cmp cr1, r3, r4
 | 
						|
 | 
						|
  Note that the condition register is the target operand of compare opcodes, so
 | 
						|
  the remaining operands are in the same order for Go asm and PPC64 asm.
 | 
						|
  When CR0 is used then it is implicit and does not need to be specified.
 | 
						|
 | 
						|
5. Branches
 | 
						|
 | 
						|
  Many branches are represented as a form of the BC instruction. There are
 | 
						|
  other extended opcodes to make it easier to see what type of branch is being
 | 
						|
  used.
 | 
						|
 | 
						|
  The following is a brief description of the BC instruction and its commonly
 | 
						|
  used operands.
 | 
						|
 | 
						|
  BC op1, op2, op3
 | 
						|
 | 
						|
    op1: type of branch
 | 
						|
        16 -> bctr (branch on ctr)
 | 
						|
        12 -> bcr  (branch if cr bit is set)
 | 
						|
        8  -> bcr+bctr (branch on ctr and cr values)
 | 
						|
	4  -> bcr != 0 (branch if specified cr bit is not set)
 | 
						|
 | 
						|
	There are more combinations but these are the most common.
 | 
						|
 | 
						|
    op2: condition register field and condition bit
 | 
						|
 | 
						|
	This contains an immediate value indicating which condition field
 | 
						|
	to read and what bits to test. Each field is 4 bits long with CR0
 | 
						|
        at bit 0, CR1 at bit 4, etc. The value is computed as 4*CR+condition
 | 
						|
        with these condition values:
 | 
						|
 | 
						|
        0 -> LT
 | 
						|
        1 -> GT
 | 
						|
        2 -> EQ
 | 
						|
        3 -> OVG
 | 
						|
 | 
						|
	Thus 0 means test CR0 for LT, 5 means CR1 for GT, 30 means CR7 for EQ.
 | 
						|
 | 
						|
    op3: branch target
 | 
						|
 | 
						|
  Examples:
 | 
						|
 | 
						|
    BC 12, 0, target		<=>	blt cr0, target
 | 
						|
    BC 12, 2, target		<=>	beq cr0, target
 | 
						|
    BC 12, 5, target		<=>	bgt cr1, target
 | 
						|
    BC 12, 30, target		<=>	beq cr7, target
 | 
						|
    BC 4, 6, target		<=>	bne cr1, target
 | 
						|
    BC 4, 1, target		<=>	ble cr1, target
 | 
						|
 | 
						|
    The following extended opcodes are available for ease of use and readability:
 | 
						|
 | 
						|
    BNE CR2, target		<=>	bne cr2, target
 | 
						|
    BEQ CR4, target		<=>	beq cr4, target
 | 
						|
    BLT target			<=>	blt target (cr0 default)
 | 
						|
    BGE CR7, target		<=>	bge cr7, target
 | 
						|
 | 
						|
  Refer to the ISA for more information on additional values for the BC instruction,
 | 
						|
  how to handle OVG information, and much more.
 | 
						|
 | 
						|
5. Align directive
 | 
						|
 | 
						|
  Starting with Go 1.12, Go asm supports the PCALIGN directive, which indicates
 | 
						|
  that the next instruction should be aligned to the specified value. Currently
 | 
						|
  8 and 16 are the only supported values, and a maximum of 2 NOPs will be added
 | 
						|
  to align the code. That means in the case where the code is aligned to 4 but
 | 
						|
  PCALIGN $16 is at that location, the code will only be aligned to 8 to avoid
 | 
						|
  adding 3 NOPs.
 | 
						|
 | 
						|
  The purpose of this directive is to improve performance for cases like loops
 | 
						|
  where better alignment (8 or 16 instead of 4) might be helpful. This directive
 | 
						|
  exists in PPC64 assembler and is frequently used by PPC64 assembler writers.
 | 
						|
 | 
						|
  PCALIGN $16
 | 
						|
  PCALIGN $8
 | 
						|
 | 
						|
  Functions in Go are aligned to 16 bytes, as is the case in all other compilers
 | 
						|
  for PPC64.
 | 
						|
 | 
						|
6. Shift instructions
 | 
						|
 | 
						|
  The simple scalar shifts on PPC64 expect a shift count that fits in 5 bits for
 | 
						|
  32-bit values or 6 bit for 64-bit values. If the shift count is a constant value
 | 
						|
  greater than the max then the assembler sets it to the max for that size (31 for
 | 
						|
  32 bit values, 63 for 64 bit values). If the shift count is in a register, then
 | 
						|
  only the low 5 or 6 bits of the register will be used as the shift count. The
 | 
						|
  Go compiler will add appropriate code to compare the shift value to achieve the
 | 
						|
  the correct result, and the assembler does not add extra checking.
 | 
						|
 | 
						|
  Examples:
 | 
						|
 | 
						|
    SRAD $8,R3,R4		=>	sradi r4,r3,8
 | 
						|
    SRD $8,R3,R4		=>	rldicl r4,r3,56,8
 | 
						|
    SLD $8,R3,R4		=>	rldicr r4,r3,8,55
 | 
						|
    SRAW $16,R4,R5		=>	srawi r5,r4,16
 | 
						|
    SRW $40,R4,R5		=>	rlwinm r5,r4,0,0,31
 | 
						|
    SLW $12,R4,R5		=>	rlwinm r5,r4,12,0,19
 | 
						|
 | 
						|
  Some non-simple shifts have operands in the Go assembly which don't map directly
 | 
						|
  onto operands in the PPC64 assembly. When an operand in a shift instruction in the
 | 
						|
  Go assembly is a bit mask, that mask is represented as a start and end bit in the
 | 
						|
  PPC64 assembly instead of a mask. See the ISA for more detail on these types of shifts.
 | 
						|
  Here are a few examples:
 | 
						|
 | 
						|
    RLWMI $7,R3,$65535,R6 	=>	rlwimi r6,r3,7,16,31
 | 
						|
    RLDMI $0,R4,$7,R6 		=>	rldimi r6,r4,0,61
 | 
						|
 | 
						|
  More recently, Go opcodes were added which map directly onto the PPC64 opcodes. It is
 | 
						|
  recommended to use the newer opcodes to avoid confusion.
 | 
						|
 | 
						|
    RLDICL $0,R4,$15,R6		=>	rldicl r6,r4,0,15
 | 
						|
    RLDICR $0,R4,$15,R6		=>	rldicr r6.r4,0,15
 | 
						|
 | 
						|
Register naming
 | 
						|
 | 
						|
1. Special register usage in Go asm
 | 
						|
 | 
						|
  The following registers should not be modified by user Go assembler code.
 | 
						|
 | 
						|
  R0: Go code expects this register to contain the value 0.
 | 
						|
  R1: Stack pointer
 | 
						|
  R2: TOC pointer when compiled with -shared or -dynlink (a.k.a position independent code)
 | 
						|
  R13: TLS pointer
 | 
						|
  R30: g (goroutine)
 | 
						|
 | 
						|
  Register names:
 | 
						|
 | 
						|
  Rn is used for general purpose registers. (0-31)
 | 
						|
  Fn is used for floating point registers. (0-31)
 | 
						|
  Vn is used for vector registers. Slot 0 of Vn overlaps with Fn. (0-31)
 | 
						|
  VSn is used for vector-scalar registers. V0-V31 overlap with VS32-VS63. (0-63)
 | 
						|
  CTR represents the count register.
 | 
						|
  LR represents the link register.
 | 
						|
 | 
						|
*/
 | 
						|
package ppc64
 |