[Mono-dev] [PATCH] Enable TLS for PPC32/64

Paolo Molaro lupus at ximian.com
Wed Jan 28 10:46:44 EST 2009


On 01/25/09 Steven Munroe wrote:
> What you consider superfluous addresses an annoying bug where the
> mkbundle generates code that can never be assembled and results in a 2GB
> a.out file. This is a result of brain-dead code the than can't handle
> biarch builds (where the build is for the non-default arch).
> 
> Has far I can tell mkbundle is disabled for PPC32 and has been for some
> time ...

mkbundle used to work on linux and osx on ppc32. IIRC it was disabled
because for some reason that I didn't investigate, the pkg-config call
failed while running make test on the POS that is OSX, while it worked
when the test was executed standalone.
So there should be no reason to disable it on ppc64: if there is an
actual bug it needs to be fixed.

> 2009-01-24  Steven Munroe  <munroesj at us.ibm.com>
> 
> This patch is contributed under the terms of the MIT/X11 license
> 
> 	* exceptions-ppc.c (mono_arch_get_restore_context): Correct
> 	g_assert test.
> 	* tramp-ppc.c (mono_arch_create_trampoline_code): Use
> 	MONO_PPC_32_64_CASE to set tramp_size for 64-bit.
> 
> 	* Makefile.am: Disable mkbundle for POWERPC64.
> 
> diff -urN mono-svn0/mono/mono/mini/exceptions-ppc.c mono-svn-fix/mono/mono/mini/exceptions-ppc.c
> --- mono-svn0/mono/mono/mini/exceptions-ppc.c	2009-01-23 17:05:56.000000000 -0600
> +++ mono-svn-fix/mono/mono/mini/exceptions-ppc.c	2009-01-24 12:21:08.000000000 -0600
> @@ -218,7 +218,7 @@
>  	/* never reached */
>  	ppc_break (code);
>  
> -	g_assert ((code - start) < size);
> +	g_assert ((code - start) <= size);
>  	mono_arch_flush_icache (start, code - start);
>  	return start;
>  }
> diff -urN mono-svn0/mono/mono/mini/tramp-ppc.c mono-svn-fix/mono/mono/mini/tramp-ppc.c
> --- mono-svn0/mono/mono/mini/tramp-ppc.c	2009-01-23 17:05:56.000000000 -0600
> +++ mono-svn-fix/mono/mono/mini/tramp-ppc.c	2009-01-24 12:22:03.000000000 -0600
> @@ -504,7 +504,7 @@
>  	guint8 *jump;
>  	int tramp_size;
>  
> -	tramp_size = 32;
> +	tramp_size = MONO_PPC_32_64_CASE (32, 44);
>  
>  	code = buf = mono_global_codeman_reserve (tramp_size);
>  

This part is ok to commit.

> 009-01-24  Steven Munroe  <munroesj at us.ibm.com>
> 
> This patch is contributed under the terms of the MIT/X11 license
> 
> 	* ppc-codegen.h: Make operand order and case consistent
> 	(assembler order) for ppc_load_reg_update,
> 	ppc_load_multiple_regs, ppc_store_multiple_regs, ppc_lwz,
> 	ppc_lhz, ppc_lbz, ppc_stw,ppc_sth, ppc_stb, ppc_stwu, ppc_lbzu,
> 	ppc_lfdu, ppc_lfsu, ppc_lfsux, ppc_lfsx, ppc_lha, ppc_lhau,
> 	ppc_lhzu, ppc_lmw, ppc_lwzu, ppc_stbu, ppc_stfdu, ppc_stfsu,
> 	ppc_sthu, ppc_stmw.
> 	Use "i" or "ui" instead of "d" for immediated operands to
> 	immediate arthimetic and logical instructions in macros
> 	ppc_addi, ppc_addis, ppc_ori, ppc_addic, ppc_addicd, ppc_andid,
> 	ppc_andisd.
> 	[__mono_ppc64__]: Make operand order and case consistent
> 	(assembler order) for ppc_load_multiple_regs,
> 	ppc_store_multiple_regs.
> 	Simplify the DS form and make them consistent with D forms for
> 	ppc_load_reg, ppc_load_reg_update,  ppc_store_reg,
> 	ppc_store_reg_update. ppc_ld, ppc_lwa, ppc_ldu, ppc_std,
> 	ppc_stdu.
> 	Define ppc_lwax and ppc_lwaux.
> 
> 	* exceptions-ppc.c (restore_regs_from_context): Correct operand
> 	order (offset then base reg) for ppc_load_multiple_regs.
> 	(emit_save_saved_regs) Correct operand order for
> 	ppc_store_multiple_regs.
> 	(mono_arch_get_call_filter): Correct operand order for
> 	ppc_load_multiple_regs.
> 	* mini-ppc.c (emit_memcpy): Fix operand order for
> 	ppc_load_reg_update and ppc_store_reg_update.
> 	(mono_arch_output_basic_block): Correct operand order for
> 	ppc_lha.
> 	(mono_arch_emit_epilog): Correct operand order for
> 	ppc_load_multiple_regs.
> 	* tramp-ppc.c (mono_arch_create_trampoline_code): Correct
> 	operand order for ppc_store_multiple_regs and
> 	ppc_load_multiple_regs.

This patch is fine.

> 2009-01-24  Steven Munroe  <munroesj at us.ibm.com>
> 
> This patch is contributed under the terms of the MIT/X11 license
> 
> 	* ppc-codegen.h: Change ppc_is_imm16 and ppc_is_imm32 to avoid
> 	compiler warnings in 64-bit. 

Will this generate warnings on ppc32 instead?

> 	* mini-ppc.c (emit_memcpy): Use type long instead of int for
> 	values used to large immediate fields to eliminate range
> 	warnings.
> 	(mono_arch_emit_outarg_vt) [__APPLE__]: Declare size.
> 	(emit_float_to_int); Change vars offset and sub_offset to type
> 	long.
> 	(emit_load_volatile_arguments) [__APPLE__]: Declare size.
> 	(emit_reserve_param_area): 
> 	(emit_unreserve_param_area: Change var size to type long.
> 	(mono_arch_output_basic_block): Copy cfg->stack_usage to local
> 	type long to avoid warnings.  Also compute LR save offset
> 	in a local long ret_offset.
> 	[!__mono_ppc64__]: Use smaller ppc_load_sequence const value if
> 	32-bit.
> 	(mono_arch_emit_prolog): Change vars alloc_size, pos,
> 	max_offset to type long. Add "l" modifier to g_print for operand
> 	pos. 
> 	[!__mono_ppc64__]: Use smaller ppc_load_sequence const value if
> 	32-bit.
> 	(mono_arch_emit_epilog): Copy cfg->stack_usage to local
> 	type long to avoid warnings.
> 	(try_offset_access): #if 0 away.
> 
> diff -urN mono-svn64-gen/mono/mono/mini/mini-ppc.c mono-svn64-warn/mono/mono/mini/mini-ppc.c
> --- mono-svn64-gen/mono/mono/mini/mini-ppc.c	2009-01-24 14:24:34.000000000 -0600
> +++ mono-svn64-warn/mono/mono/mini/mini-ppc.c	2009-01-25 12:29:19.000000000 -0600
> @@ -137,9 +137,9 @@
>  {
>  	/* unrolled, use the counter in big */
>  	if (size > sizeof (gpointer) * 5) {
> -		int shifted = size >> MONO_PPC_32_64_CASE (2, 3);
> +		long shifted = size >> MONO_PPC_32_64_CASE (2, 3);
>  		guint8 *copy_loop_start, *copy_loop_jump;
> -
> +		

Please don't introduce whitespace damage.

>  		ppc_load (code, ppc_r0, shifted);
>  		ppc_mtctr (code, ppc_r0);
>  		g_assert (sreg == ppc_r11);
> @@ -1499,9 +1499,9 @@
>  	int i, soffset, dreg;
>  
>  	if (ainfo->regtype == RegTypeStructByVal) {
> -		guint32 size = 0;
>  		soffset = 0;
>  #ifdef __APPLE__
> +		guint32 size = 0;

This introduces declarations after code: while accepted by recent
compilers, such cases shouldn't be used in mono.

> @@ -3546,7 +3548,7 @@
>  			break;
>  		case OP_JMP: {
>  			int i, pos;
> -			
> +			long stack_usage = cfg->stack_usage;

It seems to me it would be more sensible to declare cfg->stack_usage as
long instead of adding the locals.


> 2009-01-24  Steven Munroe  <munroesj at us.ibm.com>
> 
> This patch is contributed under the terms of the MIT/X11 license
> 
> 	* ppc-codegen.h [__mono_ppc64__]: Define ppc_is_imm48 and
> 	ppc_load48. Update ppc_load to use ppc_is_imm48 and ppc_load48
> 	to allow for a shorter sequence for most address constants.
> 
> 	* mini-ppc.c: Add includes to get at Aux vector definitions..
> 	[__linux__]: Define functions linux_find_auxv, linux_get_auxv,
> 	linux_auxv_init_once, linux_query_auxv, linux_init_ppc_SMP,
> 	linux_init_ppc_platform.
> 	Define flags/values auxv_once_control, linux_ppc_hwcap,
> 	linux_ppc_platform, linux_ppc_ISA2x, linux_ppc_ISA2x_mask,
> 	linux_ppc_SMP, linux_ppc_LSUs, linux_ppc_FXUs, linux_ppc_FPUs.
> 	Define macros HAS_ICACHE_SNOOP.
> 	(emit_memcpy): Generate optimized code for longer moves where
> 	linux_ppc_LSUs is 2 or more.
> 	(mono_arch_cpu_init) [__linux__]: Add calls to linux_auxv_init_once
> 	and linux_init_ppc_platform.
> 	(mono_arch_flush_icache) [__linux__]: Use linux_query_auxv
> 	(AT_DCACHEBSIZE) and set cachelineinc and cachelinesize.
> 	Optimize using HAS_ICACHE_SNOOP, linux_ppc_SMP, and
> 	linux_ppc_ISA2x.
> 	(mono_arch_output_basic_block)[__mono_ppc64__]: Replace
> 	lwz/lwzx/extsw sequence with lwa/lwax.
> 	(mono_arch_output_basic_block): Replace ppc_addic with ppc_addi
> 	for case OP_JMP.
> 	(mono_arch_emit_epilog): Replace ppc_addic with ppc_addi for
> 	unstacking frames.
> 	* mini-ppc.h (MONO_ARCH_HAVE_TLS_GET): Defined.
> 	(MONO_ARCH_ENABLE_MONITOR_IL_FASTPATH): Likewise.
> 	(PPC_THREAD_PTR_REG): Likewise.

This patch mixes some optimizations with other stuff that is not needed,
so it should be split.

> diff -urN mono-svn64-warn/mono/mono/arch/ppc/ppc-codegen.h mono-svn64-opt/mono/mono/arch/ppc/ppc-codegen.h
> --- mono-svn64-warn/mono/mono/arch/ppc/ppc-codegen.h	2009-01-24 15:07:17.000000000 -0600
> +++ mono-svn64-opt/mono/mono/arch/ppc/ppc-codegen.h	2009-01-24 16:57:11.000000000 -0600
> @@ -719,12 +719,22 @@
>  #define PPC_LOAD_SEQUENCE_LENGTH	20
>  
>  #define ppc_is_imm32(val) (((((long)val)>> 31) == 0) || ((((long)val)>> 31) == -1))
> +#define ppc_is_imm48(val) (((((long)val)>> 47) == 0) || ((((long)val)>> 47) == -1))
>  
> +#define ppc_load48(c,D,v) G_STMT_START {	\
> +		ppc_li   ((c), (D), ((gint64)(v) >> 32) & 0xffff);	\
> +		ppc_sldi ((c), (D), (D), 32); \
> +		ppc_oris ((c), (D), (D), ((guint64)(v) >> 16) & 0xffff);	\
> +		ppc_ori  ((c), (D), (D),  (guint64)(v)        & 0xffff);	\
> +	} G_STMT_END
> +	
>  #define ppc_load(c,D,v) G_STMT_START {	\
>  		if (ppc_is_imm16 ((gulong)(v)))	{	\
>  			ppc_li ((c), (D), (guint16)(guint64)(v));	\
>  		} else if (ppc_is_imm32 ((gulong)(v))) {	\
>  			ppc_load32 ((c), (D), (guint32)(guint64)(v)); \
> +		} else if (ppc_is_imm48 ((gulong)(v))) {	\
> +			ppc_load48 ((c), (D), (guint64)(v)); \
>  		} else {	\
>  			ppc_load_sequence ((c), (D), (guint64)(v)); \
>  		}	\

This part is fine to commit.

> diff -urN mono-svn64-warn/mono/mono/mini/mini-ppc.c mono-svn64-opt/mono/mono/mini/mini-ppc.c
> --- mono-svn64-warn/mono/mono/mini/mini-ppc.c	2009-01-25 12:29:19.000000000 -0600
> +++ mono-svn64-opt/mono/mono/mini/mini-ppc.c	2009-01-25 16:24:57.000000000 -0600
> @@ -11,6 +11,17 @@
>   */
>  #include "mini.h"
>  #include <string.h>
> +#include <unistd.h>
> +#include <fcntl.h>
> +#include <errno.h>
> +#include <elf.h>
> +#include <link.h>
> +#include <pthread.h>

Headers included without proper checks, besides, pthread is not needed
here, elf.h can be simply avoided (there was already a structure to
represent the auxv data that you deleted).

> +volatile static ElfW(auxv_t) *linux_auxv_buf = NULL;

We don't want to keep this data duplicated in memory: there is no use
except using more memory.

> +static ElfW(auxv_t) * 
> +linux_get_auxv(void)
> +{
> +	ElfW(auxv_t) *auxv_temp = (ElfW(auxv_t) *)linux_auxv_buf;
> +	int auxv_f;
> +	size_t page_size = getpagesize();
> +	ssize_t bytes;
> +
> +	/* If the /proc/self/auxv file has not been copied into the heap
> +	   yet, then do it */
> +
> +	if(auxv_temp == NULL)
> +	{
> +		auxv_f = open("/proc/self/auxv", O_RDONLY);
> +
> +		if(auxv_f == -1) {

Mark already pointed out that you're not following the mono coding
sstyle.

> +		/* Older kernels did not support /proc/<PID>/auxv. But
> +		   the auvx table does exist in the process address
> +		   space following the env table. So try scanning over
> +		   the environment table to find the auxv. */
> +			if (errno == ENOENT) {
> +				auxv_temp = (ElfW(auxv_t)*) linux_find_auxv();
> +		/* If someone has done a setenv() the __environ pointer
> +		   may have been moved and the assumption that the auxv
> +		   follows is not true. So look at the first entry and
> +		   verify that it is an auxv entry. */

If this detection is not reliable we should simply not do it. We can
document that users should have a 2.6 kernel.

> +			} else {
> +				perror("Error opening file for reading");

We don't want to bother users with an error message that is both
non-informative and that they can't do anything about.

> +static pthread_once_t auxv_once_control = PTHREAD_ONCE_INIT;

There is no need to use pthread_once_t: just doing the setup in the init
call is ok.

> +static unsigned long	linux_ppc_hwcap;
> +static char 		*linux_ppc_platform;
> +static int		linux_ppc_ISA2x; /* PowerISA-2.0 or newer */
> +static unsigned long	linux_ppc_ISA2x_mask = 0
> +#ifdef PPC_FEATURE_POWER4 /* PowerISA-2.01 */
> +			| PPC_FEATURE_POWER4
> +#endif
> +#ifdef PPC_FEATURE_POWER5 /* PowerISA-2.03 */
> +			| PPC_FEATURE_POWER5
> +#endif
> +#ifdef PPC_FEATURE_POWER5_PLUS  /* PowerISA-2.04 */
> +			| PPC_FEATURE_POWER5_PLUS
> +#endif
> +#ifdef PPC_FEATURE_CELL
> +			| PPC_FEATURE_CELL
> +#endif
> +#ifdef PPC_FEATURE_PA6T
> +			| PPC_FEATURE_PA6T
> +#endif
> +#ifdef PPC_FEATURE_ARCH_2_05
> +			| PPC_FEATURE_ARCH_2_05
> +#endif
> +	; /* it took a while to figure out that the AT_HWCAP should represent
> +	     ISA versions and optional categories/features and the AT_PLATFORM 
> +	     should represent the CHIP design and specific micro-architecture.
> +	     Which explains the mess above.*/
> +
> +/* Default to SMP true in case we can't find out.  */
> +static int		linux_ppc_SMP = 1;
> +
> +/* Number of independent load store pipes in each core. */
> +static int		linux_ppc_LSUs = 0;
> +/* Number of independent fixed point pipes in each core. */
> +static int		linux_ppc_FXUs = 0;
> +/* Number of independent floating point pipes in each core. */
> +static int		linux_ppc_FPUs = 0;

A single uint bitmask can replace all of the above for the jit usage.
Besides, having one or more load/store units is not a linux property,
so the names of the variables is incorrect.

> +#ifdef PPC_FEATURE_ICACHE_SNOOP
> +#define HAS_ICACHE_SNOOP (linux_ppc_hwcap & PPC_FEATURE_ICACHE_SNOOP)
> +#else
> +#define HAS_ICACHE_SNOOP 0
> +#endif
> +
> +static int
> +linux_init_ppc_SMP(void)
> +{
> +	struct utsname u;
> +	/* FIXME For 2.6.26 kernels we can try to use sysfs
> +	   /sys/devices/system/cpu/possible,  but we would have to fall
> +	   back to uname for early kernels anyway. */
> +
> +	if (uname(&u) != 0) {
> +		perror("Error uname syscall failer\n");
> +		return 1;
> +	}
> +
> +	if (strstr(u.version, "SMP"))
> +		return 1;
> +	else {
> +		if (strstr(u.version, "smp"))
> +			return 1;
> +		else
> +			return 0;

This doesn't look like a reliable way. We have already utility calls to
get the number of processors and that should be enough.

> +static void
> +linux_init_ppc_platform(void)
> +{
> +	linux_ppc_hwcap = (unsigned long) linux_query_auxv (AT_HWCAP);
> +	linux_ppc_platform = (char*) linux_query_auxv (AT_PLATFORM);
> +	linux_ppc_ISA2x = (linux_ppc_hwcap & linux_ppc_ISA2x_mask) != 0L;
> +	linux_ppc_SMP = linux_init_ppc_SMP();
> +	
> +	if ((strcmp(linux_ppc_platform, "power4") >= 0)
> +	&&  (strcmp(linux_ppc_platform, "power6x") <= 0)) {
> +		linux_ppc_LSUs = 2;
> +		linux_ppc_FXUs = 2;
> +		linux_ppc_FPUs = 2;
> +	} else if  (strcmp(linux_ppc_platform, "ppc970") == 0) {
> +		linux_ppc_LSUs = 2;
> +		linux_ppc_FXUs = 2;
> +		linux_ppc_FPUs = 2;
> +	} else if  (strcmp(linux_ppc_platform, "cell") == 0) {
> +		linux_ppc_LSUs = 1;
> +		linux_ppc_FXUs = 1;
> +		linux_ppc_FPUs = 1;
> +	}
> +}
> +#endif
> +
>  /* this function overwrites r0, r11, r12 */
>  static guint8*
>  emit_memcpy (guint8 *code, int size, int dreg, int doffset, int sreg, int soffset)
> @@ -156,6 +367,20 @@
>  		dreg = ppc_r12;
>  	}
>  #ifdef __mono_ppc64__
> +	/* the hardware has multiple load/store units and the move is long
> +	   enough to use more then one regiester, then use load/load/store/store
> +	   to execute 2 instructions per cycle. */
> +	if ((linux_ppc_LSUs > 1) && (dreg != ppc_r12) && (sreg != ppc_r12)) { 
> +		while (size >= 16) {
> +			ppc_load_reg (code, ppc_r0, soffset, sreg);
> +			ppc_load_reg (code, ppc_r12, soffset+8, sreg);
> +			ppc_store_reg (code, ppc_r0, doffset, dreg);
> +			ppc_store_reg (code, ppc_r12, doffset+8, dreg);
> +			size -= 16;
> +			soffset += 16;
> +			doffset += 16; 
> +		}
> +	}
>  	while (size >= 8) {
>  		ppc_load_reg (code, ppc_r0, soffset, sreg);
>  		ppc_store_reg (code, ppc_r0, doffset, dreg);
> @@ -163,6 +388,18 @@
>  		soffset += 8;
>  		doffset += 8;
>  	}
> +#else
> +	if ((linux_ppc_LSUs > 1) && (dreg != ppc_r12) && (sreg != ppc_r12)) { 
> +		while (size >= 8) {
> +			ppc_load_reg (code, ppc_r0, soffset, sreg);
> +			ppc_load_reg (code, ppc_r12, soffset+4, sreg);
> +			ppc_store_reg (code, ppc_r0, doffset, dreg);
> +			ppc_store_reg (code, ppc_r12, doffset+4, dreg);
> +			size -= 8;
> +			soffset += 8;
> +			doffset += 8; 
> +		}
> +	}

This optimization should be in its own separate patch (together with
numbers that show it's worthwhile).

> @@ -664,21 +890,29 @@
>  		isync
>  	}
>  #else
> -	if (1) {
> -		for (p = start; p < endp; p += cachelineinc) {
> -			asm ("dcbf 0,%0;" : : "r"(p) : "memory");
> +	/* For POWER5/6 with ICACHE_SNOOP the dcbst/icbi is not required.  */
> +	if (!HAS_ICACHE_SNOOP) {
> +		if (linux_ppc_SMP) {
> +			for (p = start; p < endp; p += cachelineinc) {
> +				asm ("dcbf 0,%0;" : : "r"(p) : "memory");
> +			}
> +		} else {
> +			for (p = start; p < endp; p += cachelineinc) {
> +				asm ("dcbst 0,%0;" : : "r"(p) : "memory");
> +			}

The changes to icache flush should be a separate patch. As I said already
on irc, though, unless there is a bug in the code, changes to it should
be done only if there are hard numbers showing that the change is a
performance benefit in some real-world case. The code as it is written
fixed some random crashes happening a few years ago, so there must be
very strong reasons to change it even a single instruction.

> 2009-01-24  Steven Munroe  <munroesj at us.ibm.com>
> 
> This patch is contributed under the terms of the MIT/X11 license
> 
> 	* mini-ppc.c (emit_nptl_tls): Define for __mono_ppc64__ and
> 	!__mono_ppc64__.
> 	(emit_tls_access): Update to use emit_nptl_tls.
> 	(mono_arch_emit_prolog): Use G_STRUCT_OFFSET(MonoJitTlsData, lmf)
> 	only if !TLS_MODE_NPTL.
> 	(setup_tls_access)[__linux__ & _CS_GNU_LIBPTHREAD_VERSION]: 
> 	Use confstr to determine pthread implementation for TLS.
> 	Use mono_domain_get_tls_offset() to set monodomain_key.
> 	Use mono_get_lmf_addr_tls_offset() to set lmf_pthread_key.
> 	Use mono_thread_get_tls_offset() to set monothread_key.
> 	* mini-ppc.h (MONO_ARCH_HAVE_TLS_GET): Defined.
> 	(MONO_ARCH_ENABLE_MONITOR_IL_FASTPATH): Likewise.
> 	(PPC_THREAD_PTR_REG): Likewise.
> 
> 	* mono-compiler.h: Define MONO_TLS_FAST and
> 	MONO_THREAD_VAR_OFFSET.
> 
> diff -urN mono-svn64-opt/mono/mono/mini/mini-ppc.c mono-svn64-tls/mono/mono/mini/mini-ppc.c
> --- mono-svn64-opt/mono/mono/mini/mini-ppc.c	2009-01-25 16:24:57.000000000 -0600
> +++ mono-svn64-tls/mono/mono/mini/mini-ppc.c	2009-01-25 16:40:28.000000000 -0600
> @@ -92,9 +92,36 @@
>  		if ((dreg) != ppc_r3) ppc_mr ((code), ppc_r3, ppc_r11);	\
>  	} while (0);
>  
> +#ifdef __mono_ppc64__
> +#define emit_nptl_tls(code,dreg,key) do { \
> +		int off1 = key; \
> +		int off2 = key >> 15; \
> +		if ((off2 == 0) || (off2 == -1)) { \
> +			ppc_load_reg ((code), (dreg), off1, ppc_r13);	\
> +		} else { \
> +			int off3 = (off2 + 1) > 1; \
> +			ppc_addis ((code), ppc_r11, ppc_r13, off3); \
> +			ppc_load_reg ((code), (dreg), off1, ppc_r11);	\
> +		} \
> +	} while (0);
> +#else
> +#define emit_nptl_tls(code,dreg,key) do { \
> +		int off1 = key; \
> +		int off2 = key >> 15; \
> +		if ((off2 == 0) || (off2 == -1)) { \
> +			ppc_load_reg ((code), (dreg), off1, ppc_r2);	\
> +		} else { \
> +			int off3 = (off2 + 1) > 1; \
> +			ppc_addis ((code), ppc_r11, ppc_r2, off3); \
> +			ppc_load_reg ((code), (dreg), off1, ppc_r11);	\
> +		} \
> +	} while (0);
> +#endif

If I didn't miss something, the only difference between the two macros
is the register, so you should unify them and use the PPC_THREAD_PTR_REG
you introduce below.
Please incorporate also Mark's changes and suggestions (I didn't check
if he made changes to your code besides splitting it up).

Thanks, nice work!

lupus

-- 
-----------------------------------------------------------------
lupus at debian.org                                     debian/rules
lupus at ximian.com                             Monkeys do it better


More information about the Mono-devel-list mailing list