[Mono-devel-list] Alias analysis

Zoltan Varga vargaz at gmail.com
Wed Jan 26 18:44:54 EST 2005


                                               Hi,

 Wouldn't be easier to just disable optimizations on variables whose address is
taken ? It usually happens in unsafe code, and with taking addresses
of valuetypes.

                                                            Zoltan

On Thu, 27 Jan 2005 00:15:51 +0100, Massimiliano Mantione
<massi at ximian.com> wrote:
> 
> This is a possible plan to implement alias analysis.
> Those interested can read, and comment on it so that I have
> some feedback...
> 
> Purpose of alias analysis: understand the operations performed
> on local variables, so that SSA can be enabled, and liveness
> information will be not only "conservatively correct", but as
> accurate as possible.
> This means we are (in this step) only interested in the aliasing
> of local variables, and not of storage locations external to the
> current method (analyzing aliasing of those locations could lead
> to the elimination of redundant operations on them, or other
> optimizations, but this is not the goal now).
> 
> So, what should alias analysis do?
> 
> [1] Detect when local variables are aliased.
> Basically, this means watching when the address of a variable
> is taken.
> 
> [2] Understand where the address that has been taken goes.
> This is important for the effectiveness of the next step:
> 
> [3] Whenever an address is used, if the address is actually an
> alias, this means that the aliased variable is used. The use
> could be a load or a store (according to how the pointer is
> used, for reading or writing).
> The key point is that we should be accurate.
> Being conservative, one could say that whenever a pointer is
> used, all variables are potentially used. This would produce
> correct code, but would not help much in optimizing it.
> Instead, we should reason in the following way: every time a
> pointer is used, only the variables potentially aliased by that
> pointer are potentially used.
> In fact, a variable whose address is never taken is not aliased.
> And also a variable whose address *has* been taken is obviously
> unaffected by operations on *other* addresses: this is why step
> two is important, so that step three will be more accurate.
> 
> Now, how should this be implemented?
> This raises another question, which in fact is more important:
> how would the JIT code make use of the aliasing data?
> 
> Before answering this, let's look at what the code does now.
> 
> Currently there is an enum that states which kind of operation
> is performed on local variables, and every instruction is flagged
> with one of the values (MONO_SSA_NOP is the default):
> 
> /* values for MonoInst.ssa_op */
> enum {
> MONO_SSA_NOP,
> MONO_SSA_LOAD,
> MONO_SSA_STORE,
> MONO_SSA_MAYBE_LOAD,
> MONO_SSA_MAYBE_STORE
> };
> 
> The LOAD and STORE values are widely used, often (always?) with
> the assumption that "inst->inst_i0->inst_c0" is a valid variable
> index. IMHO, when reworking these flags, it would be wise not to
> break this assumption. Basically, these values flag pure "read"
> and "write" accesses to local variables or arguments.
> 
> The MONO_SSA_NOP value, as said, is the default.
> It is used explicitly only when making a MonoInst a NOP inside
> the ssa code, and it is tested only in one place, in local copy
> propagation. The test is the following: if a MonoInst is a store,
> but it is flagged MONO_SSA_NOP, assume that we do not know where
> it is stored, so invalidate the state of *all* local variables
> (because all of them have now a potentially unknown value).
> 
> The MONO_SSA_MAYBE_LOAD value has a misleading name.
> It is set in each MonoInst that takes the address of a local (or
> argument). Note that this operation is not really a load, the
> variable value is not used and the variable itself is unchanged.
> And, after such an instruction, the address could in fact be used
> to write to the variable instead of reading (this is why the name
> is misleading IMHO).
> This value is tested for in a couple of places in SSA construction
> and destruction code, simply because it points to a use of the
> variable (taking the address is a form of use after all), and SSA
> must rename all variables in all places where they are used.
> It is also used in SSA based copyprop, but IMHO in the wrong way
> (and SSA based copyprop seems broken anyway).
> What is important about MONO_SSA_MAYBE_LOAD is that it already
> flags *all* the places where an alias is created (because taking
> the address of a variable means creating an alias).
> 
> The MONO_SSA_MAYBE_STORE value is never used, so it can disappear.
> 
> OK, so this is the current state of the code, where do we go from
> here?
> 
> The idea is to work on the MonoInst.ssa_op values we have seen,
> making them represent also operations that happen through aliases
> so that we can properly track them.
> 
> Since MonoInst.ssa_op has three bits, we could use them fully in
> the following way:
> 
> /* values for MonoInst.ssa_op */
> enum {
> MONO_SSA_NOP = 0,
> MONO_SSA_ADDRESS_TAKEN = 1,
> MONO_SSA_LOAD = 2,
> MONO_SSA_STORE = 4,
> MONO_SSA_INDIRECT_LOAD = MONO_SSA_LOAD|MONO_SSA_ADDRESS_TAKEN,
> MONO_SSA_INDIRECT_STORE = MONO_SSA_STORE|MONO_SSA_ADDRESS_TAKEN,
> MONO_SSA_INDIRECT_LOAD_STORE =
> MONO_SSA_LOAD|MONO_SSA_STORE|MONO_SSA_ADDRESS_TAKEN
> };
> 
> In the existing code, all occurrences of MONO_SSA_LOAD and
> MONO_SSA_STORE would work unchanged, and MONO_SSA_MAYBE_LOAD would
> be replaced by MONO_SSA_ADDRESS_TAKEN.
> Also MONO_SSA_NOP would work as before, with the exception of the
> single place where it is tested... how this would change will be
> clear afterwords.
> 
> The new "*_INDIRECT_*" values would mark operations performed
> through aliases, which basically belong to one of the following
> categories:
> - CEE_STIND_* instructions that store to something that could be
>   an alias (instead of storing to a local or arg), these should
>   be marked MONO_SSA_INDIRECT_STORE.
> - CEE_LDIND_* instructions that read from something that could be
>   an alias (instead of reading from a local or arg), these should
>   be marked MONO_SSA_INDIRECT_LOAD.
> - CALL instructions that have at least one parameter that is passed
>   by reference (it could either point to a local, or be an alias),
>   these should be marked MONO_SSA_INDIRECT_LOAD_STORE.
> 
> The real issue, then is: which values are aliases?
> This is why step [2] above is so important: every value that is
> the address of a local variable is an alias, so we should track
> all MONO_SSA_ADDRESS_TAKEN instructions, and see where they put
> the address.
> 
> There are not many alternatives for this, the possible places
> where an alias can end up are the following:
> 
> [a] An OP_OUTARG instruction (so the alias is passed to a called
> method, which can use it for reading, writing, or both).
> [b] A local variable (or an argument).
> [c] A field of an object. The field could be static, and the object
> could be a value type (instead of a class instance), but in this
> first analysis we will ignore the differences between these cases.
> 
> Note that in verifiable code only [a] is allowed (see Partition I,
> section 12.4.1.5.2 of the ECMA standard for CIL).
> This is very important, because we expect the vast majority of
> the code we will run to be verifiable, and case [a] is very easy
> to handle.
> In contrast, case [b] is harder (we have full control of local
> variables, but tracking them is in any case nontrivial), and case
> [c] is likely to be unsolvable.
> 
> So, an initial approach would be to handle case [a] accurately,
> and cases [b] and [c] conservatively.
> 
> - Case [a] (verifiable code)
> 
> To handle case [a], we must examine the MonoCallInst, and follow
> in parallel the arguments (eventually skipping the "this") and
> their parameter declarations in the signature.
> For each "byref" parameter, we should see if we can determine
> which local or argument has been passed. For each of them, the
> combined "LOAD+STORE" operation applies.
> If what is passed to one byref parameter is not just the address
> of a local, we have some trouble... basically this method call
> behaves like an operation of kind [b] or [c] (we must handle it
> conservatively).
> 
> - Cases [b] and [c]
> 
> Basically, when flagging instructions as MONO_SSA_ADDRESS_TAKEN,
> if the generated alias is of kind [b] or [c] we should also flag
> the local (or argument), stating that its address has been taken.
> Since local variables have a MonoInst as well, and since their
> MonoInst.ssa_op makes no sense (they are not real operations), it
> could make sense to flag them exactly setting that field to
> MONO_SSA_ADDRESS_TAKEN :-)
> Then, we just know that each instruction flagged as
> MONO_SSA_INDIRECT_LOAD or MONO_SSA_INDIRECT_STORE that operates
> on an unknown value will potentially affect all variables flagged
> as MONO_SSA_ADDRESS_TAKEN that have a type which is compatible
> with the type of the operation.
> Maybe it would be worth adding a flag to MonoCompile that is true
> if at leas one variable is potentially aliased, this would allow
> to avoid scanning the whole array of variables to find the ones
> that are potentially aliased.
> Or maybe we could keep one list of MonoInst for each primitive
> type, and put in there the aliased variables... but these are just
> implementation details.
> 
> OK, this seems all...
> If you made it that far... congratulations ;-)
> 
> _______________________________________________
> Mono-devel-list mailing list
> Mono-devel-list at lists.ximian.com
> http://lists.ximian.com/mailman/listinfo/mono-devel-list
>



More information about the Mono-devel-list mailing list