-
Notifications
You must be signed in to change notification settings - Fork 5k
[cDAC] Implement sign extension for ClrDataAddresses #116222
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR implements sign extension for converting 32-bit pointers to 64-bit CLR data addresses in DAC by using a new helper function. Key changes include replacing direct calls to OutputBufferHelpers with the new static method CopyStringToBuffer, wrapping pointer values with ToClrDataAddress for proper sign extension, and renaming the OutputBufferHelpers class to OutputHelpers.
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated no comments.
File | Description |
---|---|
src/native/managed/cdac/mscordaccore_universal/Legacy/SOSDacImpl.cs | Replaced calls to OutputBufferHelpers.CopyStringToBuffer with CopyStringToBuffer and applied ToClrDataAddress for pointer conversion |
src/native/managed/cdac/mscordaccore_universal/Legacy/OutputHelpers.cs | Renamed class from OutputBufferHelpers to OutputHelpers and added ToClrDataAddress implementation for sign extension |
Comments suppressed due to low confidence (1)
src/native/managed/cdac/mscordaccore_universal/Legacy/OutputHelpers.cs:27
- Consider adding unit tests to verify the behavior of ToClrDataAddress on both 32-bit and 64-bit targets.
public static ulong ToClrDataAddress(TargetPointer address, Target target)
Tagging subscribers to this area: @tommcdon |
/azp run runtime-diagnostics |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-diagnostics |
Azure Pipelines successfully started running 1 pipeline(s). |
662f91d
to
36c8003
Compare
/azp run runtime-diagnostics |
Azure Pipelines successfully started running 1 pipeline(s). |
36c8003
to
72d04e4
Compare
/azp run runtime-diagnostics |
Azure Pipelines successfully started running 1 pipeline(s). |
/azp run runtime-diagnostics |
Azure Pipelines successfully started running 1 pipeline(s). |
/// <summary> | ||
/// Converts a TargetPointer to a ClrDataAddress using sign extension if required. | ||
/// </summary> | ||
public static ClrDataAddress ToClrDataAddress(this TargetPointer address, Target target) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What happens in the other direction?
DAC checks the address and throws in the other direction:
runtime/src/coreclr/debug/daccess/dacimpl.h
Lines 56 to 68 in a3f5189
inline TADDR CLRDATA_ADDRESS_TO_TADDR(CLRDATA_ADDRESS cdAddr) | |
{ | |
SUPPORTS_DAC; | |
#ifndef HOST_64BIT | |
static_assert_no_msg(sizeof(TADDR)==sizeof(UINT)); | |
INT64 iSignedAddr = (INT64)cdAddr; | |
if (iSignedAddr > INT_MAX || iSignedAddr < INT_MIN) | |
{ | |
_ASSERTE_MSG(false, "CLRDATA_ADDRESS out of range for this platform"); | |
DacError(E_INVALIDARG); | |
} | |
#endif | |
return (TADDR)cdAddr; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have not modified the ClrDataAddress -> TargetPointer conversion. You are correct that we currently just ignore the high bits.
Ideally, we would handle this case, I discuss this more below.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
With the custom marshaller, I have added the logic to convert from ClrDataAddress
to TargetPointer
/TargetCodePointer
.
@@ -12,7 +12,7 @@ namespace Microsoft.Diagnostics.DataContractReader.Legacy; | |||
|
|||
internal struct CLRDataModuleExtent | |||
{ | |||
public ulong /* CLRDATA_ADDRESS */ baseAddress; | |||
public ClrDataAddress baseAddress; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What would an alternative where this stays as ulong
look like? I assume that you have considered it, but rejected it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes,
I initially prototyped this without changing the managed COM interface definitions. However, I found it very confusing to keep track of which ulong
values should be treated as a CLRDATA_ADDRESS
and which are normal ULONG64
values.
For example, the DacpModuleData
struct has several unsigned 64 bit integer fields that are not CLRDATA_ADDRESS
and should not be sign extended.
runtime/src/coreclr/inc/dacprivate.h
Lines 230 to 256 in 73882e8
struct MSLAYOUT DacpModuleData | |
{ | |
CLRDATA_ADDRESS Address = 0; | |
CLRDATA_ADDRESS PEAssembly = 0; // Actually the module address in .NET 9+ | |
CLRDATA_ADDRESS ilBase = 0; | |
CLRDATA_ADDRESS metadataStart = 0; | |
ULONG64 metadataSize = 0; | |
CLRDATA_ADDRESS Assembly = 0; // Assembly pointer | |
BOOL bIsReflection = FALSE; | |
BOOL bIsPEFile = FALSE; | |
ULONG64 dwBaseClassIndex = 0; | |
ULONG64 dwModuleID = 0; | |
DWORD dwTransientFlags = 0; | |
CLRDATA_ADDRESS TypeDefToMethodTableMap = 0; | |
CLRDATA_ADDRESS TypeRefToMethodTableMap = 0; | |
CLRDATA_ADDRESS MethodDefToDescMap = 0; | |
CLRDATA_ADDRESS FieldDefToDescMap = 0; | |
CLRDATA_ADDRESS MemberRefToDescMap = 0; | |
CLRDATA_ADDRESS FileReferencesMap = 0; | |
CLRDATA_ADDRESS ManifestModuleReferencesMap = 0; | |
CLRDATA_ADDRESS LoaderAllocator = 0; | |
CLRDATA_ADDRESS ThunkHeap = 0; | |
ULONG64 dwModuleIndex = 0; |
I was hoping to use type aliasing to make it more clear which values should be sign extended.
global using CLRDATA_ADDRESS = ulong;
internal struct DacpModuleData
{
public CLRDATA_ADDRESS Address;
public ulong metadataSize;
...
}
However, the aliasing seems to be resolved in VSCode's C# introspection, making it less impactful.
This led me to use the current approach of a thin struct wrapper, this gives the conversion behavior I want, where a TargetPointer
can't be implicitly cast to a ClrDataAddress
. But does have some downsides you mentioned above relating to differences in argument passing of a struct vs integer type on some platforms.
As you mentioned, using this for only out variables should have no impact as it will be passed as a pointer type anyways. Ideally, I'd like to make this change for in variables as well to properly convert from CLRDATA_ADDRESS -> TargetPointer
when truncating. I have done some experimenting and reading about passing primitives vs primitive sized structs. My understanding is that this should be the same on: ARM, ARM64, x64 MSVC, x64 System V, x86 fastcall, and x86 vectorcall
I do understand this isn't ideal due to some ABIs potentially treating ulong
differently from struct Wrapper { ulong Value; }
when passing to a method.
I was considering the case of C++ enum types. I'm trying to understand if these are treated as primitive integers of the underlying type when passed as a parameter. If so, does managed code treat interop with enums differently to match this behavior?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe this can be solved by adding custom marshalling to the struct, which source generated COM appears to support. While there is some amount of overhead, this appears to be pretty much what I am looking for. Would you have any concerns with this approach?
[NativeMarshalling(typeof(ClrDataAddressMarshaller))]
internal struct ClrDataAddress : IEquatable<ClrDataAddress>
{
public ulong Address;
public ClrDataAddress(ulong address) => Address = address;
public static implicit operator ulong(ClrDataAddress a) => a.Address;
public static implicit operator ClrDataAddress(ulong v) => new ClrDataAddress(v);
public override bool Equals(object? obj) => obj is ClrDataAddress address && Equals(address);
public readonly bool Equals(ClrDataAddress other) => Address == other.Address;
public override readonly int GetHashCode() => Address.GetHashCode();
public override readonly string ToString() => $"0x{Address:x}";
}
[CustomMarshaller(typeof(ClrDataAddress), MarshalMode.Default, typeof(ClrDataAddressMarshaller))]
internal static unsafe class ClrDataAddressMarshaller
{
public static ClrDataAddress ConvertToManaged(ulong address) => new ClrDataAddress(address);
public static ulong ConvertToUnmanaged(ClrDataAddress address) => address.Address;
}
/azp run runtime-diagnostics |
Azure Pipelines will not run the associated pipelines, because the pull request was updated after the run command was issued. Review the pull request again and issue a new run command. |
/azp run runtime-diagnostics |
Azure Pipelines successfully started running 1 pipeline(s). |
/// <summary> | ||
/// Represents the native CLRDATA_ADDRESS 64-bit type which uses sign extending | ||
/// when converting from 32-bit values to 64-bit values. | ||
/// For more information see TO_CDADDR in dacimpl.h |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: We hope dacimpl.h is going to disappear eventually, so this comment is not durable
/// <summary> | ||
/// Converts a ClrDataAddress to a TargetCodePointer, ensuring the address is within the valid range for the target platform. | ||
/// </summary> | ||
public static TargetCodePointer ToTargetCodePointer(this ClrDataAddress address, Target target) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks unused
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The marshaller-based solution looks good. Thank you
Found as part of #116072. Previously we have been 0 extending all 32-bit values.
The ISOSDacInterface APIs use sign extended values to convert 32-bit pointers to 64-bit values. These are represented using the type
CLRDATA_ADDRESS
. Changed allCLRDATA_ADDRESS
out values in relevant legacy interfaces to useClrDataAddress
to force an explicit conversion. Added a helper to sign extend aTargetPointer
to aClrDataAddress
For example, take the 32-bit sentinel value
0xFFFFFFFF
:with
0
extension (previous behavior) the 64-bitCLRDATA_ADDRESS
would be0x00000000_FFFFFFFF
.with sign extension (after PR) the 64-bit
CLRDATA_ADDRESS
is0xFFFFFFFF_FFFFFFFF
.More context given in this DAC comment:
runtime/src/coreclr/debug/daccess/dacimpl.h
Lines 32 to 53 in 8d59055