Skip to main content

dotNET - Under the Hood, IL Assembly, MSIL, ILGenerator, MethodBuilder

IL (Intermediate Language) is also referred to as MSIL (Microsoft Intermediate Language) or even ILAsm which is the same thing. IL is the code which your source code (C#, VB.NET etc) is compiled into. When you EXE or DLL is run these IL code is converted to a machine language particular to the present machine i.e. if the current machine being used to run the application is a Windows 2000 machine the IL will be converted into Windows instructions before running on the x86 processor (processor architecture used for Windows machines).

When you're developing with .NET languages with a Development tool such as Visual Studio 2005 the compiler you use will generate the correct/valid IL for you, but what if you want to create your own IL for some reason! One reason could be that you've developed your own Language and would like it to run on .NET, in this case you'll need to write out some sort of Assembly, be it EXE or DLL, with IL code. To do this you can use some Classes available in .NET API, these classes reside within the
System.Reflection.Emit namespace. I'll explain how this is achieved later.

There are also some tools which are used to compile and decompile the IL code, ILASM (all caps means this is the compiler and not the language ILAsm) is the IL compiler and ILDASM is the IL Decompiler (allows the reverse of compilation, you can create .NET language source code from an IL assembly).

An IL assembly contains 2 things, Metadata and Managed Code. Metadata is information which describes the structures and methods within the assembly. The Managed Code is the actual IL code, it is stored in the assembly in Binary form, managed means that the Runtime controls it.

The assembly has 2 main components, the metadata and the code. At runtime the assembly is loaded, the metadata is read first to find the descriptions of the structures and methods, the JIT compiles the IL code in the assembly into machine code using the metadata. When a method is required the machine code for that method is executed (incidentally this is what differentiates this from an Interpreter).

PE and COFF are additional data embedded in the assembly to describe an EXE, we'll not worry about them here.

Contents of the Assembly:
The assembly contains one or more Modules. An Assembly may or may not have multiple modules but it must have at least one i.e. the Prime Module. The Prime Module contains a
Metadata section which describes the contents of the Assembly, An Assembly Identity section and maybe some actual IL code. The Assembly may also contain additional Modules which each have their own Metadata and IL Code sections.


Boxed/Boxing and UnBoxed/UnBoxing:
You'll see these two terms appearing whenever you deal with IL, for you C++ developers you may have heard of the term before. It's related to Reference types and Value types, reference types are objects on the Heap to which a variable points, value types are not on the Heap and the variable contains the value in it's own memory location. This can be seen in the IL code. Reference types are more expensive, in terms of processor and memory, than Value types.

Boxing is the conversion of a value type to a reference type at compile time i.e. the value is contained in a variables memory location, for some reason (such as passing the value by reference, maybe as a method parameter) you want to convert your value type to a reference type. This means that a bit copy is made of the value type and instance of a Class is created with that copied value.

UnBoxing in the conversion of reference type to value type at compile time i.e. in your source code a variable points to a memory address on the Heap, you new-up an Object, when you compile the object is Boxed meaning the IL code is written for that Object, the memory required is allocated in IL code.
Int32 unBoxed = 20;//Unboxed
Object boxed = unBoxed;//Boxed
Int32 unBoxedBoxed = (Int32)boxed;

Here's the IL code generated for the C# source code above
.entrypoint
// Code size 19 (0x13)
.maxstack 1
.locals init ([0] int32 unBoxed,
[1] object boxed,
[2] int32 unBoxedBoxed)
IL_0000: nop
IL_0001: ldc.i4.s 20
IL_0003: stloc.0
IL_0004: ldloc.0
IL_0005: box [mscorlib]System.Int32
IL_000a: stloc.1
IL_000b: ldloc.1
IL_000c: unbox.any [mscorlib]System.Int32
IL_0011: stloc.2
IL_0012: ret
Here's a great article on Boxing from msdn magazine.


When playing around with IL code and reflection you may want to invoke a method within an assembly. If you are writing the assembly you may want to write it using IL code, to do this you would use the MethodInfo class found in the System.Reflection.Emit namespace. Once you'd created your methodinfo you can write IL code directly to that method and then save all to a dll later using the ILGenerator class, this ILGenerator can be thought of as the writer of the IL code and a reference to this is gotten from the MethodInfo instance you've just created
// create a dynamic assembly and module
AssemblyName assemblyName = new AssemblyName();
assemblyName.Name = "HelloWorld";
AssemblyBuilder assemblyBuilder = Thread.GetDomain().DefineDynamicAssembly(
assemblyName, AssemblyBuilderAccess.RunAndSave);

ModuleBuilder module;
module = assemblyBuilder.DefineDynamicModule("HelloWorld.dll");

// create a new type to hold our Main method
TypeBuilder typeBuilder = module.DefineType(
typeof(Example).Name,
TypeAttributes.Public TypeAttributes.Class);




// create the Main(string[] args) method
MethodBuilder methodbuilder = typeBuilder.DefineMethod(
DynamicMethod.GetCurrentMethod().Name,
DynamicMethod.GetCurrentMethod().Attributes,
DynamicMethod.GetCurrentMethod().CallingConvention,
changeID.ReturnType,
new Type[] { typeof(Example), typeof(int) });

Type t = changeID.DeclaringType;
// generate the IL for the Main method
ILGenerator ilGenerator = methodbuilder.GetILGenerator();


// Push the current value of the id field onto the
// evaluation stack. It's an instance field, so load the
// instance of Example before accessing the field.
ilGenerator.Emit(OpCodes.Ldarg_0);
ilGenerator.Emit(OpCodes.Ldfld, fid);

// Load the instance of Example again, load the new value
// of id, and store the new field value.
ilGenerator.Emit(OpCodes.Ldarg_0);
ilGenerator.Emit(OpCodes.Ldarg_1);
ilGenerator.Emit(OpCodes.Stfld, fid);


// The original value of the id field is now the only
// thing on the stack, so return from the call.
ilGenerator.Emit(OpCodes.Ret);

// bake it
Type helloWorldType = typeBuilder.CreateType();
assemblyBuilder.Save("HelloWorld.dll");


A useful reference for deeper understanding of how .NET works under the hood is the Microsoft opensource .NET project "Shared Source Common Language Infrastructure" codenamed "Rotor", it's an opensource version of .NET and can be used for discovering more about .NET code, there's also a book and it's available from googledocs, see my bookmarks.

Comments

Anonymous said…
Hi Niall,

Thank you for the interesting post!

You or your readers might be interested to read about the project named 'Visual IL' created by Craig Skibo.

There are two related posts:
- Visual IL
- Visual IL source code now available
learnerplates said…
Dmitry,
Nice one, looks promising and I'd love to give it a go.
There's a problem with the download of the project from gotdotnet right now, hopefully Craig will sort out an alternative.
Good to see your keeping an eye on me :).

Popular posts from this blog

dotNET - Debugging

Debugging with .NET MSIL assemblies Visual Studio and debugging the CLR are different, I'll talk about both. MSIL Assemblies Assemblies compiled with .NET tools such as the CLR compiler are compiled into a file which contains MSIL (Microsoft Intermediate Language). At runtime the contents of the assembly are loaded into the CLR and ran as machine code. When you compile an assembly in debug a PDB file is generated alongside the DLL or EXE you've just created. The link between these 2 files is that the PDB contains the line numbers of the methods and classes as well as the file names of the original source code that created the assembly. When you launch the debugger in Visual Studio the assembly is loaded into the Debugger (similar to the CLR) along with the PDB file. The debugger now uses your PDB file contents to match the running code found in the assembly to locations in source files (hopefully in your present project). CLR CLR Inside Out (msdn magazine) .NET Framework Tools:...

Installer CustomAction, Debugging the CustomAction, InstallState

Custom Action The Custom Action is added to the Setup Project, select the Project node and hit the Custom Action button. This allows you add an Action to a particular phase in the Installation. But first you must create the Custom Action. To Add a Custom Action you must first have a Custom Action created, this is usually in the form of a Installer Class, this should be created in a seperate project, the Installer Class is actually one of the File Templates in the C# Projects. So it's File->New Project and select Visual C# Projects. Then add a Class Library, this will prompt you for the Class Library Types , select "Installer Class". Walkthrough - Creating Custom Action (msdn). Also here's a more comprehensive document on Setup/Installer implementations, it delves into the Registry etc Getting Started with Setup Projects (SimpleTalk). Visual Studio Setup Projects and Custom Actions (Simple Talk). Create your Installer Class and then add it as a Custom Action to the ...

dotNET - Use app.config ApplicationSettings and UserSettings

When using Settings in an Assembly or .exe you can use the Settings Designer to generate a config file using Settings. The Settings Designer provides a wrapper class which allows you to provide defaults and access the config data using Properties. But what if you're not working inside that Assembly or .exe? this presents a problem. If your loading the Assembly externally and want to access that Assembly's .config file you'll probably wish to use something in the System.Configuration namespace... unfortunately it's not of much use if you've created the .config file from the Settings Designer in Visual Studio!! This is because the Designer creates Sections and ApplicationSettings and UserSettings, the System.Configuration namespace does not provide a method to access these (it has a method to access AppSettings which are a different thing. Below I've written a workaround which locates the app.config and accesses the ApplicationSettings and UserSettings using XML i...