Welcome Guest, Login

LLVM#

RSS RSS

Navigation


Proposal





Quick Search
»

PoweredBy



In this tutorial we will create a traditional helloworld code in C, and be able to generate LLVM IR code along with native exectuables.

Let's start by coding the helloworld program in C. Save the following file as helloworld.c
#include <stdio.h>
int main()
{
   puts("hello world!");
   return 0;
}

Generating the LLVM IR

To the generate the LLVM IR type the following.
llvm-gcc -emit-llvm -S hellworld.c
A file called helloworld.s is created. If you open it in a text editor you will see something similar to this.
; ModuleID = 'helloworld.c'
target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32"
target triple = "i486-linux-gnu"
@.str = internal constant [13 x i8] c"hello world!\00"		; <[13 x i8]*> [#uses=1]
 
define i32 @main() {
entry:
	%retval = alloca i32		;  [#uses=2]
	%tmp = alloca i32		;  [#uses=2]
	%"alloca point" = bitcast i32 0 to i32		;  [#uses=0]
	%tmp1 = getelementptr [13 x i8]* @.str, i32 0, i32 0		;  [#uses=1]
	%tmp2 = call i32 @puts( i8* %tmp1 ) nounwind 		;  [#uses=0]
	store i32 0, i32* %tmp, align 4
	%tmp3 = load i32* %tmp, align 4		;  [#uses=1]
	store i32 %tmp3, i32* %retval, align 4
	br label %return
 
return:		; preds = %entry
	%retval4 = load i32* %retval		;  [#uses=1]
	ret i32 %retval4
}
 
declare i32 @puts(i8*)

Generating the LLVM BitCode

LLVM bit code can be generated by
llvm-as -f helloworld.s

Executing the LLVM BitCode

To execute the llvm bitcode which was created, type as below
lli helloworld.s.bc

Optimization

We will now optimize the our code using mem2reg.
llvm-as < helloworld.s | opt -mem2reg > helloworld.bc	
File called helloworld.bc is created which is an optimized version, while helloworld.s.bc is an unoptimized version of the same code. You can also see the decrease in the file size.

Disassemble LLVM bitcode

To make sure that our llvm bitcode is optimized, let us disassemble the llvm bitcode.
llvm-dis -f helloworld.bc -o opthelloworld.ll
Optimized helloworld LLVM IR is created in the file called opthelloworld.ll, now open the text editor and compare the previous hellworld.s and ophelloworld.ll. You will see some code/optimization difference.
opthelloworld.ll looks somewhat like this...
; ModuleID = 'helloworld.bc'
target datalayout = "e-p:32:32:32-i1:8:8-i8:8:8-i16:16:16-i32:32:32-i64:32:64-f32:32:32-f64:32:64-v64:64:64-v128:128:128-a0:0:64-f80:32:32"
target triple = "i486-linux-gnu"
@.str = internal constant [13 x i8] c"hello world!\00"		; <[13 x i8]*> [#uses=1]
 
define i32 @main() {
entry:
	%"alloca point" = bitcast i32 0 to i32		;  [#uses=0]
	%tmp1 = getelementptr [13 x i8]* @.str, i32 0, i32 0		;  [#uses=1]
	%tmp2 = call i32 @puts(i8* %tmp1) nounwind		;  [#uses=0]
	br label %return
 
return:		; preds = %entry
	ret i32 0
}
 
declare i32 @puts(i8*)

Generating native assembly code

Before we generate the native assembly code, we need to generate the bitcode first.
llvm-as -f opthelloworld.ll
llc -f opthelloworld.bc
A file containg the native assemlby code caled opthelloworld.s is created which looks similar as follows...
	.file	"opthelloworld.bc"
 

.text .align 16 .globl main .type main,@function main: .Leh_func_begin1: .Llabel1: subl $4, %esp movl $.str, (%esp) call puts .LBB1_1: # return xorl %eax, %eax addl $4, %esp ret .size main, .-main .Leh_func_end1: .type .str,@object .section .rodata.str1.1,"aMS",@progbits,1 .str: # .str .size .str, 13 .asciz "hello world!" .section .eh_frame,"aw",@progbits .LEH_frame0: .Lsection_eh_frame: .Leh_frame_common: .long .Leh_frame_common_end-.Leh_frame_common_begin .Leh_frame_common_begin: .long 0x0 .byte 0x1 .asciz "zR" .uleb128 1 .sleb128 -4 .byte 0x8 .uleb128 1 .byte 0x1B .byte 0xC .uleb128 4 .uleb128 4 .byte 0x88 .uleb128 1 .align 4 .Leh_frame_common_end: .Lmain.eh: .long .Leh_frame_end1-.Leh_frame_begin1 .Leh_frame_begin1: .long .Leh_frame_begin1-.Leh_frame_common .long .Leh_func_begin1-. .long .Leh_func_end1-.Leh_func_begin1 .uleb128 0 .byte 0xE .uleb128 8 .byte 0x4 .long .Llabel1-.Leh_func_begin1 .byte 0xD .uleb128 4 .align 4 .Leh_frame_end1: .section .note.GNU-stack,"",@progbits
The above synatx generated is the default GAS (GNU Assembler) synatx format which is also referred to as AT&T synatx. If you prefer Intel synatx (which nasm uses) it can be generated using
llc -f -x86-asm-syntax=intel opthelloworld.bc -o iopthelloworld.s
A file called iopthelloworld.s is created which uses intel style assembler synatx.
	.686
	.model flat
 
	extern _puts:near
	extern _abort:near
 
_text	segment 'CODE'
	public _main
	align	16
_main	proc near
$label1:
	sub	ESP, 4
	mov	DWORD PTR [ESP], OFFSET __2E_str
	call	_puts
$BB1_1:	; return
	xor	EAX, EAX
	add	ESP, 4
	ret
_main	endp
_text	ends
 
_data	segment 'DATA'
__2E_str:				; .str
	db 'hello world!',0
_data	ends
 
	end

Generating executable file

After the native assembly code has been created, can compiler and link using gcc in the following way
gcc -c opthelloworld.s
gcc opthelloworld.o
You will then get the appropritae executeable file a.exe or a.out depending on you OS and gcc default executable file.

You can download the source code of this tutorial form top-right of the article in attachments. Makefile for this program is as follows:
all: helloworld.bc opthelloworld.o
 
helloworld.bc: helloworld.s 
	llvm-as -f helloworld.s
	@echo Optimizing LLVM IR
	llvm-as < helloworld.s | opt -mem2reg > helloworld.bc	
 
opthelloworld.o: opthelloworld.s
	gcc -c opthelloworld.s
	gcc opthelloworld.o
 
opthelloworld.s: opthelloworld.bc
	llc -f opthelloworld.bc
	llc -f --x86-asm-syntax=intel opthelloworld.bc -o iopthelloworld.s
 
opthelloworld.bc: opthelloworld.ll
	llvm-as -f opthelloworld.ll
 
opthelloworld.ll: helloworld.bc
	llvm-dis -f helloworld.bc -o opthelloworld.ll
	
helloworld.s:
	llvm-gcc -emit-llvm -S helloworld.c
 
clean:
	rm *.s *.bc *.ll *.o
  Name Size
- llvmsharp-llvmHelloworldInC.zip 514 B

ScrewTurn Wiki version 3.0.4.560. Maintained by Prabir Shrestha