Browser Games | Cybiko | Amiga | Applications | Windows Games | Python | Guides | Site Map

Programming - Speed Up Using Assembly Language

This guide will give you tips and ideas on how to speed up your programs using assembly language. The example will be of dimming an image/screen using FreeBasic making use of its inline assembler. The same idea of mixing in some assembly can apply to any language though.

It is code I have used in various games but the original version was too slow. In the games where this function was used, it needed to be as fast as possible as it was called every time the screen was refreshed.

Original Version

This is the orginal FB version:

for j=0 to 479
	for i=0 to 639
		r=(t and &H00ff0000) shr 16
		g=(t and &H0000ff00) shr 8
		b=(t and &H000000ff) 
		r=r shr 1
		g=g shr 1
		b=b shr 1
	next i
next j

Time taken : 0.076 seconds

It reads a point, gets each separate colour, halves the colour then plots it back on the screen. It works but it is not very quick, both 'point' and 'pset' are very complex instructions.

Better Version

This version uses a pointer to work directly on the screen:

Dim buffer As UByte Ptr = ScreenPtr() 

for i as integer=0 to (640*480*4)-1
	tmp=*buffer shr 1	'' Read a byte from the screen, shift it and put it into tmp
	*buffer=tmp		'' Put our shifted byte back on screen
	buffer+=1		'' Move screen pointer to the next byte

Time taken : 0.006 seconds

It is over 10 times quicker! It reads a pixel byte from the screen, shifts it and then puts it back.

Assembly Version

This version also directly manipulates the screen:

dim buffer As UByte Ptr = ScreenPtr()    
dim as uinteger c=(640*480*2)      
	mov eax, [c]		''counter
	mov ebx, [buffer]	''screen buffer

		mov cx,[ebx]		''screen pos
		shr ch			''shift colour bytes
		shr cl
		mov [ebx],cx		''put shift colour back
		add ebx,2	''move to next 2 bytes
		sub eax,1	'' decrease count
	jnz dtop	

end asm  

Time taken : 0.002 seconds

Not as big a speed increase but still 3 times faster again! It is very similar to the previous version, this one reads the screen bytes two at a time, shifts both and then puts them back.

Note that there are probably assembly versions of this that are even faster but I think the above gets the point across.


This example is cherry picked in that it compares a slow complex function to an optimised specific one. The main points are:

There are many advantages to optimising your programs:

Last updated 19/01/2019 - Buy Games & Codes for PS4, PS3, Xbox 360, Xbox One, Wii U and PC / Mac.
Support this site : Visit Play-Asia for games and accessories for new and classic games consoles!