SSJX.CO.UK
Content

Converting Avi2Cvc from C to D

This is a summary of porting my Cybiko video conversion utility from C to the D Programming Language. The main reasons for this were to improve its memory safety (I had already found one memory issue) and to take advantage of any other benefits that using D provides.

Simple Stuff

In no particular order:

For loops to Foreach loops

Replaced many loops such as:

for(int x=0;x<width;x++)

with

foreach(x;0..width)

Many advantages to doing this, some subtle:

Mallocs to Arrays

From this:

char *pcyf,*cyf;
pcyf=(char *)malloc(4100);  // Previous frame
cyf=(char *)malloc(4100);   // Cybiko frame
...
memcpy(pcyf,cyf,4000);	// Make current frame the new previous

to this:

ubyte[4000] prev;
ubyte[4000] pic;
...
prev=pic;	// Make current frame the new previous

The D method is much clearer, no pointers need to be passed around to functions and thanks to default bounds checking and other checks (via @safe) this is memory safe. The array copying is also a much safer and clearer method, it also allows the compiler to figure out the best way to do a copy.

If to a Final Switch

The C version is messy looking but it worked, though the memcmp() looks suspect and I should have lower cased the compression name as below:

int comp(char first[4],char second[4])
{
	return memcmp(first,second,4);
}
..
..
if (comp(vheader.shand,"msvc")==0 || comp(vheader.shand,"MSVC")==0)
{
	msvc(mem,img,width,height,header.size);
}

if (comp(vheader.shand,"dib ")==0 || comp(vheader.shand,"DIB ")==0)
{
	dib(mem,img,width,height,vformat.depth);
}

The D version looks a little different due to the way the file is read (raw frames are not passed a decode function) but this version is still much clearer. Also note the ref in the foreach which lets me modify the array.

// Lower case the compression name (probably a function that does this better...)
foreach(ref s;vheader[0].shand)
{
	if (s>=65 && s<=90){s+=32;}
}
..
..
// If we got this far, the compression is one or the other
final switch(vheader[0].shand)
{
case "msvc":
	frame.length=header[0].size;
	file.rawRead(frame);
	decode_msvc(frame,gray8,vformat[0].xsize,vformat[0].ysize);
	
	gray=gray8;
break;

case "dib ":
	bmp.length=header[0].size;	// May include padding so best not use x*y*depth
	file.rawRead(bmp);
	grey(bmp,vformat[0].xsize,vformat[0].ysize);
break;
}

Extra Stuff

The original C version had support for the Microsoft Video 1 Codec (MSVC), the Pascal version did not but it did have Error Diffusion as a colour reduction option. The D version has both!

The quick port of MSVC decoder revealed an out of bounds error straight away, so that was fixed and the function had a general tidy up as well.

Garbage Collection

D uses a Garbage Collector to manage memory, this keeps track of memory and frees it when when no longer in use. This is great but it means we need to pay a little more attention to our allocations so we do not overwork it. By running the built-in profiler we can get an idea of when and what the GC is doing and try to help it.

To get a profile log, compile with -profile=gc and when you run the program you get a surprisingly easy to read profilegc.log file.

Running the profile enabled version and encoding a 240 frame video using ordered dithering gave this:

bytes allocated, allocations, type, function, file:line
         191376	             84	ubyte[] D main avi2cvc.d:515
          49152	              1	ubyte[] D main avi2cvc.d:462
          23040	            240	const(const(const(ubyte)[])[]) avi2cvc.ordered avi2cvc.d:166
           3840	            240	const(const(ubyte)[]) avi2cvc.ordered avi2cvc.d:167
           3840	            240	const(const(ubyte)[]) avi2cvc.ordered avi2cvc.d:168
           3840	            240	const(const(ubyte)[]) avi2cvc.ordered avi2cvc.d:169
           3840	            240	const(const(ubyte)[]) avi2cvc.ordered avi2cvc.d:170
           3840	            240	const(const(ubyte)[]) avi2cvc.ordered avi2cvc.d:171
           3840	            240	const(const(ubyte)[]) avi2cvc.ordered avi2cvc.d:172
           3840	            240	const(const(ubyte)[]) avi2cvc.ordered avi2cvc.d:173
           3840	            240	const(const(ubyte)[]) avi2cvc.ordered avi2cvc.d:174
             32	              1	immutable(char)[] D main avi2cvc.d:288

The 240 allocations is the ordered dither pattern (a 64 byte array - the bytes allocated seems high but I assume that includes size data and other things). As the array was inside the function, the array was being allocated and deallocated each time it was called, this is bad in general and even worse for a GC language! Luckily this was an easy fix - take the array out of the function!

For a 356 frame video using error diffusion this was the result:

bytes allocated, allocations, type, function, file:line
         221520	             56	ubyte[] D main avi2cvc.d:522
          49152	              1	ubyte[] D main avi2cvc.d:469
           5696	            356	const(const(ubyte)[]) avi2cvc.diffuse avi2cvc.d:219
             32	              1	immutable(char)[] D main avi2cvc.d:295

The 356 allocations was a 4 byte colour array, again the simple solution was to move it out of the function.

Conclusion

The result of all this is I now have a fast, safe and easy to read program. The program may still have bugs but at least they will not be due to memory corruption. Also note that this program does not make use of any of D's more advanced features and I am just using D as a nicer C compiler with some useful extras included!

Last Updated 20/05/2024
Created 15/02/2020