AArch64 Optimization – Stage 1 – Benchmarks Redux

When I did the benchmark in the last post, I imagined that the compression would take longer. Benchmarking something that only takes a fifth of a second generally isn’t a great idea – it’s too easy for anything else going on in the system to disrupt the results of the benchmark. Running it a few thousand times does help, but in general it’s best to use data that takes at least a second or two to process.

So, I went and looked through an old harddrive for some video files I could use. The first thing I tried was about 1.1GB. I saw it taking far longer than I expected to run, so I popped top open and had a look. I promptly discovered that the machine I’m testing on only has about 550MB free RAM (and thus was significantly swapping, which is no good). Fortunately, I also had a 514MB file lying around. That should work for now:

The average time for thing across 100 trials is: 1.440855900 seconds

I also decided to get some preliminary benchmark results on an x86_64 box. The reason for this isn’t to compare the two, it’s so that I have numbers to compare against later. I want to make sure any changes I make don’t result in a performance loss on x86_64.

The average time for thing across 100 trials is: .132044908 seconds

That’s definitely a bit awkward. Can’t use a larger file because it’ll swap on the AArch64 box (which would distort the results), and testing with an entirely different file on x86_64 probably isn’t a good idea (since the result of optimizations could vary depending on what file is used). Well, there’s not too much I can do about it at this point anyway. I may end up doing optimizations that don’t affect the code executed on x86_64 at all, so I’ll leave it at this for now. There’s also a chance I’ll have access to a better AArch64 box in the future.

