A while ago I wrote an article about StringBuilder
and a reader mailed me to ask about the efficiency of using String.Format
instead. This reminded me of a bone I have to pick with the BCL.
Whenever we make a call to String.Format
, it has to parse the format string. That doesn't sound too bad, but string formatting can be used a heck of a lot - and the format is almost always hard-coded in some way. It may be loaded from a resource file instead of being embedded directly in the source code, but it's not going to change after the application has started.
I put together a very crude benchmark which joins two strings together, separating them with just a space. The test uses String.Format
first, and then concatenation. (I've tried it both ways round, however, and the results are the same.)
using System.Diagnostics;
public static class Test
{
const int Iterations=10000000;
const int PieceSize=10;
static void Main()
{
string first = GenerateRandomString();
string second = GenerateRandomString();
int total=0;
Stopwatch sw = Stopwatch.StartNew();
for (int i=0; i < Iterations; i++)
{
string x = String.Format("{0} {1}", first, second);
total += x.Length;
}
sw.Stop();
Console.WriteLine("Format: {0}", sw.ElapsedMilliseconds);
GC.Collect();
sw = Stopwatch.StartNew();
for (int i=0; i < Iterations; i++)
{
// Equivalent to first + " " + second
string x = String.Concat(first, " ", second);
total += x.Length;
}
sw.Stop();
Console.WriteLine("Concat: {0}", sw.ElapsedMilliseconds);
if (total != Iterations * 2 * (PieceSize*2 + 1))
{
Console.WriteLine("Incorrect total: {0}", total);
}
}
private static readonly Random rng = new Random();
private static string GenerateRandomString()
{
char[] ret = new char[PieceSize];
for (int j=0; j < ret.Length; j++)
{
ret[j] = (char) ('A' + rng.Next(26));
}
return new string(ret);
}
}
And the results (on my very slow Eee)...
Concat: 3567
As you can see, Format
takes significantly longer than Concat
. I strongly suspect that this is largely due to having to parse the format string on each iteration. That won't be the whole of the cost - String
needs to examine the format specifier for each string as well, in case there's padding, etc - but again that could potentially be optimised.
I propose a FormatString
class with a pair of Format
methods, one of which takes a culture and one of which doesn't. We could then hoist our format strings out of the code itself and make them static readonly
variables referring to format strings. I'm not saying it would do a huge amount to aid performance, but it could shave off a little time here and there, as well as making it even more obvious what the format string is used for.
Source Click Here.
No comments:
Post a Comment
Post your comments here: