StringBuilders
Resources
Video Script
[Slide 1]
In the previous sections, we have talked about why we would need to use strings and how they are a natural choice. When looking into the implementation of strings, we also saw that they could be less than ideal.
[Slide 2]
We will now talk about how we can have the best of both worlds: utilizing text based data and avoiding repetitive character writes. Java has a specific class for this problem: StringBuilders
.
StringBuilders
and Strings
can be nearly interchangeable with only a few modifications. StringBuilders
are a mutable sequence of characters whereas Strings
are an immutable sequence of characters. The StringBuilder
class has built in functions which we can greatly benefit from. First, lets discuss initialization; upon initializing a StringBuilder
will have no characters and an initial capacity of 16 characters and will be automatically made larger if needed. You can also specify a capacity upon initialization if you would like.
[Slide 3]
You can read more about StringBuilders
in the Java documentation and we will cover some of the methods you will most likely utilize. StringBuilders
have an append
function; this allows you to put characters at the end of your StringBuilder
. StringBuilders
, like many other classes, also have a toString()
function. This is the main difference between Strings
and StringBuilders
; you will need to use an extra step to get the resulting String
from a StringBuilder
.
[Slide 4]
Now that we have the basics of StringBuilders
, let’s do a memory walk through of the same algorithm from Strings
but modified to use StringBuilders
. This implementation requires the same number of lines; so what’s different?
[Slide 5]
I have highlighted the minimal changes that were required. To change to a StringBuilder
implementation, we have only had to change 4 lines: initializing ENC
, appending to ENC
, and returning ENC
. With these changes, we will now see a difference in the number of operations that require new memory allocation as well as character copies!
[Slide 6]
We will now walk through this example as we did with the String
implementation. We start with an empty heap and then add the StringBuilder
in the first memory location. Variable ENC
will be set to point at that location. For this implementation, the simplest way to visualize the StringBuilder
is in the form of a character array. It is important to note that an array of characters and a StringBuilder
are distinctly different in code.
[Slide 7-10]
Like before, we start into the loop with the first character, D
, which maps to L
after offsetting by 8
. This time when we append L
, a new memory location is not created. StringBuilders
are mutable so their state can change. We proceed to the next iteration of the loop and thus to letter a
. This maps to h
after the offset; thus, we append h
. Again, we do not create a new memory location! We can also note at this point, we have only written the character L
one time. Continuing on, we go to letter t
and append the letter b
to our StringBuilder
. We will now fast forward to the end of the loop and I pose the same questions to you as last time: How many character copies do you think we will make for this StringBuilder
? How many memory entries? Take a moment to pause the video if you would like to determine the pattern for yourself.
[Slide 11-12]
Skipping ahead to the end of the loop, we have done 14 character copies when encoding the 14 character input string. The number of memory entries is slightly tricky in this context. Precisely at the end of the loop, we have one memory entry. Recall that our return statement was slightly different. Once we get to the return statement here, we need to allocate a new space in memory for the String
itself. Thus, by the very end of the program, we have used only two memory entries!
[Slide 13]
Recall that when a String
was used for our appending, a string with one million characters, had one million one memory addresses filled and five hundred billion five hundred thousand total character copies. Now for a StringBuilder
as our accumulator, a string with one million characters would have one million character copies and only one memory entry! Not taking into account the final conversion as in this case we are just looking at the number of loops. We have a stark improvement!
Again, modern languages will do periodic clean up so you wouldn’t have one million memory locations filled in practice. In the next section, we will discuss a real world comparison and what implications using Strings
over StringBuilders
can have.