The rationale behind the Spartan desire to eliminate variables is that every variable places a burden on both the reader and the writer.
A variable that is used only once, is a perfect candidate for elimination. Simply replace the use of this variable with the expression that computes it. For example, instead of
double a = p.area();
return a;
you can write
return p.area();
In the following example, drawn from the source code of JFLEX, variable b
is used only once.
public int interpret(int[] in, int[] par) {
boolean b = boolexp.interpret(in,par);
if (b)
return exp1.interpret(in,par);
else
return exp2.interpret(in,par);
}
Eliminating this variable, we obtain
public int interpret(int[] in, int[] par) {
if (boolexp.interpret(in,par))
return exp1.interpret(in,par);
else
return exp2.interpret(in,par);
}
and by simple ternarization we get a single statement function
public int interpret(int[] in, int[] par) {
return boolexp.interpret(in,par) ? exp1.interpret(in,par) : exp2.interpret(in,par);
}
another ternarization steps yields
public int interpret(int[] in, int[] par) {
return (boolexp.interpret(in,par) ? exp1 : exp2).interpret(in,par);
}
Not all inlining opportunities should be exploited. Consider for example
for (a = 0; a < f(); a += g()) {
...
}
which may have resulted from inlining of a more explicit statement of the iteration boundaries
int f = f(), g = g();
for (a = 0; a f; a += g) {
...
}
The semantics of these two code snippets is different though. In the former, functions f
and g
are evaluated in every iteration, while in the latter, they are evaluated only once.
In the case that these functions' return value is not the same, the two loops will execute differently, and the variables should not be inlined.
Should we use the inlined version of the code in the case that the two functions are pure, that is return the same value in all calls? If the purity of the functions is not obvious, then inlining may confuse the reader, and should not be used.
But, what if purity is obvious? The inlined version is then at least as clear as the non-inlined, is shorter, and uses fewer variables. This version may also mean, at least with some compilers, slower execution. Hasting to optimize may not be a good idea though. Remember that premature optimization is a prime example of dangerous creativity and originality and that the loop may execute only a handful of times, or may not even be reached in most paths of execution.
Many routines define a variable to store their result. This variable can often be eliminated, by returning the result as soon as it is found.
Consider for example the following simple function, designed to check if a given token is in a list of keywords.
private boolean isKeyword(String token, String[] keywords) {
boolean result = false;
for (int i = 0; i < keywords.length; i++)
if (token.equals(keywords[i]))
result = true;
return result;
}
By returning the result as soon as we find it, we can eliminate variable result
. The resulting function is not only shorter but also a bit more efficient.
private boolean isKeyword(String token, String[] keywords) {
for (int i = 0; i < keywords.length; i++)
if (token.equals(keywords[i]))
return true;
return false;
}
Here is another example, taken from the code of the CLASSYCLE project.
private ResultRenderer getRenderer() throws InstantiationException, IllegalAccessException, ClassNotFoundException {
ResultRenderer renderer = new DefaultResultRenderer();
if (_resultRenderer != null)
renderer = (ResultRenderer) Class.forName(_resultRenderer).newInstance();
return renderer;
}
which can be written simply as
private ResultRenderer getRenderer() throws InstantiationException, IllegalAccessException, ClassNotFoundException {
return _resultRenderer != null ? (ResultRenderer) Class.forName(_resultRenderer).newInstance() : DefaultResultRenderer();
}
A third example taken from the CLASSYCLE project is
public boolean isFulfilled(Vertex vertex) {
boolean result = false;
if (vertex != null) {
Attributes attributes = vertex.getAttributes();
if (attributes instanceof NameAttributes)
result = _pattern.matches(((NameAttributes) attributes).getName());
}
return result;
}
Returning the result as soon it is known also eliminates the first conditional
public boolean isFulfilled(Vertex vertex) {
if (vertex == null)
return false;
Attributes attributes = vertex.getAttributes();
return attributes instanceof NameAttributes ? _pattern.matches(((NameAttributes) attributes).getName()) : false;
}
Here is an even trickier example, drawn again from the CLASSYCLE project.
private boolean matches(String string, int indexInString, int indexInConstantParts) {
boolean result = true;
if (indexInConstantParts < _constantParts.length) {
String constantPart = _constantParts[indexInConstantParts];
do {
int index = string.indexOf(constantPart, indexInString);
if (index < 0 || (indexInString == 0 && !_startsWithAnything && index > 0)) {
result = false;
break;
}
indexInString = index + constantPart.length();
result = matches(string, indexInString, indexInConstantParts + 1);
} while (result == false);
} else {
result = result && (_endsWithAnything || indexInString == string.length());
}
return result;
}
This example is quite tricky. The variable result
is used in several places, and the flow of control is a bit confusing. Let us therefore start by removing changing the order of the branches in the conditional statement following the shortest first technique.
private boolean matches(String string, int indexInString, int indexInConstantParts) {
boolean result = true;
if (indexInConstantParts >= _constantParts.length)
result = result && (_endsWithAnything || indexInString == string.length());
} else {
String constantPart = _constantParts[indexInConstantParts];
do {
int index = string.indexOf(constantPart, indexInString);
if (index < 0 || (indexInString == 0 && !_startsWithAnything && index > 0)) {
result = false;
break;
}
indexInString = index + constantPart.length();
result = matches(string, indexInString, indexInConstantParts + 1);
} while (result == false);
}
return result;
}
It now becomes a bit clearer that we can return the correct value immediately in the first branch.
private boolean matches(String string, int indexInString, int indexInConstantParts) {
if (indexInConstantParts >= _constantParts.length)
return _endsWithAnything || indexInString == string.length();
boolean result = true;
String constantPart = _constantParts[indexInConstantParts];
do {
int index = string.indexOf(constantPart, indexInString);
if (index < 0 || (indexInString == 0 && !_startsWithAnything && index > 0)) {
result = false;
break;
}
indexInString = index + constantPart.length();
result = matches(string, indexInString, indexInConstantParts + 1);
} while (result == false);
return result;
}
Examining now the inner conditional statement, we see that if the break is executed, then the returned value is false
, so we can rewrite the function as follows:
private boolean matches(String string, int indexInString, int indexInConstantParts) {
if (indexInConstantParts >= _constantParts.length)
return _endsWithAnything || indexInString == string.length();
boolean result = true;
String constantPart = _constantParts[indexInConstantParts];
do {
int index = string.indexOf(constantPart, indexInString);
if (index < 0 || (indexInString == 0 && !_startsWithAnything && index > 0)) {
return false;
indexInString = index + constantPart.length();
result = matches(string, indexInString, indexInConstantParts + 1);
} while (result == false);
return result;
}
Now, we see that if the loop terminates normally, the value of result
must be true
. A closer inspection reveals that the value of result
is computed right before the loop ending condition. So, we can eliminate it altogether by writing
private boolean matches(String string, int indexInString, int indexInConstantParts) {
if (indexInConstantParts >= _constantParts.length)
return _endsWithAnything || indexInString == string.length();
String constantPart = _constantParts[indexInConstantParts];
do {
int index = string.indexOf(constantPart, indexInString);
if (index < 0 || (indexInString == 0 && !_startsWithAnything && index > 0)) {
return false;
indexInString = index + constantPart.length();
} while (!matches(string, indexInString, indexInConstantParts + 1));
return true;
}
A famous idiom for iterating over an array is
for (int i = 0; i < array.length; i++) {
... //Do something with array[i]
}
But, if the loop's body never uses variable i
nor modifies array[i]
, then the loop can be more concisely written as
for (T a: array) {
... //Do something with a
}
where T
is the type of elements of array array
.
The above example of searching through an array of keywords, can thus be rewritten to eliminate variable i
private boolean isKeyword(String token, String[] keywords) {
for (final String element : keywords)
if (element.equals(token))
return true;
return false;
}
The following represents a common structure of the main function of many file processing applications.
public static void main(String argv[]) {
for (int i = 0; i < argv.length; i++) {
try {
System.out.println("Processing ["+argv[i]+"]");
process(new FileReader(argv[i])));
System.out.println("No errors processing " + argv[i]);
} catch (Exception e) {
e.printStackTrace(System.out);
System.exit(1);
}
}
}
With this technique, it is simplified to the following
public static void main(String argv[]) {
for (final String f : argv) {
try {
System.out.println("Processing ["+f+"]");
process(new FileReader(f)));
System.out.println("No errors processing " + f);
} catch (Exception e) {
e.printStackTrace(System.out);
System.exit(1);
}
}
}
Old Java code iterating over collections uses an instance of class Iterator
to carry out the iteration. This iteration variable can be eliminated.
The following function is drawn from class SecureRandom
of package java.security
in the Java runtime library.
private static String getPrngAlgorithm() {
List provs = Providers.getProviderList().providers();
for (Iterator t = provs.iterator(); t.hasNext();) {
Provider p = (Provider) t.next();
for (Iterator u = p.getServices().iterator(); u.hasNext();) {
Service s = (Service) u.next();
if (s.getType().equals("SecureRandom")) {
return s.getAlgorithm();
}
}
}
return null;
}
In this function we have two iteration variables, t
and u
, both of which can be eliminated. Also, variable, provs
is used only once, so it is safe to eliminate as well.
The result is:
private static String getPrngAlgorithm() {
for (Provider p : Providers.getProviderList().providers())
for (Service s : p.getServices())
if (s.getType().equals("SecureRandom"))
return s.getAlgorithm();
return null;
}
The NOT metric is reduced by this simplification by 50% (from 113 to 56).
Not all opportunities for inlining should be exploited. Consider for example
for (a = 0; a < f(); a += g()) {
...
}
which may have resulted from inlining of a more explicit statement of the iteration boundaries
int bound = f(), step = g();
for (a = 0; a < bound; a += step) {
...
}
The semantics of these two code snippets is different though. In the former, functions f
and g
are evaluated in every iteration, while in the latter, they are evaluated only once.
In the case that these functions' return value is not the same, the two loops will execute differently, and the variables should not be inlined.
But, in the case that the two functions return the same value in all calls using the first form of the code may mean lost optimization opportunities. Choosing between the two versions is a matter of favoring between clarity of the design.
Sometimes, several steps are required to eliminate a variable. Consider for example the following constructor
protected Term(IndexFactory a, Predicate target, List<Attribute> as, Location loc) {
super(a, loc);
predicate = target;
this.loc = loc;
Attribute[] temp = new Attribute[as.size()];
temp = as.toArray(temp);
for (int i = 0; i < temp.length; ++i)
temp[i] = as.get(i);
attributes = Sequence.make(temp);
}
We would like to eliminate both variable temp
and i
. This time, we cannot use an enhanced for loop
syntax, since this iteration mechanism does not allow one update the underlying collection.
However, the elimination of i
is rather easy if we remember that the List
interface offers a toArray
method, so we do not need variable i
to iterate over the array. The tricky part is that for reasons related to the implementation of generics in Java this method requires an array as a parameter. (There is also an overloaded version of this method which takes no parameters. Alas, this version returns an array of Object
s instead of an array of the right type.)
We thus write:
protected Term(IndexFactory a, Predicate target, List<Attribute> as, Location loc) {
super(a, loc);
predicate = target;
this.loc = loc;
Attribute[] temp = as.toArray(new Attribute[as.size()]);
attributes = Sequence.make(temp);
}
Now, we can eliminate temp
altogether.
rotected Term(IndexFactory a, Predicate target, List<Attribute> as, Location loc) {
super(a, loc);
predicate = target;
this.loc = loc;
attributes = Sequence.make(as.toArray(new Attribute[as.size()]));
}
Finally, re-examining the specification of toArray
, we see that it does not need an array of the correct size, so instead
attributes = Sequence.make(as.toArray(new Attribute[as.size()]));
we can write
attributes = Sequence.make(as.toArray(new Attribute[0]));
The following example is taken from the Java run time library (class in package com.sun.corba.se.impl.activation
), and was obviously written before the new syntax of for
loops was introduced.
public int[] getActiveServers() {
ServerTableEntry entry;
int[] list = null;
synchronized (serverTable) {
// unlike vectors, list is not synchronized
ArrayList servers = new ArrayList(0);
Iterator serverList = serverTable.keySet().iterator();
try {
while (serverList.hasNext()) {
Integer key = (Integer) serverList.next();
// get an entry
entry = (ServerTableEntry) serverTable.get(key);
if (entry.isValid() && entry.isActive()) {
servers.add(entry);
}
}
} catch (NoSuchElementException e) {
// all done
}
// collect the active entries
list = new int[servers.size()];
for (int i = 0; i < servers.size(); i++) {
entry = (ServerTableEntry) servers.get(i);
list[i] = entry.getServerId();
}
}
if (debug) {
StringBuffer sb = new StringBuffer();
for (int ctr = 0; ctr < list.length; ctr++) {
sb.append(' ');
sb.append(list[ctr]);
}
System.out.println("ServerManagerImpl: getActiveServers returns" + sb.toString());
}
return list;
}
Concentrating on the first loop,
Iterator serverList = serverTable.keySet().iterator();
try {
while (serverList.hasNext()) {
Integer key = (Integer) serverList.next();
// get an entry
entry = (ServerTableEntry) serverTable.get(key);
if (entry.isValid() && entry.isActive()) {
servers.add(entry);
}
}
} catch (NoSuchElementException e) {
// all done
}
we see that both variable key
and variable serverList
can be eliminated
try {
for (ServerTableEntry e : serverTable.values())
if (e.isValid() && e.isActive())
servers.add(e);
} catch (NoSuchElementException e) {
// all done
}
In the second loop,
// collect the active entries
list = new int[servers.size()];
for (int i = 0; i < servers.size(); i++) {
entry = (ServerTableEntry) servers.get(i);
list[i] = entry.getServerId();
}
we cannot eliminate any variables. Yet, the iteration is simplified by using the new syntax.
// collect the active entries
list = new int[servers.size()];
int i = 0;
for (ServerTableEntry e : servers)
list[i++] = e.getServerId();
The third looping instruction
for (int ctr = 0; ctr < list.length; ctr++) {
sb.append(' ');
sb.append(list[ctr]);
}
makes it possible to eliminate variable ctr
:
for (int id : list)
sb.append(' ').append(id);
Applying all these modifications, and passing an appropriate generic parameter to the local variable servers, the function now looks as follows:
public int[] getActiveServers() {
int[] list;
synchronized (serverTable) {
// unlike vectors, list is not synchronized
ArrayList servers = new ArrayList(0);
try {
for (ServerTableEntry e : serverTable.values())
if (e.isValid() && e.isActive())
servers.add(e);
} catch (NoSuchElementException e) {
// all done
}
// collect the active entries
list = new int[servers.size()];
int i = 0;
for (ServerTableEntry e : servers)
list[i++] = e.getServerId();
}
if (debug) {
StringBuffer sb = new StringBuffer();
for (int id : list)
sb.append(' ').append(id);
System.out.println("ServerManagerImpl: getActiveServers returns" + sb.toString());
}
return list;
}
In total, the NOT metric is reduced from 235 to 164 (30% reduction).