JavaLib C++ Library
review and upgrades
by Curtis Krauskopf
Download
the JavaLib Library (JavaLib_1_0_1.zip 127k)
A long time ago, I had been working on a project that
was almost 100% Java for almost a year. During that
time, I became very familiar with the Java class hierarchy.
The people that wrote the Java programming language
used C and C++ as a reference. In doing so, they took
what they considered were the best features of C and
C++ while avoiding many of the pitfalls. [1]
When the Java project was finished and I resumed programming
in C++, I missed the Java class hierarchy that I had
become accustomed to. Standardized containers I had
taken for granted in Java, such as a hash table, were
not standardized in C++.
I searched the web for a Java-like library that was
written in C++. From the libraries I found, I liked
the ones at Michael Olivero's website (www.mike95.com)
best.
JavaLib
Mike95.com hosts four Java-like C++ libraries:
- JHashtable
- JString
- JVector
- JStack
The things I liked best about the libraries is that
the source code required very few changes to be Borland
C++ Builder compatible, the source files were small
and the classes were independent instead of trying to
strictly follow the Java class hierarchy of deriving
objects from a single class called Object.
The Mike95.com classes allowed me to use both primitive
C++ types and my own classes without having to inherit
a common Object class.
Over the years, Michael has upgraded the library.
However, sometime in 2005 the library started to fall
into disrepair with version control problems. For example,
the source code provided in code listings is different
than the source code that is downloaded. The downloadable
code has also been subject to version control problems
because changes that had been made to previous revisions
were lost in subsequent versions.
The library is also starting to show its age. The
Java language deprecated some of the container's method
names years ago but they were never changed in the Mike95.com
library. Even though it's trivial to change a class
method name, the anachronisms inhibit adoption of the
library by new-comers.
The value of the library remains, though, in providing
a very simple (and portable) implementation of hash
tables, vectors, stacks and memory-managed strings.
Changes
The changes I've implemented are based on previous versions
of the Mike95.com Java-like C++ libraries and upon the
current version. In some cases, I reverted to using the
previous implementation because of performance or simplicity
reasons.
Runtime errors were originally reported using cerr.
This wasn't suitable because sometimes I didn't want
to use cerr to report
errors. The alternative I coded was to define a JAVALIB_USE_CERR
compile-time symbol to report errors through cerr. When
JAVALIB_USE_CERR was not
defined, I coded the methods to throw exceptions for
runtime errors.
The library uses a header file, called m95_types.h,
to provide typedefs for often-used types: unsigned
int (UINT), false,
true and bool.
Because none of the modern C++ compilers are ignorant
of bool anymore and because
of the problems I've had using other libraries that
also define common types [2],
I've removed all of the Mike95.com-specific types from
the library. This means I removed the #includes
for the m95_types.h file
as well.
To prevent namespace collisions, I've wrapped the
four Java-like C++ libraries in a Java::
namespace and used the original Java class name.
A coding horror was also removed. [3]
Some of the .h files contained
using namespace std;
Oddly enough, removing the using
statement did not cause any of the libraries to fail
to compile.
JHashtable
The original Java-like C++ container was called JHashtable.
The namespace allowed me to use the simpler name: Hashtable.
Java::Hashtable is a hash table container that
uses a public interface that is similar to a Java Hashtable.
The Java language provides a built-in hashing function
for all of its objects. [4]
Because that feature doesn't exist in C++, the Java::Hashtable
implementation requires the user to provide a hash function.
The original JHashtable
implementation on Mike95.com required the user to call
setHashFunction() to provide
a hashing function. One problem I had when using the
library is I would oftentimes forget to set the hashing
function and that would lead to runtime disasters.
JHashtable also had a problem when the objects being
stored in the container were pointers, like this:
JHashtable < MyClass * > ...
Internally, when the library tried to detect if a hash
table entry didn't exist yet, it would try to compare
uninitialized pointers -- which of course causes problems.
Both of these problems were addressed in my new constructor
for Java::Hashtable:
template <class KeyType, class ObjType>
Java::Hashtable(
UINT(*func)(const KeyType&),
const ObjType &null_item,
UINT initialCapacity = 101,
float loadFactor = 0.5f
) throw (char *);
This implementation requires that the hashing function
and the definition of a null (undefined) item be passed
to the container. Since it's in the constructor, there
is no way to forget to provide either one.
This constructor keeps two features of the original
JHashtable constructor:
the ability to specify the initial number of hash table
bins and to specify the load factor. The load factor
is defined as:
.. a measure of how full the hash table is
allowed to get before its capacity is automatically
increased. The initial capacity and load factor
parameters are merely hints to the implementation.
The exact details as to when and whether the rehash
method is invoked are implementation-dependent.
[5] |
Listing A shows a typical
use of the JHashtable container.
Listing
A: JHashtable Example |
#include "assert.h"
#include "stdio.h"
#include "JHashTable.h"
typedef char * key_type;
typedef int value_type;
unsigned int hash_char_ptr(const key_type& key)
{
return 5; // world's second worst hash algorithm
}
int main(int, char**)
{
typedef char * key_type;
typedef int value_type;
Java::Hashtable <key_type, value_type> translator(hash_char_ptr, NULL);
assert(translator.isEmpty());
translator.put("one", 1);
assert(translator.containsKey("one"));
assert(!translator.containsKey("three"));
printf("one equals %d\n", translator.get("one"));
return 0;
}
|
~\Regression\HashTable_Example.cpp |
Pointer Failures
JHashtable's comparison function
(containsKey) used operator==()
to compare objects. This works fine most of the time.
One situation where it failed for me was when I created
a hash table in which the value being stored was a pointer
to a character array. This lead to a situation in which
the pointer to the character array was being compared
for equality instead of what the pointer pointed at. A
simple example of this is:
char hello1 = "hello";
char hello2 = "hello";
if (hello1 == hello2) {
cout << "They are the same\n";
}
else {
cout << "They are different\n";
}
In the above code sample, "They
are different" will always be the output.
Solutions
I introduced a new method, IsEqual,
that compares Hashtable keys
for equality. Because of that, I was able to use template
specialization to create an instance of IsEqual
tailored for char * types.
Listing B contains the code for
the generic IsEqual template
and for the specialized char * template.
Listing
B: IsEqual Implementation |
// cdk: added these helper functions for testing when keys are equal.
// Function base template
template <typename T>
bool const IsEqual(T const &first, T const &second)
{
return (first == second) ? true : false;
}
// Specialize for type 'char *'
template <>
inline bool IsEqual(char * first, char * second)
{
return (std::strcmp(first, second) == 0);
}
|
Code snippet from JHashtable.cpp |
Earlier versions of JHashtable
used a private variable, called hash,
to store the most recently hashed key value. This provided
convenience for the designer in not having to pass around
the hash value for various JHashtable
library functions.
Subsequent versions of JHashtable
implemented a stricter approach to handling the hash
value that required recalculating the key value. In
some cases, the key value would be calculated up to
three times for a single public method call. I chose
to revert to the previous functionality for performance
reasons. My profiling showed that the hashing algorithm
was the most expensive part of Java::Hashtable
-- so caching the hash value seemed to be the best solution.
Hashtable wish list
Despite its capabilities, Hashtable
is incomplete. I would like to see the following features
added to Hashtable:
- use operator[]()
to allow syntax like:
translator["one"] = 1;
- provide default hash table functions for char
* string arrays and Java::String
objects
mimic the Java Hashtable's features for to_string(),
which return a string representation of the map with
entries enclosed in braces and separated by ",
" (comma and space). [5]
JString
Memory managed char array classes are the centerpiece
of most C++ programming texts. JString
is similar to those introductory classes by providing
all of the expected operator overloading. Table
1 shows many of the Java-like methods available in
JString.
Table 1: Java::String
methods |
Method
| Description |
charAt
| Returns the character at the specified location.
Character indexes are 0-based (i.e., the first character
in a string is 0). |
compareTo
| Lexigraphically compare two strings. |
endsWith
| Returns position (offset) of the first character
in the supplied substring. |
equalsIgnoreCase
| Compare strings while ignoring capitalization.
|
indexOf
| Return the offset of the first character or of
the string specified. |
lastIndexOf
| Return the offset of the last character or string
specified. |
length
| Same as strlen(). |
startsWith
| Tests if this string starts with the specified
prefix. |
substring
| Create a slice of the string using the specified
starting point and range. |
toLowerCase
| Convert the string to lower case. |
trim
| Remove leading and trailing whitespace. |
Changes
In keeping with the spirit of using the Java namespace,
the JString class was renamed
to String.
The JString class provides
a cstr() method that returns
a const char * to the internal
null-terminated array. I added a c_str()
method to Java::String
that does the same thing because it was standardized
in the String class in
the Standard Template Library (STL).
Another change I made was to automatically convert
native integer types to
Java::String when no ambiguity
was involved. Table 2 shows the
constructors that automatically convert variations of
integers to Java::String.
Table 2: Java::String integer
conversion constructors |
explicit String( const unsigned char ); |
explicit String( const short ); |
explicit String( const unsigned short ); |
explicit String( const int); |
explicit String( const unsigned int); |
explicit String( const long ); |
explicit String( const unsigned long ); |
Good programming practices use explicit constructors
(Table 2) to prevent hidden type
conversions.
An exception is thrown in operator[]()
when the index is out of bounds for the character array.
When JAVALIB_USE_CERR
is defined, a message is sent to cerr
and the program immediately terminates.
The Java::String::substring
method can also emit a cerr
message, but when JAVALIB_USE_CERR
is not defined, the substring
method silently adjusts the out-of-bounds right-hand
index to the length of the string. I haven't decided
if this behavior is good or not; it deviates from the
Java standard [6] but the
one time I needed it saved me a lot of bookkeeping in
testing if the substring boundary was beyond the end
of the string. For consistency sake, I should adopt
one way or the other.
That’s not a bug… it’s a feature!
One noteworthy feature is that one variation of operator[]()
returns a reference. This allows us to use operator[]()
as the left-hand side of an expression, as in the
following code:
Java::String symbol = "Go";
symbol[0] = 'D';
assert(symbol == "Do");
Because the index value for operator[]()
is checked at runtime, the following code will
either throw an exception or emit a runtime error and
terminate the program (depending on the status of JAVALIB_USE_CERR):
Java::String symbol = "Go";
symbol[2] = 'X';
The lack of dependencies between the JavaLib libraries
is evident in that Java::String
does not provide a hash algorithm.
Another undocumented deficiency is that operator>>()
has a maximum buffer size of 2048 bytes. Users
who are concerned about buffer overruns should not use
operator>>() in
Java::String.
JString.cpp needs to
be compiled to generate an .obj
file. Most of the other Java libraries are template-based
and the .cpp is #included
by the .h file. The distribution
file (JavaLib_1_0_1.zip) contains a library file
for easily using the Java::String
in your own project.
JVector
JVector was renamed to Java::Vector
when the namespace was applied.
Java::Vectors use a template
parameter for the type being stored. The example program
in Listing C shows a quick
example of a Java::Vector
of char *.
Listing
C: Java::Vector Example |
#include "JVector.h"
#include <iostream>
int main(int, char**)
{
typedef char * ptr_type;
Java::Vector<ptr_type> pointers;
pointers.addElement("hello");
pointers.addElement("world");
for(unsigned int i = 0; i < pointers.size(); ++i) {
std::cout << pointers[i] << std::endl;
}
return 0;
}
|
~\Regression\Vector_Test.cpp |
Java::Vectors default
to having an initial capacity of 10 elements and the
growth rate is also 10 elements. The user can specify
customized initial conditions in the constructor.
Table 3 shows the list of methods
I added to JString to modernize
the Java::Vector interface.
Table 3: Java::Vector Additions |
add |
clear |
remove |
get |
Many of the Java::Vector
methods throw exceptions for out-of-bound indexes or
emit error messages to cerr.
All of the methods that can throw an exception are documented
in the JVector.h file
with a throw(char*) modifier
in the function declaration.
JStack
The JStack collection implements
a standard Java-like stack container on a user-specified
data type. I renamed JStack
to Java::Stack to be consistent
with the other libraries.
Like JString, JStack.cpp
needs to be compiled to generate an
.obj file.
An interesting note is that JStack
does not use Java::Vector.
Instead, it uses a protected
class (Java::Stack::node)
to do the stack bookkeeping.
Two methods, peek() and
pop(), will throw a char
* exception when the stack is empty or they will
emit an error message when JAVALIB_USE_CERR
is defined.
The original implementation for JStack's
empty() method did the same thing as a modern clear()
method in Java containers. The Java 1.5.0 documentation
[7] uses empty()
to test if the container is empty. In that spirit, I've
renamed the original JStack
empty() to Java::Stack::clear()
and implemented Java::Stack::empty()
to do the same thing as JStack's
isEmpty() method.
Documentation
The documentation for the JavaLib classes is poor. The
hyperlink provided on the Mike95.com site does not resolve
to the Java documentation on java.sun.com
anymore. Instead, I recommend using the hyperlinks in
the references section at the end of this article.
Installation
Installation of the new JavaLib is very simple. The library
is available in the file JavaLib_1_0_1.zip
(127k). Unzip it to a folder and add the folder to
the include path for your project.
Two of the libraries, Java::String
and Java::Stack, need to
be compiled to generate a library. A default library
using reasonable compiler settings is provided in the
distribution file but a project file is available to
customize the library.
Next, #include whichever
JavaLib containers you want to use. If you're using
either Java::String or
Java::Stack, you'll need
to add the JavaLib.lib
file to the project too.
The segregation (and lack of dependency) of the JavaLib
containers means that the code compiles with no overhead.
The Java::Stack implementation,
for example, is about as bare-bones as possible.
The lack of dependencies also aids in porting the library
to other platforms. Instead of having to port the entire
library at one time, you can concentrate on just the
parts of the library you need on the new platform.
Overall wishes
I wish that the JavaLib library had better conformance
to the modern Java API. I have modified the JavaLib library
to add modern method names to Java::String
and Java::Stack, but my work
is incomplete and I only implemented the method names
I use.
JavaLib could also be improved by implementing Set
and List classes. There
are other Java containers that would also be useful.
Implementing an exception class hierarchy that mimicked
the Java classes would be helpful too. Then all of the
throws could be changed
from char * to more meaningful
exceptions.
When JAVALIB_USE_CERR
is defined and an error message is emitted, the program
automatically terminates using exit(1).
This is probably not desirable in some situations even
when exceptions are not being used. Another preprocessor
symbol, called JAVALIB_EXIT_ON_ERROR
should probably be defined to provide more customization
to the library's behavior.
Casts are littered throughout the code. The library
should be recoded to use static_cast
or dynamic_cast for most
of the casts. While this won't affect performance, it
will affect maintainability.
Update (April 18, 2007):
A JavaLib user named Pawel reported two problems
in the JVector class that caused compile-time problems.
One problem was a minor syntax issue and the other
was a missing namespace name. Both problems have
been fixed in the library. The JavaLib library's
version has been upgraded to 1.0.1. |
Conclusion
JavaLib is a small library of Java-like classes written
in C++. The size of the library promotes portability and
maintainability by end-users. Even though the library
has not stayed in lock-step with the changes in the Java
API, those changes are trivial to implement.
The small size of the library also means that only
the most commonly used methods are implemented. For
the right project, JavaLib can be quite helpful in providing
containers that are much simpler to use than the STL
but provide all of the power of template programming.
References
1. http://java.sun.com/docs/white/langenv/Intro.doc2.html
2. Borland
C++ Builder Developer’s Journal, “Opt
3.19”, November, 2006.
3. http://www.parashift.com/c++-faq-lite/coding-standards.html#faq-27.5
4. http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Object.html#hashCode()
5. http://java.sun.com/j2se/1.5.0/docs/api/java/util/Hashtable.html
6. http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html#substring(int,%20int)
7. http://java.sun.com/j2se/1.5.0/docs/api/java/util/Stack.html
Contact Curtis at
Curtis Krauskopf is a software
engineer and the president of The Database Managers (www.decompile.com).
He has been writing code professionally for over 25 years. His prior projects
include multiple web e-commerce applications, decompilers
for the DataFlex language, aircraft simulators, an automated Y2K conversion
program for over 3,000,000 compiled DataFlex programs, and inventory control projects.
Curtis has spoken at many domestic and international DataFlex developer conferences
and has been published in FlexLines Online, JavaPro
Magazine, C/C++
Users Journal and C++ Builder Developer's Journal.
Popular C++ topics at The Database Managers:
|