The Database Managers, Inc.

Contact The Database Managers, Inc.


Use an RSS enabled news reader to read these articles.Use an RSS enabled news reader to read these articles.

JavaLib C++ Library

review and upgrades
by Curtis Krauskopf

Download the JavaLib Library (JavaLib_1_0_1.zip 127k)

A long time ago, I had been working on a project that was almost 100% Java for almost a year. During that time, I became very familiar with the Java class hierarchy. The people that wrote the Java programming language used C and C++ as a reference. In doing so, they took what they considered were the best features of C and C++ while avoiding many of the pitfalls. [1]

When the Java project was finished and I resumed programming in C++, I missed the Java class hierarchy that I had become accustomed to. Standardized containers I had taken for granted in Java, such as a hash table, were not standardized in C++.

I searched the web for a Java-like library that was written in C++. From the libraries I found, I liked the ones at Michael Olivero's website (www.mike95.com) best.

JavaLib

Mike95.com hosts four Java-like C++ libraries:
  • JHashtable
  • JString
  • JVector
  • JStack

The things I liked best about the libraries is that the source code required very few changes to be Borland C++ Builder compatible, the source files were small and the classes were independent instead of trying to strictly follow the Java class hierarchy of deriving objects from a single class called Object. The Mike95.com classes allowed me to use both primitive C++ types and my own classes without having to inherit a common Object class.

Over the years, Michael has upgraded the library. However, sometime in 2005 the library started to fall into disrepair with version control problems. For example, the source code provided in code listings is different than the source code that is downloaded. The downloadable code has also been subject to version control problems because changes that had been made to previous revisions were lost in subsequent versions.

The library is also starting to show its age. The Java language deprecated some of the container's method names years ago but they were never changed in the Mike95.com library. Even though it's trivial to change a class method name, the anachronisms inhibit adoption of the library by new-comers.

The value of the library remains, though, in providing a very simple (and portable) implementation of hash tables, vectors, stacks and memory-managed strings.

Changes

The changes I've implemented are based on previous versions of the Mike95.com Java-like C++ libraries and upon the current version. In some cases, I reverted to using the previous implementation because of performance or simplicity reasons.

Runtime errors were originally reported using cerr. This wasn't suitable because sometimes I didn't want to use cerr to report errors. The alternative I coded was to define a JAVALIB_USE_CERR compile-time symbol to report errors through cerr. When JAVALIB_USE_CERR was not defined, I coded the methods to throw exceptions for runtime errors.

The library uses a header file, called m95_types.h, to provide typedefs for often-used types: unsigned int (UINT), false, true and bool. Because none of the modern C++ compilers are ignorant of bool anymore and because of the problems I've had using other libraries that also define common types [2], I've removed all of the Mike95.com-specific types from the library. This means I removed the #includes for the m95_types.h file as well.

To prevent namespace collisions, I've wrapped the four Java-like C++ libraries in a Java:: namespace and used the original Java class name.

A coding horror was also removed. [3] Some of the .h files contained

using namespace std;

Oddly enough, removing the using statement did not cause any of the libraries to fail to compile.

JHashtable

The original Java-like C++ container was called JHashtable. The namespace allowed me to use the simpler name: Hashtable. Java::Hashtable is a hash table container that uses a public interface that is similar to a Java Hashtable.

The Java language provides a built-in hashing function for all of its objects. [4] Because that feature doesn't exist in C++, the Java::Hashtable implementation requires the user to provide a hash function. The original JHashtable implementation on Mike95.com required the user to call setHashFunction() to provide a hashing function. One problem I had when using the library is I would oftentimes forget to set the hashing function and that would lead to runtime disasters.

JHashtable also had a problem when the objects being stored in the container were pointers, like this:

JHashtable < MyClass * > ...

Internally, when the library tried to detect if a hash table entry didn't exist yet, it would try to compare uninitialized pointers -- which of course causes problems.

Both of these problems were addressed in my new constructor for Java::Hashtable:

template <class KeyType, class ObjType>
Java::Hashtable( 
    UINT(*func)(const KeyType&), 
    const ObjType &null_item, 
    UINT initialCapacity = 101, 
    float loadFactor = 0.5f 
) throw (char *);

This implementation requires that the hashing function and the definition of a null (undefined) item be passed to the container. Since it's in the constructor, there is no way to forget to provide either one.

This constructor keeps two features of the original JHashtable constructor: the ability to specify the initial number of hash table bins and to specify the load factor. The load factor is defined as:

.. a measure of how full the hash table is allowed to get before its capacity is automatically increased. The initial capacity and load factor parameters are merely hints to the implementation. The exact details as to when and whether the rehash method is invoked are implementation-dependent. [5]

Listing A shows a typical use of the JHashtable container.

Listing A: JHashtable Example

#include "assert.h"
#include "stdio.h"
#include "JHashTable.h"

typedef char * key_type;
typedef int value_type;

unsigned int hash_char_ptr(const key_type& key)
{
  return 5;   // world's second worst hash algorithm
}

int main(int, char**)
{
  typedef char * key_type;
  typedef int value_type;

  Java::Hashtable <key_type, value_type> translator(hash_char_ptr, NULL);

  assert(translator.isEmpty());

  translator.put("one", 1);
  assert(translator.containsKey("one"));
  assert(!translator.containsKey("three"));

  printf("one equals %d\n", translator.get("one"));

  return 0;
}
~\Regression\HashTable_Example.cpp

Pointer Failures

JHashtable's comparison function (containsKey) used operator==() to compare objects. This works fine most of the time. One situation where it failed for me was when I created a hash table in which the value being stored was a pointer to a character array. This lead to a situation in which the pointer to the character array was being compared for equality instead of what the pointer pointed at. A simple example of this is:
char hello1 = "hello";
char hello2 = "hello";
if (hello1 == hello2) {
  cout << "They are the same\n";
}
else {
  cout << "They are different\n";
}

In the above code sample, "They are different" will always be the output.

Solutions

I introduced a new method, IsEqual, that compares Hashtable keys for equality. Because of that, I was able to use template specialization to create an instance of IsEqual tailored for char * types. Listing B contains the code for the generic IsEqual template and for the specialized char * template.
Listing B: IsEqual Implementation

// cdk: added these helper functions for testing when keys are equal.
// Function base template
template <typename T>
bool const IsEqual(T const &first, T const &second)
{
    return (first == second) ? true : false;
}

// Specialize for type 'char *'
template <>
inline bool IsEqual(char * first, char * second)
{
    return (std::strcmp(first, second) == 0);
}

Code snippet from JHashtable.cpp

Earlier versions of JHashtable used a private variable, called hash, to store the most recently hashed key value. This provided convenience for the designer in not having to pass around the hash value for various JHashtable library functions.

Subsequent versions of JHashtable implemented a stricter approach to handling the hash value that required recalculating the key value. In some cases, the key value would be calculated up to three times for a single public method call. I chose to revert to the previous functionality for performance reasons. My profiling showed that the hashing algorithm was the most expensive part of Java::Hashtable -- so caching the hash value seemed to be the best solution.

Hashtable wish list

Despite its capabilities, Hashtable is incomplete. I would like to see the following features added to Hashtable:
  • use operator[]() to allow syntax like:
    translator["one"] = 1;
  • provide default hash table functions for char * string arrays and Java::String objects
  • mimic the Java Hashtable's features for to_string(), which return a string representation of the map with entries enclosed in braces and separated by ", " (comma and space). [5]

JString

Memory managed char array classes are the centerpiece of most C++ programming texts. JString is similar to those introductory classes by providing all of the expected operator overloading. Table 1 shows many of the Java-like methods available in JString.
Table 1: Java::String methods
Method Description
charAt Returns the character at the specified location. Character indexes are 0-based (i.e., the first character in a string is 0).
compareTo Lexigraphically compare two strings.
endsWith Returns position (offset) of the first character in the supplied substring.
equalsIgnoreCase Compare strings while ignoring capitalization.
indexOf Return the offset of the first character or of the string specified.
lastIndexOf Return the offset of the last character or string specified.
length Same as strlen().
startsWith Tests if this string starts with the specified prefix.
substring Create a slice of the string using the specified starting point and range.
toLowerCase Convert the string to lower case.
trim Remove leading and trailing whitespace.

Changes

In keeping with the spirit of using the Java namespace, the JString class was renamed to String.

The JString class provides a cstr() method that returns a const char * to the internal null-terminated array. I added a c_str() method to Java::String that does the same thing because it was standardized in the String class in the Standard Template Library (STL).

Another change I made was to automatically convert native integer types to Java::String when no ambiguity was involved. Table 2 shows the constructors that automatically convert variations of integers to Java::String.

Table 2: Java::String integer conversion constructors
explicit String( const unsigned char );
explicit String( const short );
explicit String( const unsigned short );
explicit String( const int);
explicit String( const unsigned int);
explicit String( const long );
explicit String( const unsigned long );

Good programming practices use explicit constructors (Table 2) to prevent hidden type conversions.

An exception is thrown in operator[]() when the index is out of bounds for the character array. When JAVALIB_USE_CERR is defined, a message is sent to cerr and the program immediately terminates.

The Java::String::substring method can also emit a cerr message, but when JAVALIB_USE_CERR is not defined, the substring method silently adjusts the out-of-bounds right-hand index to the length of the string. I haven't decided if this behavior is good or not; it deviates from the Java standard [6] but the one time I needed it saved me a lot of bookkeeping in testing if the substring boundary was beyond the end of the string. For consistency sake, I should adopt one way or the other.

That’s not a bug… it’s a feature!

One noteworthy feature is that one variation of operator[]() returns a reference. This allows us to use operator[]() as the left-hand side of an expression, as in the following code:
Java::String symbol = "Go";
symbol[0] = 'D';
assert(symbol == "Do");

Because the index value for operator[]() is checked at runtime, the following code will either throw an exception or emit a runtime error and terminate the program (depending on the status of JAVALIB_USE_CERR):

Java::String symbol = "Go";
symbol[2] = 'X';

The lack of dependencies between the JavaLib libraries is evident in that Java::String does not provide a hash algorithm.

Another undocumented deficiency is that operator>>() has a maximum buffer size of 2048 bytes. Users who are concerned about buffer overruns should not use operator>>() in Java::String.

JString.cpp needs to be compiled to generate an .obj file. Most of the other Java libraries are template-based and the .cpp is #included by the .h file. The distribution file (JavaLib_1_0_1.zip) contains a library file for easily using the Java::String in your own project.

JVector

JVector was renamed to Java::Vector when the namespace was applied.

Java::Vectors use a template parameter for the type being stored. The example program in Listing C shows a quick example of a Java::Vector of char *.

Listing C: Java::Vector Example

#include "JVector.h"
#include <iostream>

int main(int, char**)
{
  typedef char * ptr_type;
  Java::Vector<ptr_type> pointers;

  pointers.addElement("hello");
  pointers.addElement("world");

  for(unsigned int i = 0; i < pointers.size(); ++i) {
    std::cout << pointers[i] << std::endl;
  }

  return 0;
}
~\Regression\Vector_Test.cpp

Java::Vectors default to having an initial capacity of 10 elements and the growth rate is also 10 elements. The user can specify customized initial conditions in the constructor.

Table 3 shows the list of methods I added to JString to modernize the Java::Vector interface.

Table 3: Java::Vector Additions
add
clear
remove
get

Many of the Java::Vector methods throw exceptions for out-of-bound indexes or emit error messages to cerr. All of the methods that can throw an exception are documented in the JVector.h file with a throw(char*) modifier in the function declaration.

JStack

The JStack collection implements a standard Java-like stack container on a user-specified data type. I renamed JStack to Java::Stack to be consistent with the other libraries.

Like JString, JStack.cpp needs to be compiled to generate an .obj file.

An interesting note is that JStack does not use Java::Vector. Instead, it uses a protected class (Java::Stack::node) to do the stack bookkeeping.

Two methods, peek() and pop(), will throw a char * exception when the stack is empty or they will emit an error message when JAVALIB_USE_CERR is defined.

The original implementation for JStack's empty() method did the same thing as a modern clear() method in Java containers. The Java 1.5.0 documentation [7] uses empty() to test if the container is empty. In that spirit, I've renamed the original JStack empty() to Java::Stack::clear() and implemented Java::Stack::empty() to do the same thing as JStack's isEmpty() method.

Documentation

The documentation for the JavaLib classes is poor. The hyperlink provided on the Mike95.com site does not resolve to the Java documentation on java.sun.com anymore. Instead, I recommend using the hyperlinks in the references section at the end of this article.

Installation

Installation of the new JavaLib is very simple. The library is available in the file JavaLib_1_0_1.zip (127k). Unzip it to a folder and add the folder to the include path for your project.

Two of the libraries, Java::String and Java::Stack, need to be compiled to generate a library. A default library using reasonable compiler settings is provided in the distribution file but a project file is available to customize the library.

Next, #include whichever JavaLib containers you want to use. If you're using either Java::String or Java::Stack, you'll need to add the JavaLib.lib file to the project too.

The segregation (and lack of dependency) of the JavaLib containers means that the code compiles with no overhead. The Java::Stack implementation, for example, is about as bare-bones as possible.

The lack of dependencies also aids in porting the library to other platforms. Instead of having to port the entire library at one time, you can concentrate on just the parts of the library you need on the new platform.

Overall wishes

I wish that the JavaLib library had better conformance to the modern Java API. I have modified the JavaLib library to add modern method names to Java::String and Java::Stack, but my work is incomplete and I only implemented the method names I use.

JavaLib could also be improved by implementing Set and List classes. There are other Java containers that would also be useful.

Implementing an exception class hierarchy that mimicked the Java classes would be helpful too. Then all of the throws could be changed from char * to more meaningful exceptions.

When JAVALIB_USE_CERR is defined and an error message is emitted, the program automatically terminates using exit(1). This is probably not desirable in some situations even when exceptions are not being used. Another preprocessor symbol, called JAVALIB_EXIT_ON_ERROR should probably be defined to provide more customization to the library's behavior.

Casts are littered throughout the code. The library should be recoded to use static_cast or dynamic_cast for most of the casts. While this won't affect performance, it will affect maintainability.

Update (April 18, 2007):

A JavaLib user named Pawel reported two problems in the JVector class that caused compile-time problems. One problem was a minor syntax issue and the other was a missing namespace name. Both problems have been fixed in the library. The JavaLib library's version has been upgraded to 1.0.1.

Conclusion

JavaLib is a small library of Java-like classes written in C++. The size of the library promotes portability and maintainability by end-users. Even though the library has not stayed in lock-step with the changes in the Java API, those changes are trivial to implement.

The small size of the library also means that only the most commonly used methods are implemented. For the right project, JavaLib can be quite helpful in providing containers that are much simpler to use than the STL but provide all of the power of template programming.

References

1. http://java.sun.com/docs/white/langenv/Intro.doc2.html
2. Borland C++ Builder Developer’s Journal, “Opt 3.19”, November, 2006.
3. http://www.parashift.com/c++-faq-lite/coding-standards.html#faq-27.5
4. http://java.sun.com/j2se/1.5.0/docs/api/java/lang/Object.html#hashCode()
5. http://java.sun.com/j2se/1.5.0/docs/api/java/util/Hashtable.html
6. http://java.sun.com/j2se/1.5.0/docs/api/java/lang/String.html#substring(int,%20int)
7. http://java.sun.com/j2se/1.5.0/docs/api/java/util/Stack.html

Contact Curtis at

Curtis Krauskopf is a software engineer and the president of The Database Managers (www.decompile.com). He has been writing code professionally for over 25 years. His prior projects include multiple web e-commerce applications, decompilers for the DataFlex language, aircraft simulators, an automated Y2K conversion program for over 3,000,000 compiled DataFlex programs, and inventory control projects. Curtis has spoken at many domestic and international DataFlex developer conferences and has been published in FlexLines Online, JavaPro Magazine, C/C++ Users Journal and C++ Builder Developer's Journal.


Popular C++ topics at The Database Managers:

Services | Programming | Contact Us | Recent Updates
Send feedback to: