The Database Managers, Inc.

Contact The Database Managers, Inc.


Use an RSS enabled news reader to read these articles.Use an RSS enabled news reader to read these articles.

STLplus3 Review, Upgrades and Reengineering

by Curtis Krauskopf
Download STLplus3 version 3.40 update

I had an old friend email me out of the blue a few weeks ago. It was the first time I've heard from him in a long time. His life has changed a lot: he's living in a different part of the country, he's going to get married and he's on a fitness program to improve his life and his lifestyle.

Seeing an old friend improve over time is gratifying because I get to see the transitions. The improvement process is usually not sudden because it's a slow transformation over a long time.

I have been using STLplus 2.5 for a very long time and the library had become an 'old friend' because I could rely on it so much. I used STLplus 2.5 in every program I wrote because of the STL helper functions and the STL-like containers it provided.

I recently discovered STLplus3 3.4 and, similarly to my real-life friend, it has gone through some transformations over the years too.

The new STLplus3 version 3.4 code and documentation have been updated with a modern look that uses widely acknowledged best practices, such as namespaces.

In case you missed the first STLplus review, STLplus3 is a collection of C++ components and helper functions that follow the STL paradigm of containers, iterators, algorithms, function objects and adapters. STLplus3 expands upon the popular STL library by providing features the standard STL doesn't provide, such as:

  • Containers
    • A smarter smart pointer
    • Hash table
    • Directed graph
    • Bundling
  • Object and container persistence
  • Portability helpers
    • Cross platform functions
    • File system access
    • TCP classes
    • Safe printf-like string formatting
    • Simple wildcard pattern matching
    • Infinite precision integers
  • std::string helpers

Out of the box

STLplus3 is available from SourceForge.net. The .zip file is only 756k. It contains folders with the complete source code and documentation along with makefiles for the gcc compilers and solution files for Microsoft Visual Studio compilers.

The documentation is still in .html files and the root page is index.html. Some of the classes and functions I checked have better documentation than in STLplus 2.5 and there is an explanation of the item with a few lines of sample code. The documentation has hyperlinks to colorized header files (these are static HTML files and are not dynamically colorized from the actual source).

Installation

To install the library, I unzip the distribution file into c:\cpp\lib\STLplus3 on my development machine. This keeps it separate from my STLPlus 2.5 installation in the ~\lib\STLplus\ folder.

Just like STLplus 2.5, STLplus3 3.4 doesn't have a Borland-compatible project file in the distribution. However, the Building and Using the STLplus Library Collection article in the documentation indicates that the library is portable to Borland compilers.

Going back in time

One of the first things I do when evaluating STLplus3 3.4 is to reread my original review because it had step-by-step instructions for installing STLplus 2.5.

In the original review, I manually created a project group and added a library to it. Next, I manually added the source for STLplus to the library.

The older STLplus had only one source folder, cleverly called ~\source\. The source code for STLplus3 is segregated across five folders: containers, persistence, portability, strings and subsystems. All of those folders contain header files and all except the containers folder have .cpp files.

I wonder why Andy Ruston, the creator and maintainer of STLplus, chose to break the library up like that but I also decided to worry about that later.

Still following my directions from the original review, I save the project group and the library but I put them in the root of the STLplus3 folder. I feel this is okay because the Visual Studio solution file and workspace file are there too.

When I compile the project, I get a compile-time error: Unable to open include file 'smart_ptr.hpp'.

The smart_ptr.hpp file is in the containers folder. I can blame myself for that compile-time error because I had not added that folder to the library's include path. A few mouse clicks later and I have the full path to the containers folder added to the include list.

The next compile also fails: library too large, please restart with library page size 32. I have seen that type of problem before in the old STLplus library. The solution is to change the TLIB page size from 0x0010 to 0x0080.

My next surprise is that the library compiles successfully. None of the source code changes I had to make to STLplus 2.5 were needed in STLplus 3.4.

Warnings

The ~\portability\portability_fixes.hpp file disables some Borland-specific warnings that I consider to be important. Lines 64-68 in portability_fixes.hpp very clearly document which warnings are disabled. However, the warnings don't need to be disabled. All four warnings, W8022 and W8060 (virtual function in a base class is overridden and possible incorrect assignment respectively) and W8008 for condition always true and W8066 for unreachable code are not problems in the STLplus code anymore. I don't like headers that disable compiler warnings so I've chosen to comment-out the #pragmas that disable those warnings.

Recompiling the library in release-mode results in just 11 warnings. About half are W8004, which is for an assigned value that is never used. Looking over each instance of W8004, I decided W8004 is a false-alarm because even though the variable is assigned a value at declaration, the variable is used as a parameter elsewhere in the function. The other warnings are W8027, which complains that functions containing do are not expanded inline.

In debug-mode, the library compiles with no warnings or errors.

Andy had clearly done a lot of work cleaning up the library because STLplus 2.5 compiled with dozens of warnings.

Hash example

My previous review provided an example of using the Hash container in STLplus. The only change I had to make to that program for STLplus3 was to qualify the use of the hash template with the stlplus:: namespace. The hash example is in listing A.

Listing A
#include <iostream>
#pragma hdrstop
#include <hash.hpp>

using std::string; // to make
using std::cout;   // this listing
using std::endl;   // narrower
using stlplus::hash;

// World's worst hashing algorithm
struct hash_string {
  unsigned operator()
    (const string & s) const
    { return 5; }
};

int main(int  argc, char* argv[]) {
  hash<string, string, hash_string>
    symbols;

  symbols["alpha"] = "a";
  symbols["bravo"] = "b";

  cout << "alpha = ";
  cout << symbols["alpha"] << endl;
  cout << "bravo = ";
  cout << symbols["bravo"] << endl;

  if (symbols.present("charlie"))
    cout << "found charlie" << endl;
  else
    cout << "charlie not found" << endl;

  cout << "charlie = ";
  cout << symbols["charlie"] << endl;

  if (symbols.present("charlie"))
    cout << "found charlie" << endl;
  else
    cout << "charlie not found" << endl;

  return 0; 
} 

// Expected output: 
// alpha = a 
// bravo = b 
// charlie not found
// charlie =
// found charlie
hash.cpp sample program.

The last three lines of the program's output show that referencing a hash element as an r-value causes it to be defined in the hash table. This is an expected behavior for this container and it's documented in STLplus3.

The hash example doesn't test the newly created STLplus3 library because it only uses the hash template which is defined in the containers folder.

Listing B is an example of an STLplus program that uses the STLplus3 library. I used #pragma link to specify the library. Alternatively, I could have dragged the library into the project's definition in Project Manager.

In Listing B, stlplus::file_exists is used to verify that the source file exists. stlplus::folder_part leaves only the drive and full path of __FILE__. stlplus::folder_files creates a vector of the filenames in the same folder as the source code. The vector is then limited to 3 elements so that the vector is short enough to print nicely in this example.

True to its name, stlplus::vector_to_string converts the vector into a string and separates each vector element with a comma (specified in the third parameter). The second parameter of stlplus::vector_to_string requires a pointer to a function that takes the type being stored by the std::vector (in this case a string) and returns a string. STLplus3 provides a suitable function, called stlplus::string_to_string. If the vector had stored ints, the helper function would have been stlplus::int_to_string, as shown in this snippet:

  std::vector<int> test;
  test.push_back(5);
  test.push_back(7);
  cout << stlplus::vector_to_string(test, stlplus::int_to_string, ",") << endl;

STLplus provides a convenient function called stlplus::print_vector that routes the output to any std::ostream. The stlplus::print_vector does the same thing as stlplus::vector_to_string except it uses stlplus::print_string as a helper function. As would be expected, if the vector had been of ints, the helper function would have been stlplus::print_int.

Finally, the stlplus::extension_part extracts just the extension from the filename.

Listing B

#include <iostream>

#include <STLplus.hpp>

using std::cout;
using std::endl;

int main(int, char**)
{
  cout << __FILE__ << " exists: ";
  cout << stlplus::file_exists(__FILE__);
  cout << endl;

  std::string root = stlplus::folder_part(__FILE__);
  std::vector<std::string> files = stlplus::folder_files(root);

  if (files.size() > 3) files.resize(3);

//  cout << "Vector uses overloaded extraction operator\n";
//  cout << files << endl;

  cout << "Using vector_to_string...\n";
  cout << stlplus::vector_to_string(files, stlplus::string_to_string, ",") << endl;

  cout << "Using print_vector...\n";
  stlplus::print_vector(cout, files, stlplus::print_string, ",");
  cout << endl;

  cout << "Source file extension: ";
  cout << stlplus::extension_part(__FILE__);
  cout << endl;

  std::vector<int> test;
  test.push_back(5);
  test.push_back(7);
  cout << stlplus::vector_to_string(test, stlplus::int_to_string, ",") << endl;

  return 0;
}

// Sample output:
// c:\cpp\lib\STLPlus3\Examples\File_System.cpp exists: 1
// Using vector_to_string...
// Examples.bpg,File_System.bpf,File_System.bpr
// Using print_vector...
// Examples.bpg,File_System.bpf,File_System.bpr
// Source file extension: cpp
// 5,7
File_System.cpp sample program.

The program in listing B is remarkable in that it's both cross-platform and cross-compiler compatible. The code doesn't know or care if the operating system is Windows, Linux, Unix or even MacOS. The code [3] can be compiled using a Borland compiler (of course), or in Visual Studio or even gcc and the program would still work. What's even more remarkable is that there isn't a single #if anywhere in the code. All of the cross-platform and cross-compiler capabilities are hidden inside of STLplus3's library.

In STLplus 2.5, I was able to include just one file, STLplus.hpp, and everything in the library would be included. STLplus3 has moved away from that paradigm. The modules in STLplus3 are segregated even though some modules (such as persistence, subsystems and strings) depend on other modules (containers and portability). This leads to the list of #includes shown at the top of Listing B that include enough modules to let the program compile.

Pros and Cons

Andy has fixed some of the gripes I had with STLplus 2.5: CVS files are no longer in the distribution file and the library does not compile with any warnings (other than Borland-specific compile-time warnings about not being able to create a precompiled header).

Some of my gripes from STLplus 2.5 haven't been addressed yet: there are no unit tests and there are no examples, other than snippets and short programs in the documentation.

The documentation, by the way, is well-written with many examples. The documentation for smart pointers, in particular, has been significantly improved over the previous versions. Despite the massive refactoring of the library between STLplus 2.5 and STLplus3 3.4, the documentation is correct (in most cases) [4].

The code is also well written with copious helpful comments. STL vendors, such as STLPort, oftentimes pride themselves on efficient algorithms that use advanced techniques. STLplus is an exception. One of the users politely asked Andy about optimization and Andy's response was:

"In my opinion, code should be written for readability first and only optimised if profiling shows a need. In my experience (and contrary to popular belief), most performance problems are caused by poor data-structure design, not by lack of optimisation of code. In fact, optimising every line of code tends to be counter-productive."

I tend to agree with Andy. At one point I had taken over the maintenance of a miserable C++ project. The original authors must have thought that the ?: operator provided superior performance over an if/then statement because they used it often and in many clever ways. However, in almost every case, every use of the ?: operator in their code contained a subtle bug of some sort.

My point is if you're expecting STLplus3 to use an optimum algorithm because STL is in the library's name, you'll be disappointed. Here's a typical example:

std::string trim(const std::string& val)
{
  std::string result = val;
  while (!result.empty() && isspace(result[0]))
    result.erase(result.begin());
  while (!result.empty() && isspace(result[result.size()-1]))
    result.erase(result.end()-1);
  return result;
}

The trim function in STLplus3 is short, it's easy to understand and a desk-check of the code shows that it generates the right results for any string. The only problem is that it's slow -- mostly because the while loops are using std::vector's erase method on each of the leading and trailing blanks of a string. A project I had worked on required over 50,000,000 trim operations. When I tried to find out why the program was so slow, the STLplus3 trim() function showed up in the profiler as a bottleneck. Listing C shows my improved version.

Although I'm not professing that listing C is the best possible trim implementation, my testing shows that it has a 250% performance improvement. A desk-check of listing C is not nearly as easy as a desk-check of the original trim() function.

But that's also the advantage of using open-source libraries. If there's something in the library that doesn't work right or can be improved, the source code is there to make the changes.

Listing C
std::string trim(const std::string& input)
{
  // If string is empty, there is nothing to look for.
  if (input.empty()) return "";

  // Remove spaces at end
  int last_non_blank = int(input.length()-1);
  int last_character = last_non_blank;
  while ((last_non_blank >= 0) && (isspace(input[last_non_blank]))) {
    --last_non_blank;
  }

  // string full of spaces: return nothing
  if (last_non_blank < 0) return "";

  // Remove spaces at beginning
  // Don't need to check for accessing beyond the string
  // because we've already removed blanks at the end of the
  // string -- if we get this far, the string has non-blanks.
  int first_non_blank = 0;
  while (isspace(input[first_non_blank])) {
      ++first_non_blank;
  }

  if (last_character != last_non_blank || first_non_blank != 0) {
    return input.substr(first_non_blank, last_non_blank + 1 - first_non_blank);
  }
  else {
    return input;
  }
}
Reengineered trim function is 250% faster than the original trim function.

From a usability point of view, STLplus3 is more difficult to use than STLplus. The new library requires five folders to be added to the include path (containers, persistence, portability, strings and subsystems) and up to five #include directives in the code using the library. The previous library only required one path to be included and one #include directive would slurp in the entire library.

Reengineering STLplus3

Andy Rushton has the same challenges, tradeoffs and questions that any library creator faces for a cross-platform, cross-compiler-compatible library:

  • Should the library's source code be segregated or should there be a monolithic source folder?
  • Will the linker on the target system be smart enough to jettison unused parts of the library or will it try to import all two megabytes of object code?
  • Should test modules be included in the distribution source even though they are unsupported and might not even compile on the target system?
  • Will users who only need a small portion of the library be discouraged from using it if there are many subtle interdependencies?

When I interviewed Andy about his reason for modularizing the library's source, he responded:

"One reason for splitting it up was that the library was large and varied and quite a few people fed back that this made it hard to understand. The separation also means that people can cherry-pick parts and I've done some work on reorganising some of the code to reduce inter-dependencies so that most libraries can be used stand-alone. I'd say the main reason though was to clarify what STLplus is for - it extends the STL (containers), implements serialisation (persistence), provides platform independent layers to platform-specific features (portability) etc."

I understand the design decisions that Andy made. Because of my experience with STLplus 2.5 and because I will probably use STLplus3 in almost every program I write, I feel it's worthwhile to spend some time making STLplus3 easier for me (and others) to use.

The same, only different

I use a large number of third-party libraries. Over the years, I've standardized on a common folder structure for most of the libraries so that it's easier for me to use them. The standard folder structure looks like the one in Figure A.

Figure A
Figure A shows a standardized
folder structure I use for most
of the third-party libraries.

The IDE folder segregates the files used by individual compilers. In Figure A, the three compiler environments defined are BCB6, gcc [9] and VS7. The BCB6 folder contains three folders: lib, obj and objd. The lib folder contains all of the library permutations I need (debug, release, single-threaded, and multi-threaded). The obj and objd folders contain the .obj files for the single-threaded release and debug libraries respectively.

The stipulations of this structure are to:

  1. Organize the library so that source files are in specific folders and compiler-specific files are in other folders.
  2. Eliminate the pollination of compiler-specific files into the library's source files.
  3. Provide a consistent but flexible structure that can be used by multiple open-source libraries.
  4. Provide a way to segregate compiler-specific files based on the compiler's version.

The simplicity of the structure is in the include folder. It is the only one I have to examine when I use the library. It contains a header file with the same name as the library (STLplus3.hpp). For backwards compatibility, STLplus3.hpp includes STLplus.hpp.

When STLplus.hpp is compiled on a Borland compiler, a #pragma directive in the file automatically links the correct library from the ~/IDE/BCB6/lib folder. Listing D shows the contents of STLplus.hpp with the #pragma directive. The exact library chosen from the lib folder depends on the _DEBUG conditional define. When _DEBUG is defined, the STLplus3d.lib file is linked in. When _DEBUG is not defined, STLplus3.lib is linked in.

Listing D
#ifndef STLPLUS_HPP
#define STLPLUS_HPP

#include <containers.hpp>
#include <persistence.hpp>
#include <portability.hpp>
#include <strings.hpp>
#include <subsystems.hpp>

#ifdef __BORLANDC__
#  ifdef _DEBUG
#    pragma link "c:/cpp/lib/stlplus3/IDE/BCB6/lib/stlplus3d.lib"
#  else
#    pragma link "c:/cpp/lib/stlplus3/IDE/BCB6/lib/stlplus3.lib"
#  endif
#endif
#endif
STLplus.hpp file conditionally links in either the debug version of STLplus (stlplus3d.lib) or the release version of STLplus (stlplus.lib).

The advantage of this is that I can choose a release build or a debug build and automatically have the correct library linked into the project without changing the library loaded in my project files.

The programs that use the library only need to include one file (STLplus3.hpp) and only add one include folder (~/include) to the compiler's search path.

The examples folder in figure A could either be complete programs or they could be program snippets. The examples that I'm putting in my examples folder are the sample programs from this article.

The src folder contains all of the library's source in a monolithic structure (minus the header files that are in the include folder).

The docs and unitTest folders contain the files you think they do.

Getting Testy

While I was exploring in the STLplus CVS repository on SourceForge.net, I discovered a folder called tests/. I was surprised that this folder existed in the repository because it wasn't included in the distribution .zip file.

The tests folder on SourceForge.net contains unit tests for 18 parts of STLplus3.

These were precisely the unit tests and examples that I complained about in the Pros and Cons section. The user interface for SourceForge.net's repository is stuck in the early 1980s so there is no way to easily download the entire folder. Instead, each file in the folder structure needs to be downloaded separately.

Searching for a better way, I discovered that SourceForge.net supports TortoiseCVS, a client that hooks into the Windows Explorer GUI. Unfortunately, all of my attempts to download files using TortoiseCVS were thwarted by an error message [10].

Frustrated, I told Andy of my problem and I asked him to email the test folder to me. He immediately obliged and I had all 18 unit tests available to me.

All of the tests follow the same pattern:

  • The source for each test is segregated into separate folders.
  • The cpp, executable and folder all use the same name.
  • Unit test failures cause the program to return a non-0 value.
  • Unit test successes cause the program to return a 0.

After some experimenting with different techniques, I decided that the best way to compile the unit tests was directly from the command line using make.

The only difference between each .mak file is in the first line. For example:

TARGET = bitset_test

is the first line of bitset_test.mak. All of the .mak files use relative paths and use the assumptions in the above bullets to find the source and create an executable. For example, OBJFILES is defined as:

OBJFILES = obj\$(TARGET).obj

which resolves to obj\bitset_test.obj at make-time.

Just like the files on SourceForge.net, the tests I received from Andy were in a tests/ folder. I renamed this to UnitTests (to be compatible with other projects) and put it in the root of my STLplus3 folder.

Most of the unit tests contained data files that were used to verify that structures were written and read correctly. I decided to put the unit test executables in the ~/UnitTests folder so that each one could access the specialized datafiles it needed. This violated stipulation #2 (eliminate compiler pollination into source files) but getting around it was not worthwhile because of the difficulty and the increased maintenance costs of any other solution I could think of. The lesson here is to know when to bend (or break) the rules.

All of the .mak files were put into the ~/IDE/BCB6/UnitTests folder. A script called ~/IDE/BCB6/make_unittest.bat makes the .mak files and creates executables in the ~/UnitTest/X folder (where X is a specific test).

I created another script in ~/UnitTest/Run_Tests.bat that executes each test and pauses if there is an error. Run_Tests.bat counts the number of successes and failures and reports it at the end of the run.

Of the 18 unit tests Andy provided me, the only one that fails is dynaload_test because it's trying to load a DLL that I don't have source code for. I felt this was okay because I never use the dynaload feature of STLplus.

Conclusion

If you do any STL programming, get STLplus3. It provides extensions and containers that will help you be more productive.

If you never do any STL programming because you don't like the STL syntax or you are discouraged by STL's inability to do simple things, like trimming strings, then get STLplus3. Its code examples will help you learn STL and become more comfortable with it.

STLplus3 will save you time doing the trivial things you need to do in almost every program and prevent you from reinventing the wheel.

The cross-platform and cross-compiler capabilities in the library allow your C++ programs to be reliably portable to environments you might not normally use, such as MacOS.

The library works, it's solidly built with years of history and with many diverse developers using it.

Andy Ruston, the creator and maintainer of STLplus, has a track record of quickly answering questions posted in the SourceForge.net forum. Throughout the development of this article and the testing of STLplus3, Andy was immediately responsive to all of my questions.

The ability to test the library and having examples from the library's author give me confidence to immediately start using the library on all of my new projects.

Footnotes

[3] The __FILE__ preprocessor macro might not be supported on each compiler but you could easily substitute a constant string with the same effect.

[4] The documentation says the print_X functions (print_vector, for example) are prototyped in strings/string_stl.hpp but they are really prototyped in strings/print_stl.hpp. My reengineering of STLplus3 makes this difference meaningless because I import all of the STLplus3 definitions.

[9] Yes, I know gcc is a command-line compiler and that it's not an IDE. If I were to start over, I suppose I would call the IDE folder 'compiler' but this folder structure is used by too many libraries to change anymore. Besides, IDE is shorter and we all know how much harder it is to type 8 characters instead of 3.

[10] "No connection could be made because the target machine actively refused it."

This article was written by Curtis Krauskopf (email at ).


Popular C++ topics at The Database Managers:

Services | Programming | Contact Us | Recent Updates
Send feedback to: