STLplus3 Review, Upgrades and Reengineering
by Curtis Krauskopf
I had an old friend email me out of the blue a few
weeks ago. It was the first time I've heard from him
in a long time. His life has changed a lot: he's living
in a different part of the country, he's going to get
married and he's on a fitness program to improve his
life and his lifestyle.
Seeing an old friend improve over time is gratifying
because I get to see the transitions. The improvement
process is usually not sudden because it's a slow transformation
over a long time.
I have been using STLplus 2.5 for a very long time
and the library had become an 'old friend' because I
could rely on it so much. I used STLplus 2.5 in every
program I wrote because of the STL helper functions
and the STL-like containers it provided.
I recently discovered STLplus3 3.4 and, similarly to
my real-life friend, it has gone through some transformations
over the years too.
The new STLplus3 version 3.4 code and documentation
have been updated with a modern look that uses widely
acknowledged best practices, such as namespaces.
In case you missed the first
STLplus review, STLplus3 is a collection of C++
components and helper functions that follow the STL
paradigm of containers, iterators, algorithms, function
objects and adapters. STLplus3 expands upon the popular
STL library by providing features the standard STL doesn't
provide, such as:
- Containers
- A smarter smart pointer
- Hash table
- Directed graph
- Bundling
- Object and container persistence
- Portability helpers
- Cross platform functions
- File system access
- TCP classes
- Safe printf-like string formatting
- Simple wildcard pattern matching
- Infinite precision integers
- std::string helpers
Out of the box
STLplus3 is available from SourceForge.net.
The .zip file is only 756k. It contains folders with
the complete source code and documentation along with
makefiles for the gcc compilers and solution files for
Microsoft Visual Studio compilers.
The documentation is still in .html files and the root
page is index.html. Some
of the classes and functions I checked have better documentation
than in STLplus 2.5 and there is an explanation of the
item with a few lines of sample code. The documentation
has hyperlinks to colorized header files (these are
static HTML files and are not dynamically colorized
from the actual source).
Installation
To install the library, I unzip the distribution file
into c:\cpp\lib\STLplus3
on my development machine. This keeps it separate from
my STLPlus 2.5 installation in the ~\lib\STLplus\
folder.
Just like STLplus 2.5, STLplus3 3.4 doesn't have a
Borland-compatible project file in the distribution.
However, the Building and Using the STLplus
Library Collection article in the documentation
indicates that the library is portable to Borland compilers.
Going back in time
One of the first things I do when evaluating STLplus3
3.4 is to reread my original review because it had step-by-step
instructions for installing STLplus 2.5.
In the original review, I manually created a project
group and added a library to it. Next, I manually added
the source for STLplus to the library.
The older STLplus had only one source folder, cleverly
called ~\source\. The
source code for STLplus3 is segregated across five folders:
containers, persistence,
portability, strings
and subsystems. All
of those folders contain header files and all except
the containers folder have .cpp files.
I wonder why Andy Ruston, the creator and maintainer
of STLplus, chose to break the library up like that
but I also decided to worry about that later.
Still following my directions from the original review,
I save the project group and the library but I put them
in the root of the STLplus3
folder. I feel this is okay because the Visual Studio
solution file and workspace file are there too.
When I compile the project, I get a compile-time error:
Unable to open include file 'smart_ptr.hpp'.
The smart_ptr.hpp file
is in the containers
folder. I can blame myself for that compile-time error
because I had not added that folder to the library's
include path. A few mouse clicks later and I have the
full path to the containers
folder added to the include list.
The next compile also fails: library too large, please
restart with library page size 32. I have seen that
type of problem before in the old STLplus library. The
solution is to change the TLIB page size from 0x0010
to 0x0080.
My next surprise is that the library compiles successfully.
None of the source code changes I had to make to STLplus
2.5 were needed in STLplus 3.4.
Warnings
The ~\portability\portability_fixes.hpp
file disables some Borland-specific warnings that I
consider to be important. Lines 64-68 in portability_fixes.hpp
very clearly document which warnings are disabled. However,
the warnings don't need to be disabled. All four warnings,
W8022 and W8060 (virtual function in a base class is
overridden and possible incorrect assignment respectively)
and W8008 for condition always true and W8066 for unreachable
code are not problems in the STLplus code anymore. I
don't like headers that disable compiler warnings so
I've chosen to comment-out the #pragmas
that disable those warnings.
Recompiling the library in release-mode results in
just 11 warnings. About half are W8004, which is for
an assigned value that is never used. Looking over each
instance of W8004, I decided W8004 is a false-alarm
because even though the variable is assigned a value
at declaration, the variable is used as a parameter
elsewhere in the function. The other warnings are W8027,
which complains that functions containing do are not
expanded inline.
In debug-mode, the library compiles with no warnings
or errors.
Andy had clearly done a lot of work cleaning up the
library because STLplus 2.5 compiled with dozens of
warnings.
Hash example
My previous review provided an example of using the
Hash container in STLplus. The only change I had to
make to that program for STLplus3 was to qualify the
use of the hash template with the stlplus::
namespace. The hash example is in listing
A.
Listing
A |
#include <iostream>
#pragma hdrstop
#include <hash.hpp>
using std::string; // to make
using std::cout; // this listing
using std::endl; // narrower
using stlplus::hash;
// World's worst hashing algorithm
struct hash_string {
unsigned operator()
(const string & s) const
{ return 5; }
};
int main(int argc, char* argv[]) {
hash<string, string, hash_string>
symbols;
symbols["alpha"] = "a";
symbols["bravo"] = "b";
cout << "alpha = ";
cout << symbols["alpha"] << endl;
cout << "bravo = ";
cout << symbols["bravo"] << endl;
if (symbols.present("charlie"))
cout << "found charlie" << endl;
else
cout << "charlie not found" << endl;
cout << "charlie = ";
cout << symbols["charlie"] << endl;
if (symbols.present("charlie"))
cout << "found charlie" << endl;
else
cout << "charlie not found" << endl;
return 0;
}
// Expected output:
// alpha = a
// bravo = b
// charlie not found
// charlie =
// found charlie |
hash.cpp sample program. |
The last three lines of the program's output show that
referencing a hash element as an r-value causes it to
be defined in the hash table. This is an expected behavior
for this container and it's documented in STLplus3.
The hash example doesn't test the newly created STLplus3
library because it only uses the hash template which
is defined in the containers folder.
Listing B is an example of
an STLplus program that uses the STLplus3 library. I
used #pragma link to specify
the library. Alternatively, I could have dragged the
library into the project's definition in Project Manager.
In Listing B, stlplus::file_exists
is used to verify that the source file exists. stlplus::folder_part
leaves only the drive and full path of __FILE__.
stlplus::folder_files creates
a vector of the filenames in the same folder as the
source code. The vector is then limited to 3 elements
so that the vector is short enough to print nicely in
this example.
True to its name, stlplus::vector_to_string
converts the vector into a string and separates each
vector element with a comma (specified in the third
parameter). The second parameter of stlplus::vector_to_string
requires a pointer to a function that takes the type
being stored by the std::vector
(in this case a string) and returns a string. STLplus3
provides a suitable function, called stlplus::string_to_string.
If the vector had stored ints,
the helper function would have been stlplus::int_to_string,
as shown in this snippet:
std::vector<int> test;
test.push_back(5);
test.push_back(7);
cout << stlplus::vector_to_string(test, stlplus::int_to_string, ",") << endl;
STLplus provides a convenient function called stlplus::print_vector
that routes the output to any std::ostream.
The stlplus::print_vector
does the same thing as stlplus::vector_to_string
except it uses stlplus::print_string
as a helper function. As would be expected, if the vector
had been of ints, the helper
function would have been stlplus::print_int.
Finally, the stlplus::extension_part
extracts just the extension from the filename.
Listing
B |
#include <iostream>
#include <STLplus.hpp>
using std::cout;
using std::endl;
int main(int, char**)
{
cout << __FILE__ << " exists: ";
cout << stlplus::file_exists(__FILE__);
cout << endl;
std::string root = stlplus::folder_part(__FILE__);
std::vector<std::string> files = stlplus::folder_files(root);
if (files.size() > 3) files.resize(3);
// cout << "Vector uses overloaded extraction operator\n";
// cout << files << endl;
cout << "Using vector_to_string...\n";
cout << stlplus::vector_to_string(files, stlplus::string_to_string, ",") << endl;
cout << "Using print_vector...\n";
stlplus::print_vector(cout, files, stlplus::print_string, ",");
cout << endl;
cout << "Source file extension: ";
cout << stlplus::extension_part(__FILE__);
cout << endl;
std::vector<int> test;
test.push_back(5);
test.push_back(7);
cout << stlplus::vector_to_string(test, stlplus::int_to_string, ",") << endl;
return 0;
}
// Sample output:
// c:\cpp\lib\STLPlus3\Examples\File_System.cpp exists: 1
// Using vector_to_string...
// Examples.bpg,File_System.bpf,File_System.bpr
// Using print_vector...
// Examples.bpg,File_System.bpf,File_System.bpr
// Source file extension: cpp
// 5,7
|
File_System.cpp sample program. |
The program in listing
B is remarkable in that it's both cross-platform
and cross-compiler compatible. The code doesn't know
or care if the operating system is Windows, Linux, Unix
or even MacOS. The code [3]
can be compiled using a Borland compiler (of course),
or in Visual Studio or even gcc and the program would
still work. What's even more remarkable is that there
isn't a single #if anywhere
in the code. All of the cross-platform and cross-compiler
capabilities are hidden inside of STLplus3's library.
In STLplus 2.5, I was able to include just one file,
STLplus.hpp, and everything
in the library would be included. STLplus3 has moved
away from that paradigm. The modules in STLplus3 are
segregated even though some modules (such as persistence,
subsystems and strings)
depend on other modules (containers
and portability). This
leads to the list of #includes
shown at the top of Listing B
that include enough modules to let the program compile.
Pros and Cons
Andy has fixed some of the gripes I had with STLplus
2.5: CVS files are no longer in the distribution file
and the library does not compile with any warnings (other
than Borland-specific compile-time warnings about not
being able to create a precompiled header).
Some of my gripes from STLplus 2.5 haven't been addressed
yet: there are no unit tests and there are no examples,
other than snippets and short programs in the documentation.
The documentation, by the way, is well-written with
many examples. The documentation for smart pointers,
in particular, has been significantly improved over
the previous versions. Despite the massive refactoring
of the library between STLplus 2.5 and STLplus3 3.4,
the documentation is correct (in most cases) [4].
The code is also well written with copious helpful
comments. STL vendors, such as STLPort, oftentimes pride
themselves on efficient
algorithms that use advanced
techniques. STLplus is an exception. One of the
users politely asked Andy about optimization and Andy's
response was:
"In my opinion, code should be written
for readability first and only optimised if profiling
shows a need. In my experience (and contrary to popular
belief), most performance problems are caused by poor
data-structure design, not by lack of optimisation of
code. In fact, optimising every line of code tends to
be counter-productive."
I tend to agree with Andy. At one point I had taken
over the maintenance of a miserable C++ project. The
original authors must have thought that the ?:
operator provided superior performance over an if/then
statement because they used it often and in many clever
ways. However, in almost every case, every use of the
?: operator in their code
contained a subtle bug of some sort.
My point is if you're expecting STLplus3 to use an
optimum algorithm because STL is in the library's name,
you'll be disappointed. Here's a typical example:
std::string trim(const std::string& val)
{
std::string result = val;
while (!result.empty() && isspace(result[0]))
result.erase(result.begin());
while (!result.empty() && isspace(result[result.size()-1]))
result.erase(result.end()-1);
return result;
}
The trim function in
STLplus3 is short, it's easy to understand and a desk-check
of the code shows that it generates the right results
for any string. The only problem is that it's slow --
mostly because the while loops are using std::vector's
erase method on each of the leading and trailing blanks
of a string. A project I had worked on required over
50,000,000 trim operations. When I tried to find out
why the program was so slow, the STLplus3 trim() function
showed up in the profiler as a bottleneck. Listing
C shows my improved version.
Although I'm not professing that listing
C is the best possible trim implementation, my testing
shows that it has a 250% performance improvement. A
desk-check of listing C is
not nearly as easy as a desk-check of the original trim()
function.
But that's also the advantage of using open-source
libraries. If there's something in the library that
doesn't work right or can be improved, the source code
is there to make the changes.
Listing
C |
std::string trim(const std::string& input)
{
// If string is empty, there is nothing to look for.
if (input.empty()) return "";
// Remove spaces at end
int last_non_blank = int(input.length()-1);
int last_character = last_non_blank;
while ((last_non_blank >= 0) && (isspace(input[last_non_blank]))) {
--last_non_blank;
}
// string full of spaces: return nothing
if (last_non_blank < 0) return "";
// Remove spaces at beginning
// Don't need to check for accessing beyond the string
// because we've already removed blanks at the end of the
// string -- if we get this far, the string has non-blanks.
int first_non_blank = 0;
while (isspace(input[first_non_blank])) {
++first_non_blank;
}
if (last_character != last_non_blank || first_non_blank != 0) {
return input.substr(first_non_blank, last_non_blank + 1 - first_non_blank);
}
else {
return input;
}
}
|
Reengineered trim function is
250% faster than the original trim function. |
From a usability point of view, STLplus3 is more difficult
to use than STLplus. The new library requires five folders
to be added to the include path (containers,
persistence, portability,
strings and subsystems)
and up to five #include
directives in the code using the library. The previous
library only required one path to be included and one
#include directive would
slurp in the entire library.
Reengineering STLplus3
Andy Rushton has the same challenges, tradeoffs and
questions that any library creator faces for a cross-platform,
cross-compiler-compatible library:
- Should the library's source code be segregated or
should there be a monolithic source folder?
- Will the linker on the target system be smart enough
to jettison unused parts of the library or will it
try to import all two megabytes of object code?
- Should test modules be included in the distribution
source even though they are unsupported and might
not even compile on the target system?
- Will users who only need a small portion of the
library be discouraged from using it if there are
many subtle interdependencies?
When I interviewed Andy about his reason for modularizing
the library's source, he responded:
"One reason for splitting it up was
that the library was large and varied and quite a few
people fed back that this made it hard to understand.
The separation also means that people can cherry-pick
parts and I've done some work on reorganising some of
the code to reduce inter-dependencies so that most libraries
can be used stand-alone. I'd say the main reason though
was to clarify what STLplus is for - it extends the
STL (containers), implements serialisation (persistence),
provides platform independent layers to platform-specific
features (portability) etc."
I understand the design decisions that Andy made. Because
of my experience with STLplus 2.5 and because I will
probably use STLplus3 in almost every program I write,
I feel it's worthwhile to spend some time making STLplus3
easier for me (and others) to use.
The same, only different
I use a large number of third-party libraries. Over
the years, I've standardized on a common folder structure
for most of the libraries so that it's easier for me
to use them. The standard folder structure looks like
the one in Figure A.
Figure
A |
|
Figure A shows a standardized
folder structure I use for most
of the third-party libraries. |
The IDE folder segregates the files used by individual
compilers. In Figure A, the
three compiler environments defined are BCB6, gcc [9]
and VS7. The BCB6 folder contains three folders: lib,
obj and objd.
The lib folder contains
all of the library permutations I need (debug, release,
single-threaded, and multi-threaded). The obj
and objd folders contain
the .obj files for the single-threaded release and debug
libraries respectively.
The stipulations of this structure are to:
- Organize the library so that source files are in
specific folders and compiler-specific files are in
other folders.
- Eliminate the pollination of compiler-specific files
into the library's source files.
- Provide a consistent but flexible structure that
can be used by multiple open-source libraries.
- Provide a way to segregate compiler-specific files
based on the compiler's version.
The simplicity of the structure is in the include
folder. It is the only one I have to examine when I
use the library. It contains a header file with the
same name as the library (STLplus3.hpp).
For backwards compatibility, STLplus3.hpp
includes STLplus.hpp.
When STLplus.hpp is compiled on a Borland compiler,
a #pragma directive in
the file automatically links the correct library from
the ~/IDE/BCB6/lib folder.
Listing D shows the contents
of STLplus.hpp with the
#pragma directive. The
exact library chosen from the lib
folder depends on the _DEBUG conditional
define. When _DEBUG is
defined, the STLplus3d.lib
file is linked in. When _DEBUG
is not defined, STLplus3.lib
is linked in.
Listing
D |
#ifndef STLPLUS_HPP
#define STLPLUS_HPP
#include <containers.hpp>
#include <persistence.hpp>
#include <portability.hpp>
#include <strings.hpp>
#include <subsystems.hpp>
#ifdef __BORLANDC__
# ifdef _DEBUG
# pragma link "c:/cpp/lib/stlplus3/IDE/BCB6/lib/stlplus3d.lib"
# else
# pragma link "c:/cpp/lib/stlplus3/IDE/BCB6/lib/stlplus3.lib"
# endif
#endif
#endif
|
STLplus.hpp file conditionally
links in either the debug version of STLplus (stlplus3d.lib)
or the release version of STLplus (stlplus.lib). |
The advantage of this is that I can choose a release
build or a debug build and automatically have the correct
library linked into the project without changing the
library loaded in my project files.
The programs that use the library only need to include
one file (STLplus3.hpp)
and only add one include folder (~/include)
to the compiler's search path.
The examples folder in figure
A could either be complete programs or they could
be program snippets. The examples that I'm putting in
my examples folder are
the sample programs from this article.
The src folder contains
all of the library's source in a monolithic structure
(minus the header files that are in the include
folder).
The docs and unitTest
folders contain the files you think they do.
Getting Testy
While I was exploring in the STLplus CVS repository
on SourceForge.net, I discovered a folder called tests/.
I was surprised that this folder existed in the repository
because it wasn't included in the distribution .zip
file.
The tests folder on
SourceForge.net contains unit tests for 18 parts of
STLplus3.
These were precisely the unit tests and examples that
I complained about in the Pros
and Cons section. The user interface for SourceForge.net's
repository is stuck in the early 1980s so there is no
way to easily download the entire folder. Instead, each
file in the folder structure needs to be downloaded
separately.
Searching for a better way, I discovered that SourceForge.net
supports TortoiseCVS, a client that hooks into the Windows
Explorer GUI. Unfortunately, all of my attempts to download
files using TortoiseCVS were thwarted by an error message
[10].
Frustrated, I told Andy of my problem and I asked him
to email the test folder to me. He immediately obliged
and I had all 18 unit tests available to me.
All of the tests follow the same pattern:
- The source for each test is segregated into separate
folders.
- The cpp, executable and folder all use the same
name.
- Unit test failures cause the program to return a
non-0 value.
- Unit test successes cause the program to return
a 0.
After some experimenting with different techniques,
I decided that the best way to compile the unit tests
was directly from the command line using make.
The only difference between each .mak file is in the
first line. For example:
TARGET = bitset_test
is the first line of bitset_test.mak.
All of the .mak files use relative paths and use the
assumptions in the above bullets to find the source
and create an executable. For example, OBJFILES
is defined as:
OBJFILES = obj\$(TARGET).obj
which resolves to obj\bitset_test.obj
at make-time.
Just like the files on SourceForge.net, the tests I
received from Andy were in a tests/
folder. I renamed this to UnitTests
(to be compatible with other projects) and put it in
the root of my STLplus3
folder.
Most of the unit tests contained data files that were
used to verify that structures were written and read
correctly. I decided to put the unit test executables
in the ~/UnitTests folder
so that each one could access the specialized datafiles
it needed. This violated stipulation #2 (eliminate compiler
pollination into source files) but getting around it
was not worthwhile because of the difficulty and the
increased maintenance costs of any other solution I
could think of. The lesson here is to know when to bend
(or break) the rules.
All of the .mak files were put into the ~/IDE/BCB6/UnitTests
folder. A script called ~/IDE/BCB6/make_unittest.bat
makes the .mak files and creates executables in the
~/UnitTest/X folder
(where X is a specific test).
I created another script in ~/UnitTest/Run_Tests.bat
that executes each test and pauses if there is an error.
Run_Tests.bat counts the
number of successes and failures and reports it at the
end of the run.
Of the 18 unit tests Andy provided me, the only one
that fails is dynaload_test
because it's trying to load a DLL that I don't have
source code for. I felt this was okay because I never
use the dynaload feature of STLplus.
Conclusion
If you do any STL programming, get STLplus3. It provides
extensions and containers that will help you be more productive.
If you never do any STL programming because you don't
like the STL syntax or you are discouraged by STL's
inability to do simple things, like trimming strings,
then get STLplus3. Its code examples will help you learn
STL and become more comfortable with it.
STLplus3 will save you time doing the trivial things
you need to do in almost every program and prevent you
from reinventing the wheel.
The cross-platform and cross-compiler capabilities
in the library allow your C++ programs to be reliably
portable to environments you might not normally use,
such as MacOS.
The library works, it's solidly built with years of
history and with many diverse developers using it.
Andy Ruston, the creator and maintainer of STLplus,
has a track record of quickly answering questions posted
in the SourceForge.net forum. Throughout the development
of this article and the testing of STLplus3, Andy was
immediately responsive to all of my questions.
The ability to test the library and having examples
from the library's author give me confidence to immediately
start using the library on all of my new projects.
Footnotes
[3] The __FILE__ preprocessor
macro might not be supported on each compiler but you
could easily substitute a constant string with the same
effect.
[4] The documentation says
the print_X functions (print_vector, for example) are
prototyped in strings/string_stl.hpp but they are really
prototyped in strings/print_stl.hpp. My reengineering
of STLplus3 makes this difference meaningless because
I import all of the STLplus3 definitions.
[9] Yes, I know gcc is a command-line
compiler and that it's not an IDE. If I were to start
over, I suppose I would call the IDE folder 'compiler'
but this folder structure is used by too many libraries
to change anymore. Besides, IDE is shorter and we all
know how much harder it is to type 8 characters instead
of 3.
[10] "No connection could
be made because the target machine actively refused
it."
This
article was written by Curtis Krauskopf (email at ).
Popular C++ topics at The Database Managers:
|