July, 2008
This is a string library that is intended to be compatible with the class string library in the C++ standard. My version is for strings of characters of type char only.
It is for people who do not have access to an official version of the string library or wish to use a version without templates.
It follows the standard class string as I understand it, except that a few functions that are relevant only to the template version are omitted, and all the functions involving iterators are omitted.
I use the name String rather than string to prevent conflicts with other string libraries (as in BC 5.0).
The initial version was taken from Tony Hansen's book The C++ answer book, but very little of Tony's code remains.
Permission is granted to use/modify/distribute this. If you distribute it or put it on your web site please include a link to my site. If you distribute a modified version please make it clear which bits are mine and which are yours. I take no responsibility for errors, omissions etc, but please tell me about them.
This library links into my exception package. If you are using a very old compiler, you may need to edit the file include.h to determine whether to use simulated exceptions or compiler supported exceptions or simply to disable exceptions. More information on the exception package is given in the documentation for my matrix library, newmat11.
The package uses a limited form of copy-on-write (see Tony Hansen's book for more details) and also attempts to avoid repeated reallocation of the string storage during a multiple sum. This results in some saving in space and time for some operations at the expense of an increase in the complexity of the program and an increase in the time used by a few operations. As with newmat it is still an open question whether the extra complexity is really warranted. Or under what circumstances it is really warranted.
This package includes simple functions for manipulating strings and a class for extracting information from the command line.
It also includes class libraries to help format numerical output and to edit ASCII files. They documented in separate files.
The following files are included in this package
str.h | header file for the string library |
str.cpp | function bodies |
str_fns.h | header file for string functions |
str_fns.cpp | string functions bodies |
commline.h | command line class header |
commline.cpp | command line bodies |
myexcept.h | header for the exceptions simulator |
myexcept.cpp | bodies for the exceptions simulator |
include.h | options header file (see documentation in newmat11) |
strtst.cpp | test program |
strtst.dat | data file used by test program |
strtst.txt | output from the test program |
test_exs.cpp | test exceptions |
test_exs.txt | output from test_exs |
readme.txt | readme file |
string.htm | this file |
rbd.css | style sheet for use with htm files |
st_gnu.mak | make file for gnu c++ |
st_cc.mak | make file for CC compiler |
st_b55.mak | make file for Borland C++ 5.5 |
st_b56.mak | make file for Borland C++ 5.6 |
st_b58.mak | make file for Borland C++ 5.8 |
st_m6.mak | make file for Visual C++ version 6 or 7 |
st_m8.mak | make file for Visual C++ version 8 |
st_i8.mak | make file for Intel compiler for Windows, v8,9 |
st_i10.mak | make file for Intel compiler for Windows, v10 |
st_il8.mak | make file for Intel compiler for Linux, v8,9,10 |
st_ow.mak | Make file for Open Watcom compiler |
str.lfl | library file list for make file generator |
st_targ.txt | target file for make file generator |
format.h | header file for format program |
format.cpp | bodies for format program |
formtest.cpp | test program for format program |
formtest.txt | output from test program |
format.htm | documentation for format program |
gstring.h | header file for gstring ascii file editor |
gstring.cpp | bodies for gstring program |
liststr.cpp | bodies for gstring program |
lstst.cpp | test program |
fox.dat | test data file |
lstst.dat | test data file |
lstst.txt | output from test program |
gstring.htm | documentation for gstring program |
I have tested this program on recent versions of the Borland, Microsoft, Gnu, Intel, Sun, Open Watcom compilers.
You may need to edit include.h - but it will probably work for you as is. See the newmat documentation for more information about editing include.h.
Activate the _STANDARD_ option to use the form of include statements used in standard C++ (automatic for recent versions of Borland, Microsoft, Gnu and Intel compilers).
Activate the use_namespace to put the string library in namespace RBD_STRING.
The GString library which is included in this package uses nested classes and will not compile under older compilers.
Some CC compilers generate 33 error messages when running the strtst test program. I suspect these are due to a slightly different convention in deleting temporaries and don't matter.
For the indexes, lengths etc I use unsigned int (typedefed to uint). This is instead of size_type in the official package.
You will need to #include files include.h and str.h in your programs that use this package. Don't forget to edit include.h to determine whether exceptions are to be used, simulated or disabled. If you use the simulated exceptions you should turn off the exception capability of a compiler that does support exceptions.
I have included make files for a variety of compilers for compiling the test programs. Make files for some other compilers can be generated using my genmake utility. The file st_targ.txt gives the list of targets for genmake and str.lfl has the list of names of the libraries. See the genmake documentation for more details about the make files.
static uint npos | String::npos is the largest possible value of uint and is used to indicate that a find function has failed to find its target. All Strings must have length strictly less than String::npos |
String() | construct a String of zero length |
String(const String&str) | copy constructor (not explicitly in standard) |
String(const String&str, uint pos, uint n = npos) | construct a String from str starting at location pos (first location = 0) and continuing for the length of the String or for n characters, whichever occurs first |
String(const char* s, uint n) | construct a String from s taking a maximum of n characters or the length of the String |
String(const char* s) | construct a String from s |
String(uint n, char c) | construct a String consisting of n copies of the character c |
~String() | the destructor |
String& operator=(const String& str) | copy a String (except that it may be able to avoid copying) |
String& operator=(const char* s) | set a String equal to a c-style character string pointed to by s |
String& operator=(const char c) | set a String equal to a character |
uint size() const | the length of the String (does not include a trailing zero - in most cases there isn't one) |
uint length() const | same as size |
uint max_size() const | the maximum size of a String, I have set it to npos-1 |
void resize(uint n, char c = 0) | change the size of a String, either by truncating or filling out with copies of character c (std does default separately) |
uint capacity() const | the total space allocated for a String (always >= size()) |
void reserve(uint res_arg = 0) | change the capacity of a String to the maximum of res_arg and size(). This may be an increase or a decrease in the capacity. |
void clear() | erase the contents of the string |
bool empty() const | true if the String is empty; false otherwise |
char operator[](uint pos) const | return the pos-th character; return 0 if pos = size() |
char& operator[](uint pos) | return a reference to the pos-th character; undefined if pos>=size() - I throw an exception. This reference may become invalid after almost any manipulation of the String |
char at(uint n) const | same as operator[] const |
char& at(uint n) | same as operator[]. Throw an exception of pos >=size() |
For conditions under which references and pointers to data are invalidated by these functions see policy on reallocation.
String& operator+=(const String& rhs) | append rhs to a String |
String& operator+=(const char* s) | append the c-string defined by s to a String |
String& operator+=(char c) | append the character c to a String |
String& append(const String& str) | append str to a String |
String& append(const String& str, uint pos, uint n) | append String(str,pos,n) |
String& append(const char* s, uint n) | append String(s,n) |
String& append(const char* s) | append String(s) |
String& append(uint n, char c) | append character c |
void push_back(char c) | operator+=(c) |
String& assign(const String& str) | replace the String by str (this function is not explicitly in the standard) |
String& assign(const String& str, uint pos, uint n) | replace the String by String(str,pos,n) |
String& assign(const char* s, uint n) | replace the String by String(s, n) |
String& assign(const char* s) | replace the String by String(s) |
String& assign(uint n, char c) | replace the String by String(c) |
String& insert(uint pos1, const String& str) | insert str before character pos1 |
String& insert(uint pos1, const String& str, uint pos2, uint n) | insert String(str,pos2,n) before character pos1 |
String& insert(uint pos, const char* s, uint n = npos) | insert String(s,n) before character pos (std does default separately) |
String& insert(uint pos, uint n, char c) | insert character c(s,n) before character pos |
String& erase(uint pos = 0, uint n = npos) | erase characters starting at pos and continuing for n characters or till the end of the String. This was originally called remove |
String& replace(uint pos1, uint n1, const String& str) | erase(pos1,n1); insert(pos1,str) |
String& replace(uint pos1, uint n1, const String& str, uint pos2, uint n2) | erase(pos1,n1); insert(pos1,str,pos2,n2) |
String& replace(uint pos, uint n1, const char* s, uint n2 = npos) | erase(pos,n1); insert(pos,s,n2); (std does default separately) |
String& replace(uint pos, uint n1, uint n2, char c) | erase(pos,n1); insert(pos,n2,c) |
uint copy(char* s, uint n, uint pos = 0) const | copy a maximum of n characters from a string starting at position pos to memory starting at location given by s. Return the number of characters copied. I assume that the program has already allocated space for the characters |
void swap(String&) | a.swap(b) swaps the contents of Strings a and b. The standard also provides for a function swap(a,b) - see binary operators |
const char* c_str() const | return a pointer to the contents of a String after appending (char)0 to the String. This pointer will be invalidated by almost any operation on the String |
const char* data() const | return a pointer to the contents of a String. This pointer will be invalidated by almost any operation on the String |
uint find(const String& str, uint pos = 0) const | find the first location of str in a String starting at position pos. The location is relative to the beginning of the parent String. Return String::npos if not found |
uint find(const char* s, uint pos, uint n) const | find(String(s,n),pos) |
uint find(const char* s, uint pos = 0) const | find(String(s),pos) |
uint find(const char c, uint pos = 0) const | find(String(1,c),pos) |
uint rfind(const String& str, uint pos = npos) const | find the last location of str in a String starting at position pos. ie begin the search with the first character of str at position pos of the target String. The location is relative to the beginning of the parent String. Return String::npos if not found |
uint rfind(const char* s, uint pos, uint n) const | rfind(String(s,n),pos) |
uint rfind(const char* s, uint pos = npos) const | rfind(String(s),pos) |
uint rfind(const char c, uint pos = npos) const | rfind(String(1,c),pos) |
uint find_first_of(const String& str, uint pos = 0) const | find first of any element in str starting at pos. Return String::npos if not found |
uint find_first_of(const char* s, uint pos, uint n) const | find_first_of(String(s,n),pos) |
uint find_first_of(const char* s, uint pos = 0) const | find_first_of(String(s),pos) |
uint find_first_of(const char c, uint pos = 0) const | find_first_of(String(1,c),pos) |
uint find_last_of(const String& str, uint pos = npos) const | find last of any element in str starting at pos. Return String::npos if not found |
uint find_last_of(const char* s, uint pos, uint n) const | find_last_of(String(s,n),pos) |
uint find_last_of(const char* s, uint pos = npos) const | find_last_of(String(s),pos) |
uint find_last_of(const char c, uint pos = npos) const | find_last_of(String(1,c),pos) |
uint find_first_not_of(const String& str, uint pos = 0) const | find first of any element not in str starting at pos. Return String::npos if not found |
uint find_first_not_of(const char* s, uint pos, uint n) const | find_first_not_of(String(s,n),pos) |
uint find_first_not_of(const char* s, uint pos = 0) const | find_first_not_of(String(s),pos) |
uint find_first_not_of(const char c, uint pos = 0) const | find_first_not_of(String(1,c),pos) |
uint find_last_not_of(const String& str, uint pos = npos) const | find last of any element not in str starting at pos. Return String::npos if not found |
uint find_last_not_of(const char* s, uint pos, uint n) const | find_last_not_of(String(s,n),pos) |
uint find_last_not_of(const char* s, uint pos = npos) const | find_last_not_of(String(s),pos) |
uint find_last_not_of(const char c, uint pos = npos) const | find_last_not_of(String(1,c),pos) |
String substr(uint pos = 0, uint n = npos) const | return String(*this, pos, n) |
int compare(const String& str) const | a.compare(b) compares a and b in normal sort order. Return -1, 0 or 1 |
int compare(uint pos, uint n, const String& str) const | a.compare(pos,n,b) compares String(a,pos,n) and b in normal sort order. Return -1, 0 or 1 |
int compare(uint pos1, uint n1, const String& str, uint pos2, uint n2) const | a.compare(pos1,n1,b,pos2,n2) compares String(a,pos1,n1) and String(b,pos2,n2) in normal sort order. Return -1, 0 or 1 |
int compare(const char* s) const | return compare(String(s)) |
int compare(uint pos1, uint n1, const char* s, uint n2 = npos) const | return compare(pos1, n1, String(s,n2)) |
+ means concatenate, otherwise the meanings are obvious.
String operator+(const String& lhs, const String& rhs) String operator+(const char* lhs, const String& rhs) String operator+(char lhs, const String& rhs) String operator+(const String& lhs, const char* rhs) String operator+(const String& lhs, char rhs)
bool operator==(const String& lhs, const String& rhs) bool operator==(const char* lhs, const String& rhs) bool operator==(const String& lhs, const char* rhs)
bool operator!=(const String& lhs, const String& rhs) bool operator!=(const char* lhs, const String& rhs) bool operator!=(const String& lhs, const char* rhs)
bool operator<(const String& lhs, const String& rhs) bool operator<(const char* lhs, const String& rhs) bool operator<(const String& lhs, const char* rhs)
bool operator>(const String& lhs, const String& rhs) bool operator>(const char* lhs, const String& rhs) bool operator>(const String& lhs, const char* rhs)
bool operator<=(const String& lhs, const String& rhs) bool operator<=(const char* lhs, const String& rhs) bool operator<=(const String& lhs, const char* rhs)
bool operator>=(const String& lhs, const String& rhs) bool operator>=(const char* lhs, const String& rhs) bool operator>=(const String& lhs, const char* rhs)
void swap(const String& A, const String& B)
The stream functions - slightly rough implementation as yet:
istream& operator>>(istream& is, String& str)
... read token from istream
ostream& operator<<(ostream& os, const String& str)
... output a String
istream& getline(istream is, String& str, char delim = '\n')
... read a line
This section discusses under what circumstances the String data in a String object will be moved. It is unclear to me what the standard allows. Moving the String data invalidates the const char* returned by .data() and .c_str() and any reference returned by the non-const versions of .at() or operator[] (and any iterators referring to the string).
I describe here what my program does. Another standard String package may (and probably does) follow different rules.
The value returned by .c_str will most likely become invalid under almost any operation of the String which changes the value of the String. Also a call to .c_str will invalidate a const char* returned by .data() and any reference returned by .at() or operator[].
If A is a String that has been assigned a capacity with the reserve function then the following functions will not cause a reallocation (so the value returned by .data() etc. will remain valid)
A += ... A.assign(...) A.append(...) A.insert(...) A.erase(...) A.replace(...)
where ... denotes a legitimate argument, providing the resulting String will fit in the assigned capacity (as set by a call to reserve).
If the resulting String will not fit into the assigned capacity the String data will be moved (so the value returned by .data() etc. will not remain valid). Also the String will no longer be regarded as having an assigned capacity.
The concept of having an assigned capacity is important in considering the behaviour of assign, erase and replace when the parameters are such that length of the String is reduced. For example
String A = "0123456789"; A.reserve(1); // will set capacity to A.size() = 10 const char* d = A.data(); A.erase(1,9);
will leave a valid value in d whereas
String A = "0123456789"; const char* d = A.data(); A.erase(1,9);
will not leave a valid value in d since the storage of the String data will have been moved.
The operator= does not conform to these rules. A = something will always remove any assigned capacity for A (and will not pick up any capacity from the something).
In this package A.reserve() or A.reserve(0) will remove any assigned capacity. i.e. it will be as though no capacity had ever been assigned. So an erase or a replace that changes a length will cause a reallocation.
But don't expect anyone else's package to follow these rules.
The evaluation of the concatenation expression A+B is delayed until the expression is used or until the value is referred to twice. This means the expressions such as A+B+C are evaluated in one sweep rather than having A+B formed as a temporary before evaluating A+B+C.
Unfortunately, this means that in expressions such as A + c_string the c-string c_string will be converted to a String object, before the overall String is formed. Since c-strings will usually be small I don't see this as a serious problem.
Likewise A+=X or A.append(X) will not be evaluated until the result is used (unless A has been assigned a capacity that is large enough to accommodate X). This means that sequences like
A += X1; A += X2; ...
will not cause repeated reallocations of the space used by the String data.
These are a set of simple functions for manipulating strings. You need the header file str_fns.h and body file str_fns.cpp.
String ToString(int i) | Convert int to string |
String ToString(long i) | Convert long to string |
String ToString(double f, int ndec = 4) | Convert double to string; ndec determines the number of decimal places |
void UpperCase(String& S) | Convert string to upper case |
void LowerCase(String& S) | Convert string to lower case |
bool IsInt(const String& S) | Does a string represent an integer? |
bool IsFloat(const String& S) | Does a string represent a floating point number (includes integer, does allow for E format)? |
inline bool Contains(const String& S, const String& str) inline bool Contains(const String& S, const char* s) inline bool Contains(const String& S, char c) |
Does S contain str, s or c, respectively? |
inline bool ContainsAnyOf(const String& S, const String& str) inline bool ContainsAnyOf(const String& S, const char* s) inline bool ContainsAnyOf(const String& S, char c) |
Does S contain any of the characters of str, s or c, respectively? |
inline bool ContainsOnly(const String& S, const String& str) inline bool ContainsOnly(const String& S, const char* s) inline bool ContainsOnly(const String& S, char c) |
Does S contain only characters of str, s or c, respectively? |
int sf(String& S, const String& s1, const String& s2); int sl(String& S, const String& s1, const String& s2); int sa(String& S, const String& s1, const String& s2); |
Suppose S contains a contains a copy of s1. The function sf replaces the first copy by s2, sl replaces the last copy and sa replaces all copies. Return number of changes (0 or 1 for sf and sl). |
This is a simple class for extracting the information from the command line (when you call a program from a text window). See the genmake program as an example. I assume you call your program with a command like
program -options A B C
where program is the name of the program, options is a sequence of single letter options with no spaces and A B C is a sequence of names separated by spaces.
Start your main program with
#include "str.h" #include "commline.h" int main(int argc, char** argv) { CommandLine CL(argc, argv); ...
Here are the member functions for the CommandLine class.
CommandLine(int argc, char** argv) | Constructor: argc, argv from main(int argc, char** argv) |
int argc() | Return argc |
char** argv() | Return argv |
String GetArg(int i) | Get the i-th name; i=1 for first name after options |
String GetOptions() | Get option sequence |
int NumberOfArgs() | Return number of arguments excluding options |
bool Options() | True if there are options |
bool HasOption(const String& s) | True if options has any character in s |
bool HasOptionCI(const String& s) | Case independent version of HasOption |