Search the FAQ Archives

3 - A - B - C - D - E - F - G - H - I - J - K - L - M
N - O - P - Q - R - S - T - U - V - W - X - Y - Z
faqs.org - Internet FAQ Archives

C++ FAQ (part 06 of 14)

( Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - Part8 - Part9 - Part10 - Part11 - Part12 - Part13 - Part14 )
[ Usenet FAQs | Web FAQs | Documents | RFC Index | Business Photos and Profiles ]
Archive-name: C++-faq/part06
Posting-Frequency: monthly
Last-modified: Jun 17, 2002
URL: http://www.parashift.com/c++-faq-lite/

See reader questions & answers on this topic! - Help others by sharing your knowledge
AUTHOR: Marshall Cline / cline@parashift.com / 972-931-9470

COPYRIGHT: This posting is part of "C++ FAQ Lite."  The entire "C++ FAQ Lite"
document is Copyright(C)1991-2002 Marshall Cline, Ph.D., cline@parashift.com.
All rights reserved.  Copying is permitted only under designated situations.
For details, see section [1].

NO WARRANTY: THIS WORK IS PROVIDED ON AN "AS IS" BASIS.  THE AUTHOR PROVIDES NO
WARRANTY WHATSOEVER, EITHER EXPRESS OR IMPLIED, REGARDING THE WORK, INCLUDING
WARRANTIES WITH RESPECT TO ITS MERCHANTABILITY OR FITNESS FOR ANY PARTICULAR
PURPOSE.

C++-FAQ-Lite != C++-FAQ-Book: This document, C++ FAQ Lite, is not the same as
the C++ FAQ Book.  The book (C++ FAQs, Cline and Lomow, Addison-Wesley) is 500%
larger than this document, and is available in bookstores.  For details, see
section [3].

==============================================================================

SECTION [11]: Destructors


[11.1] What's the deal with destructors?

A destructor gives an object its last rites.

Destructors are used to release any resources allocated by the object.  E.g.,
class Lock might lock a semaphore, and the destructor will release that
semaphore.  The most common example is when the constructor uses new, and the
destructor uses delete.

Destructors are a "prepare to die" member function.  They are often abbreviated
"dtor".

==============================================================================

[11.2] What's the order that local objects are destructed?

In reverse order of construction: First constructed, last destructed.

In the following example, b's destructor will be executed first, then a's
destructor:

 void userCode()
 {
   Fred a;
   Fred b;
   // ...
 }

==============================================================================

[11.3] What's the order that objects in an array are destructed?

In reverse order of construction: First constructed, last destructed.

In the following example, the order for destructors will be a[9], a[8], ...,
a[1], a[0]:

 void userCode()
 {
   Fred a[10];
   // ...
 }

==============================================================================

[11.4] Can I overload the destructor for my class?

No.

You can have only one destructor for a class Fred.  It's always called
Fred::~Fred().  It never takes any parameters, and it never returns anything.

You can't pass parameters to the destructor anyway, since you never explicitly
call a destructor[11.5] (well, almost never[11.10]).

==============================================================================

[11.5] Should I explicitly call a destructor on a local variable?

No!

The destructor will get called again at the close } of the block in which the
local was created.  This is a guarantee of the language; it happens
automagically; there's no way to stop it from happening.  But you can get
really bad results from calling a destructor on the same object a second time!
Bang! You're dead!

==============================================================================

[11.6] What if I want a local to "die" before the close } of the scope in which
       it was created? Can I call a destructor on a local if I really want to?

No! [For context, please read the previous FAQ[11.5]].

Suppose the (desirable) side effect of destructing a local File object is to
close the File.  Now suppose you have an object f of a class File and you want
File f to be closed before the end of the scope (i.e., the }) of the scope of
object f:

 void someCode()
 {
   File f;

   // ... [This code that should execute when f is still open] ...

   // <-- We want the side-effect of f's destructor here!

   // ... [This code that should execute after f is closed] ...
 }

There is a simple solution to this problem[11.7].  But in the mean time,
remember: Do not explicitly call the destructor![11.5]

==============================================================================

[11.7] OK, OK already; I won't explicitly call the destructor of a local; but
       how do I handle the above situation?

[For context, please read the previous FAQ[11.6]].

Simply wrap the extent of the lifetime of the local in an artificial block
{...}:

 void someCode()
 {
   {
     File f;
     // ... [This code will execute when f is still open] ...
   }
 // ^-- f's destructor will automagically be called here!

   // ... [This code will execute after f is closed] ...
 }

==============================================================================

[11.8] What if I can't wrap the local in an artificial block?

Most of the time, you can limit the lifetime of a local by wrapping the local
in an artificial block ({...})[11.7].  But if for some reason you can't do
that, add a member function that has a similar effect as the destructor.  But
do not call the destructor itself!

For example, in the case of class File, you might add a close() method.
Typically the destructor will simply call this close() method.  Note that the
close() method will need to mark the File object so a subsequent call won't
re-close an already-closed File.  E.g., it might set the fileHandle_ data
member to some nonsensical value such as -1, and it might check at the
beginning to see if the fileHandle_ is already equal to -1:

 class File {
 public:
   void close();
   ~File();
   // ...
 private:
   int fileHandle_;   // fileHandle_ >= 0 if/only-if it's open
 };

 File::~File()
 {
   close();
 }

 void File::close()
 {
   if (fileHandle_ >= 0) {
     // ... [Perform some operating-system call to close the file] ...
     fileHandle_ = -1;
   }
 }

Note that the other File methods may also need to check if the fileHandle_ is
-1 (i.e., check if the File is closed).

Note also that any constructors that don't actually open a file should set
fileHandle_ to -1.

==============================================================================

[11.9] But can I explicitly call a destructor if I've allocated my object with
       new?

Probably not.

Unless you used placement new[11.10], you should simply delete the object
rather than explicitly calling the destructor.  For example, suppose you
allocated the object via a typical new expression:

 Fred* p = new Fred();

Then the destructor Fred::~Fred() will automagically get called when you delete
it via:

 delete p;  // Automagically calls p->~Fred()

You should not explicitly call the destructor, since doing so won't release the
memory that was allocated for the Fred object itself.  Remember: delete p does
two things[16.8]: it calls the destructor and it deallocates the memory.

==============================================================================

[11.10] What is "placement new" and why would I use it?

There are many uses of placement new.  The simplest use is to place an object
at a particular location in memory.  This is done by supplying the place as a
pointer parameter to the new part of a new expression:

 #include <new>        // Must #include this to use "placement new"
 #include "Fred.h"     // Declaration of class Fred

 void someCode()
 {
   char memory[sizeof(Fred)];     // Line #1
   void* place = memory;          // Line #2

   Fred* f = new(place) Fred();   // Line #3 (see "DANGER" below)
   // The pointers f and place will be equal

   // ...
 }

Line #1 creates an array of sizeof(Fred) bytes of memory, which is big enough
to hold a Fred object.  Line #2 creates a pointer place that points to the
first byte of this memory (experienced C programmers will note that this step
was unnecessary; it's there only to make the code more obvious).  Line #3
essentially just calls the constructor Fred::Fred().  The this pointer in the
Fred constructor will be equal to place.  The returned pointer f will therefore
be equal to place.

ADVICE: Don't use this "placement new" syntax unless you have to.  Use it only
when you really care that an object is placed at a particular location in
memory.  For example, when your hardware has a memory-mapped I/O timer device,
and you want to place a Clock object at that memory location.

DANGER: You are taking sole responsibility that the pointer you pass to the
"placement new" operator points to a region of memory that is big enough and is
properly aligned for the object type that you're creating.  Neither the
compiler nor the run-time system make any attempt to check whether you did this
right.  If your Fred class needs to be aligned on a 4 byte boundary but you
supplied a location that isn't properly aligned, you can have a serious
disaster on your hands (if you don't know what "alignment" means, please don't
use the placement new syntax).  You have been warned.

You are also solely responsible for destructing the placed object.  This is
done by explicitly calling the destructor:

 void someCode()
 {
   char memory[sizeof(Fred)];
   void* p = memory;
   Fred* f = new(p) Fred();
   // ...
   f->~Fred();   // Explicitly call the destructor for the placed object
 }

This is about the only time you ever explicitly call a destructor.

Note: there is a much cleaner but more sophisticated[11.14] way of handling the
destruction / deletion situation.

==============================================================================

[11.11] When I write a destructor, do I need to explicitly call the destructors
        for my member objects?

No.  You never need to explicitly call a destructor (except with placement
new[11.10]).

A class's destructor (whether or not you explicitly define one) automagically
invokes the destructors for member objects.  They are destroyed in the reverse
order they appear within the declaration for the class.

 class Member {
 public:
   ~Member();
   // ...
 };

 class Fred {
 public:
   ~Fred();
   // ...
 private:
   Member x_;
   Member y_;
   Member z_;
 };

 Fred::~Fred()
 {
   // Compiler automagically calls z_.~Member()
   // Compiler automagically calls y_.~Member()
   // Compiler automagically calls x_.~Member()
 }

==============================================================================

[11.12] When I write a derived class's destructor, do I need to explicitly call
        the destructor for my base class?

No.  You never need to explicitly call a destructor (except with placement
new[11.10]).

A derived class's destructor (whether or not you explicitly define one)
automagically invokes the destructors for base class subobjects.  Base classes
are destructed after member objects.  In the event of multiple inheritance,
direct base classes are destructed in the reverse order of their appearance in
the inheritance list.

 class Member {
 public:
   ~Member();
   // ...
 };

 class Base {
 public:
   virtual ~Base();     // A virtual destructor[20.5]
   // ...
 };

 class Derived : public Base {
 public:
   ~Derived();
   // ...
 private:
   Member x_;
 };

 Derived::~Derived()
 {
   // Compiler automagically calls x_.~Member()
   // Compiler automagically calls Base::~Base()
 }

Note: Order dependencies with virtual inheritance are trickier.  If you are
relying on order dependencies in a virtual inheritance hierarchy, you'll need a
lot more information than is in this FAQ.

==============================================================================

[11.13] Should my destructor throw an exception when it detects a problem?

Beware!!! See this FAQ[17.3] for details.

==============================================================================

[11.14] Is there a way to force new to allocate memory from a specific memory
        area? [UPDATED!]

[Recently two typos ("myPool" vs. "pool") in the code were fixed thanks to
Randy Sherman (in 5/02).]

Yes.  The good news is that these "memory pools" are useful in a number of
situations.  The bad news is that I'll have to drag you through the mire of how
it works before we discuss all the uses.  But if you don't know about memory
pools, it might be worthwhile to slog through this FAQ -- you might learn
something useful!

First of all, recall that a memory allocator is simply supposed to return
uninitialized bits of memory; it is not supposed to produce "objects." In
particular, the memory allocator is not supposed to set the virtual-pointer or
any other part of the object, as that is the job of the constructor which runs
after the memory allocator.  Starting with a simple memory allocator function,
allocate(), you would use placement new[11.10] to construct an object in that
memory.  In other words, the following is morally equivalent to "new Foo()":

 void* raw = allocate(sizeof(Foo));  // line 1
 Foo* p = new(raw) Foo();            // line 2

Okay, assuming you've used placement new[11.10] and have survived the above two
lines of code, the next step is to turn your memory allocator into an object.
This kind of object is called a "memory pool" or a "memory arena." This lets
your users have more than one "pool" or "arena" from which memory will be
allocated.  Each of these memory pool objects will allocate a big chunk of
memory using some specific system call (e.g., shared memory, persistent memory,
stack memory, etc.; see below), and will dole it out in little chunks as
needed.  Your memory-pool class might look something like this:

 class Pool {
 public:
   void* alloc(size_t nbytes);
   void dealloc(void* p);
 private:
   ...data members used in your pool object...
 };

 void* Pool::alloc(size_t nbytes)
 {
   ...your algorithm goes here...
 }

 void Pool::dealloc(void* p)
 {
   ...your algorithm goes here...
 }

Now one of your users might have a Pool called pool, from which they could
allocate objects like this:

 Pool pool;
 ...
 void* raw = pool.alloc(sizeof(Foo));
 Foo* p = new(raw) Foo();

Or simply:

 Foo* p = new(pool.alloc(sizeof(Foo))) Foo();

The reason it's good to turn Pool into a class is because it lets users create
N different pools of memory rather than having one massive pool shared by all
users.  That allows users to do lots of funky things.  For example, if they
have a chunk of the system that allocates memory like crazy then goes away,
they could allocate all their memory from a Pool, then not even bother doing
any deletes on the little pieces: just deallocate the entire pool at once.  Or
they could set up a "shared memory" area (where the operating system
specifically provides memory that is shared between multiple processes) and
have the pool dole out chunks of shared memory rather than process-local
memory.  Another angle: many systems support a non-standard function often
called alloca() which allocates a block of memory from the stack rather than
the heap.  Naturally this block of memory automatically goes away when the
function returns, eliminating the need for explicit deletes.  Someone could use
alloca() to give the Pool its big chunk of memory, then all the little pieces
allocated from that Pool act like they're local: they automatically vanish when
the function returns.  Of course the destructors don't get called in some of
these cases, and if the destructors do something nontrivial you won't be able
to use these techniques, but in cases where the destructor merely deallocates
memory, these sorts of techniques can be useful.

Okay, assuming you survived the 6 or 8 lines of code needed to wrap your
allocate function as a method of a Pool class, the next step is to change the
syntax for allocating objects.  The goal is to change from the rather clunky
syntax new(pool.alloc(sizeof(Foo))) Foo() to the simpler syntax[26.13]
new(pool) Foo().  To make this happen, you need to add the following two lines
of code just below the definition of your Pool class:

 inline void* operator new(size_t nbytes, Pool& pool)
 {
   return pool.alloc(nbytes);
 }

Now when the compiler sees new(pool) Foo(), it calls the above operator new and
passes sizeof(Foo) and pool as parameters, and the only function that ends up
using the funky pool.alloc(nbytes) method is your own operator new.

Now to the issue of how to destruct/deallocate the Foo objects.  Recall that
the brute force approach sometimes used with placement new[11.10] is to
explicitly call the destructor then explicitly deallocate the memory:

 void sample(Pool& pool)
 {
   Foo* p = new(pool) Foo();
   ...
   p->~Foo();  // explicitly call dtor
   pool.dealloc(p);  // explicitly release the memory
 }

This has several problems, all of which are fixable:

 1. The memory will leak if Foo::Foo() throws an exception.

 2. The destruction/deallocation syntax is different from what most programmers
    are used to, so they'll probably screw it up.

 3. Users must somehow remember which pool goes with which object.  Since the
    code that allocates is often in a different function from the code that
    deallocates, programmers will have to pass around two pointers (a Foo* and
    a Pool*), which gets ugly fast (example, what if they had an array of Foos
    each of which potentially came from a different Pool; ugh).

We will fix them in the above order.

Problem #1: plugging the memory leak. When you use the "normal" new operator,
e.g., Foo* p = new Foo(), the compiler generates some special code to handle
the case when the constructor throws an exception.  The actual code generated
by the compiler is functionally similar to this:

 // This is functionally what happens with Foo* p = new Foo()

 Foo* p;

 // don't catch exceptions thrown by the allocator itself
 void* raw = operator new(sizeof(Foo));

 // catch any exceptions thrown by the ctor
 try {
   p = new(raw) Foo();  // call the ctor with raw as this
 }
 catch (...) {
   // oops, ctor threw an exception
   operator delete(raw);
   throw;  // rethrow the ctor's exception
 }

The point is that the compiler deallocates the memory if the ctor throws an
exception.  But in the case of the "new with parameter" syntax (commonly called
"placement new"), the compiler won't know what to do if the exception occurs so
by default it does nothing:

 // This is functionally what happens with Foo* p = new(pool) Foo():

 Foo* p;
 void* raw = operator new(sizeof(Foo), pool);
 // the above function simply returns "pool.alloc(sizeof(Foo))"
 p = new(raw) Foo();
 // if the above line "throws", pool.dealloc(raw) is NOT called

So the goal is to force the compiler to do something similar to what it does
with the global new operator.  Fortunately it's simple: when the compiler sees
new(pool) Foo(), it looks for a corresponding operator delete.  If it finds
one, it does the equivalent of wrapping the ctor call in a try block as shown
above.  So we would simply provide an operator delete with the following
signature (be careful to get this right; if the second parameter has a
different type from the second parameter of the operator new(size_t, Pool&),
the compiler doesn't complain; it simply bypasses the try block when your users
say new(pool) Foo()):

 void operator delete(void* p, Pool& pool)
 {
   pool.dealloc(p);
 }

After this, the compiler will automatically wrap the ctor calls of your new
expressions in a try block:

 // This is functionally what happens with Foo* p = new(pool) Foo()

 Foo* p;

 // don't catch exceptions thrown by the allocator itself
 void* raw = operator new(sizeof(Foo), pool);
 // the above simply returns "pool.alloc(sizeof(Foo))"

 // catch any exceptions thrown by the ctor
 try {
   p = new(raw) Foo();  // call the ctor with raw as this
 }
 catch (...) {
   // oops, ctor threw an exception
   operator delete(raw, pool);  // that's the magical line!!
   throw;  // rethrow the ctor's exception
 }

In other words, the one-liner function operator delete(void* p, Pool& pool)
causes the compiler to automagically plug the memory leak.  Of course that
function can be, but doesn't have to be, inline.

Problems #2 ("ugly therefore error prone") and #3 ("users must manually
associate pool-pointers with the object that allocated them, which is error
prone") are solved simultaneously with an additional 10-20 lines of code in one
place.  In other words, we add 10-20 lines of code in one place (your Pool
header file) and simplify an arbitrarily large number of other places (every
piece of code that uses your Pool class).

The idea is to implicitly associate a Pool* with every allocation.  The Pool*
associated with the global allocator would be NULL, but at least conceptually
you could say every allocation has an associated Pool*.  Then you replace the
global operator delete so it looks up the associated Pool*, and if non-NULL,
calls that Pool's deallocate function.  For example, if(!)[16.2] the normal
deallocator used free(), the replacment for the global operator delete would
look something like this:

 void operator delete(void* p)
 {
   if (p != NULL) {
     Pool* pool = <somehow get the associated 'Pool*'>;
     if (pool == null)
       free(p);
     else
       pool->dealloc(p);
   }
 }

If you're not sure if the normal deallocator was free()[16.2], the easiest
approach is also replace the global operator new with something that uses
malloc().  The replacement for the global operator new would look something
like this (note: this definition ignores a few details such as the new_handler
loop and the throw std::bad_alloc() that happens if we run out of memory):

 void* operator new(size_t nbytes)
 {
   if (nbytes == 0)
     nbytes = 1;  // so all alloc's get a distinct address
   void* raw = malloc(nbytes);
   <somehow associate the NULL 'Pool*' with 'raw'>
   return raw;
 }

The only remaining problem is to associate a Pool* with an allocation.  One
approach, used in at least one commercial product, is to use a
std::map<void*,Pool*>.  In other words, build a look-up table whose keys are
the allocation-pointer and whose values are the associated Pool*.  For reasons
I'll describe in a moment, it is essential that you insert a key/value pair
into the map only in operator new(size_t,Pool&).  In particular, you must not
insert a key/value pair from the global operator new (e.g., you must not say,
poolMap[p] = NULL in the global operator new).  Reason: doing that would create
a nasty chicken-and-egg problem -- since std::map probably uses the global
operator new, it ends up inserting a new entry every time inserts a new entry,
leading to infinite recursion -- bang you're dead.

Even though this technique requires a std::map look-up for each deallocation,
it seems to have acceptable performance, at least in many cases.

Another approach that is faster but might use more memory and is a little
trickier is to prepend a Pool* just before all allocations.  For example, if
nbytes was 24, meaning the caller was asking to allocate 24 bytes, we would
allocate 28 (or 32 if you think the machine requires 8-byte alignment for
things like doubles and/or long longs), stuff the Pool* into the first 4 bytes,
and return the pointer 4 (or 8) bytes from the beginning of what you allocated.
Then your global operator delete backs off the 4 (or 8) bytes, finds the Pool*,
and if NULL, uses free() otherwise calls pool->dealloc().  The parameter passed
to free() and pool->dealloc() would be the pointer 4 (or 8) bytes to the left
of the original parameter, p.  If(!) you decide on 4 byte alignment, your code
would look something like this (although as before, the following operator new
code elides the usual out-of-memory handlers):

 void* operator new(size_t nbytes)
 {
   if (nbytes == 0)
     nbytes = 1;                    // so all alloc's get a distinct address
   void* ans = malloc(nbytes + 4);  // overallocate by 4 bytes
   *(Pool**)ans = NULL;             // use NULL in the global new
   return (char*)ans + 4;           // don't let users see the Pool*
 }

 void* operator new(size_t nbytes, Pool& pool)
 {
   if (nbytes == 0)
     nbytes = 1;                    // so all alloc's get a distinct address
   void* ans = pool.alloc(nbytes + 4); // overallocate by 4 bytes
   *(Pool**)ans = &pool;            // put the Pool* here
   return (char*)ans + 4;           // don't let users see the Pool*
 }

 void operator delete(void* p)
 {
   if (p != NULL) {
     p = (char*)p - 4;              // back off to the Pool*
     Pool* pool = *(Pool**)p;
     if (pool == null)
       free(p);                     // note: 4 bytes left of the original p
     else
       pool->dealloc(p);            // note: 4 bytes left of the original p
   }
 }

Naturally the last few paragraphs of this FAQ are viable only when you are
allowed to change the global operator new and operator delete.  If you are not
allowed to change these global functions, the first three quarters of this FAQ
is still applicable.

==============================================================================

SECTION [12]: Assignment operators


[12.1] What is "self assignment"?

Self assignment is when someone assigns an object to itself.  For example,

 #include "Fred.hpp"    // Declares class Fred

 void userCode(Fred& x)
 {
   x = x;   // Self-assignment
 }

Obviously no one ever explicitly does a self assignment like the above, but
since more than one pointer or reference can point to the same object
(aliasing), it is possible to have self assignment without knowing it:

 #include "Fred.hpp"    // Declares class Fred

 void userCode(Fred& x, Fred& y)
 {
   x = y;   // Could be self-assignment if &x == &y
 }

 int main()
 {
   Fred z;
   userCode(z, z);
 }

==============================================================================

[12.2] Why should I worry about "self assignment"?

If you don't worry about self assignment[12.1], you'll expose your users to
some very subtle bugs that have very subtle and often disastrous symptoms.  For
example, the following class will cause a complete disaster in the case of
self-assignment:

 class Wilma { };

 class Fred {
 public:
   Fred()                : p_(new Wilma())      { }
   Fred(const Fred& f)   : p_(new Wilma(*f.p_)) { }
  ~Fred()                { delete p_; }
   Fred& operator= (const Fred& f)
     {
       // Bad code: Doesn't handle self-assignment!
       delete p_;                // Line #1
       p_ = new Wilma(*f.p_);    // Line #2
       return *this;
     }
 private:
   Wilma* p_;
 };

If someone assigns a Fred object to itself, line #1 deletes both this->p_ and
f.p_ since *this and f are the same object.  But line #2 uses *f.p_, which is
no longer a valid object.  This will likely cause a major disaster.

The bottom line is that you the author of class Fred are responsible to make
sure self-assignment on a Fred object is innocuous[12.3].  Do not assume that
users won't ever do that to your objects.  It is your fault if your object
crashes when it gets a self-assignment.

Aside: the above Fred::operator= (const Fred&) has a second problem:     If an
exception is thrown[17] while evaluating new Wilma(*f.p_) (e.g., an
out-of-memory     exception[16.5] or an exception in Wilma's copy
    constructor[17.2]), this->p_ will be a dangling pointer -- it will
    point to memory that is no longer valid.  This can be solved by allocating
the     new objects before deleting the old objects.

==============================================================================

[12.3] OK, OK, already; I'll handle self-assignment.  How do I do it?

You should worry about self assignment every time you create a class[12.2].
This does not mean that you need to add extra code to all your classes: as long
as your objects gracefully handle self assignment, it doesn't matter whether
you had to add extra code or not.

If you do need to add extra code to your assignment operator, here's a simple
and effective technique:

 Fred& Fred::operator= (const Fred& f)
 {
   if (this == &f) return *this;   // Gracefully handle self assignment[12.1]

   // Put the normal assignment duties here...

   return *this;
 }

This explicit test isn't always necessary.  For example, if you were to fix the
assignment operator in the previous FAQ[12.2] to handle exceptions thrown by
new[16.5] and/or exceptions thrown by the copy constructor[17.2] of class
Wilma, you might produce the following code.  Note that this code has the
(pleasant) side effect of automatically handling self assignment as well:

 Fred& Fred::operator= (const Fred& f)
 {
   // This code gracefully (albeit implicitly) handles self assignment[12.1]
   Wilma* tmp = new Wilma(*f.p_);   // It would be OK if an exception[17] got thrown here
   delete p_;
   p_ = tmp;
   return *this;
 }

In cases like the previous example (where self assignment is harmless but
inefficient), some programmers want to improve the efficiency of self
assignment by adding an otherwise unnecessary test, such as
"if (this == &f) return *this;".  It is generally the wrong tradeoff to make
self assignment more efficient by making the non-self assignment case less
efficient.  For example, adding the above if test to the Fred assignment
operator would make the non-self assignment case slightly less efficient (an
extra (and unnecessary) conditional branch).  If self assignment actually
occured once in a thousand times, the if would waste cycles 99.9% of the time.

==============================================================================

SECTION [13]: Operator overloading


[13.1] What's the deal with operator overloading?

It allows you to provide an intuitive interface to users of your class, plus
makes it possible for templates[33.5] to work equally well with classes and
built-in/intrinsic types.

Operator overloading allows C/C++ operators to have user-defined meanings on
user-defined types (classes).  Overloaded operators are syntactic sugar for
function calls:

 class Fred {
 public:
   // ...
 };

 #if 0

   // Without operator overloading:
   Fred add(Fred, Fred);
   Fred mul(Fred, Fred);

   Fred f(Fred a, Fred b, Fred c)
   {
     return add(add(mul(a,b), mul(b,c)), mul(c,a));    // Yuk...
   }

 #else

   // With operator overloading:
   Fred operator+ (Fred, Fred);
   Fred operator* (Fred, Fred);

   Fred f(Fred a, Fred b, Fred c)
   {
     return a*b + b*c + c*a;
   }

 #endif

==============================================================================

[13.2] What are the benefits of operator overloading?

By overloading standard operators on a class, you can exploit the intuition of
the users of that class.  This lets users program in the language of the
problem domain rather than in the language of the machine.

The ultimate goal is to reduce both the learning curve and the defect rate.

==============================================================================

[13.3] What are some examples of operator overloading?

Here are a few of the many examples of operator overloading:
 * myString + yourString might concatenate two std::string objects
 * myDate++ might increment a Date object
 * a * b might multiply two Number objects
 * a[i] might access an element of an Array object
 * x = *p might dereference a "smart pointer" that actually "points" to a disk
   record -- it could actually seek to the location on disk where p "points"
   and return the appropriate record into x

==============================================================================

[13.4] But operator overloading makes my class look ugly; isn't it supposed to
       make my code clearer?

Operator overloading makes life easier for the users of a class[13.2], not for
the developer of the class!

Consider the following example.

 class Array {
 public:
   int& operator[] (unsigned i);      // Some people don't like this syntax
   // ...
 };

 inline
 int& Array::operator[] (unsigned i)  // Some people don't like this syntax
 {
   // ...
 }

Some people don't like the keyword operator or the somewhat funny syntax that
goes with it in the body of the class itself.  But the operator overloading
syntax isn't supposed to make life easier for the developer of a class.  It's
supposed to make life easier for the users of the class:

 int main()
 {
   Array a;
   a[3] = 4;   // User code should be obvious and easy to understand...
 }

Remember: in a reuse-oriented world, there will usually be many people who use
your class, but there is only one person who builds it (yourself); therefore
you should do things that favor the many rather than the few.

==============================================================================

[13.5] What operators can/cannot be overloaded?

Most can be overloaded. The only C operators that can't be are . and ?: (and
sizeof, which is technically an operator).  C++ adds a few of its own
operators, most of which can be overloaded except :: and .*.

Here's an example of the subscript operator (it returns a reference).  First
without operator overloading:

 class Array {
 public:
   int& elem(unsigned i)        { if (i > 99) error(); return data[i]; }
 private:
   int data[100];
 };

 int main()
 {
   Array a;
   a.elem(10) = 42;
   a.elem(12) += a.elem(13);
 }

Now the same logic is presented with operator overloading:

 class Array {
 public:
   int& operator[] (unsigned i) { if (i > 99) error(); return data[i]; }
 private:
   int data[100];
 };

 int main()
 {
   Array a;
   a[10] = 42;
   a[12] += a[13];
 }

==============================================================================

[13.6] Can I overload operator== so it lets me compare two char[] using a
       string comparison?

No: at least one operand of any overloaded operator must be of some
user-defined type[25.10] (most of the time that means a class).

But even if C++ allowed you to do this, which it doesn't, you wouldn't want to
do it anyway since you really should be using a std::string-like class rather
than an array of char in the first place[17.5] since arrays are evil[33.1].

==============================================================================

[13.7] Can I create a operator** for "to-the-power-of" operations?

Nope.

The names of, precedence of, associativity of, and arity of operators is fixed
by the language.  There is no operator** in C++, so you cannot create one for a
class type.

If you're in doubt, consider that x ** y is the same as x * (*y) (in other
words, the compiler assumes y is a pointer).  Besides, operator overloading is
just syntactic sugar for function calls.  Although this particular syntactic
sugar can be very sweet, it doesn't add anything fundamental.  I suggest you
overload pow(base,exponent) (a double precision version is in <cmath>).

By the way, operator^ can work for to-the-power-of, except it has the wrong
precedence and associativity.

==============================================================================

[13.8] How do I create a subscript operator for a Matrix class?

Use operator() rather than operator[].

When you have multiple subscripts, the cleanest way to do it is with operator()
rather than with operator[].  The reason is that operator[] always takes
exactly one parameter, but operator() can take any number of parameters (in the
case of a rectangular matrix, two paramters are needed).

For example:

 class Matrix {
 public:
   Matrix(unsigned rows, unsigned cols);
   double& operator() (unsigned row, unsigned col);
   double  operator() (unsigned row, unsigned col) const;
   // ...
  ~Matrix();                              // Destructor
   Matrix(const Matrix& m);               // Copy constructor
   Matrix& operator= (const Matrix& m);   // Assignment operator
   // ...
 private:
   unsigned rows_, cols_;
   double* data_;
 };

 inline
 Matrix::Matrix(unsigned rows, unsigned cols)
   : rows_ (rows),
     cols_ (cols),
     data_ (new double[rows * cols])
 {
   if (rows == 0 || cols == 0)
     throw BadIndex("Matrix constructor has 0 size");
 }

 inline
 Matrix::~Matrix()
 {
   delete[] data_;
 }

 inline
 double& Matrix::operator() (unsigned row, unsigned col)
 {
   if (row >= rows_ || col >= cols_)
     throw BadIndex("Matrix subscript out of bounds");
   return data_[cols_*row + col];
 }

 inline
 double Matrix::operator() (unsigned row, unsigned col) const
 {
   if (row >= rows_ || col >= cols_)
     throw BadIndex("const Matrix subscript out of bounds");
   return data_[cols_*row + col];
 }

Then you can access an element of Matrix m using m(i,j) rather than m[i][j]:

 int main()
 {
   Matrix m(10,10);
   m(5,8) = 106.15;
   std::cout << m(5,8);
   // ...
 }

==============================================================================

[13.9] Why shouldn't my Matrix class's interface look like an array-of-array?

Here's what this FAQ is really all about: Some people build a Matrix class that
has an operator[] that returns a reference to an Array object, and that Array
object has an operator[] that returns an element of the Matrix (e.g., a
reference to a double).  Thus they access elements of the matrix using syntax
like m[i][j] rather than syntax like m(i,j)[13.8].

The array-of-array solution obviously works, but it is less flexible than the
operator() approach[13.8].  Specifically, there are easy performance tuning
tricks that can be done with the operator() approach that are more difficult in
the [][] approach, and therefore the [][] approach is more likely to lead to
bad performance, at least in some cases.

For example, the easiest way to implement the [][] approach is to use a
physical layout of the matrix as a dense matrix that is stored in row-major
form (or is it column-major; I can't ever remember).  In contrast, the
operator() approach[13.8] totally hides the physical layout of the matrix, and
that can lead to better performance in some cases.

Put it this way: the operator() approach is never worse than, and sometimes
better than, the [][] approach.
 * The operator() approach is never worse because it is easy to implement the
   dense, row-major physical layout using the operator() approach, so when that
   configuration happens to be the optimal layout from a performance
   standpoint, the operator() approach is just as easy as the [][] approach
   (perhaps the operator() approach is a tiny bit easier, but I won't quibble
   over minor nits).
 * The operator() approach is sometimes better because whenever the optimal
   layout for a given application happens to be something other than dense,
   row-major, the implementation is often significantly easier using the
   operator() approach compared to the [][] approach.

As an example of when a physical layout makes a significant difference, a
recent project happened to access the matrix elements in columns (that is, the
algorithm accesses all the elements in one column, then the elements in
another, etc.), and if the physical layout is row-major, the accesses can
"stride the cache".  For example, if the rows happen to be almost as big as the
processor's cache size, the machine can end up with a "cache miss" for almost
every element access.  In this particular project, we got a 20% improvement in
performance by changing the mapping from the logical layout (row,column) to the
physical layout (column,row).

Of course there are many examples of this sort of thing from numerical methods,
and sparse matrices are a whole other dimension on this issue.  Since it is, in
general, easier to implement a sparse matrix or swap row/column ordering using
the operator() approach, the operator() approach loses nothing and may gain
something -- it has no down-side and a potential up-side.

Use the operator() approach[13.8].

==============================================================================

[13.10] Should I design my classes from the outside (interfaces first) or from
        the inside (data first)?

From the outside!

A good interface provides a simplified view that is expressed in the vocabulary
of a user[7.3].  In the case of OO software, the interface is normally the set
of public methods of either a single class or a tight group of classes[14.2].

First think about what the object logically represents, not how you intend to
physically build it.  For example, suppose you have a Stack class that will be
built by containing a LinkedList:

 class Stack {
 public:
   // ...
 private:
   LinkedList list_;
 };

Should the Stack have a get() method that returns the LinkedList? Or a set()
method that takes a LinkedList? Or a constructor that takes a LinkedList?
Obviously the answer is No, since you should design your interfaces from the
outside-in.  I.e., users of Stack objects don't care about LinkedLists; they
care about pushing and popping.

Now for another example that is a bit more subtle.  Suppose class LinkedList is
built using a linked list of Node objects, where each Node object has a pointer
to the next Node:

 class Node { /*...*/ };

 class LinkedList {
 public:
   // ...
 private:
   Node* first_;
 };

Should the LinkedList class have a get() method that will let users access the
first Node? Should the Node object have a get() method that will let users
follow that Node to the next Node in the chain? In other words, what should a
LinkedList look like from the outside? Is a LinkedList really a chain of Node
objects? Or is that just an implementation detail? And if it is just an
implementation detail, how will the LinkedList let users access each of the
elements in the LinkedList one at a time?

The key insight is the realization that a LinkedList is not a chain of Nodes.
That may be how it is built, but that is not what it is.  What it is is a
sequence of elements.  Therefore the LinkedList abstraction should provide a
"LinkedListIterator" class as well, and that "LinkedListIterator" might have an
operator++ to go to the next element, and it might have a get()/set() pair to
access its value stored in the Node (the value in the Node element is solely
the responsibility of the LinkedList user, which is why there is a get()/set()
pair that allows the user to freely manipulate that value).

Starting from the user's perspective, we might want our LinkedList class to
support operations that look similar to accessing an array using pointer
arithmetic:

 void userCode(LinkedList& a)
 {
   for (LinkedListIterator p = a.begin(); p != a.end(); ++p)
     std::cout << *p << '\n';
 }

To implement this interface, LinkedList will need a begin() method and an end()
method.  These return a "LinkedListIterator" object.  The "LinkedListIterator"
will need a method to go forward, ++p; a method to access the current element,
*p; and a comparison operator, p != a.end().

The code follows.  The important thing to notice is that LinkedList does not
have any methods that let users access Nodes.  Nodes are an implementation
technique that is completely buried.  This makes the LinkedList class safer (no
chance a user will mess up the invariants and linkages between the various
nodes), easier to use (users don't need to expend extra effort keeping the
node-count equal to the actual number of nodes, or any other infrastructure
stuff), and more flexible (by changing a single typedef, users could change
their code from using LinkedList to some other list-like class and the bulk of
their code would compile cleanly and hopefully with improved performance
characteristics).

 #include <cassert>    // Poor man's exception handling

 class LinkedListIterator;
 class LinkedList;

 class Node {
   // No public members; this is a "private class"
   friend LinkedListIterator;   // A friend class[14]
   friend LinkedList;
   Node* next_;
   int elem_;
 };

 class LinkedListIterator {
 public:
   bool operator== (LinkedListIterator i) const;
   bool operator!= (LinkedListIterator i) const;
   void operator++ ();   // Go to the next element
   int& operator*  ();   // Access the current element
 private:
   LinkedListIterator(Node* p);
   Node* p_;
   friend LinkedList;  // so LinkedList can construct a LinkedListIterator
 };

 class LinkedList {
 public:
   void append(int elem);    // Adds elem after the end
   void prepend(int elem);   // Adds elem before the beginning
   // ...
   LinkedListIterator begin();
   LinkedListIterator end();
   // ...
 private:
   Node* first_;
 };

Here are the methods that are obviously inlinable (probably in the same header
file):

 inline bool LinkedListIterator::operator== (LinkedListIterator i) const
 {
   return p_ == i.p_;
 }

 inline bool LinkedListIterator::operator!= (LinkedListIterator i) const
 {
   return p_ != i.p_;
 }

 inline void LinkedListIterator::operator++()
 {
   assert(p_ != NULL);  // or if (p_==NULL) throw ...
   p_ = p_->next_;
 }

 inline int& LinkedListIterator::operator*()
 {
   assert(p_ != NULL);  // or if (p_==NULL) throw ...
   return p_->elem_;
 }

 inline LinkedListIterator::LinkedListIterator(Node* p)
   : p_(p)
 { }

 inline LinkedListIterator LinkedList::begin()
 {
   return first_;
 }

 inline LinkedListIterator LinkedList::end()
 {
   return NULL;
 }

Conclusion: The linked list had two different kinds of data.  The values of the
elements stored in the linked list are the responsibility of the user of the
linked list (and only the user; the linked list itself makes no attempt to
prohibit users from changing the third element to 5), and the linked list's
infrastructure data (next pointers, etc.), whose values are the responsibility
of the linked list (and only the linked list; e.g., the linked list does not
let users change (or even look at!) the various next pointers).

Thus the only get()/set() methods were to get and set the elements of the
linked list, but not the infrastructure of the linked list.  Since the linked
list hides the infrastructure pointers/etc., it is able to make very strong
promises regarding that infrastructure (e.g., if it was a doubly linked list,
it might guarantee that every forward pointer was matched by a backwards
pointer from the next Node).

So, we see here an example of where the values of some of a class's data is the
responsibility of users (in which case the class needs to have get()/set()
methods for that data) but the data that the class wants to control does not
necessarily have get()/set() methods.

Note: the purpose of this example is not to show you how to write a linked-list
class.  In fact you should not "roll your own" linked-list class since you
should use one of the "container classes" provided with your compiler.  Ideally
you'll use one of the standard container classes[34.1] such as the std::list<T>
template.

==============================================================================

SECTION [14]: Friends


[14.1] What is a friend?

Something to allow your class to grant access to another class or function.

Friends can be either functions or other classes.  A class grants access
privileges to its friends.  Normally a developer has political and technical
control over both the friend and member functions of a class (else you may need
to get permission from the owner of the other pieces when you want to update
your own class).

==============================================================================

[14.2] Do friends violate encapsulation?

No! If they're used properly, they actually enhance encapsulation.

You often need to split a class in half when the two halves will have different
numbers of instances or different lifetimes.  In these cases, the two halves
usually need direct access to each other (the two halves used to be in the same
class, so you haven't increased the amount of code that needs direct access to
a data structure; you've simply reshuffled the code into two classes instead of
one).  The safest way to implement this is to make the two halves friends of
each other.

If you use friends like just described, you'll keep private things private.
People who don't understand this often make naive efforts to avoid using
friendship in situations like the above, and often they actually destroy
encapsulation.  They either use public data (grotesque!), or they make the data
accessible between the halves via public get() and set() member functions.
Having a public get() and set() member function for a private datum is OK only
when the private datum "makes sense" from outside the class (from a user's
perspective).  In many cases, these get()/set() member functions are almost as
bad as public data: they hide (only) the name of the private datum, but they
don't hide the existence of the private datum.

Similarly, if you use friend functions as a syntactic variant of a class's
public access functions, they don't violate encapsulation any more than a
member function violates encapsulation.  In other words, a class's friends
don't violate the encapsulation barrier: along with the class's member
functions, they are the encapsulation barrier.

(Many people think of a friend function as something outside the class.
Instead, try thinking of a friend function as part of the class's public
interface.  A friend function in the class declaration doesn't violate
encapsulation any more than a public member function violates encapsulation:
both have exactly the same authority with respect to accessing the class's
non-public parts.)

==============================================================================

[14.3] What are some advantages/disadvantages of using friend functions?

They provide a degree of freedom in the interface design options.

Member functions and friend functions are equally privileged (100% vested).
The major difference is that a friend function is called like f(x), while a
member function is called like x.f().  Thus the ability to choose between
member functions (x.f()) and friend functions (f(x)) allows a designer to
select the syntax that is deemed most readable, which lowers maintenance costs.

The major disadvantage of friend functions is that they require an extra line
of code when you want dynamic binding.  To get the effect of a virtual friend,
the friend function should call a hidden (usually protected) virtual[20] member
function.  This is called the Virtual Friend Function Idiom[15.9].  For
example:

 class Base {
 public:
   friend void f(Base& b);
   // ...
 protected:
   virtual void do_f();
   // ...
 };

 inline void f(Base& b)
 {
   b.do_f();
 }

 class Derived : public Base {
 public:
   // ...
 protected:
   virtual void do_f();  // "Override" the behavior of f(Base& b)
   // ...
 };

 void userCode(Base& b)
 {
   f(b);
 }

The statement f(b) in userCode(Base&) will invoke b.do_f(), which is
virtual[20].  This means that Derived::do_f() will get control if b is actually
a object of class Derived.  Note that Derived overrides the behavior of the
protected virtual[20] member function do_f(); it does not have its own
variation of the friend function, f(Base&).

==============================================================================

[14.4] What does it mean that "friendship isn't inherited, transitive, or
       reciprocal"?

Just because I grant you friendship access to me doesn't automatically grant
your kids access to me, doesn't automatically grant your friends access to me,
and doesn't automatically grant me access to you.
 * I don't necessarily trust the kids of my friends.  The privileges of
   friendship aren't inherited.  Derived classes of a friend aren't necessarily
   friends.  If class Fred declares that class Base is a friend, classes
   derived from Base don't have any automatic special access rights to Fred
   objects.
 * I don't necessarily trust the friends of my friends.  The privileges of
   friendship aren't transitive.  A friend of a friend isn't necessarily a
   friend.  If class Fred declares class Wilma as a friend, and class Wilma
   declares class Betty as a friend, class Betty doesn't necessarily have any
   special access rights to Fred objects.
 * You don't necessarily trust me simply because I declare you my friend.  The
   privileges of friendship aren't reciprocal.  If class Fred declares that
   class Wilma is a friend, Wilma objects have special access to Fred objects
   but Fred objects do not automatically have special access to Wilma objects.

==============================================================================

[14.5] Should my class declare a member function or a friend function?

Use a member when you can, and a friend when you have to.

Sometimes friends are syntactically better (e.g., in class Fred, friend
functions allow the Fred parameter to be second, while members require it to be
first).  Another good use of friend functions are the binary infix arithmetic
operators.  E.g., aComplex + aComplex should be defined as a friend rather than
a member if you want to allow aFloat + aComplex as well (member functions don't
allow promotion of the left hand argument, since that would change the class of
the object that is the recipient of the member function invocation).

In other cases, choose a member function over a friend function.

==============================================================================

User Contributions:

Comment about this article, ask questions, or add new information about this topic:

CAPTCHA




Part1 - Part2 - Part3 - Part4 - Part5 - Part6 - Part7 - Part8 - Part9 - Part10 - Part11 - Part12 - Part13 - Part14

[ Usenet FAQs | Web FAQs | Documents | RFC Index ]

Send corrections/additions to the FAQ Maintainer:
cline@parashift.com (Marshall Cline)





Last Update March 27 2014 @ 02:11 PM