Perl 5 Internals
Prev	Chapter 6. Fundamental Operations	Next

6.2. PP Code

We know the order of execution of the operations, and what some of them do. Now it's time to look at how they're actually implemented - the source code inside the interpreter that actually carries out print, +, and other operations.

The functions which implement operations are known as "PP Code" - "Push / Pop Code" - because most of their work involves popping off elements from a stack, performing some operation on it, and then pushing the result back. PP code can be found in several files: pp_hot.c contains frequently used code, put into a single object to encourage CPU caching; pp_ctl.c contains operations related to flow control; pp_sys.c contains the system-specific operations such as file and network handling; pack and unpack recently moved to pp_pack.c, and pp.c contains everything else.

6.2.1. The argument stack

We've already talked a little about the argument stack. The Perl interpreter makes use of several stacks, but the argument stack is the main one.

The best way to see how the argument stack is used is to watch it in operation. With a debugging build of Perl, the -Ds command line switch prints out the contents of the stack in symbolic format between operations. Here is a portion of the output of running $a=5; $b=10; print $a+$b;:

(-e:1)  nextstate
    =>
(-e:1)  pushmark
    =>  *
(-e:1)  gvsv(main::a)
    =>  *  IV(5)
(-e:1)  gvsv(main::b)
    =>  *  IV(5)  IV(10)
(-e:1)  add
    =>  *  IV(15)
(-e:1)  print
    =>  SV_YES

At the beginning of a statement, the stack is typically empty. First, Perl pushes a mark onto the stack to know when to stop pushing off arguments for print. Next, the values of $a and $b are retrieved and pushed onto the stack.

The addition operator is a binary operator, and hence, logically, it takes two values off the stack, adds them together and puts the result back onto the stack. Finally, print takes all of the values off the stack up to the previous bookmark and prints them out. Let's not forget that print itself has a return value, the true value SV_YES which it pushes back onto the stack.

6.2.2. Stack manipulation

Let's now take a look at one of the PP functions, the integer addition function pp_i_add. The code may look formidable, but it's a good example of how the PP functions manipulate values on the stack.

PP(pp_i_add)                                               
{
    dSP; dATARGET; tryAMAGICbin(add,opASSIGN);             
    {
      dPOPTOPiirl_ul;                                      
      SETi( left + right );                                
      RETURN;                                              
    }
}

: In case you haven't guessed, everything in this function is a macro. This first line declares the function pp_i_add to be the appropriate type for a PP function.
: Since following macros will need to manipulate the stack, the first thing we need is a local copy of the stack pointer, SP. And since this is C, we need to declare this in advance: dSP declares a stack pointer. Then we need an SV to hold the return value, a "target". This is declared with dATARGET; see Section 6.4 for more on how targets work. Finally, there is a chance that the addition operator has been overloaded using the overload pragma. The tryAMAGICbin macro tests to see if it is appropriate to perform "A" (overload) magic on either of the scalars in a binary operation, and if so, does the addition using a magic method call.
: We will deal with two values, left and right. The dPOPTOPiirl_ul macro pops two SVs off the top of the stack, converts them to two integers (hence ii) and stores them into automatic variables right and left. (hence rl)
The _ul? Look up the definition in pp.h and work it out...
: We add the two values together, and set the integer value of the target to the result, pushing the target to the top of the stack.
: As mentioned above, operators are expected to return the next op to be executed, and in most cases this is simply the value of op_next. Hence RETURN performs a normal return, copying our local stack pointer SP which we obtained above back into the global stack pointer variable, and then returning the op_next.

As you might have guessed, there are a number of macros for controlling what happens to the stack; these can be found in pp.h. The more common of these are:

POPs

Pop an SV off the stack and return it.

POPpx

Pop a string off the stack and return it. (Note: requires a variable "STRLEN n_a" to be in scope.)

POPn

Pop an NV off the stack.

POPi

Pop an IV off the stack.

TOPs

Return the top SV on the stack, but do not pop it. (The macros TOPpx, TOPn, etc. are analogous)

TOPm1s

Return the penultimate SV on the stack. (There is no TOPm1px, etc.)

PUSHs

Push the scalar onto the stack; you must ensure that the stack has enough space to accommodate it.

PUSHn

Set the NV of the target to the given value, and push it onto the stack. PUSHi, etc. are analogous.

There is also an XPUSHs, XPUSHn, etc. which extends the stack if necessary.

SETs

This sets the top element of the stack to the given SV. SETn, etc. are analogous.

dTOPss, dPOPss

These declare a variable called sv, and either return the top entry from the stack or pop an entry and set sv to it.

dTOPnv, dPOPnv

These are similar, but declare a variable called value of the appropriate type. dTOPiv and so on are analogous.

In some cases, the PP code is purely concerned with rearranging the stack, and the PP function will call out to another function in doop.c to actually perform the relevant operation.

Prev	Home	Next
Fundamental Operations	Up	The opcode table and `opcodes.pl`