Using WithClass to Reverse PHP
by Michael Gold

Figure 1 - PHP Add-In Created in VBA
WithClass has the ability to help you easily create Add-Ins for reversing your own languages. It can even be used to reverse scripting languages that have some object-oriented functionality such as PERL or PHP. This article will step you through creating an Add-In for reverse engineering your PHP classes.
The first step in creating any Add-In is to go to the File menu and choose New AddIn...

Figure 2 - Creating a New Add-In from WithClass
This will bring up the AddIn creation dialog. If you type on the first line, all the other lines will fill in automatically. You can customize the content of any of the fields in the dialog. There are two additional checkboxes for customizing your add-in. One box tells the add-in to automatically shutdown after running. You can uncheck this to keep the add-in running during the operation of WithClass. This is useful if your application is a Modeless form or application trapping events in WithClass. There is also a check box that allows you to automatically run the add-in when WithClass starts. You can use this feature to help you to customize the behavior of WithClass itself. For example you can change the menues that appear at start-up of WithClass.

Figure 3 - New Add-In Dialog
When you are finished typing in your fields and click OK, the program will create the Add-In, register it with WithClass in the Add-In menu, and bring you into the VBA editor for editing your Add-In. The Launching Procedure of the Add-In (the same one we typed into the Launching Procedure Field in the Add-In Wizard) is called ReversePHPStart shown in Figure 3. Below is the procedure created by the Add-In Wizard:
Listing 1 - Launching Procedure
| Sub PHPReverseStart() End Sub |
Any VBA code we add to this procedure will be run first. For this add-in, we will start it up by running our user form. By right clicking on our VBA project, we can insert a new user form into the project:

Figure 4 - Inserting a UserForm into our project
The UserForm menu brings up a resource editor in which we can put buttons, edit boxes, images and other controls. Below is the UserForm we created for the PHPReverse Add-In to retrieve the directory path we want to reverse:

Figure 5 - Directory Entry Screen
Double clicking on the GO button creates the command event handler we will use to launch our PHP Reversing code. We just need to type in the name of the routine we are calling along with the parameter string retrieved inside the TextBox contained in the UserForm:
Listing 2 - Go Button Event handler to call ReversePHPDirectory
| Private Sub CommandButton1_Click() AddIn.ReversePHPDirectory TextBox1.Text End Sub |
The ReversePHPDirectory routine called after the GO button is pressed, loops through each php file in the directory and calls PHPParse. When it finishes, it will arrange the classes and relationships that have been reversed:
Listing 3 - Method to reverse engineer the directory of php files
| Public Sub ReversePHPDirectory(MyDir As String) Dim theFile As String Dim MyPath As String Dim MyName As String ' Check for an exception On Error GoTo BadDir ' theFile = VBA.InputBox("Enter the PHP directory name to reverse ") ' Display the names in C:\ that represent directories. If (Right(MyDir, 1) <> "\") Then MyDir = MyDir + "\" ' add a slash if its not there End If ' Create the path for retrieving each
php file for reversing ' Parse the file and create a class or
classes from it ' Create inheritance relationships and
arrange classes accordingly |
PHP Classes
A PHP class looks like a cross between a Java class and a Java script. Below is a simple set of classes in PHP in which we will use our add-in to reverse engineer into a UML class diagram:
Listing 4 - Simple PHP classes representing airplanes
| class
Airplane { var $tirePressure; var $fuelLevel; var $passengerLimit; function takeOff() function
land() function
preFlightCheck() class sevenFortySeven extends
Airplane function
preFlightCheck() } class biplane extends Airplane { function
preFlightCheck() } |
Here are a few keywords that will help us out with our reverse engineering strategy:
classes will be found using the class keyword
Inheritance is accomplished in PHP through the extends keyword.
We will draw out our class operations with the help of the function keyword
The var keyword indicates the presence of an attribute
To parse out the PHP code and create our classes, we will create a simple state machine. The state machine will utilize the Case-Select structure of the VBA language. Below are the states we've chosen to represent in our code. Each state serves a purpose in extracting a different component of our class.
Listing 5 - States of the PHP Parsing state engine
| Const IdleState
= 1 Const OperationState = 2 Const BaseState = 3 Const CodeState = 4 Const ClassNameState = 5 Const ClassFinishedState = 6 Const PackageNameState = 7 Const AttributeState = 8 |
The State Machine is a simple loop that continues to parse out
the next keyword using the built in WithClass Parser class. Below is the
structure of our state machine subroutine:
Listing 6 - VBA State Machine Shell for Reverse Engineering
| Sub PHPPARSE(theFile As String) .... ' Get the Active Document we are
reversing into ' Get a Built-In Parsing object
from WithClass ' Get the first token (word
or symbol) |
Inside each Case statement is the conditions and actions that occur in the current state. Let's look at one of the states, the Idle State. This is the initial state of the state machine. Each condition is handled by an if statement that compares the current token against a symbol or keyword in PHP. If the condition is met, an action is taken such as creating a class, creating an operation, creating a relationship, or simply transitioning to the next state. State transitions are handled by setting the state variable equal to the next state. The idle state is looking for an initial class keyword so it can transition to the ClassNameState. It also filters out comments, finds functions (operations) and vars (attributes):
Listing 7 - Idle State in the PHP Parsing state machine
| Case IdleState
' if its a comment, skip it ' if its a comment, skip it |
Note: The PHP statemachine assumes that every file contains only classes. This is why functions and vars are assumed to be members of the class in the idle state. If you want to alter the state machine to have mixed classes and PHP script, you would need to move the function and var conditions into an InsideClass state.
WithClass has many objects that allow you to create classes, relationships, operations, attributes, states, and other shapes on the fly. Most objects are contained in collections that allow you to traverse through each object that is contained within another object. For example, there are Attribute and Operation collections inside of a class object. An operation collection contains one or more operation objects and an attribute collection contains one or more attribute objects. The class object has the ability to create an operation with the NewOperation method, for example. Below is the OperationState of our state machine. It will create a new Operation based on the name of the current token in the OperationState
Listing 10 - OperationState of the State Machine for creating an operation inside the class
| Case OperationState ' Below is the case when there is no class in the php file, we need to pretend the file is a class If (theClass Is Nothing) Then Set theClass = wcDocument.NewClass(xSpace, 100, GetClassFromFile(theFile)) End If ' Create a new operation in operation collection of the current class Set theOperation = theClass.NewOperation(nextToken) ' If the class name and the
operation name are the same, then the operation is a constructor |
PHP uses brackets to enclose a block of code or a block definition. The block can be included in an if statement, a function, or a whole class. We've written a VBA function to help us match braces to extract a string that resides between them. This is useful for pulling out the code in a function for example as in the CodeState shown below:
Listing 9 - CodeState in the State Machine for extracting code into the current operation
| Case CodeState
' go to the first bracket in the function |
The code for matching the braces is shown below in Listing
10. It uses a counter to determine when the braces are matched. If
another begin bracket is found, the braceCount is incremented. If an end
bracket is found, the count is decremented. When the count is 0, the
brackets have been matched:
Listing 10 - MatchBrace subroutine for matching brackets in a PHP function
| Function MatchBrace(aParser As
With_Class.Parser) As String Dim startParse As Long Dim endParse As Long Dim braceCount As Long Dim nextToken As String startParse = aParser.Ptr ' initialize the bracket count for matching braceCount = 1 While ((braceCount > 0) And (aParser.IsLastToken = False)) nextToken = aParser.GetNextNonSpaceToken If (nextToken = "{") Then braceCount = braceCount + 1 ' Begin bracket found, increment count End If If (nextToken = "}") Then braceCount = braceCount - 1 ' End bracket found, decrement count End If Wend startParse = startParse + 1 ' get rid of first brace endParse = aParser.Ptr ' use the located bracket positions to extract the string between them MatchBrace = Mid(aParser.Buffer, startParse, endParse - startParse) End Function |
The last subroutine we will talk about is run after the state
machine is finished. This routine goes through each class and uses the
LibraryBaseClass field to determine if the name in this field is the name of one
of our classes contained in our generated class collection. If we find a
class that matches the name of the LibraryBaseClass, we can create an
inheritance relationship between the classes. This routine also uses some
of the built in functions in the WithClass Document to arrange the classes and
relationships after it is finished creating any new relationships between
classes.
Listing 11 - Creates the inheritance relationships using the
LibraryBaseClass field inside the class
| Sub CreateInheritanceRelationships()
' Declare the objects needed to work with classes and relationships Dim theClasses As With_Class.Classes |
The result of running this add-in from the WithClass menu on the sample PHP code we looked at is shown below:

Figure 6 - PHP Diagram produced by running the reverse add-in
Conclusion
WithClass's Add-In and Automation Architecture makes it fairly straightforward for creating add-ins that perform functions such as generate code, generate reports, import comments, and even reverse engineer existing code. In this example we saw how WithClass Add-In utility can be used to create an Add-In for reverse engineering PHP classes. The principles in this article can be applied to reverse engineering any OO-like language. What's nice about the Add-In architecture is it gives the WithClass user control of how their code is reversed. For example, one could add parameter reverse engineering into this state engine by creating a ParameterState and parsing the variables between the parentheses of the operation name. If you are curious about experimenting with Add-Ins, download the enterprise demo of WithClass 2000 and follow the steps in this article.