Bush Guide: Part 5 - BUSH Hackers Guide

This part of the guide contains detailed descriptions of how BUSH is structured, how to add packages and how to port BUSH.  The information in this section is subject to change between different versions of BUSH (and may be out of date).

BUSH Structure

BUSH is written in GCC Ada (GNAT) and GCC C.

Files named ".ads" are Ada package specifications.  They contain the definitions of package functions (like a C header file) and document the functions inside a package.

Files named ".adb" are Ada package bodies.  They contain the actual implementation of the package (like a C ".c" file).

Except for the main program (bush.adb), all other packages have a specification and a body.

The BUSH project files can be divided up into 4 layers which build one on top of another:

Utility Packages

Scanner Packages

Parser Packages

Main Program

Porting BUSH to UNIX

You will need a basic knowledge of Ada.  For example, read chapter 10 of the Big Online Book of Linux Ada Programming (http://www.pegasoft.ca/homes/book.html).

Install a GCC with the Ada language enabled.  Either compile GCC 3.x with Ada or downloaded a version of GNAT for your computer from New York University (ftp://cs.nyu.edu/pub/gnat).  The NYU site has SolarisAn SCO UNIX version is available at http://www.gnuada.org/sco.html.  A NetBSD version is available at http://www.gnuada.org/netbsd.html.  Older versions for other operating systems are lurking around on the web, but GCC 3.x might be your best solution.

The bush_os packages (bush_linux, etc.) contain binding to UNIX/Linux system calls and other operating system constants.  To port BUSH to a UNIX-based operating system, make a copy of bush_linux and rename it to your operating system.  Then edit the file and make sure all the system calls and constants are updated to reflect your version of UNIX.  This is (theoretically) the only BUSH package that needs to be modified.

You will have to check the man pages and the C include files to find the necessary information.  For example, to get tty driver ioctl() values (TCGETATTR, etc.) you may have to read the C include files looking for TCGETATTR (or the equivalent).

It may be possible that your version of UNIX doesn't have a particular system call.  You may have to fake the call using other operating system calls.  For example, "htons" is a macro on HP-UX, not a function call.  Since htons does nothing on HP-UX, write a htons function for the bush_hp package that returns unchanged whatever parameter it is given.

Run the BUSH test suite and try the command line features to make sure your bush_os bindings are correct.

Finally, send Ken your bush_os file and he will add it to the BUSH source code.

Porting BUSH to MS-DOS or Windows

BUSH was designed to run on UNIX/Linux operating systems.  It was not designed to run on Windows.  It is, of course, possible to port BUSH to Windows or MS-DOS, but it will take more work that porting it to a UNIX variation.

You will need to understand the Ada language since you'll probably have to write functions which simulate UNIX system calls.  For example, read chapters 10 and 11 of the Big Online Book of Linux Ada Programming (http://www.pegasoft.ca/homes/book.html).

To port BUSH to MS-DOS, download the GNAT compiler for DOS available at http://www.gnuada.org/dos.html.  This compiler is based on DJGPP which includes libraries that implement many UNIX functions on DOS.  Ken has not used DJGPP so he cannot tell you how many required UNIX functions are missing.

The bush_os packages (bush_linux, etc.) contain binding to UNIX/Linux system calls and other operating system constants.  To port BUSH to a MS-DOS, make a copy of bush_linux and rename it to bush_dos.    You may have to fake some UNIX call using MS-DOS calls.

You may also have to modify user_io which is UNIX-specific.  (Someday Ken intends to move the operating system specific parts of user_io into bush_os, but he hasn't gotten around to it yet.  If you're undertaking a MS-DOS or Windows port, you may want to do this as well.)

To port BUSH to Windows, you need to download GNAT for Windows from New York University (ftp://cs.nyu.edu/put/gnat).  This version of GNAT is based on mingw (not Cygwin, the POSIX layer which sits on top of the Win32 API to enable porting of UNIX software). GNAT for Windows produces native applications which run directly on the Win32 API.  Ken has no knowledge of mingw so don't ask him any questions.  If mingw doesn't emulate any UNIX system calls, then most of bush_os will have to be written with bindings to fake the UNIX calls using Windows calls.

Ken has received a number of requests for a Windows port.  Although it will be a challenge, there is apparently a lot of interest in it and your work will not be wasted.

Run the BUSH test suite and try the command line features to make sure your bush_os bindings are correct.

Finally, send Ken your bush_os file and he will add it to the BUSH source code.

Adding New Built-in Packages

This is not a discussion on language design.  This is a quick overview on adding new built-in packages.  We'll use the Ada.Calendar.Year package as an example binding.

In the current version of BUSH, packages are hard-coded into the parser.  There's no ability to create separate files that can be loaded on demand.  Some day this ability may be added.

If you're adding a third-party Ada project (like AdaCGI), create a subdirectory to hold the project and modify the main BUSH makefile to compile (and make clean) the project.  Test your changes by making a clean rebuild of BUSH.

Now you need to declare your identifiers in the BUSH scanner.

Add the identifier variables needed by your binding to the scanner.ads file.  For example, if you were implementing the Ada.Calendar package, create a cal_clock_t, cal_year_t and so on for each identifier you'll need.  Include any types.  By convention these end in "_t" (for "token").

-- Built-in Calendar package identifiers

cal_clock_t : identifier; -- "calendar.clock"

Add declaration calls in scanner.adb's resetScanner procedure to declare the identifier variables (that is, add them to the symbol table and identify what types they are).  Copy some of the other declarations that are similar to the ones you are doing.  For example, to declare Ada.Calendar's time type,

declareIdent( cal_time_t, "calendar.time", variable_t, typeClass );

There are several declare calls.  declareIdent is a general purpose declaration that uses the identifier variable, the name of the identifier (as the user would type it), what root type it's derived from (variable_t is used for private types, integer_t for integers, string_t for strings, and so forth), and the class of identifier (typeClass for a type declaration, subClass for a subtype declaration, and so forth).

Recompile BUSH and check for errors.

The BUSH scanner now recognizes your identifiers.  Move onto the BUSH parser.

Create an Ada package  to contain your new BUSH built-in package.  You can copy one of the existing packages and edit it.  Each package consists of two files: a package specification and a package body.  For now, create a package specification.  For example, for Ada.Calendar you could create a file called "parser_cal.ads".

The package specification should contain a series of "Parse" procedures.  These will be called by BUSH when it needs to run subprograms in the built-in package.  If a parse procedure defines a function, it should have one out unbounded_string parameter (to return the result of the function).  Procedures, which return nothing, have no parameters.  For the Ada.Calendar package, you might have:

procedure ParseCalClock( result : out unbounded_string );  -- Ada.Calendar.Clock function
procedure ParseCalYear( result : out unbounded_string );  -- Ada.Calendar.Year function
procedure ParseCalMonth( result : out unbounded_string ); -- Ada.Calendar.Month function
procedure ParseCalSplit; -- Ada.Calendar.Split procedure

For each of these procedures, follow them with a stub pragma

procedure ParseCalClock( result : out unbounded_string ); -- Ada.Calendar.Clock function
pragma import( stubbed, ParseCalClock );

This GCC Ada pragma indicates that ParseCalClock has not been completed and it will raise a PROGRAM_ERROR exception if the procedure is called.  When you complete the ParseCalClock procedure, remove the stub pragma.

Compile the package specification with gcc -c to make sure there are no obvious errors.

Now tie your package specification into the parser.  Edit the parser.adb file and add the name of your package with the "with" and "use" statements at the top of the file.

Get BUSH to take action when it sees a procedure or function.  (This is the purpose of those identifier variables you declared earlier.)  BUSH checks for built-in package procedures in ParseGeneralStatement.  It checks for built-in package functions in ParseFactor.

For an Ada.Calendar package, add the check for the Ada.Calendar.Split procedure in ParseGeneralStatement:

elsif token = cal_split_t then -- are we looking at "calendar.split"?
   ParseCalSplit;             -- then process a calendar.split call

Add functions like Ada.Calendar.Clock to ParseFactor.  The parameter is always "f" (the value of the "factor" and the variable "kind" must be set to the identifier variable for the type of result (e.g. an integer result has a kind of integer_t).

elsif token = cal_clock_t then -- are we looking at "calendar.clock"?
  ParseCalClock( f );
  kind := cal_tile_t;

Recompile BUSH to check for errors.  Try using the procedures and functions.  Each should raise a PROGRAM_ERROR exception but should have no other errors.

The only thing left to do is to check the parameters to the subprograms and execute them.  Create a package body file and being implementing the Parse procedures one at a time.  As you implement each, remove the corresponding stub pragma from the specification file.

The variable "token" represents the current item in the source file.  To move to the next item in the source file, use the expect procedure.  Typically, you are only looking for an identifier or a punctuation mark.  For example,

   expect( cal_clock_t ); -- expect the identifier "calendar.clock"
   expect( symbol_t, "(" ); -- expect the punctuation mark "("

The parser has some Parsing functions that automatically process and report errors.  An important one is:

   ParseExpessions( val, kind ); -- interpret any kind of expression.  Return the value and the type.

Using expect and ParseExpression you can read through the parameters for most functions.  For example, to Ada.Calendar.Year has one parameter:

  year_value : unbounded_string;
  year_type : identifier;
  expect( cal_year_t );
  expect( symbol_t, "(" );
  ParseExpression( year_value, year_type );
  expect( symbol_t( ")" );

Don't check for a semi-colon.  BUSH will do that later.

Now add the type checks.

The scanner has several functions to check the type of an identifier.  The main procedure is baseTypesOK.  This compares two type identifiers and verifies they are compatible with one another.  You don't have to report the error: baseTypesOK will do this for you.  baseTypes also handles derrived types and subtypes.

expect( cal_year_t );
expect( symbol_t, "(" );
ParseExpression( year_value, year_type );
if baseTypesOK( year_type, cal_time_t ) then  -- year should be a calendar.time type or compatible
   null;                                      -- do nothing special if type is OK
end if;
expect( symbol_t( ")" );

Recompile BUSH again and check your work.  Although calendar.year does nothing yet, BUSH should understand the parameters.  Using an integer or a character parameter instead of a calendar.time parameter should cause an error.  Leaving out a "(" or ")" should also cause an error.  Check your definition of types using the env command

=> env calendar.time
calendar.time = ( private type )
=> env calendar.year
calendar.year = ( identifier of type keyword )

BUSH has no formal representation for functions and procedures at this time.  They will be reported as keywords.

Finally, you need to actually execute the subprogram.  Before you execute anything, check to see if you should execute the function with the BUSH isExeuctingCommand function.  If BUSH is doing a syntax check of a script, or if an error was previously encountered, isExecutingCommand will be false.

It is also a good idea to wrap the function or procedure you are calling in an Ada declare block to catch and report any exceptions.  Otherwise, BUSH will crash because of the exception.

All the parameters are unbounded strings and will have to be converted to the appropriate Ada type needed for the parameters.  In the case of Ada.Calendar.Year,

if isExecutingCommand then
      result := to_unbounded_string( year( time( to_numeric( year_value ) ) )'img );
   exception when others =>
      err( "exception raised" );
end if;

An explaination of the conversions:

Rebuild BUSH and test the function to make sure it works.  You've completed the implementation of "calendar.year".  Complete and test the rest of the bindings.

The actual ParseCalYear function is in the parser_cal.adb file.  The only difference to the calendar package compared to what you did here was that Ada.Calendar.Time was a private type so BUSH implements its own calendar package using a normal type so that time values can be converted to strings.

End of Document