Created by Daniel Hellerstein,  Last updated June 2006.

                The GRBL2 Batch Procedures. 

                -------------------

Contents:

1. Introduction
2. Loading the GRBL2 Batch Procedures
3. Calling G_BATCH
3a. Globals set by G_BATCH
3b. Structure of GRBL2 command files (that are used by G_BATCH)
4a. G_BATCH options
4b. Detailed description of options:
5. Using a G_BATCH2 procedure to specify additional keywords
6. The G_MKDATA procedure
6a. The G_MKXUSE procedure

                -------------------

1. Introduction 

The G_BATCH procedure allows you to implement a user interface to your GAUSS 
programs.  Specifically,G_BATCH allows you to process "GRBL2 command files".
 "GRBL2 command files" can be used to specify which dataset 
to use, what variables to include, and a variety of other modeling options. 


GRBL2 command files consist of statements, with each statement ending with 
a semi-colon (;).

Each statement has the form:
     keyword [val1 ... valn] ;
where:
     keyword: a generic, or model specific, option or parameter
     val1 ... valn : (optional) values specific to the keyword


Note that in addition to model specification, GRBL2 command files support a simple
internal logic that permits some programming. For example, you can specify 
"subroutines" that can  be called several times using different arguments.
Or, you can include IF and GOTO statements to control what GRBL2 commands
are processed.

                -------------------

2. Loading the GRBL2 Batch Procedures 

To use the GRBL2 batch procedures, include at the top of your main program file 
(that implements your model):
           #include grbl2_procs.inc;
           #include grbl2_file.inc ;
           #include grbl2_batch.inc ;

These files should be either in the current directory, or in GAUSS's path. 


Notes:
   *  the first two .INC files include procedures used by GRBL2_BATCH. See GRBL2_PROCS.TXT for a 
      description of their contents.
     Be sure to use the above order -- grbl2_procs before grbl2_file before grbl2_batch!
   
   * Depending on your model, you may also #include grbl2_mle.inc and grbl2_math.inc

   *  you can run the GRBL2_INI.PRG program when you first start up GAUSS. This will
      change the GAUSS path to include directories you specify.

The G_BATCH procedure does the work. G_BATCH requires a text string that contains a GRBL2 
command file. Typically, your program will:

   1) ask the user for the name of a parameters file (a GRBL2 command file)
   2) read this file
   3) send the contents of this file to G_BATCH
   
G_BATCH sets a number of global scalars and matrices. These include a variety of 
generic values (such as the dataset name).

Furthermore, it is straightforward to extend G_BATCH -- so that it will also
read/parse options specific to your model.

                -------------------

3. Calling G_BATCH

G_BATCH takes 3 arguments, and returns two arguments. It also can set a number
of globals:

     {new_string,status}=G_BATCH(text_string,reset,use_custom_proc)

where:
        text_string = a text string containing the contents of a 
                      GRBL2 command file
        reset = if 1, then reset the generic global parameters
        use_custom_proc = if 1, then call G_BATCH2 if an unknown (to G_BATCH)
                          keyword is encountered
                         Or, a 2 column vector of "start" and "end" block keywords (see below)

        new_string = the portion of text_string that was not processed.
                     If status=0, this will always be empty (since 0 means "text_string
                     was completely processed").
                     If status=1, this will contain the portion of text_string following
                     a RUN command.
                     Note that text_string can contain several RUN commands.
                     Thus, by iteratively calling G_BATCH, and setting
                     text_string=new_string, you can successively run several
                     models.

                      Or, a 3 row string vector (if a block is returned).

        status = if 0, then all commands in text_string were processed, or a
                       a "STOP" command was encountered.
                 if 1, then a "RUN" command was encountered.
       if 3, then a "blocK" is returned


Using blocks:
  For added flexibility, you can tell G_BATCH to return "blocks" of code, that will be handled
  by the calling program. This is used, for example, to process "CREATE" ... "RUN" blocks.

  To specify blocks, the use_custom_proc argument should be a 2-column vector of "instructions:.
     The first column contains "start keyword".
     The second column contains "end keyword"
  Whenever a keyword matches a "start keyword", then all content between (and including) the
  keyphrase containing this keyword and the "End keyword" (specified in column 2) is returned.

  If you want to call G_BATCH2, just have one of the rows have a "1" and "1" in the first and 2nd columns


  If a block is found, then
    i) The action is set to 3
   ii) NEW_STRING will be a 3 row string vector:
      row 1 : The string containing the REMAINING commands to be executed.
      row 2:  The block returned (including start and end keyphrases)
      row 3:  The keyword (basically, the first word in the block)


  Example:
   {stuff,iact}=g_batch(cmd_string,0,("CREATE"~"RUN")|("1"~"1"));
   will return a "create file" block if CREATE ; .... ; RUN ; is encountered (which can then be fed to
   GRBL2_CREATE).  Note the use of the "1"~"1" to signal "call g_batch2 if unknown keyword encountered".


Notes:
    *  the G_ZAPCMT procedure can be used to remove comments from the
       text_string. You should call G_ZAPCMT before calling G_BATCH.

    * Example of usage:
        pfile="my_params.in";
        istuf=getf(pfile,0);
        istuf=g_zapcmt(stuf,"@");
        {stuf2,astat)=call g_batch(istuf,1,1);
        "Parameter file processed. ";

                -------------------

3a. Globals set by G_BATCH

The following global variables (they are CLEARed when GRBL2_BATCH.INC is #included) can be
set by commands in a GRBL2 command file. You can use these global variables directly.
In addition, many of them are used by the various GRBL2 procedures (such as G_MKDATA and G_MKXUSE).

   bstart:    vector of starting values (primary)
   bstart1:   vector of starting values (first)
   bstart2:   vector of starting values (second)
   baux:      vector of starting values (say, for AUX variables)
   baux1:      vector of starting values (say, for AUX1 variables)
   baux2:      vector of starting values (say, for AUX2 variables)

   defines,definew: strings used for replacements

   filename:  the name of a GAUSS dataset
   fixname:   used to rename variables

   i_svlist:   number of SAVEVARS global matrices to save to .FMT files

   IDindex,IDname:     an "ID" variable
   ID1index,ID1name:   an "ID1" variable
   ID2index,ID2name:   an "ID2" variable
   ID3index,ID3name:   an "ID3" variable

   ndefines:      number of "DEFINES"  replacement strings

   outfile:   name to use for an output file
   prntit:    controls how much intermedate output is written

   savevars,savefils: global matrices to save to .FMT files
   slctlist: a matrix of "observation selection criteria". Each row is a criteria,
        with 4 columns of info.  slctlist is used by g_vuslct and g_slctid

   title:     a title

   varlist:   the variables in filename
   verbose:   controls extent of status reporting

    WTindex,WTname,WTnorm,WTrep:  a "WEIGHT" variable
    Wt1index,WT1name,WT1norm,WT1rep:  a "WEIGHT1" variable
    Wt2index,WT2name,WT2norm,WT2rep:  a "WEIGHT2" variable


   Xindex,Xname,nx:  a vector of "X" variables
   X1index,X1name,nx1:  a vector of "X1" variables
   X2index,X2name,nx2:  a vector of "X2" variables
   X3index,X3name,nx3:  a vector of "X3" variables

   XDUMMY,nXDUMMY : a matrix of DUMMY variables (defined using the DUMMY GRBL2 command)
   XNEW,nXNEW    : a matrix of XNEW variables (defined using the XNEW GRBL2 command)

     Yindex,Yname:     a "Y" variable
   Y1index,Y1name:     a "Y1" variable
   Y2index,Y2name:     a "Y2" variable
   Y3index,Y3name:     a "Y3" variable

      Zindex,Zname,nZ:  a vector of "Z" variables
   Z1index,Z1name,nz1:  a vector of "Z1" variables
   Z2index,Z2name,nz2:  a vector of "Z2" variables
   Z3index,Z3name,nz3:  a vector of "Z3" variables

      Aindex,Auxname,naux:  a vector of "Auxillary" variables
   A1index,aux1name,naux1:  a vector of "AUX1" variables
   A2index,aux2name,naux:  a vector of "AUX2" variables

  


Note: the reset argument to G_BATCH, if set, will CLEARG these matrices.
      Also,the matrices from IDindex to Z3 are cleared whenever a new FILE
      is opened.

3b. Structure of GRBL2 command files (that are used by G_BATCH)

GRBL2command files are simple text files. 
Each GRBL2 command has the following structure:
    keyword options ; 
Note that a semi-colon (a ;) completes the command. Thus, you can have multiple commands
on a single line, or one command can span multiple lines.

Similar to GAUSS, the @ key can be used for comments.
For example:
    FILE mydata ;
    @ the next lines define the required variables @
    X education age height ;
    Y income ;
    run ;

4a. G_BATCH keywords

The following describe keywords are understood by G_BATCH. Note that you can 
add more keywords by writing a procedure (named G_BATCH2), and telling
G_BATCH to call G_BATCH2 if an unknown keyword is encountered.


Summary of GRBL2 command-file keywords:

 BSTART     : Assign a vector of numeric values to the BSTART global vector
 BSTART1    : Assign a vector of numeric values to the BSTART1 global vector
 BSTART2    : Assign a vector of numeric values to the BSTART2 global vector

 BAUX       : Assign a vector of numeric values to the BAUX global vector
 BAUX2      : Assign a vector of numeric values to the BAUX2 global vector
 BAUX3      : Assign a vector of numeric values to the BAUX3 global vector


 COMMENT    : A text string written to the screen

 DELETE     : Delete a file

 DUMMY      : define sets of dummy variables
 DEFINE     : Define a GRBL2 variable (used as a shorthand for longer strings)
 ENDBLOCK   : End of a "gosub callable" routine

 GOTO       : Skip to a label in the batch file  
 GOSUB      : Replace with labeled block of code.
 FILE       : Input dataset
 FIXNAMES   : Transform names of displayed variables

 IF and #IF : Conditional inclusion of command (#IF for numeric comparisons)
 ID         : Specify an ID variable
 ID1        : Specify an ID variable
 ID2        : Specify an ID variable
 ID3        : Specify an ID variable

 INFO     : Info on currently selected dataset
 INPUT      : Input dataset (synonym for FILE)

 OUTPUT     : Name of output file.
 PAUSE      : Pause for a few seconds, with optional input
 QUERY      : Ask the user for input (assign this input to a GRBL2 variable)

 RESET      : Reset (to defaults) the global variables (set in G_BATCH)
 RUN        : Return to calling program, with status=1

 SAVE       : Save matrices to .FMT files.
 SET        : Set values of global variables.
 STOP       : Stop the program.
 STARTBLOCK : Start of a "gosub callable" routine

 TITLE      : Title to display in output file

 VERBOSE    : Status message verbosity.
 VIEW       : View a text file using a simple-but-navigable viewer.

 WEIGHT       : Define a weight variable
 WEIGHT1      : Define a 1st weight variable
 WEIGHT2      : Define a 2nd weight variable

 WORK_DIR    : Define a working directory

 X          : Define a list of "X" variables.
 X1         : Define a list of "X" variables.
 X2         : Define a list of "X" variables.
 X3         : Define a list of "X" variables.

 XNEW       : define new variables as arithmetic combinations of two variables in dataset

 Y          : Define a  "Y" variable.
 Y1         : Define a  "Y" variable.
 Y2         : Define a  "Y" variable.
 Y3         : Define a  "Y" variable.

 Z          : Define a list of "Z" variables.
 Z1         : Define a list of "Z" variables.
 Z2         : Define a list of "Z" variables.
 Z3         : Define a list of "Z" variables.


 AUX          : Define a list of "Auxillary" variables.
 AUX1         : Define a list of "Auxillary" variables.
 AUX2         : Define a list of "Auxillary" variables.


Note that the An (A=X,Y,Z, AUX, or ID; and n='',1,2, or 3) dictate the creation of different global matrices.
Similarly, WEIGHT and WEIGHT2 create different global matrices.

4b. Detailed description of GRBL2 command-file keywords:

Note.   Unless otherwise specified:
        all keywords, and their options, are case INsensitive.



* a_comment:
     Same as COMMENT, but without replaceing GRBL2 substitution variables.

! a_comment:
     Same as comment, but only write to screen


AUX: 
  Define a list of "Auxillary" variables. The AUXNAME, AINDEX, and NAUX global matrices are set.
  The syntax is exactly the same as X, except the -CONST option is NOT supported.


AUX1: 
  Define a list of "Auxillary" variables. The AUX1NAME, A1INDEX, and NAUX1 global matrices are set.
  The syntax is exactly the same as X, except the -CONST option is NOT supported.

AUX2: 
  Define a list of "Auxillary" variables. The AUX2NAME, A2INDEX, and NAUX2 global matrices are set.
  The syntax is exactly the same as X, except the -CONST option is NOT supported.

BSTART: Define a vector of starting values.

   Define a vector of starting values. The results are stored in the 
   BSTART global matrix.

   Syntax:
        BSTART yes/no  val1 ... valk ;
    Where yes/no is either YES or NO.
    If YES: the values of val1 to valk are stored in BSTART.
    If NO : BSTART is set to missing

    Examples:
         BSTART NO ;
         BSTART YES 12.5 661 -3 ;


BSTART1: Define a vector of starting values.

   Define a vector of starting values. The results are stored in the 
   BSTART1 global matrix.

   The syntax is the same as BSTART1


BSTART2: Define a vector of starting values.

   Define a vector of starting values. The results are stored in the 
   BSTART global matrix.

   The syntax is the same as BSTART


BAUX: Define a vector of "auxillary" values.

   Define a vector of auxillary values. The results are stored in the 
   BAUX global matrix.

   The syntax is the same as BSTART

BAUX1: Define a vector of "auxillary" values.

   Define a vector of auxillary values. The results are stored in the 
   BAUX1 global matrix.

   The syntax is the same as BSTART

BAUX2: Define a vector of "auxillary" values.

   Define a vector of auxillary values. The results are stored in the 
   BAUX2 global matrix.

   The syntax is the same as BSTART

COMMENT  a_comment:
    A_comment is a text string.  A_COMMENT will be written to your screen, and
    to the output file.

    Comment can actually span many lines.

    Example: Comments ===== Is there life after grad school?

    Note that GRBL2 substitution-variables can be used in a_comment (see DEFINE 
    for the details)


DELETE:  Delete a  file
    Delete is used to delete a file on your hard drive. This can be useful if you create temporary
    files.
  
    Syntax:  DELETE filename ;

    If no extension is given, a .DAT extension is assumed.



DEFINE:  Define a GRBL2 variable.
    DEFINE is used to define strings that can be used elsewhere in
    your command file. These "GRBL2 variables" can be referenced
    by using a $NAME syntax -- whenever a $NAME is found, it is replaced
    by the DEFINEd value associated with NAME.

    Syntax:
         DEFINE  varname  the string to use ;

    Example:
         DEFINE BaseTitle The year 2000 bug runs: ;
         DEFINE TCost Travel cost related regressions ;
         DEFINE DefModel MBL ;
         DEFINE Records  The $STATE datum ;

    Examples of use of $variables:
          Model $DEFMODEL ;
          If $USEFILE = FOOBAR then comment Using FOOBAR ! ;
          TITLE  $basetitle Best case scenarios ;

     Notes:
        * QUERY offers another means of defining "GRBL2 variables"

        * Do NOT put a $ before the varname in DEFINE!

        *  You CAN place previonsly defined $VARIABLES in the
          "string to use" portion. For example: DEFINE Records The $STATE datum

        * There are several redefined $vars:
          $FILENAME is the current input file.
          $VARLIST yields a space delimited list of variables (in the current input file)
          $DATE is the current date/time, in 8/25/2005 13:40:44 format.
          $PROGRAM is the name of the GRBL2 program currently running

       * The following $words are reserved (for use with GOTO commands) -- do NOT use them in DEFINE or QUERY!
           $RUN, $NEXT, and $PAST

      *  GRBL2 variables are stored in the DEFINES and DEFINEW global
         matrices.

      *  DEFINE -DROP will remove all currentl GRBL-variable definitions (except for
         $DATE, $PROGRAM, $FILENAME and $VARLIST).


DUMMY: define sets of dummy variables

  Create vectors of 0/1 (or 0/value) dummies, using an "ordinal variable" to define the levels;
  and (optionally) a cardinal value to defined the "values".

  Syntax:

    DUMMY ORDINAL_VAR CARDINAL_VAR_OR_1 -DROP_or_-KEEP  -BASE=ABCDE ~  def2 ...  , ;

    Where:

   ORDINAL_VAR : a "ordinal variable", that contains a limited set of values.
             A seperate dummy variable will be created for each possible value.

   CARDINAL_VAR : instead of using "1" for the (single) non-zero dummy value, you
         can use this observation's value of cardinal_var. Cardinal_var
         can be one of the variables in the dataset, OR it can be
              one of the XNEW variables (but you MUST define XNEW before DUMMY)

   -DROP   : Optional. If -DROP, then drop the first level
             If -KEEP, then keep the first level
            -KEEP is the default

   -BASE=bname : Optional. Base name for dummies. Must be no more than 5 characters.
            It's a good idea to end with underscore ("_").
            If not specified. "Dnn_" is used, where nn starts with "01" (for the
            first dummy specified), etc.

       ~ (or | ) are used to seperate dummy specifications 




    Examples: DUMMY REGION 1 -DROP  ; if 5 different regions, makes a vector of four 0/1 dummies (first level is dropped)
              DUMMY RACE INCOME -KEEP ;  if 4 races, makes a 4x1 vector zero vector, with one of the zeros replaced by INCOME

   Notes
    * If you are using "1" as the cardinal variable, and you are including a constant, we highly recommend
      specifying -DROP.

    *   You can specify several dummys. For example:
       DUMMY REGION , 1 -DROP  ~
           RACE , INCOME -KEEP -BASE=INC_ ;

         (note that the commas in the above example are for readability, they are is ignored by GRBL2)

    *  To display a description of the dummies, you can use G_VUDUM
       i.e.;     call g_Vudum(dinfo,filename,0,1) ;

    *   To NOT have dummies, you can us:   DUMMY NO

    *  Example: to use log of XINCOME in a dummy on RACE :
          Xnew LNINC = xincome ( ln ;
           dummy RACE LNINC -DROP ;

    *   Global DINFO matrix:

       The final result is the DINFO global matrix.
       If no dummies are specified, dinfo will be missing, or equal to 0.
       Otherwise,  each row in Dinfo refers constains instructions for creating a dummy variables.

       The structure of a row is:
          OINDEX~CINDEX~BASENAME~NLEVELS~Level1~...~LevelN
       If no cardinal variable was specific (0/1 dummies are desired), then CINDEX will be 0.

       If more then one DUMMY being created, the number of useful columns in each rows
       will not be the same  (zeros are used to pad the   matrix column size to the length
       of the longest row, where the longest row is associated with
       the ordinal variable with the most levels)


   Examples:
        DUMMY REGION 1 -DROP  ; if 5 different regions, makes a vector of four 0/1 dummies (first region has no dummy)
        DUMMY RACE INCOME -KEEP ;  if 4 races, makes a 4x1 vector of zeros, with one of the zeros replaced by INCOME
        DUMMY NO  ;

       If
         *  VERSION has 4 distinct values: 1, 2, 3, and 4
         *  The variables you wish to "expand" are a 0/1 dummy, and AGE.
       then:
          DUMMY VERSION 1 -DROP ~  VERSION AGE -KEEP ;
       will yield two sets of dummies will be.
       For example:
            For observations    /\ the following variables are
                with values:    /\ created
            VERSION and AGE     ||
                   4  35        ||  0 0 1    0   0   0 35
                   2  50        ||  1 0 0    0  50   0  0
                   1  12        ||  0 0 0   12   0   0  0
            where the first three columns correspond to the 0/1 dummy (note that the effect
            of -DROP when VERSION=1), and the last four correspond to AGE

ENDBLOCK:
    Signals end of a STARTBLOCK "GOSUB callable" routine.
    Syntax:
      ENDBLOCK ;

    (ENDBLOCK does not take any arguments)



FILE: Input dataset
   Name of a gauss data file. Do NOT include the .DAT (or .DHT)
   extension. You can include path information; if you don't, the current
   directory is used.

   Syntax:  File a_filename ;

   The FILENAME global will contain the value of a_filename.
   The VARLIST global matrix will contain a list of the variables in a_filename.

   Example: FILE recnat4s ;

   Note: to include spaces in your filename, use quotes.
   For example:
      FILE  "my file" ;


FIXNAMES: Transform names of displayed variables

     FIXNAMES should be called with one, or many,  "oldname newname" pairs.
     These pairs can be on spread over several lines.

     FIXNAMES stores results to the FIXNAMES global matrix, which is used by the
     G_FIXNAM procedure.

     Note that the oldname(s) and newname(s) must be 8 characters or less.

     FIXNAMES is meant to be used to replace the parameter names displayed in
     your ouptut file. For example, it can be used for replacing " names".

     Example:
        fixnames
            D02_001  alt_1
            D02_003  last_alt ;


GOTO: Skip to a label

  Skip to a label in the program. Labels can be any length, but
  must end with a colon.

  Example:
        GOTO FOO1 ;
         ....
        FOO0: some stuff ;
        FOO1: other stuff ;

  GOTO is often used with IF to allow conditional execution of models.

  Note that labels are case-insensitive.

  Special options:
     You can skip TO or PAST the next RUN command by using $RUN or $NEXT respectively.
     Thus:
       * GOTO $RUN ;
          means ...
         "skip to the next  RUN command, ignoring all commands in between the present location and this
         next RUN command, and then run the model."
       * GOTO $NEXT ;
          means ...
         "skip past the next  RUN command, ignoring all commands in between the present location and this
         next RUN command, DO not run the model."
    

GOSUB:  Replace with labeled block of code.

   GOSUB is used to insert a "block of code" at the current location.
   These "blocks of code" are defined using
   the STARTBLOCK ... ENDBLOCK options.

   You can specify "arguments" in a GOSUB, all %n strings in the
   STARTBLOCK ...  ENDBLOCK will be replaced by these arguments.

   Examples:
        GOSUB BLOCK1 ;
        GOSUB MyWay2  X1 X2 , samples\test1 ;

   Notes:
      *  Arguments are seperated by commas.
      *  When evaluated, spaces and ' are stripped from ends of
         each argument
      *  You can have up to 50 arguments

      * If you use GOSUB ...
           You MUST call G_GOSUB0 before calling G_BATCH.
      IF G_BATCH finds a GOSUB, STARTBLOCK, or ENDBLOCK -- an error message
      will be displayed and processing will stop!


IF, #IF, and EXISTS:     Conditional inclusion of command

   IF and #IF compares two terms; if true, then either interpret a
   command, or goto a label.

   IF will do string comparisons.
      The two terms can be strings (with no spaces), or can be
      $variables (created with DEFINE or QUERY).

   #IF will do numeric comparisons -- both sides must be
   numeric values (after converting $variables),

   EXISTS check for a file name -- if it exsts then either interpret a
   command, or goto a label.

      The filename can be a string, or can be a $variable (created with DEFINE or QUERY).

   For IF and #IF, six comparison types are allowed: EQ NE GT GE LT and LE.

   Examples:
         if 1 eq 2 goto  noway ;
         if $ONE eq $TWO then comment Var One is equal to Var Two ;
         if $OPT gt 1 goto step2 ;
         #if $MININC > 10000 then select or rich eq 1 ;
    EXists  NEW_DATA1 goto step2 ;
    

   Notes:
     * you can NOT have a GOSUB xxx following a then.  That is,
         IF $VAR1 = 2 then GOSUB proc2 ;
       is NOT allowed (use GOTO's instead).
     * Before comparision, spaces and ' are stripped from each
       term.
     * When IF is  used in a STARTBLOCK ... ENDBLOCK block, a nice trick is
       to use:
           IF "TESTVAL" eq "%2" then xxxx ;
       If there is no 2nd argument, this yields:
           If TESTVAL eq "" then xxxx
       which is false, so xxxx is not attempted.  However, if this
       is used in a STARBLOCK ... ENDBLOCK block that is called using:
            GoSUB myblock  v1 , TESTVAL
       then the %2 argument will be "TESTVAL", and  match will occur
      (note that spaces are stripped from the arguments listed in a GoSub)

   * IF_EXISTS is a synonym for EXISTS

ID:
   Define an identifier variable. The IDNAME, IDINDEX, and IDTYPE global 1x1 scalars are set.

   ID is often used to create "panels" (subsets of observations; such as multiple years
   of observation for one individual).

  Syntax:
       ID  varname -type ;
   where:
         varname is one of the variables contained in the current datafile
        -type is optional.
   or
      ID n   ;
   where:
      n is an integer > 0

   -type can be:
      -ID  : the variable identifies an ID that is unique (this is the default)
      -COUNT : the variable identifies the number of rows grouped together

  For the first method (ID varname -type):
    IDNAME is the VARNAME,
    IDINDEX points to the row of VARLIST (and the column of the dataset) corresponding to varname.
    IDTYPE is either "ID" or "COUNT".

    Exactly how the variable is used depends on the program. However, typically:
      ID) The rows belonging to a panel are contiguous (in adjacent rows of the dataset),
          The ID Var contains a panel-specific, unique value.
     COUNT) The rows belonging to a panel are contiguous (in adjacent rows of the dataset),
            The ID Var contains the count of how many rows are in this panel.

          Each panel has a unique value for the ID variable.

  For the second method (ID n):
    This method is used to specify "balanced" panels, with each panel containing "n" rows.
     IDNAME="#"
     IDINDEX=n
     IDTYPE="BAL"


  Example: in the following example...
       ID ID_ID -ID and ID ID_CT -COUNT would yield the same panels:
       Panel 1: rows 1,2,3
       Panel 2: rows 4,5,6
       Panel 3: rows 7,8,9,10
       Panel 4: rows 11,12
  Note that typically, the value of ID_CT from the first row in a panel is used -- ther others
  are ignored (thus, ID_CT for row 2 and 3 could be set to 0)

    ROW  ID_ID  ID_CT   X1     X2
   -------------------------------------
    1   1   3  515   15
    2   1   3  263   516
    3   1   3  262   677
    4  12   3  775   63
    5  12   3  12   66
    6  12   3  6    7
    7  33   4  678  886
    8  33   4  12   63
    9  33   4  64  77
   10  33   4  997 82349
   11  22   2  44  778
   12  22   2  3   754


  Notes:
     * You MUST define a dataset (using FILE) before using ID
     * The VARNAME variables MUST be a variable in the selected dataset.


ID1:
   Define a second identifier variable. The ID1NAME, ID1TYPE, and ID1INDEX global 1x1 matrices are set.
   The syntax is the same as ID.


ID2:
   Define a second identifier variable. The ID2NAME, ID2TYPE, and ID2INDEX global 1x1 matrices are set.
   The syntax is the same as ID
   Note: in contrast to ID, the typical usage of ID2 does not require contiguous rows
         For example, ID may define a "panel" or contiguous rows; while ID2
         defines non-contiguous subsets (within the panel's rows);

ID3:
   Define a third identifier variable. The ID3NAME, ID3TYPE, and ID3INDEX global 1x1 matrices are set.
   The syntax is the same as ID
   Note: in contrast to ID, the typical usage of ID3 does not require contiguous rows
         For example, ID may define a "panel" or contiguous rows; while ID3
         defines non-contiguous subsets (within the panel's rows);


INFO:
   Produces a short report listing the name of the currently selected gauss dataset, the number of
   rows in this dataset, and a list of variable names.

   Alternatively:

       INFO other_file ;

   will display basic information on other_file (rather then the most recent FILENAME file );


INPUT: Input dataset. This is a synonym for FILE -- see the description of FILE for the
       details.


OUTPUT: Name of output file.
    Name of file to write results to.

    Syntax:
         OUTPUT output_filename -OVERWRITE ;
     or
         OUTPUT file="output_filename" -OVERWRITE ;

     The -OVERWRITE is optional. If you specify OVERWRITE, pre-existing version of
     output_Filename is deleted.
     
     Otherwise, new output is appended to the output_filename file.

     Notes:
        * the output_filename is stored in the OUTFILE global variable.
        * if you use the file="output_filename" format, output_filename may contain spaces.
        * -RESET is synonymous with -OVERWRITE
        * Older versions of GRBL2 (pre Nov '05) used:
        OUTPUT output_filename RESET. 
          This syntax no longer works.

    Examples:
          OUTPUT all.out ;
          OUTPUT nuresult.out -overwrite ;
          OUTPUT file="this is my results.out"  ;


PAUSE:   Pause for a few seconds, with optional input
   Pause and wait for user input (any single key). If no input
   after a designated time, continue.

   Example:
            PAUSE 10 This is a prompt ;

   The "10" is a 10 second delay,
   The "This is a prompt" will be displayed.
   The key pressed will be stored in a $PAUSE variable.
   If no key pressed (i.e.; no answer after 10 seconds),
   the $PAUSE variable is set to "".


QUERY: Ask the user for input
   QUERY is used to ask the user for input (from the keyboard). The
   results are then saved into a "GRBL2 variable" (see DEFINE for a
   description of GRBL2 variables).

   Syntax:
      QUERY varname  multi-word prompt;
   If the prompt is not included, a simple generic prompt is used.

   Example:
      QUERY outfil Select output file ;
      QUERY BidType ;

    Notes:

      * the prompt in QUERY (as well as the "value" in DEFINE), may contain
        $variables -- they will be replaced by the current value
        of the corresponding GRBL2 variable (as set in prior DEFINEs and QUERYs).

      * The varname in a QUERY must NOT start with a $.

      * GRBL2 variables are stored in the DEFINES and DEFINEW global matrices.      


RESET:  Reset variables
   Resets the global variables to their defaults.
   Note: this sets them to the values they would have if G_BATCH is called with G_BATCH(..,1,..)


RUN:
    Return to calling program, with status=1


SELECT: select which observations to retain

    You can enter several observation selection criteria.
    Each criteria  has 4 elements;
       OR/AND Variable Condition Value
    where
       OR/AND: Either OR or AND. To be selected, any of the OR criteria must
               be satisfied, and ALL of the AND criteria.
       Variable : The name of a variable
       Condition: One of GT LT LE GE EQ or NE
       Value : A numeric value

       Examples:
          select or q2 eq 2 ,
                 or q2 eq 1 ;

         select or q2 eq 2   and q7_1b eq 2
              and version ne 2  and version ne 3 ;

       Notes:
        *  For readability purposes: linefeeds, spaces, and commas can be freely intermixed
           in the SELECT list -- they are ignored.

SAVE: Save matrices to .FMT files.
  Specify a list of matrices and .FMT files. These are stored in the 
  i_svlist,savevars, and savefils  global matrices; which are used by
  the G_SAVE procedure.
  
  Syntax:
     SAVE FILE=VAR ;
   or
     SAVE VAR ;
   where
      file - a file name (do NOT include an extension)
       var - a global variable

   If you do not include FILE=, then a file with the same name as
   the variable will be created.
   
   Example:
      SAVE X   --   save X to X.FMT 
      SAVE  X1F=x1  X2G=X2  --  save x1 to  X1F.FMT, and x2 to X2G.FMT

SET: Set values of global variables.
     Syntax:
         SET  varname=newvalue varname=newvalue ;

     This is especially useful for changing the defaults of model specific
     parmeters.

    Examples:
        set ditype=2 ;
        set ditype=1 steptype=1 ;

    Note: this sets GAUSS variables. It does NOT set the "GRBL2  string 
          replacement" variables.


STARTBLOCK: Start of a "gosub callable" block
  Syntax:
   STARTBLOCK blockname ;
  Blockname should be a label (like with GOTO).
  It MUST be unique (don't define multiple GOTO or GOSUB labels using the sane name)

  Note: code between STARTBLOCK and ENDBLOCK will be used to replace GOSUB commands.

  Note2: the G_GOSUB0 will do this replacement. G_GOSUB0 must be called before G_BATCH
    
           IF G_BATCH finds a GOSUB, STARTBLOCK, or ENDBLOCK -- an error message
      will be displayed and processing will stop!
         

STOP how :
EXIT how ;

  Stop the program.

  how is  optional, it can be SYSTEM, DOS, or GAUSS.

    how not specified:  G_BATCH returns with status=0. This usually is intererpeted as
                        "ask user for new input"
    how="GAUSS" : exit to GAUSS prompt
    how="DOS"  : exit to operating system after 4 second delay (close GAUSS down)
    how="SYSTEM"  : exit to operating system immediately (close GAUSS down)

    Hint:
       if you use the GRBL2 "script" facility, you should end each of the input
       files (that are listed in the GRBL2 script file) with EXIT SYSTEM.

TITLE:   
    Create a "title".

    Results are stored to the TITLE global variable.
        
    TITLE can contain  $xxx variables -- they will be replaced by the
    corresponding GRBL2 variables.

   Example:
      TITLE This is the b4 regression ;
      Title $Atitle :  apples ;

     Note:
       if you earlier specified:
           DEFINE Atitle Fruit demand curves ;
       then the latter example yields:
           Fruit demand curves : apples ;
  
   Special syntax -- replacing lines of a multi line title.
   
   If you use the following syntax:
      TITLEn  a_string ;
        when n is an integer between 1 and 10 (inclusive).
        Then the "nth" line of a current title will be replaced by a_string .
        If the current title does not have n lines, then the appropriate number of blank lines will
   be added.
   
   The use of TITLEn allows you to change one line in long titles -- which can be useful if you
        are running many different models, with each model differing in one of several respects (so
        that most of the title remains the same, and you change just the line that contains the 
        description of what changes).

   Example:
          TITLE This is my model.
             Using Logit
             Using basic dataset
             Using core variables.

              ... run some models

          TITLE2  Using Probit

              ..... run some models

          TITLE2 Using Logit
          TITLE3 Using revised dataset
     
              ..... run some models

     etc 


WEIGHT: 
   Define a weighting variable. The WTNAME, WTNORM, WTREP, and WTINDEX global 1x1 scalars are set.

  Syntax:
       WEIGHT varname -NORM -REP ;
   or 
       WEIGHT 0 ; 
  to suppress the weight variable.

  WTNAME is the VARNAME, WTINDEX points to the row of VARLIST (and the column of
  the dataset) corresponding to varname.  If WEIGHT 0 is used, WTINDEX is set to missing.

   The -NORM is optional. If included, WNTORM=1 (otherwise, WTNORM=0).

      If -NORM is specified, GRBL2 programs will normalize the WEIGHT variable 
      (so that the sum of all the weight variables adds up to the number of observations).

   The -REP is optional. If included, WTREP=1 (otherwise, WTREP=0).

      Typically, this signals that the weights should be treated as  "replications", rather 
      than multiplicative weights. However, WTREP is NOT used by any of the GRBL2 procedures
      (it is used by a few GRBL2 programs).


  Notes:
     * You MUST define a dataset (using FILE) before using WEIGHT.

     * The VARNAME variables MUST be a variable in the selected dataset.


WEIGHT1: 
   Define a first weight variable. The WT1NAME, WT1NORM, WT1REP, and WT1INDEX global 1x1 scalars are set.
   The syntax is exactly the same as WEIGHT.

WEIGHT2: 
   Define a second weight variable. The WT2NAME, WT2NORM, WT2REP, and WT2INDEX global 1x1 scalars are set.
   The syntax is exactly the same as WEIGHT.



WORK_DIR: 
   Specify working directory.
   Specify where programs should find,and write, files to.

   If not specified, the current working directory is used.
  
   If the specified directory does not exist, an error is reported and the program will stop.

  Example:
         WORK_DIR  e:\stuff\data\project1 ;


  WORK_DIR is only used if a relative file name is specified (a file name without disk and directory information.

  GRBL2 programs typically will look for a file by:
    a) If it is a qualified filename (with drive and directory), look for the file using the path information
       in the filename. If not found, report an error
    b) If a relative file name
         i) First check the WORK_DIR
         ii) If not found in the WORK_DIR, check the current working directory
         If not found in either of these, report an error.

    Note: g_fspec3 procedure is used to do the above.
  

VIEW: 
     View a text file using a simple-but-navigable viewer.
     Syntax:
         VIEW filename prompt
     Prompt will be displayed at the top of the screen.
     If no filename specified, then view the current output file (as defined by the
     OUTPUT command), using a generic prompt.

     Examples:
         VIEW ;
         VIEW FOOBAR.OUT This is foobar ;
         VIEW  .  Output from the $MYMODEL is: ;

    Note the use of . as a placeholder for the file name -- it means
    "use current output file". Also note that $variables are replaced by
    values set using the GRBL2 DEFINE command.


VERBOSE or VERBOSE2: Status message verbosity.
   If specified, then more status messages are written.

   Example:
         Verbose ;


X: 
  Define a list of "X" variables. The XNAME, XINDEX, and NX global matrices are set.

  Syntax:
        X NO
   or
        X VAR1 ... VARN 
   or 
   X * 

  Option: if you start the list with "-CONST", a constant will be added (as the first variable)

  Examples:
        X NO ;
        X V1 VX1  INCOME ;
   X -CONST V1 VX1 INCOME ;
   X *  ;

   X NO means "no independent variables". This causes:
        NX=0
        XINDEX=miss(1,1)
        XNAME=0
   Otherwise,
        NX = number of variables specified, possibly including a constant
        XNAME = vector containing a list of the specified variables    
        XINDEX = vector of indices; row i of XINDEX points to the column
                 of the dataset containing the variable specified in row i of
                 XNAME. That is, XINDEX is an index of XNAME's positions in
                 the VARLIST global matrix.
       For the constant, the corresponding row of XINDEX will be 0.

  Notes:

     * You MUST define a dataset (using FILE) before using X.

     * The VAR1 ... VARN variables MUST be variables in the selected dataset.

     *  X *  means "all the variables" (sometimes this is useful).

X1: 
  Define a list of "X1" variables. The X1NAME, X1INDEX, and NX1 global matrices are set.
  The syntax is exactly the same as X (including the -CONST option).


X2: 
  Define a list of "X2" variables. The X2NAME, X2INDEX, and NX2 global matrices are set.
  The syntax is exactly the same as X (including the -CONST option).

X3: 
  Define a list of "32" variables. The X3NAME, X3INDEX, and NX3 global matrices are set.
  The syntax is exactly the same as X  (including the -CONST option)..


XNEW: define new variables as arithmetic combinations of two (or more) variables in dataset

    XNEW's primary purpose is to create polynomial variables (X**2, X1 * X2, etc).
    However, it can be use for some other, more complex, mathematical transforms.
 
    Syntax:
     XNEW  newname1 = oldvar1a action oldvar2b ~
      newnamex = oldvar1x action oldvar2x action oldvar3x ~ 
      ... ~
      newnamex = oldvar1x action oldvar2x ;

         Where
      newnameX is a (8 character maximum) "variable name" that will be displayed

      oldvar1.. are variables that exists in the datafile
            OR
           numeric values
              OR
           one of the the previously defined new variables.


         action is either ^, *, /, +, - , ^, \,  %, =, <, >, or func(..)
       ^ is exponentiaton;
       \ is division with truncation, eg. 20\6=3 (=trunc(20/6))
                 % is remainder, eg. 20 % 6 = 2 
            =  is equality -- a 0/1 dummy is created (1 if oldvar1a = oldvar2b )
            < and > are similar to =, but use lesser and greater comparisons

          func() signifies that the func is a function, with .. the operator of the function.
            
    ~ (or |) are used to seperate equations (thus, a single equation can span mutliple lines).

    Note: when you specify a XNEW keyword, all other previously defined "new variableS" 
     are erased.

         
    You can specify up to 20 new variables, with up to 5 terms per equation.
   
    Examples:
      XNEW X11 = X1 * X1 ;
      XNEW X11=  X1 * X1 ~   X12=X1 * X2  ~
                      X2_1 = X2 - X11 / 100 ~
                 YBIG = Y1 > 100 ;

    Note: if you have several terms on the line, processing goes from left to right -- there is NO
      operator precedence. 
       Thus: 
       X2_1 = X2 - X11 / 100 
            means 
       X2_1 = (X2 - X11) / 100 

    Functions:
   You can use several functions, such as LN and EXP, but only in a simple fashion.
        Specifially, the equation can start with a function(, followed by multiple terms,
        followed by a ), and then followed by other terms.

        Examples:
            LY  =  ln(y2 ) ;
            Y2  =  ln( x1 + x2 )  ;
            Y23  =  exp(Y2 + Z1 )  /  Z2  ;

        The following  is NOT permitted:
            Y  =  x3 + ln(x1 + y2 )        
        (perhaps in later verions of GRBL2 we will loosen this syntatic stricture)

      Supported functions. 
      LN  - natural log. values <=0 cause a fatal error.
      LN_m  - natural log. If value is <=0, use ln(10^-m)
      EXP  - exponentiation
      SIN  - sin (radians)
      COS  - cosine (radians)
           TAN  - tangent (radians)

    Notes:
       * XNEW NO to have no new variables created.
 
       * XNEW will set the XNEWinfo global "structure" variable

       * For details on the structure of the XNEWinfo "structure" variable, see the description
    of G_VUXNEW.

       * XNEW V3 = ln_3(VOLD1) ;
    means: "compute the log of VOLD1; if vold1<=0, use ln(0.001)

       * There can NOT be any spaces between func_name and (

       * Hint: IF like constructions can be implemented with the following type of syntax:
             XNEW DUM1 = YN =  1 ~              @ assuming YN is a 0/1 dummy, where 1 signifies "YES" @
             TMP1 = YN * VAL1 ~
             TMP0 = 1 - YN * VAL0 ~
             NEWVAL = TMP1 + TMP0 ;        @ NEWVAL equals VAL1 if YN=1, else NEWVAL equals VAL0 @


Y:
   Define a dependent variable. The YNAME and YINDEX global 1x1 matrices are set.

  Syntax:
        Y varname ;

  YNAME is the VARNAME, YINDEX points to the row of VARLIST (and the column of
  the dataset) corresponding to varname. 

  Notes:
     * You MUST define a dataset (using FILE) before using Y.

     * The VARNAME variables MUST be a variable in the selected dataset.


Y1: 
   Define a Y1 dependent variable. The Y1NAME and Y1INDEX global 1x1 matrices are set.
   The syntax is exactly the same as Y.


Y2: 
   Define a Y2 dependent variable. The Y2NAME and Y2INDEX global 1x1 matrices are set.
   The syntax is exactly the same as Y.

Y3: 
   Define a Y3 dependent variable. The Y3NAME and Y3INDEX global 1x1 matrices are set.
   The syntax is exactly the same as Y.


Z: 
  Define a list of "Z" variables. The ZNAME, ZINDEX, and NZ global matrices are set.
  The syntax is exactly the same as X, except the -CONST option is NOT supported.

Z1: 
  Define a list of "Z1" variables. The Z1NAME, Z1INDEX, and NZ1 global matrices are set.
  The syntax is exactly the same as X, except the -CONST option is NOT supported.


Z2:
  Define a list of "Z2" variables. The Z2NAME, Z2INDEX, and NZ2 global matrices are set.
  The syntax is exactly the same as X, except the -CONST option is NOT supported.


Z3: 
  Define a list of "Z3" variables. The Z3NAME, Z3INDEX, and NZ3 global matrices are set.
  The syntax is exactly the same as X, except the -CONST option is NOT supported.




                -------------------

5. Using a G_BATCH2 procedure to specify additional keywords

As noted in section 3, you can tell G_BATCH to call a custom-defined procedure if need be.
More precisely, you can tell G_BATCH to call a G_BATCH2 procedure whenever an unrecognized
"keyword" is encountered.

To do this, just set the "use_custom_proc", in
     {new_string,status}=G_BATCH(text_string,reset,use_custom_proc),
procedure to be 1.

For example:
  {string1,status}=g_batch(a_string,1,1)
Will process the contents of the a_string "string", will reset all the globals, and will call
the G_BATCH2 procedure if an unknown keyword is encountered.

It is you, the programmer's, responsibility to provide a specific version of G_BATCH2.
Obviously, this can change from program. In fact, for simple enough models, you probably will not
need a G_BATCH2 procedure (the G_BATCH options often suffice).

What G_BATCH2 does is up to you -- typically, you will use it to set globals. Or, you can
use it to start model execution.


The syntax of g_batch should be:

proc (2)=g_batch2(keywurd,acommand,ie) ;
 Where:
   keywurd:  the unrecognized  (by G_BATCH) "keyword" -- a single upper case word (no embedded spaces)
        acommand: the full command (including the keywurd)
   ie: the character location (in acommand) of the end of the "keywurd".

 And you should return
    {ienew,status}

  The ienew should be the location of the last read character (in acommand). However, it's not currently
  used, so you can return a 0.

  Status should take one of the following variables:
     0 = OK
     3 = No such  Command

  
  Hints:
     You can use the following syntax to repetitively read words out of the acommand:
        do until ie==0 ;
             {aword,ie}=g_getwrd(acommand,ie+1);
            ... process aword ...
         end ;


6. The G_MKDATA procedure

G_MKDATA: construct global matrices (X,X1,....) using info specified in GRBL2  command files

   okindex=g_mkdata(which,dta,selects,removes)

  where:

  which: ALL - make all defined variables 
         or a vector containing any of the following:
            X,X1,X2,X3,Y,Y1,Y2,Y3,Z,Z1,Z2,Z3,ID,ID1,ID2,ID3,WEIGHT,WEIGHT2,DUMMY,XNEW,AUX,AUX1,AUX2 
        to make the corresponding variables

   dta: the original (from data file) data to extract these variables from
        Or, missing, in which case the entire current dataset is read (as set by
        the FILE keyphrase)

  selects: an index of observations to start with. This is ONLY used if  dta=miss(1,1) --
      that is, when all the data from the currently chosen dataset is to be read.
     SELECTS is typically found using: selects=G_DOSLCT(slclist,filename)
     If SELECTS is a scalar 0, then use all observations.

  removes: how to deal with missings, etc
      0=  retain all rows
      1 =  remove rows if ANY of the matrices have missing values
      2 =  same as 1, but also treat nans and infs as missing values
       Note that SELECTS is first used to discard unwanted observations,
       and REMOVES is used to dicard observations with unusable values.

   okindex: an index (into dta)  of the observations used to create the various
       globals. If removes=0, this will just point to all the rows in dta.


  Notes:
  *  G_MKDATA Uses:
       xindex, x1index, X2index, x3ndex, yindex, y1index, y2index, y3index,
       zindex, z1index, z2index, z3index, idindex, id1index, id2index, id3index, 
       wtindex, wt2index,
       aindex,a1index,a2index,naux,naux1,naux2
       dinfo, and XNEWinfo; and NX, NX1 ,NX2, NXNEW,NXDUMMY, NZ, NZ1 and NZ2.
     G_MKDATA creates the global matrices (or vectors):
       x, x1, X2, x3, y, y1, y2, y3, z, z1, z2, z3, id, id1, id2, id3, weight, weight2, 
       aux, aux1, aux2,
       XDUMMY, and XNEW.

  * There will be the same number of rows in all matrices created.

  * If you are not reading the entire file (if you explicitily specify a dta), then
    observation selection  (say, using G_DOSLCT) should be performed on dta before calling this
    procedure


6a. The G_MKXUSE procedure

   The G_MKXUSE procedure can be used to create a single global matrix (XUSE) that contains X, XNEW, 
   and, DUMMY  variables (if they've been specified). Thus, G_MKXUSE is meant to be called after 
   G_MKDATA.

        {norms,parnames}=g_mkxuse(do_norm)

    where:
         XUSE will be  a N x J matrix constructed from X, XNEW, and DUMMY variables for N observations

          do_norm : If 1, then normalize XUSE. That is, each column of XUSE will be normalized by the
                  sd of its values. However, dummy columns (such as when only 0 and 1 are values) columns will
                  NOT be changed.
          norms :   If do_norm=1, a J x 1 vector of "normalization" parameters.
         PARNAMES :  be a vector of variable names (each row of PARNAMES  corresponds to a column of XUSE).

     Note that G_MKDATA removes all bad observations (with missing values, infs, etc) when it creates
      X, XNEW, and DUMMY. Thus, XUSE will not contain any "bad" observations.
