STATS: Compute statistics on GAUSS datasets


  The STATS program is a simple utility for computing statistics on GAUSS datasets.
  It can also be used to compute frequencies.


			----------------------------------
I. Synopsis

To use STATS, you must create a "GRBL2-style" input file.

These input files are comprised of "keyphrases".
 
  The syntax of these keyphrases is:

     KEYWORD option_list ;

  where:
    Keyword: one of several keywords understood by SNP
    option_list: a list of one or more space delimited options.
		 The syntax of these options depends on the keyword.
  Notes: 
     * comments are enclosed between ampersand characters.
     * each keyphrase MUST end with a semi-colon 
     * keywords and options are case-INsensitive
     * For a complete description of GRBL2 input files, see in GRBL2_BATCH.TXT

STATS supports two actions using the keyphrases:
   STATS options ;   : Basic statistics
   FREQ  options     : Frequency 

In addition, the following keyphrases are supported (see GRBL2_BATCH.TXT for
the details).

   OUTPUT out_file  ; Name of a file to write a report of results to 
   INPUT  in_file   ; Name of GAUSS dataset 

There are a number of keyphrases used with STATS and FREQ...  

	--------------------------------

II. Description of STATS and FREQ

	--------------------------------


IIa. STATS

STATS will generate a variety of statistics for variables in a GAUSS dataset.
You can also generate statitics for subsets of the data, using a BY variable.


The following keyphrases are required (see GRBL2_BATCH.TXT for the details):

FILENAME : Name of the input (gauss data) file. 

 X      : variables to compute statistics for. 


The following are optional. 

BY : Break dataset into subsets

   The BY variable tells STATS to produce a seperate set of statistics for subsets
    of the data -- where each subset has the same value of the BY variable.

   Syntax:
       BY varname ;		

   Carefu: if the BY variable has lots of values, a lot of of tables will
           be produced.


ID : A panel identifier.
   If specified, basic statistics on the number, and size, of the panels will be reported.

   
OUTPUT  : Name of output file


SELECT :  Observation selection critieria. 0 to use all obs
         See GRBL2_BATCH.TXT for the details on observation selection.


STATLIST : Space delimited string of statistics to be generated.

     The following statistics are supported:
	  MEAN MEDIAN MIN MAX N NMISS SD SUM 1 5 10 25 50 75 90 95 99 

      Or
	   *  
      to display all the above statistics.

      Or, if blank (not specified)
	  MEAN SD MIN MAX N 


      Notes:
        *  The 1... 99 refer to quintiles. 
           I.e.; 5 means the value at which 5% of the observations are less then.
        *  Statistics are displayed in order entered in STATLIST.
        *  Only 5 items are displayed per table (hence, if you specify 
	   everything, 4 tables are created).
	*  Include a " $" to force a "table break".
	     For example:
	        STATS MEAN SD SUM $ MEDIAN 1 5 95 99 ;
	     Would produced 2 tables, the first with three columns of statistics, the
	     2nd with five.
         * The variable name is always the first column of a table.

TITLE : One or more line title to display.

WEIGHT  :  Weight variable.
     Weight can either be a "scale" or "replication" type.  
     Weights affect the mean, sd, and sum.

     The scale type multiplies the value, and then computes statistics.
     The replication type replicates values, then compute statistics.

     Replication will tend to have less of an impact on the mean and sd.

     Note that the sum of weights is reported. However, depending on what
     observations are missing, the actual sum-of-weights for a given variable
     will vary.




	--------------------------------

IIb. FREQ


FREQ will compute a frequency table for a variable in a GAUSS dataset.
You can specify the categories, or let the program compute them.
Or you can compute a count for all unique values of this variable.


The following keyphrases are required (see GRBL2_BATCH.TXT for the details):

FILENAME : Name of the input (gauss data) file. 

 Y      : variable to compute a frequency for. 



The following are optional. 


RANGE : The categories to compute frequencies within
    	 RANGE -DEFAULT  :   Use the default ranges
         RANGE AUTO=N    :   Program divides into N ranges
         RANGE -ALL      :  Use all values (yields a count of occurences of each value)
         RANGE n1 n2 n3  : n1 .. are categories values:
	     
	Note:
		RANGE -DEFAULT
	and
	     RANGE 0 1 2 3 4 5 10 25 50   100 250 500 1000 2000  1000000 ;
        are the same thing.


OUTPUT  : Name of output file


SELECT :  Observation selection critieria. 0 to use all obs
         See GRBL2_BATCH.TXT for the details on observation selection.


TITLE : One or more line title to display.

