Copyright (c) Hyperion Entertainment and contributors.

AmigaOS Manual: ARexx Elements of ARexx

From AmigaOS Documentation Wiki
Jump to navigation Jump to search

This chapter introduces the rules and concepts that make up the ARexx programming language and explains how ARexx interprets the characters and words used in programs. The different elements that are explained include:

  • Tokens - the smallest element of the ARexx language
  • Clauses - the smallest executable unit, similar to a sentence
  • Expressions - a group of evaluated tokens
  • The Command Interface - the process by which ARexx programs communicate with ARexx-compatible applications

This chapter also includes a discussion of the ARexx execution environment. This is intended for more advanced Amiga users and includes technical details on interprocess communication.

Tokens

Tokens, the smallest distinct entities of the ARexx language, may be a single character or a series of characters. There are five categories of tokens:

  • comments
  • symbols
  • strings
  • operators
  • special characters

Comments

A comment is any group of characters beginning with the sequence /* (slash asterisk) and ending with */ (asterisk slash). Each ARexx program must begin with a comment. Each /* must have a matching */ . For example:

/*This is an ARexx comment*/

Comments may be placed anywhere in a program and can even be nested within one another. For example:

/*A /*nested*/ comment*/

Insert comments throughout your program. Comments remind you and others of the program's intentions. Because the interpreter ignores comments when it scans your programs, comments do not slow down the execution of your program.

Symbols

A symbol is any group of the characters a-z, A-Z, 0-9, and period (.), exclamation pint (!), question mark (?), dollar sign ($), and underscore (_). Symbols are translated to uppercase as the interpreter scans the program, so the symbol MyName is equivalent to MYNAME. The four types of recognized symbols are:

Fixed symbols
A series of numeric characters that begins with a digit (0-9) or a period (.). The value of a fixed symbol is always the symbol name itself, translated to uppercase. 12345 is an example of a fixed symbol.
Simple symbols
A series of alphabetic characters that begins with a letter A-Z. "MyName" is an example of a simple symbol.
Stem symbols
A series of alphanumeric characters that ends with one period. "A." and "Stem9." Are examples of stem symbols.
Compound symbols
A series of alphanumeric characters that includes one or more periods within the characters. "A.1.Index" is an example of a compound symbol.

Simple, stem, and compound symbols are called variables and may be assigned a value during the course of the program execution. If a variable has not yet been assigned a value, it is uninitialized. The value for an uninitialized variable is the variable name itself (translated to uppercase, if applicable).

Stems and compound symbols have special properties that make them useful for building arrays and lists. Stem symbols provide a way to initialize a whole class of compound symbols. A compound symbol can be regarded as having the structure stem.n1.n2...nk, where the leading name is a stem symbol and each node, n1...nk, is a fixed or simple symbol.

When an assignment is made to a stem symbol, it assigns that value to all possible compound symbols derived from the stem. Thus, the value of a compound symbol depends on the prior assignments made to itself or its associated stem.

Whenever a compound symbol appears in a program, its name is expanded by replacing each node with its current value. The value string my consist of any characters, including embedded blanks, and will not be converted to uppercase. The result of the expansion is a new name that is used in place of the compound symbol. For example, if J has the value 3 and K has the value 7, then the compound symbol A.J.K will expand to A.3.7.

Compound symbols can be regarded as a form of associative or content-addressable memory. For example, suppose that you needed to store and retrieve a set of names and telephone numbers. The conventional approach would be to set up two arrays, NAME and NUMBER, each indexed by an integer running from one to the number of entries. A number would be looked up by scanning the name array until the given name was found, say in NAME.12, and then retrieving NUMBER.12. With compound symbols, the symbol NAME could hold the name to be retrieved, and NUMBER.NAME would then expand to the corresponding number, for example, NUMBER.CBM.

Compound symbols can also be used as conventional indexed arrays, with the added convenience that only a single assignment (to the stem) is required to initialize the entire array.

For instance, the program below uses the stems "number." And "addr." to create a computerized telephone directory.

Program 8. Phone.rexx

/*A telephone book to show compound variables.*/
IF ARG () ~ = 1 THEN DO
SAY "USAGE: rx phone name"
EXIT 5
END
/*Open window to display phone nos/addresses.*/
CALL OPEN out, "con:0/0/640/60/ARexx Phonebook"
IF ~ result THEN DO
SAY "Open failure ... sorry"
EXIT 10
END
/*Number definitions*/
number. = `(not found)'
number.wsh = `(555) 001-0001'
addr. = `(not found)'
number.CBM = `(555) 002-0002'
addr.CBM = `1200 Wilson Dr., West Chester, PA, 19380'
/*(Work is done here)*/
ARG name /*The name*/
CALLWRITELN out, name | | " `s number is" number.name
CALL WRITELN out,name | | " `s address is" addr.name
CALL WRITELN out, "Press Return to exit."
CALL READLN out
EXIT

To execute the program, activate a Shell window and enter:

RX Phone cbm

A window will display the name and address assigned to CBM.

Strings

A string is any group of characters beginning and ending with a quote (`) or double quote (") delimiter. The same delimiter must be used at both ends of the string. To include the delimiter character in the string, use a double-delimiter sequence (` ` or ""). For example:

"Now is the time." An example of a normal string.

`Can't you see?' An example of a string using a double-delimiter sequence

The value of a string is the string itself. The number of characters in the string is called its length. If the string does not contain any characters, it is called a null string.

Strings that are followed by an X or B character are classified as hex or binary strings, respectively, and must be composed of hexadecimal digits (0-9, A-F) or binary digits (0,1). For example:

`4A 3B C0'X
`00110111'B

Blanks are permitted at byte boundaries to improve readability. Hex and binary strings are convenient for specifying non-ASCII characters and machine-specific information, like addresses. They are converted immediately to the packed (machine-compressed) internal form.

Operators

Operators are a combination of the following characters: ~ + - * / = > < & | ^, as explained in this section. There are four types of operators:

  • Arithmetic operators require one or two numeric operands and produce a numeric result.
  • Concatenation operators join two strings into a single string.
  • Comparison operators require two operands and produce a Boolean (0 or 1) result.
  • Logical operators require one or two Boolean operands and produce a Boolean result.

Each operator has an associated priority that determines the order in which operations will be performed in an expression. Operators with higher priorities (8) are performed before those with lower priorities (1).

Arithmetic Operators

An important class of operands are those representing numbers. Numbers consist of the character 0-9, a period (.), plus sign (+), minus sign (-), and blanks. To indicate exponential notation, a number may be followed by an "e" or "E" and a (signed) integer.

Both strings and symbols may be used to specify numbers. Since the language is typeless, variables do not have to be declared as numeric before use in an arithmetic operation. Instead, each value string is examined when it is used in order to verify that it represents a number. The following examples are all valid numbers:

33
" 12.3 "
0.321e12
` + 15. `

Leading and trailing blanks are permitted. Blanks may be embedded between a plus (+) or minus (-) sign and the number, but not within the number itself.

You can modify the basic precision used for arithmetic calculations while a program is executing. The number of significant figures used in arithmetic operations is determined by the Numeric Digits setting and may be modified using the NUMERIC instruction described in Chapter 4.

The number of decimal places used for a result depends on the operation and the number of decimal places in the operands. ARexx preserves trailing zeroes to indicate the precision of the result. If the total number of digits required to express a value exceeds the current Numeric Digits setting, the number is formatted in exponential notation. They are:

  • Scientific notation - the exponent is adjusted so that a single digit is placed to the left of the decimal point.
  • Engineering notation - the number is scaled so that the exponent is a multiple of 3 and the digits to the left of the decimal point range from 1 to 999.
Table 3-1. Arithmetic Operators
Operator Priority Example Result
+ (prefix conversion) 8 '3.12' 3.12
- (prefix negation) 8 -"3.12" -3.12
** (exponentiation) 7 0.5**3 0.125
* (multiplication) 6 1.5*1.50 2.250
/ (division) 6 6 / 3 2
% (integer division) 6 -8 % 3 -2
// (remainder) 6 5.1//0.2 7.15
+ (addition) 5 3.1+4.05 7.15
-(subtraction) 5 5.55 - 1 4.55

Concatenation Operators

ARexx defines two concatenation operators. The first, identified by the operator sequence | | (two vertical bars), joins two strings into a single string with no intervening blank. This type of concatenation can also be specified implicitly. When a symbol and a string are typed without any intervening spaces, ARexx behaves as if the | | operator had been specified. The second concatenation operation is identified by the blank operator and joins the two operand strings with one intervening blank.

The priority of all concatenation operations is 4. Table 3-2 summarizes the different operations.

Table 3-2. Concatenation Operators
Operator Operation Example Result
|| Concatenation 'why me, '||'Mom?' why me, Mom?
Blank Blank Concatenation 'good''times' good times
none Implied Concatenation one'two'three ONEtwoTHREE

Comparison Operators

ARexx supports three types of comparisons:

  • Exact comparisons - character-by-character comparison.
  • String comparisons - ignore leading blanks and add blanks to the shorter string.
  • Numeric comparisons - converts the operands to an internal numeric from using the current Numeric Digits setting and then runs an arithmetic comparison.

Comparisons always result in a Boolean value. The numbers 0 and 1 are used to represent the Boolean values false and true. The use of a value other than 0 or 1 when a Boolean operand is expected will generate an error. Any number equivalent to 0 or 1, for example 0.000 or 0.1E1, is also acceptable as a Boolean value.

Except for the exact equality (==) and exact inequality (~==) operators, all comparison operators dynamically determine whether a string or numeric comparison is to be performed. A numeric comparison is performed if both operands are valid numbers. Otherwise, the operands are compared as strings.

All comparisons have a priority of 3. Table 3-3 lists the acceptable comparison operators.

Table 3-3. Comparison Operators
Operator Operation Mode
== Exact Equality Exact
~== Exact Inequality Exact
= Equality String/Numeric
~= Inequality String/Numeric
> Greater Than String/Numeric
>= or ~< Greater Than or Equal To String/Numeric
< Less Than String/Numeric
<= or ~> Less Than or Equal To String/Numeric

Logical (Boolean) Operators

ARexx defines the four logical operations, NOT, AND, OR, and Exclusive OR, all of which require Boolean operands and produce a Boolean result. An attempt to perform a logical operation on a non-Boolean operand will generate an error. Table 3-4 shows the acceptable logical operators.

Table 3-4. Logical Operators
Operator Priority Operation
~ 8 NOT (Inversion)
& 2 AND
| 1 OR
^or && 1 Exclusive OR

Special Characters

A few punctuation characters have special meanings within ARexx, as shown in Table 3-5.

Table 3-5. Special Characters
Special Character Definition
(:) Colon A colon defines a label when preceded by a symbol token (may alphanumeric character or . ! ? $).
( ) Parentheses Parentheses are used to group operators and operands into subexpressions to override the normal operator priorities. An open parenthesis also serves to identify a function call within an expression. A Symbol or string followed immediately by an open parenthesis defines a function name. Parentheses must always be balanced within a statement.
(;) Semicolon A semicolon acts as a statement terminator. Statements too long to fit on one line may be separated by semicolons.
(,) Comma A comma acts as the continuation character for statements broken into several lines and as a separator of argument expressions in a function call.

Clauses

Clauses, the smallest language unit that can be executed as a statement, are formed from token groupings.

As the program is read, the language interpreter splits the program into groups of clauses. These group of one or more clauses are then broken down into tokens and each clause is classified as a particular type. Seemingly small syntactic differences may completely change the semantic content of a statement. For example:

SAY `Hello, Bill'

is an instruction clause and will display "Hello, Bill" on the console, but:

''SAY 'Hello, Bill'

is a command clause, and will issue "SAY Hello, Bill" as a command to an external program. The presence of the leading null string (` `) changes the classification from an instruction clause to a command clause.

The end of a line normally acts as the implicit end of a clause. A clause can be continued on the next line by ending the line with a comma. The comma is ignored by the program, and the next line is considered as a continuation of the clause. There is no limit to the number of continuations that may occur (except for those limits imposed by the command buffer).

String and comment tokens are automatically continued if a line ends before the closing delimiter has been found, and the newline (i.e., enter) character is not considered to be part of the token.

Null Clauses

Null clauses are lines of blanks or comments and may appear anywhere in a program. They have no function in the execution of a program, except to aid its readability and to increment the line count.

Label Clauses

A label clause is a symbol followed by a colon (:). A label acts as a place marker in the program, but no action occurs with the execution of a label. The colon is considered as an implicit clause terminator, so each label stands as a separate clause. Label clauses may appear anywhere in a program. For example:

start: /*Begin execution*/
syntax: /*Error processing*/

Assignment Clauses

Assignment clauses are identified by a variable symbol followed by an = operator. (In this context the = operator's normal definition of equality comparison is overridden.) The tokens to the right of the = are evaluated as an expression and the result is assigned to the variable. For example:

When = 'Now is the time'
answ = 3.14 * fact (5)

The equal sign (=) assigns the value 'Now is the time' to the variable 'when', and assigns the result of 3.14 * fact(5) to the variable 'answ'.

Instruction Clauses

Instruction clauses begin with the name of the instruction and tell ARexx to perform an action. Instruction names are described in Chapter 4. For example:

DROP a b c
SAY 'please'
IF j > 5 THEN LEAVE;

Command Clauses

Command clauses are any ARexx expression that cannot be classified as one of the preceding types of clauses. The expression is evaluated and the result is issued as a command to an external host. For example:

'delete' 'myfile' /*AmigaDOS command*/
'jump' current+10 /*An editor command*/

The delete command is not recognized as an ARexx command, so it is sent to the external host, in this case AmigaDOS. The jump command in the second example is assumedly understood by an external text editor.

Expressions

Expressions are a group of evaluated tokens. Most statements contain at least one expression. Expressions are composed of:

  • Strings - The value of a string is the string itself.
  • Symbols - The value of a fixed symbol is the symbol itself, translated to uppercase. Symbols may be used as variables and may have an assigned value.
  • Operators - Operators have a priority order that determines when it will be performed.
  • Parentheses - Parentheses may be used to alter the normal order of evaluation in the expression or to identify function calls. A symbol or string followed immediately by an open parenthesis defines the function name, and the tokens between the opening and closing parenthesis form the argument list for the function. For example, the expression:
J 'factorial is' fact (J)

is composed of:

  • a symbol - J
  • a blank operator
  • a string - factorial is
  • another blank
  • a symbol - fact
  • an open parenthesis
  • a symbol - J
  • a closing parenthesis

In this example, FACT is a function name and (J) is its argument list, the single expression J.

Before the evaluation of an expression proceeds, ARexx must obtain a value for each symbol in the expression. For fixed symbols the value is the symbol name itself, but variable symbols must be looked up in the current symbol table. In the example above, if the symbol J was assigned the value 3, the expression after symbol resolution would be:

3 'factorial is' FACT (3)

To avoid ambiguities in the values assigned to symbols during the resolution process, ARexx guarantees a strict left-to-right resolution order. Symbol resolution proceeds irrespective of operator priority or parenthetical grouping. If a function call is found, the resolution is suspended while the function is evaluated. It is possible for the same symbol to have more than one value in an expression.

If the previous example was rearranged to read:

FACT(J) 'is' J 'factorial'

would the second occurrence of symbol J still resolve to 3? In general, function calls may have side effects that include altering the values of variables. If the example was rearranged, the value of J might have been changed by the call to FACT.

After all symbol values have been resolved, the expression is evaluated based on operator priority and subexpression grouping. ARexx does not guarantee an order of evaluation among operators of equal priority and does not employ a "fast path" evaluation of Boolean operations. For example, in the expression:

(1 = 2) & (FACT(3) = 6)

the call to the FACT function will be made even though the first term of the AND (&) operation is 0. This example points out that ARexx will continue reading left to right, even though the given example is false and will return a value of 0.