Copyright (c) 2012-2016 Hyperion Entertainment and contributors.
AmigaOS Manual: ARexx Elements of ARexx
This chapter introduces the rules and concepts that make up the ARexx programming language and explains how ARexx interprets the characters and words used in programs. The different elements that are explained include:
- Tokens - the smallest element of the ARexx language
- Clauses - the smallest executable unit, similar to a sentence
- Expressions - a group of evaluated tokens
- The Command Interface - the process by which ARexx programs communicate with ARexx-compatible applications
This chapter also includes a discussion of the ARexx execution environment. This is intended for more advanced Amiga users and includes technical details on interprocess communication.
- 1 Tokens
- 2 Clauses
- 3 Expressions
- 4 The Command Interface
- 5 The Execution Environment
Tokens, the smallest distinct entities of the ARexx language, may be a single character or a series of characters. There are five categories of tokens:
- special characters
A comment is any group of characters beginning with the sequence /* (slash asterisk) and ending with */ (asterisk slash). Each /* must have a matching */ . For example:
/*This is an ARexx comment*/
Comments may be placed anywhere in a program and can even be nested within one another. For example:
/*A /*nested*/ comment*/
Insert comments throughout your program. Comments remind you and others of the program's intentions. Because the interpreter ignores comments when it scans your programs, comments do not slow down the execution of your program.
|Each ARexx program must begin with a comment.|
A symbol is any group of the characters a-z, A-Z, 0-9, and period (.), exclamation point (!), question mark (?), dollar sign ($), and underscore (_). Symbols are translated to uppercase as the interpreter scans the program, so the symbol MyName is equivalent to MYNAME. The four types of recognized symbols are:
- Fixed symbols
- A series of numeric characters that begins with a digit (0-9) or a period (.). The value of a fixed symbol is always the symbol name itself, translated to uppercase. 12345 is an example of a fixed symbol.
- Simple symbols
- A series of alphabetic characters that begins with a letter A-Z. "MyName" is an example of a simple symbol.
- Stem symbols
- A series of alphanumeric characters that ends with one period. "A." and "Stem9." Are examples of stem symbols.
- Compound symbols
- A series of alphanumeric characters that includes one or more periods within the characters. "A.1.Index" is an example of a compound symbol.
Simple, stem, and compound symbols are called variables and may be assigned a value during the course of the program execution. If a variable has not yet been assigned a value, it is uninitialized. The value for an uninitialized variable is the variable name itself (translated to uppercase, if applicable).
Stems and compound symbols have special properties that make them useful for building arrays and lists. Stem symbols provide a way to initialize a whole class of compound symbols. A compound symbol can be regarded as having the structure stem.n1.n2...nk, where the leading name is a stem symbol and each node, n1...nk, is a fixed or simple symbol.
When an assignment is made to a stem symbol, it assigns that value to all possible compound symbols derived from the stem. Thus, the value of a compound symbol depends on the prior assignments made to itself or its associated stem.
Whenever a compound symbol appears in a program, its name is expanded by replacing each node with its current value. The value string may consist of any characters, including embedded blanks, and will not be converted to uppercase. The result of the expansion is a new name that is used in place of the compound symbol. For example, if J has the value 3 and K has the value 7, then the compound symbol A.J.K will expand to A.3.7.
Compound symbols can be regarded as a form of associative or content-addressable memory. For example, suppose that you needed to store and retrieve a set of names and telephone numbers. The conventional approach would be to set up two arrays, NAME and NUMBER, each indexed by an integer running from one to the number of entries. A number would be looked up by scanning the name array until the given name was found, say in NAME.12, and then retrieving NUMBER.12. With compound symbols, the symbol NAME could hold the name to be retrieved, and NUMBER.NAME would then expand to the corresponding number, for example, NUMBER.CBM.
Compound symbols can also be used as conventional indexed arrays, with the added convenience that only a single assignment (to the stem) is required to initialize the entire array.
For instance, the program below uses the stems "number." and "addr." to create a computerized telephone directory.
Program 8. Phone.rexx
/*A telephone book to show compound variables.*/ IF ARG() ~ = 1 THEN DO SAY "USAGE: rx phone name" EXIT 5 END /*Open window to display phone nos/addresses.*/ CALL OPEN out, "con:0/0/640/60/ARexx Phonebook" IF ~ result THEN DO SAY "Open failure ... sorry" EXIT 10 END /*Number definitions*/ number. = '(not found)' number.wsh = '(555) 001-0001' addr. = '(not found)' number.CBM = '(555) 002-0002' addr.CBM = '1200 Wilson Dr., West Chester, PA, 19380' /*(Work is done here)*/ ARG name /*The name*/ CALL WRITELN out, name | | " 's number is" number.name CALL WRITELN out,name | | " 's address is" addr.name CALL WRITELN out, "Press Return to exit." CALL READLN out EXIT
To execute the program, activate a Shell window and enter:
RX Phone cbm
A window will display the name and address assigned to CBM.
A string is any group of characters beginning and ending with a quote (') or double quote (") delimiter. The same delimiter must be used at both ends of the string. To include the delimiter character in the string, use a double-delimiter sequence ('' or ""). For example:
|"Now is the time."||An example of a normal string.|
|'Can''t you see?'||An example of a string using a double-delimiter sequence.|
The value of a string is the string itself. The number of characters in the string is called its length. If the string does not contain any characters, it is called a null string.
Strings that are followed by an X or B character are classified as hex or binary strings, respectively, and must be composed of hexadecimal digits (0-9, A-F) or binary digits (0,1). For example:
'4A 3B C0'X '00110111'B
Blanks are permitted at byte boundaries to improve readability. Hex and binary strings are convenient for specifying non-ASCII characters and machine-specific information, like addresses. They are converted immediately to the packed (machine-compressed) internal form.
Operators are a combination of the following characters: ~ + - * / = > < & | ^, as explained in this section. There are four types of operators:
- Arithmetic operators require one or two numeric operands and produce a numeric result.
- Concatenation operators join two strings into a single string.
- Comparison operators require two operands and produce a Boolean (0 or 1) result.
- Logical operators require one or two Boolean operands and produce a Boolean result.
Each operator has an associated priority that determines the order in which operations will be performed in an expression. Operators with higher priorities (8) are performed before those with lower priorities (1).
An important class of operands is that representing numbers. Numbers consist of the characters 0-9, a period (.), plus sign (+), minus sign (-), and blanks. To indicate exponential notation, a number may be followed by an "e" or "E" and a (signed) integer.
Both strings and symbols may be used to specify numbers. Since the language is typeless, variables do not have to be declared as numeric before use in an arithmetic operation. Instead, each value string is examined when it is used in order to verify that it represents a number. The following examples are all valid numbers:
33 " 12.3 " 0.321e12 ' + 15. '
Leading and trailing blanks are permitted. Blanks may be embedded between a plus (+) or minus (-) sign and the number, but not within the number itself.
You can modify the basic precision used for arithmetic calculations while a program is executing. The number of significant figures used in arithmetic operations is determined by the Numeric Digits setting and may be modified using the NUMERIC instruction described in Chapter 4.
The number of decimal places used for a result depends on the operation and the number of decimal places in the operands. ARexx preserves trailing zeroes to indicate the precision of the result. If the total number of digits required to express a value exceeds the current Numeric Digits setting, the number is formatted in exponential notation. They are:
- Scientific notation - the exponent is adjusted so that a single digit is placed to the left of the decimal point.
- Engineering notation - the number is scaled so that the exponent is a multiple of 3 and the digits to the left of the decimal point range from 1 to 999.
|+ (prefix conversion)||8||'3.12'||3.12|
|- (prefix negation)||8||-"3.12"||-3.12|
|/ (division)||6||6 / 3||2|
|% (integer division)||6||-8 % 3||-2|
|- (subtraction)||5||5.55 - 1||4.55|
ARexx defines two concatenation operators. The first, identified by the operator sequence || (two vertical bars), joins two strings into a single string with no intervening blank. This type of concatenation can also be specified implicitly. When a symbol and a string are typed without any intervening spaces, ARexx behaves as if the || operator had been specified. The second concatenation operation is identified by the blank operator and joins the two operand strings with one intervening blank.
The priority of all concatenation operations is 4. Table 3-2 summarizes the different operations.
|||||Concatenation||'why me, '||'Mom?'||why me, Mom?|
|Blank||Blank Concatenation||'good''times'||good times|
ARexx supports three types of comparisons:
- Exact comparisons - character-by-character comparison.
- String comparisons - ignore leading blanks and add blanks to the shorter string.
- Numeric comparisons - convert the operands to an internal numeric form using the current Numeric Digits setting and then run an arithmetic comparison.
Comparisons always result in a Boolean value. The numbers 0 and 1 are used to represent the Boolean values false and true. The use of a value other than 0 or 1 when a Boolean operand is expected will generate an error. Any number equivalent to 0 or 1, for example 0.000 or 0.1E1, is also acceptable as a Boolean value.
Except for the exact equality (==) and exact inequality (~==) operators, all comparison operators dynamically determine whether a string or numeric comparison is to be performed. A numeric comparison is performed if both operands are valid numbers. Otherwise, the operands are compared as strings.
All comparisons have a priority of 3. Table 3-3 lists the acceptable comparison operators.
|>= or ~<||Greater Than or Equal To||String/Numeric|
|<= or ~>||Less Than or Equal To||String/Numeric|
Logical (Boolean) Operators
ARexx defines the four logical operations, NOT, AND, OR, and Exclusive OR, all of which require Boolean operands and produce a Boolean result. An attempt to perform a logical operation on a non-Boolean operand will generate an error. Table 3-4 shows the acceptable logical operators.
|^or &&||1||Exclusive OR|
A few punctuation characters have special meanings within ARexx, as shown in Table 3-5.
|(:) Colon||A colon defines a label when preceded by a symbol token (any alphanumeric character or . ! ? $).|
|( ) Parentheses||Parentheses are used to group operators and operands into subexpressions to override the normal operator priorities. An open parenthesis also serves to identify a function call within an expression. A Symbol or string followed immediately by an open parenthesis defines a function name. Parentheses must always be matched within a statement.|
|(;) Semicolon||A semicolon acts as a statement terminator. Several statements that fit on one line may be separated by semicolons.|
|(,) Comma||A comma acts as the continuation character for statements broken into several lines and as a separator of argument expressions in a function call.|
Clauses, the smallest language unit that can be executed as a statement, are formed from token groupings.
As the program is read, the language interpreter splits the program into groups of clauses. These groups of one or more clauses are then broken down into tokens and each clause is classified as a particular type. Seemingly small syntactic differences may completely change the semantic content of a statement. For example:
SAY 'Hello, Bill'
is an instruction clause and will display "Hello, Bill" on the console, but:
''SAY 'Hello, Bill'
is a command clause, and will issue "SAY Hello, Bill" as a command to an external program. The presence of the leading null string () changes the classification from an instruction clause to a command clause.
The end of a line normally acts as the implicit end of a clause. A clause can be continued on the next line by ending the line with a comma. The comma is ignored by the program, and the next line is considered as a continuation of the clause. There is no limit to the number of continuations that may occur (except for those limits imposed by the command buffer).
String and comment tokens are automatically continued if a line ends before the closing delimiter has been found, and the newline (i.e., Enter) character is not considered to be part of the token.
Null clauses are lines of blanks or comments and may appear anywhere in a program. They have no function in the execution of a program, except to aid its readability and to increment the line count.
A label clause is a symbol followed by a colon (:). A label acts as a place marker in the program, but no action occurs with the execution of a label. The colon is considered as an implicit clause terminator, so each label stands as a separate clause. Label clauses may appear anywhere in a program. For example:
start: /*Begin execution*/ syntax: /*Error processing*/
Assignment clauses are identified by a variable symbol followed by an = operator. (In this context the = operator's normal definition of equality comparison is overridden.) The tokens to the right of the = are evaluated as an expression and the result is assigned to the variable. For example:
When = 'Now is the time' answ = 3.14 * fact(5)
The equal sign (=) assigns the value 'Now is the time' to the variable 'when', and assigns the result of 3.14 * fact(5) to the variable 'answ'.
Instruction clauses begin with the name of the instruction and tell ARexx to perform an action. Instruction names are described in Chapter 4. For example:
DROP a b c SAY 'please' IF j > 5 THEN LEAVE;
Command clauses are any ARexx expression that cannot be classified as one of the preceding types of clauses. The expression is evaluated and the result is issued as a command to an external host. For example:
'delete' 'myfile' /*AmigaDOS command*/ 'jump' current+10 /*An editor command*/
The delete command is not recognized as an ARexx command, so it is sent to the external host, in this case AmigaDOS. The jump command in the second example is assumed to be understood by an external text editor.
Expressions are a group of evaluated tokens. Most statements contain at least one expression. Expressions are composed of:
- Strings - The value of a string is the string itself.
- Symbols - The value of a fixed symbol is the symbol itself, translated to uppercase. Symbols may be used as variables and may have an assigned value.
- Operators - An operator has a priority order that determines when it will be performed.
- Parentheses - Parentheses may be used to alter the normal order of evaluation in the expression or to identify function calls. A symbol or string followed immediately by an open parenthesis defines the function name, and the tokens between the opening and closing parenthesis form the argument list for the function. For example, the expression:
J 'factorial is' fact (J)
is composed of:
- a symbol - J
- a blank operator
- a string - factorial is
- another blank
- a symbol - fact
- an open parenthesis
- a symbol - J
- a closing parenthesis
In this example, FACT is a function name and (J) is its argument list, the single expression J.
Before the evaluation of an expression proceeds, ARexx must obtain a value for each symbol in the expression. For fixed symbols the value is the symbol name itself, but variable symbols must be looked up in the current symbol table. In the example above, if the symbol J was assigned the value 3, the expression after symbol resolution would be:
3 'factorial is' FACT (3)
To avoid ambiguities in the values assigned to symbols during the resolution process, ARexx guarantees a strict left-to-right resolution order. Symbol resolution proceeds irrespective of operator priority or parenthetical grouping. If a function call is found, the resolution is suspended while the function is evaluated. It is possible for the same symbol to have more than one value in an expression.
If the previous example was rearranged to read:
FACT(J) 'is' J 'factorial'
would the second occurrence of symbol J still resolve to 3? In general, function calls may have side effects that include altering the values of variables. If the example was rearranged, the value of J might have been changed by the call to FACT.
After all symbol values have been resolved, the expression is evaluated based on operator priority and subexpression grouping. ARexx does not guarantee an order of evaluation among operators of equal priority and does not employ a "fast path" evaluation of Boolean operations. For example, in the expression:
(1 = 2) & (FACT(3) = 6)
the call to the FACT function will be made even though the first term of the AND (&) operation is 0. This example points out that ARexx will continue reading left to right, even though the given example is false and will return a value of 0.
The Command Interface
The ARexx command interface is a public message port. ARexx compatible applications must have this message port. ARexx programs issue commands by placing the command string in a message packet and sending the packet to the host's message port. The program suspends operation while the host processes the commands and resumes when the message packet returns.
The Host Address
ARexx maintains two implicit host addresses, a current and a previous value, as part of the program's storage environment. These values can be changed at any time using the ADDRESS instruction (or its synonym, SHELL). The current host address can be inspected with the ADDRESS() built-in function. The default host address string is REXX, but this can be overridden when a program is invoked. Most host applications will supply the name of their public port when they invoke a macro program, so that the macro can automatically issue commands back to the host.
One special host address is recognized. The string COMMAND indicates that the macro should be issued directly to AmigaDOS. All other host addresses are assumed to refer to a public message port. An attempts to send a command to a nonexistent message port will generate the syntax error "Host environment not found".
Program 9 shows the interaction between ARexx and the AmigaDOS editor, ED. The program sees if ED is running, determines the name of the message port, and sets up some stem variables.
Program 9. ED-status.rexx
/*Prints status of ED. ED must be running before this program is started.*/ /*ED ports are named 'Ed', 'Ed_1', 'Ed_2', and so on.*/ DEFAULT_ED = "Ed "/*This name is case sensitive*/ /*Procedure to follow if ED isn't running, or if only a second*/ /*(or later) instance of ED is running.*/ DO WHILE ~ SHOW( 'p', DEFAULT_ED ) /*Look for port*/ SAY "Cannot find port named" DEFAULT_ED SAY "Available ports:" SAY SHOW( 'P' ) '0a'X /*Let user choose port if we can't find it*/ SAY "Enter different name for port, or QUIT to quit " DEFAULT_ED = READLN( stdout ) IF STRIP( UPPER( DEFAULT_ED ) ) = 'QUIT' THEN EXIT 10 /*Let user quit*/ END SAY "Using ED port" DEFAULT_ED /*Now that port is found, have ARexx address it.*/ ADDRESS VALUE DEFAULT_ED /* Set up some useful stem variables*/ STEM.0 = 15 /*Number of ED ARexx variables*/ STEM.1 = 'LEFT' /*Left margin (SL)*/ STEM.2 = 'RIGHT' /*Right margin (SR)*/ STEM.3 = 'TABSTOP' /*Tab stop setting (ST)*/ STEM.4 = 'LMAX' /*Max visible line on screen*/ STEM.5 = 'WIDTH' /*Width of screen in chars*/ STEM.6 = 'X' /*Physical X pos. on screen-from 1*/ STEM.7 = 'Y' /*Physical Y pos. on screen-from 1*/ STEM.8 = 'BASE' /*Window base*/ /*Base is 0 unless screen is shifted right)*/ STEM.9 = 'EXTEND' /*Extended margin value (EX)*/ STEM.10 = 'FORCECASE' /*Case sensitivity flag*/ STEM.11 = 'LINE' /*Current line number*/ STEM.12 = 'FILENAME' /*File being edited*/ STEM.13 = 'CURRENT' /*Text of current line*/ STEM.14 = 'LASTCMD' /*Last extended command*/ STEM.15 = 'SEARCH' /*Last search string*/ /*Ask ED to put values into stem variable 'STEM.'*/ 'RV' '/STEM/' /*RV is an ED command used to send into from ED to ARexx*/ /*STEM.1 is LEFT, and STEM.LEFT now holds a value from ED.*/ /*Here is a way to print that information.*/ DO i = 1 to STEM.0 ED_VAR = STEM.1 SAY STEM.1 "=" STEM.ED_VAR /*Print ED variable/value*/ END
Creating a Macro
ARexx can be used to write programs for any host application that includes a compatible command interface. Some application programs are designed with an embedded macro language and may include many pre-defined macro commands.
Check your macro program for "shortcut" commands. Some programs may include powerful functions that were implemented specifically for use in macro programs.
The interpretation of the received commands depends entirely on the host application. In the simplest case, the command strings will correspond exactly to commands that could be entered directly by a user. For example, positional control (up/down) commands for a text editor would probably have identical interpretations. Other commands may be valid only when issued from a macro program. A command to simulate a menu operation would probably not be entered by the user. In Program 10, the ARexx program is called by ED to transpose two characters.
Program 10. Transpose.rexx
/*Given string '123', if cursor is on 3, macro converts string to '213'.*/ HOST = ADDRESS() /*Find out which ED called us*/ ADDRESS VALUE HOST /*. . . and talk to it.*/ 'rv' '/CURR/' /*Have ED put info in stem CURR*/ /*We'll need two pieces of information:*/ currpos = CURR.X /*Position of cursor on line*/ currling = CURR.CURRENT /*Contents of current line*/ IF ( currpos > 2 ) THEN currpos = currpos - 1 /*Must work on current line*/ ELSE DO /*Report error and exit*/ 'sm /Cursor must be at pos. 2 or more to the right/' EXIT 10 END /*Need to reverse the CURRPOSth and CURRPOSth-1 chars and*/ /*replace the current line with the new one.*/ DROP CURR. /*STEM variable CURR is no longer needed; save some memory*/ 'd' /*Tell ED to delete current line*/ currlin = swapch( currpos, currlin ) /*Swap 2 chars*/ 'i /'| |currlin| |'/ /*Insert modified line*/ DO i = 1 TO currpos /*Place cursor back at start*/ 'cr' /*ED's 'cursor right' command*/ END EXIT /*All done*/ /*Function to swap two characters*/ swapch: procedure PARSE ARG cpos,clin chl = SUBSTR( clin, cpos, 1 ) /*Get character*/ clin = DELSTR( clin, cpos, 1 ) /*Delete it from string*/ clin = INSERT( chl, clin, cpos - 2, 1 ) /*Insert to create transposition*/ RETURN clin /*Return modified string*/
To execute this example from ED, press ESC then enter:
You can also assign this string to a function key.
After it finishes processing a command, the host replies with a return code to indicate the status of the command. The documentation for the host application should describe the possible return codes for each command. These codes can be used to determine whether the operation performed by the command was successful.
This return code is placed in the ARexx special variable RC so that it can be examined by the macro. A value of zero means that the command was successful. A return of a positive integer indicates an error condition. The higher the integer, the more severe the error. The return code allows the macro program to determine whether the command succeeded and to take action if it failed.
Although ARexx was designed to work most effectively with programs that support its specific command interface, it can be used with any command shell program that uses standard I/O mechanisms to obtain its input stream. One way to use ARexx is to create an actual command file on the Ram Disk, then pass it directly to the Shell. Program 11 opens a new Shell to run a standard EXECUTE script.
Program 11. Shell.rexx
/*Launch a new Shell*/ ADDRESS command conwindow = "CON:0/0/640/100/NewOne/Close" /*Create a command file*/ CALL OPEN out, "ram:temp", write CALL WRITELN out, 'echo "This is a test"' CALL CLOSE out /*Open the new Shell window*/ 'newshell' conwindow "ram:temp" EXIT
The Execution Environment
|The material in this section is intended for advanced Amiga users. This information assumes a working knowledge of the Amiga operating system and a familiarity with the reference material on this wiki.|
The ARexx interpreter, RexxMast, provides a uniform execution environment by running each program as a separate process in the Amiga's multitasking operating system. This allows for a flexible interface between an external host program and RexxMast. The host program can proceed concurrently with its operations or can wait for the interpreted ARexx program to finish. Each ARexx program has both an external and internal environment.
The External Environment
The external environment includes its process structure, input and output streams, and current directory. When each ARexx process is created, it inherits the input and output streams and current directory from its client, the external program that invoked the ARexx program. For example, if an ARexx program was started from a Shell, the ARexx program will inherit the input and output stream and current directory of that Shell. The current directory is used as the starting point in any search for a program or data file. External functions are limited to a maximum of 15 arguments.
The Internal Environment
The internal environment of an ARexx program consists of a static global structure and one or more storage environments. The global data values are fixed (static) at the time the program is invoked. These values include the program source code, static data strings, and argument strings. Once the program is running, these values cannot be changed.
ARexx programs invoked as commands usually have only one argument string, although the command tokenization option may provide more than one. A program invoked as an internal function can have any number of arguments. These arguments persist for the duration of the program.
The storage environment includes the symbol table used for variable values, numeric options, trace options, and host address strings. While the global environment is unique, there may be many storage environments during the course of the program execution. Each time an internal function is called, a new storage environment is activated and initialized. The initial values for most fields are inherited from the previous environment, but values may be changed afterwards without affecting the caller's environment. The new environment persists until control returns from the function.
Every storage environment includes a symbol table to store the value strings that have been assigned to variables. This symbol table is organized as a two-level binary tree. The primary level stores entries for simple and stem symbols. The secondary level is used for compound symbols. All of the compound symbols associated with a particular stem are stored in one tree, with the entry for the stem being the root of the tree.
Symbols are not entered into the table until an assignment is made to the symbol. Once created, entries at the primary level are never removed, even if the symbol subsequently becomes uninitialized. Secondary trees are released whenever a new assignment is made to the stem associated with that tree.
ARexx provides complete tracking for all of the dynamically allocated resources that it uses to execute a program. These resources include memory space, DOS files and related structures, and the message port structure. The tracking system was designed to allow a program to shut down at any point without leaving any resources hanging.
It is possible to go outside of the interpreter's resource tracking net by making calls directly to the Amiga operating system from within an ARexx program. It is the programmer's responsibility to trace and return any resources allocated outside of the ARexx resource tracking system. ARexx provides a special interrupt facility so that a program can retain control after an execution error, perform the required cleanup, and exit.