CHAPTER 3 OPERATION AND REQUIREMENTS Creating Programs to Assemble Before you invoke A86 you must have an assembly-language source program to assemble. A source program is an ASCII text file, created with the text editor of your choice. The editor must produce a file that is free of internal records known only to the editor. Some of the fancier word processors will require you to use a "plain text" mode to insure that the file is free of such records. This manual will fully explain to you the correct syntax of an A86 program, but it is not intended to teach you about the 86-family instruction set, or about assembly-language interfacing to your computer or your operating system. The instruction set charts in Chapters 6 and 7 give concise, one-line descriptions of each instruction, but they don't go into any detail about instruction usage. For such detail, I recommend either one of the two books The 8086/8088 Primer by Stephen P. Morse, or The 80286 Architecture by Morse and Albert. The latter book covers the 8087/287 and is recommended if you have a floating-point coprocessor, or if you wish to explore the expanded capabilities of the 286. (My 386 book is the latest in a series in which those two books are predecessors.) To learn how to make system calls to input from keyboard or disk, output to screen, printer or disk, etc., you need a book that covers the MS-DOS operating system and the BIOS for the IBM-PC, or whatever computer you have if it's non-IBM-BIOS-compatible (if you don't know whether or not it's compatible, it probably is). I used Peter Norton's Programmer's Guide to the IBM-PC. If you're less familiar with assembly language, you will probably want his Assembly Language Guide to the IBM-PC instead. Program Invocation To invoke A86, you must provide a program invocation line, either typed to the console when the DOS command prompt appears, or included in a batch file. The program invocation line consists of the following: 1. The program name A86. 2. The names of the source files you want to assemble. You may use the wild card delimiters * and ? if you wish, to denote a group of source files to be assembled. A86 will sort all matching names into alphabetical order for each wild card specification; so the files will be assembled in the same order even if they get jumbled up within a directory. A86 identifies the end of the source file names when it sees a name with no extension, or a name with the default object extension (COM, BIN or OBJ, as described shortly). Sorry, you cannot have a source file with the default object extension. 3-2 3. You may optionally provide the word TO, to separate the source file names from the output file names. 4. The name of the output program. If you do not provide an extension, A86 will assume one of the following extensions: a. .OBJ if you invoked the +O switch, for linkable object file production. b. .BIN if there is no +O switch, but there is an ORG 0 in your program. c. .COM otherwise. If you want your program file to have no extension, you end the file name with a period. You have the option to omit both the program file name and the symbol table file name from the invocation. If you do so, A86 will output the program source.COM (or source.OBJ or source.BIN) and the symbol table source.SYM; where "source" is a name derived from the list of source files, according to the rules described in the section "Strategies for Source File Maintenance" later in this chapter. 5. The name of the symbol table file. You do not need to give the .SYM extension: A86 will produce a file with extension .SYM in any case. In earlier versions of A86 I had allowed other extensions to be specified, but this meant that by carelessly permuting names on the command line, you could destroy a source file-- not good! You can omit the name of the symbol table file. If you do so, A86 will use the same root as the output program name. If you desire no symbol table file, specify the +S switch in your invocation line or A86 environment variable (described later in this chapter). Assembler Switches In addition to input and output file names, you may intersperse assembler switch settings anywhere after the A86 program name. They are all acted upon immediately, no matter where they are on the command line. Some of the switches are discussed in more detail elsewhere; I'll summarize them here: +C causes the assembler to output symbol names with lower case letters to its OBJ and SYM files. The case of letters is still ignored during assembly. I output the name as it appears in the last PUBLIC or EXTRN directive containing it; if there is no such directive, I use the first occurrence of the symbol to control which letters are output lower case. (+C duplicates Microsoft MASM's /mx switch.) 3-3 +c causes the assembler to consider the case of letters within all non-built-in symbols as significant both during assembly and for output. Thus, for example, you can define different symbols X and x. (+c duplicates MASM's /ml switch.) +D causes the default base for numeric constants to be decimal, even if the constants have leading zeroes. -D causes the default base to be hexadecimal if there is a leading zero; decimal otherwise. +E causes the error-message-augmented source file to be written to yourname.ERR within the current directory, in all cases. With +E, A86 will never rewrite your original source file. -E causes A86 to insert error messages into your source file, whenever the file is in the current directory. If the file is not in the current directory, A86 write an ERR file no matter what the E switch setting is. +F causes A86 to generate the 287 form of floating point instructions (no implicit FWAIT bytes are generated before the instructions). This mode can also be specified in the program with the .287 directive. +f causes A86 to support emulation of the 8087. When A86 sees a floating point instruction, it generates external references to be resolved by the standard emulation library (provided by Microsoft, Borland, etc.). When you LINK your program to the emulation library, the floating point instructions are emulated by software. NOTE you must be assembling to a linkable OBJ file for this mode to have effect; otherwise, +f is ignored. -F causes emulation and default-287 to be disabled. You'll still get 287 generation if there is a .287 directive in your program. +Ln causes A86 to implement one or more of the following minor options for code-generation. All these options enhance MASM compatibility. The first three do so at the expense of program size. The number n should be the sum of the numbers for each of the options selected. For example, +L10 will select the options numbered 2 and 8. 1 causes A86 to generate a longer (3-byte) instruction form for an unconditional JMP instruction to a forward reference local label, e.g. JMP >L1. A86 normally assumes that since it's a local label, it will be nearby and the short, 2-byte form will work. With this option your code will usually be longer than necessary, but you'll be spared having to occasionally go back and code an explicit JMP LONG >L1. 2 causes A86 to refrain from optimizing the LEA instruction. Without this option A86 will replace an LEA with a shorter, equivalent MOV when it sees the chance. 3-4 4 causes A86 to generate a slightly more inefficient internal format for memory references within an OBJ file. The Power C compiler's MIX utility requires the inefficient form. The makers of Power C refused to support their customers on this by enhancing MIX, so I am forced to offer this option. 8 causes A86 to assume that all ambiguous forward reference operands to instructions other than jumps or calls refer to memory variables and not offsets or constant values. You can override this on a one-by-one basis, with the OFFSET operator. -L causes A86 to revert to its default for all the above options. +O causes A86 to produce a linkable .OBJ file when the output file name extension is not explicitly given. -O causes A86 to produce an executable .COM file when the output file name extension is not explicitly given. +S suppresses the creation of the symbol table (.SYM) file in normal (no errors) assembly. This is overridden if you give an explicit symbols file name in the invocation line. -S causes the symbol table file to be created in all cases. +X causes A86 to require that undefined names be explicitly declared with an EXTRN when A86 is producing a linkable .OBJ file. The X switch has no effect when A86 is making a .COM file. -X causes A86 to quietly assume that all undefined names are valid external references. The default setting for all the switches is "minus". Multiple switches can be specified with a single sign; e.g. +OX is the same as +O+X. The A86 Environment Variable To allow you to customize A86, the assembler examines the MS-DOS environment variable named "A86" when it is invoked. If there is such a variable, its contents are inserted before the invocation command tail, as if you had typed them yourself. For example, if you execute the command SET A86=+OX while in DOS (typically in the AUTOEXEC.BAT file run when the computer is started), then the O and X switches will be "plus", unless overridden with a "minus" setting in the command line. 3-5 You may also include one or more file names in the A86 environment variable. Those files will always be assembled first, before the files you specify on the command line. This allows you to set up a library of macro definitions, which will always be automatically available to your programs. Thus, for example, the DOS command SET A86=C:\A86\MACDEF.8 +OX will cause both the O and X switches to default ON, but will also cause the file MACDEF.8 of subdirectory A86 of drive C to always be assembled. Using Standard Input as a Command Tail The following feature is a bit advanced. If you're not familiar with the practice of redirecting standard input, you may safely skip this section. A86 can also be configured to take its command arguments from standard input, in addition to the invocation command tail or the A86 environment variable. This allows A86 to be used in those menu-driven systems that don't generate command tails for programs. It also allows other programs to create lists of files to be assembled, then "pipe" the list to A86. Here's how the feature works: when the command argument A86 is an ampersand &, A86 will prompt for standard input. If the ampersand is seen but there are other things following it, the ampersand is ignored. For example, you can place a list of file names and switch settings into a file called FILELIST. You can then invoke the assembler via A86