Chapter 9

How Common Gateway Interface (CGI) Works


CONTENTS

In general, Web software is not particularly database friendly-it often happens that valuable corporate information is not always accessible to many people who need it.

In the earliest days of the Web, it was exceedingly difficult for anyone to gain access to those databases. As technology developed, however, and it became easier to access databases, an increasing number of databases became available, both on the Internet and then later on intranets as well. Web server software was developed to make it easier to access databases for the Web.

There are a number of ways in which someone from an intranet can tap into corporate databases. An exceedingly popular one is called the Common Gateway Interface (CGI). CGI allows any executable to run and have its output sent back to the requesting client. Therefore, it allows intranet programmers to write programs and scripts that will allow people on an intranet to use their Web browsers to easily query databases by filling out forms on Web pages-and they will then send those results back in HTML form that the browsers can understand.

Essentially, CGI is an interface that delivers information from the server to your program and from your program back to the requesting client. It is not a programming language. The program does all the processing. CGI only gets data to the program. On an intranet, the data accessed is often from a database of some sort. You've no doubt used CGI many times without knowing it. On the Internet, for example, if you've filled out a form on a Web page in order to register to use a site, and then later received an e-mail notification with a password for you to use, you've probably used CGI. CGI most likely took the information you filled in the form, performed several actions on it (including putting the information in a database), automatically created a password, and then automatically sent you mail.

On your corporate intranet, if you have one, you may well have used CGI programs or scripts as well. If you've queried a corporate database from a Web page and gotten information from it, then there's a good chance that a CGI program or script is what did the work for you.

Intranet programmers can use a variety of technologies to make use of CGI. One of the simplest is to use what's called an interpreted language. An interpreted language, such as the popular Perl used on UNIX systems, is often favored because scripts written with them are easy to debug, modify, and maintain.

CGI can also be accessed with more sophisticated computer languages, such as C, C++, or Fortran. When a programmer writes a program in a language such as C to be accessed by CGI, the program must first be compiled. That means running it through a program called a compiler that can change the application into a machine-readable language that the computer can understand.

How Common Gateway Interface (CGI) Works

Essentially, CGI is an interface that delivers information from the server to your program and from your program back to the requesting client. It is not a programming language. The program does all the processing. CGI only gets data to the program. CGI is a standard that allows programmers to write programs that can access information servers and databases, and then send the information to users on an intranet. Using CGI, Web-based intranet technologies can communicate with non-TCP/IP resources and databases. Using CGI, an intranet programmer can write an application that searches a database and displays the result in HTML format. CGI is used to allow people to fill out corporate forms on an intranet, and have that information entered into a database.

  1. For CGI to be used, an intranet programmer first writes a program or a script. Since interpreted languages are easier to debug, modify, and maintain than compiled programs, they are used more frequently than compiled programs. Perl is probably the most popular language for writing scripts. Languages such as C, C++, or Fortran can be used to access CGI as well after they have been compiled.
  2. Next, the script or compiled program is put on an intranet Web server in a special directory (usually called /cgi-bin) which holds all the CGI programs and scripts. For security reasons, programs put in other directories won't run. If multiple directory placements were allowed, it would be difficult for intranet maintainers to track them all and therefore realize if an unauthorized user posted a rogue CGI program.
  3. After posting the CGI program to /cgi-bin, it is linked to the URL in HTML on an intranet Web page.
  4. When someone clicks on the URL, the server launches (via HTML's GET or POST) the CGI program residing on the Web server. If, for example, the CGI program's function is for searching a database, the CGI program could send an HTML form to the client. From the client, the data on the completed query form is sent (with HTTP request headers) back to the CGI program using STDIN or environment variables. The data is formatted as encoded name/value pairs.
  5. The CGI program contacts the database and requests the information. The database sends the information to the CGI program. The information can be in a variety of formats, such as text, graphics, sound and video files, and URLs. The CGI program returns the results to the server (via STDOUT) which, in turn, would send it on to the browser.
  6. The CGI program also formats the data, for example, taking the information and putting it into HTML format so that the user can read it using a Web browser. The user can use that HTML page as they can any other.