Svend's CGI Script Page.

(Stolen by Zrajm C Akfohg 20 december 2001 from <http://www.gbar.dtu.dk/~c958468/computer/cgi.html>.)


Contents


If you have questions, corrections or things to add then please send me an e-mail.

What is a CGI script?

A CGI script is, like a Java script, a program that runs when it is activated by a client browser. The difference between a Java script and a CGI script is that a CGI script runs on the WEB-server rather than on the client computer. This allows the CGI script to write and read data on the WEB-server itself. CGI scripts are used for many purposes some of them are: counters, guest books, databases and mail forms.

Programming languages

A CGI script can be programmed in any language. The only demand is that the final program can be executed on the WEB-server where it is installed.

Most people choose a language called Perl (Practical Extraction and Report Language). Perl was not invented for writing CGI scripts but through the years many libraries have been developed for the purpose. Perl does not need to be compiled and can be transferred from server type to server type without problems.

I myself prefer C. It is unlike Perl good, powerful and flexible programming language but it has the disadvantage that it needs to be compiled for every new server type. This is though no problem if GNU C is used. GNU C is available on almost every server. This means that if a script is written on one server using GNU C it should be no problem to compile the program with GNU C on another server. CGI libraries for C are available on the Internet.

It is also possible to make command shell scripts. This is the most simple way to write a script but the things you can do are limited.

How do CGI scripts work?

The first thing a WEB-server does when it gets a request on a file is to look at its attributes. If it isn't accessible for "the world" it returns an error. If the file is accessible it looks at the execution attribute that state if the file can be executed. If it can't it will be loaded as a normal file and sent to the client. If it however can be executed the server will try to execute it and the output from the program will be sent to the client.

The first thing a script shall return is a header. The header shall state which type of data that are returned. This is called a content specifyer. It can also state an execution result code or a URL that the client is going to jump to.

A script can be invoked by several different methods depending on the link to the script. Some methods are GET, POST & HEAD. The most used methods, GET and POST, will be described in this document.
The following two HTML code examples will result in that the script is executed by the GET method:

<a href="http://www.john.doe.com/cgi-bin/script.cgi">

<FORM METHOD="GET" ACTION="http://john.doe.com/cgi-bin/script.cgi">
  <INPUT TYPE="TEXT" size=32 NAME="NAME">
  <INPUT TYPE="NUMBER" size=10 NAME="VALUE">
  <INPUT TYPE=submit value="Order">
</FORM>
The following HTML code will result in that the script is executed by the POST method:
<FORM METHOD="POST" ACTION="http://john.doe.com/cgi-bin/script.cgi">
  <INPUT TYPE="TEXT" size=32 NAME="NAME">
  <INPUT TYPE="NUMBER" size=10 NAME="VALUE">
  <INPUT TYPE=submit value="Order">
</FORM>
The difference between the POST and GET method is the following:
  • GET: Data from a form is passed to the script by environment variables. The size of an environment variable is limited and therefor is the amount of data that can be passed to the script limited.
  • POST: The data from a form must be read by the script from stdin. This way large blocks of data can be passed to the script.

    Script parameters

    Parameters can be passed to the script after a ?. You can write anything in the parameter string except spaces, " and ?. Most people uses + to separate parameters and %xx (where xx is an ASCII hex number) to define special characters. Here are some examples on script links with parameters:
    <a href="http://www.john.doe.com/cgi-bin/script.cgi?Parameter%201+Parameter%202">
    
    <FORM METHOD="GET" ACTION="http://john.doe.com/cgi-bin/script.cgi?Parameter%201+Parameter%202">
      <INPUT TYPE="TEXT" size=32 NAME="NAME">
      <INPUT TYPE="NUMBER" size=10 NAME="VALUE">
      <INPUT TYPE=submit value="Order">
    </FORM>
    The script can read the parameters from an environment variable called QUERY_STRING. When the script is invoked by the GET method by a form the form data will be appended to the QUERY_STRING. The example above would result in the following QUERY_STRING:
    Parameter%201+Parameter%202+NAME=text+VALUE=number

    The GET method

    Parameters and form data can be found, as above described, in the QUERY_STRING environment variable.

    The POST method

    Parameters can be found in the QUERY_STRING environment variable. Form data will be available from stdin. The size of the form data can be read from the environment variable called CONTENT_LENGTH. Some C code that reads the form data could look like this:
      int content_length=atoi(getenv("CONTENT_LENGTH"));
      char *input=malloc(content_length);
      fread(input,1,content_length,stdin);
    
    The fields in the form data will be separated by & and spaces will be replaced with +. The following is an HTML code example and the resulting form data read from stdin:
    <FORM METHOD="GET" ACTION="http://john.doe.com/cgi-bin/script.cgi?Parameter%201+Parameter%202">
      <INPUT TYPE="TEXT" size=32 NAME="NAME">
      <INPUT TYPE="NUMBER" size=10 NAME="VALUE">
      <INPUT TYPE=submit value="Order">
    </FORM>
    
    NAME=This+is+the+text+from+the+input+field+of+type+TEXT&VALUE=number
    

    Script data headers

    The first a script shall return is a header. The header can consist of the following header items:
  • Content-type
  • Location
  • Status
    The Content-type is always required. A blank line indicates the end of the header.

    A script is required to return a header specifying the type of the returned data. This is called a content type specifier. The following is an example on a C program returning a content type specifier:

      printf("Content-type: text/html\n");
    
    The following is a selection of content specifiers:
    Content-type: text/htmlhtml;htm
    Content-type: text/plaintxt;pl
    Content-type: image/gifgif
    Content-type: image/jpgjpg;jpeg;jpe
    Content-type: video/mpegmpg;mpeg;mpe
    Content-type: application/postscriptps;ai;eps
    Content-type: application/mac-binhex40hqx
    Content-type: application/octet-streambin
    Content-type: application/odaoda
    Content-type: application/pdfpdf
    Content-type: application/rtfrtf
    Content-type: application/x-mifmif
    Content-type: application/x-makerfm
    Content-type: application/x-cshcsh
    Content-type: application/x-dvidvi
    Content-type: application/hdfhdf
    Content-type: application/x-latexlatex
    Content-type: application/x-netcdfnc;cdf
    Content-type: application/x-shsh
    Content-type: application/x-tcltcl
    Content-type: application/x-textex
    Content-type: application/x-texinfotexinfo texi
    Content-type: application/x-trofft;tr;roff
    Content-type: application/x-troff-manman
    Content-type: application/x-troff-meme
    Content-type: application/x-troff-msms
    Content-type: application/x-wais-sourcesrc
    Content-type: application/zipzip
    Content-type: application/x-bcpiobcpio
    Content-type: application/x-cpiocpio
    Content-type: application/x-gtargtar
    Content-type: application/x-sharshar
    Content-type: application/x-sv4cpiosv4cpio
    Content-type: application/x-sv4crcsv4crc
    Content-type: application/x-tartar
    Content-type: application/x-ustarustar
    Content-type: audio/basicau;snd
    Content-type: audio/x-aiffaif;aiff;aifc
    Content-type: audio/x-wavwav
    Content-type: image/iefief
    Content-type: image/tifftiff;tif
    Content-type: image/x-cmu-rasterras
    Content-type: image/x-portable-anymapras
    Content-type: image/x-portable-bitmappbm
    Content-type: image/x-portable-graymappgm
    Content-type: image/x-portable-pixmapppm
    Content-type: image/x-rgbrgb
    Content-type: image/x-xbitmapxbm
    Content-type: image/x-xpixmapxpm
    Content-type: image/x-xwindowdumpxwd
    Content-type: text/richtextrtx
    Content-type: text/tab-separated-valuestsv
    Content-type: text/x-setextetx
    Content-type: video/quicktimeqt;mov
    Content-type: video/x-msvideoavi
    Content-type: video/x-sgi-moviemovie

    Another header item you can have is a location header. A location header specifies that another document should be loaded rather than data from the script. The format of the location header is:

    Location: URL
    Where URL is a vailid URL.

    The status header is used to return the status of the script. The following status codes are available:
    Status: 200 OKThe request was fulfilled.
    Status: 201 OKFollowing a POST command.
    Status: 202 OKAccepted for processing, but processing not completed.
    Status: 203 OKPartial information - The returned information is only partial.
    Status: 204 OKNo response - Request but no information exists to send back
    Status: 301Moved - The data requested has a new location and the change is permanent.
    Status: 302Found - The data requested has a different URL temporarily
    Status: 303Method - Under discussion, a suggestion for the client to try another location.
    Status: 304Not Modified - The document has not been modified as expected.
    Status: 400Bad request - Syntax problem in th equest or it could not be satisfied.
    Status: 401Unauthorized - The client is not authorized to access data.
    Status: 402Payment requred - Indicates a charging scheme is in effect.
    Status: 403Forbidden - Access not granted even with authorization.
    Status: 404Not found - Server could not find the given resource.
    Status: 500Internal Error - The server could not fulfill the equest because of an unexpected condition.
    Status: 501Not implemented - The server does not support the facility requested.
    Status: 502Server overloaded - High load (or servicing) in progress.
    Status: 503Gateway timeout - Server waited for another service that did not complete in time.

    Environment variables

    The following environment variables are standard and should always be available (they can be empty though):
    AUTH_TYPEThe protocol-specific authentication method used to validate the user. It is set when the server supports user authentication.
    CONTENT_LENGTHThe length of the content as given by the client.
    CONTENT_TYPEThe content type of the data for queries that have attached information (for example, as HTTP, POST and PUT)
    GATEWAY_INTERFACEThe CGI specification revision of the server. Format: CGI/revision.
    PATH_INFOPath information, as given by the user request.
    PATH_TRANSLATEDThe translated version of PATH_INFO, with the path including any viertual-to-physical mapping to it.
    QUERY_STRINGThe information following the ? in the URL when referencing the script (using GET)
    REMOTE_ADDRThe IP address of the remote (user's) host making the request.
    REMOTE_HOSTThe name of the host making the request
    REMOTE_IDENTThis variable is set to the remote username as retrived from the server (if the HTTP server supports RFC 931 identification).
    REMOTE_USERThis is set to the username if the HTTP server supports RFC 931 identification and the script is protected.
    REQUEST_METHODThe method by which the request was made (for example, GET, HEAD, POST, and so on.
    SCRIPT_NAMEA pathname of the script to execute
    SERVER_NAMEThe server's hostname, DNS alia, or IP address as it would appear in self-referencing URLs.
    SERVER_PORTThe port number where the request was sent.
    SERVER_PROTOCOLThe name/revision of the information protocol.
    SERVER_SOFTWAREThe name/version of the information server software that answered the request.

    Simple script example using GET method

    The following is a simple CGI script written in C. The scripts writes back its QUERY_STRING.
    #include <stdlib.h>
    
    void main(void)
    {
      printf("Content-type: text/html\n\n");
    
      printf("<HEAD><TITLE>Write back</TITLE></HEAD>\n");
      printf("<BODY>Parameters: %s</BODY>",getenv("QUERY_STRING"));
    }
    

    [eof]