(Stolen by Zrajm C Akfohg 20 december 2001 from <http://www.gbar.dtu.dk/~c958468/computer/cgi.html>.)
Most people choose a language called Perl (Practical Extraction and Report Language). Perl was not invented for writing CGI scripts but through the years many libraries have been developed for the purpose. Perl does not need to be compiled and can be transferred from server type to server type without problems.
I myself prefer C. It is unlike Perl good, powerful and flexible programming language but it has the disadvantage that it needs to be compiled for every new server type. This is though no problem if GNU C is used. GNU C is available on almost every server. This means that if a script is written on one server using GNU C it should be no problem to compile the program with GNU C on another server. CGI libraries for C are available on the Internet.
It is also possible to make command shell scripts. This is the most simple way to write a script but the things you can do are limited.
The first thing a script shall return is a header. The header shall state which type of data that are returned. This is called a content specifyer. It can also state an execution result code or a URL that the client is going to jump to.
A script can be invoked by several different methods depending on the link to the script. Some methods are GET, POST & HEAD. The most used methods, GET and POST, will be described in this document.
A script is required to return a header specifying the type of the returned data. This is called a content type specifier. The following is an example on a C program returning a content type specifier:
Another header item you can have is a location header. A location header specifies that another document should be loaded rather than data from the script. The format of the location header is:
The status header is used to return the status of the script. The following status codes are available:
[eof]
How do CGI scripts work?
The first thing a WEB-server does when it gets a request on a file is to look at its attributes. If it isn't accessible for "the world" it returns an error. If the file is accessible it looks at the execution attribute that state if the file can be executed. If it can't it will be loaded as a normal file and sent to the client. If it however can be executed the server will try to execute it and the output from the program will be sent to the client.
The following two HTML code examples will result in that the script is executed by the GET method:
<a href="http://www.john.doe.com/cgi-bin/script.cgi">
<FORM METHOD="GET" ACTION="http://john.doe.com/cgi-bin/script.cgi">
<INPUT TYPE="TEXT" size=32 NAME="NAME">
<INPUT TYPE="NUMBER" size=10 NAME="VALUE">
<INPUT TYPE=submit value="Order">
</FORM>
The following HTML code will result in that the script is executed by the POST method:
<FORM METHOD="POST" ACTION="http://john.doe.com/cgi-bin/script.cgi">
<INPUT TYPE="TEXT" size=32 NAME="NAME">
<INPUT TYPE="NUMBER" size=10 NAME="VALUE">
<INPUT TYPE=submit value="Order">
</FORM>
The difference between the POST and GET method is the following:
Script parameters
Parameters can be passed to the script after a ?. You can write anything in the parameter string except spaces, " and ?. Most people uses + to separate parameters and %xx (where xx is an ASCII hex number) to define special characters. Here are some examples on script links with parameters:
<a href="http://www.john.doe.com/cgi-bin/script.cgi?Parameter%201+Parameter%202">
<FORM METHOD="GET" ACTION="http://john.doe.com/cgi-bin/script.cgi?Parameter%201+Parameter%202">
<INPUT TYPE="TEXT" size=32 NAME="NAME">
<INPUT TYPE="NUMBER" size=10 NAME="VALUE">
<INPUT TYPE=submit value="Order">
</FORM>
The script can read the parameters from an environment variable called QUERY_STRING. When the script is invoked by the GET method by a form the form data will be appended to the QUERY_STRING. The example above would result in the following QUERY_STRING:
Parameter%201+Parameter%202+NAME=text+VALUE=number
The GET method
Parameters and form data can be found, as above described, in the QUERY_STRING environment variable.
The POST method
Parameters can be found in the QUERY_STRING environment variable. Form data will be available from stdin. The size of the form data can be read from the environment variable called CONTENT_LENGTH. Some C code that reads the form data could look like this:
int content_length=atoi(getenv("CONTENT_LENGTH"));
char *input=malloc(content_length);
fread(input,1,content_length,stdin);
The fields in the form data will be separated by & and spaces will be replaced with +. The following is an HTML code example and the resulting form data read from stdin:
<FORM METHOD="GET" ACTION="http://john.doe.com/cgi-bin/script.cgi?Parameter%201+Parameter%202">
<INPUT TYPE="TEXT" size=32 NAME="NAME">
<INPUT TYPE="NUMBER" size=10 NAME="VALUE">
<INPUT TYPE=submit value="Order">
</FORM>
NAME=This+is+the+text+from+the+input+field+of+type+TEXT&VALUE=number
Script data headers
The first a script shall return is a header. The header can consist of the following header items:
The Content-type is always required. A blank line indicates the end of the header.
printf("Content-type: text/html\n");
The following is a selection of content specifiers:
Content-type: text/html html;htm Content-type: text/plain txt;pl Content-type: image/gif gif Content-type: image/jpg jpg;jpeg;jpe Content-type: video/mpeg mpg;mpeg;mpe Content-type: application/postscript ps;ai;eps Content-type: application/mac-binhex40 hqx Content-type: application/octet-stream bin Content-type: application/oda oda Content-type: application/pdf pdf Content-type: application/rtf rtf Content-type: application/x-mif mif Content-type: application/x-maker fm Content-type: application/x-csh csh Content-type: application/x-dvi dvi Content-type: application/hdf hdf Content-type: application/x-latex latex Content-type: application/x-netcdf nc;cdf Content-type: application/x-sh sh Content-type: application/x-tcl tcl Content-type: application/x-tex tex Content-type: application/x-texinfo texinfo texi Content-type: application/x-troff t;tr;roff Content-type: application/x-troff-man man Content-type: application/x-troff-me me Content-type: application/x-troff-ms ms Content-type: application/x-wais-source src Content-type: application/zip zip Content-type: application/x-bcpio bcpio Content-type: application/x-cpio cpio Content-type: application/x-gtar gtar Content-type: application/x-shar shar Content-type: application/x-sv4cpio sv4cpio Content-type: application/x-sv4crc sv4crc Content-type: application/x-tar tar Content-type: application/x-ustar ustar Content-type: audio/basic au;snd Content-type: audio/x-aiff aif;aiff;aifc Content-type: audio/x-wav wav Content-type: image/ief ief Content-type: image/tiff tiff;tif Content-type: image/x-cmu-raster ras Content-type: image/x-portable-anymap ras Content-type: image/x-portable-bitmap pbm Content-type: image/x-portable-graymap pgm Content-type: image/x-portable-pixmap ppm Content-type: image/x-rgb rgb Content-type: image/x-xbitmap xbm Content-type: image/x-xpixmap xpm Content-type: image/x-xwindowdump xwd Content-type: text/richtext rtx Content-type: text/tab-separated-values tsv Content-type: text/x-setext etx Content-type: video/quicktime qt;mov Content-type: video/x-msvideo avi Content-type: video/x-sgi-movie movie Location: URL
Where URL is a vailid URL.
Status: 200 OK The request was fulfilled. Status: 201 OK Following a POST command. Status: 202 OK Accepted for processing, but processing not completed. Status: 203 OK Partial information - The returned information is only partial. Status: 204 OK No response - Request but no information exists to send back Status: 301 Moved - The data requested has a new location and the change is permanent. Status: 302 Found - The data requested has a different URL temporarily Status: 303 Method - Under discussion, a suggestion for the client to try another location. Status: 304 Not Modified - The document has not been modified as expected. Status: 400 Bad request - Syntax problem in th equest or it could not be satisfied. Status: 401 Unauthorized - The client is not authorized to access data. Status: 402 Payment requred - Indicates a charging scheme is in effect. Status: 403 Forbidden - Access not granted even with authorization. Status: 404 Not found - Server could not find the given resource. Status: 500 Internal Error - The server could not fulfill the equest because of an unexpected condition. Status: 501 Not implemented - The server does not support the facility requested. Status: 502 Server overloaded - High load (or servicing) in progress. Status: 503 Gateway timeout - Server waited for another service that did not complete in time. Environment variables
The following environment variables are standard and should always be available (they can be empty though):
AUTH_TYPE The protocol-specific authentication method used to validate the user. It is set when the server supports user authentication. CONTENT_LENGTH The length of the content as given by the client. CONTENT_TYPE The content type of the data for queries that have attached information (for example, as HTTP, POST and PUT) GATEWAY_INTERFACE The CGI specification revision of the server. Format: CGI/revision. PATH_INFO Path information, as given by the user request. PATH_TRANSLATED The translated version of PATH_INFO, with the path including any viertual-to-physical mapping to it. QUERY_STRING The information following the ? in the URL when referencing the script (using GET) REMOTE_ADDR The IP address of the remote (user's) host making the request. REMOTE_HOST The name of the host making the request REMOTE_IDENT This variable is set to the remote username as retrived from the server (if the HTTP server supports RFC 931 identification). REMOTE_USER This is set to the username if the HTTP server supports RFC 931 identification and the script is protected. REQUEST_METHOD The method by which the request was made (for example, GET, HEAD, POST, and so on. SCRIPT_NAME A pathname of the script to execute SERVER_NAME The server's hostname, DNS alia, or IP address as it would appear in self-referencing URLs. SERVER_PORT The port number where the request was sent. SERVER_PROTOCOL The name/revision of the information protocol. SERVER_SOFTWARE The name/version of the information server software that answered the request. Simple script example using GET method
The following is a simple CGI script written in C. The scripts writes back its QUERY_STRING.
#include <stdlib.h>
void main(void)
{
printf("Content-type: text/html\n\n");
printf("<HEAD><TITLE>Write back</TITLE></HEAD>\n");
printf("<BODY>Parameters: %s</BODY>",getenv("QUERY_STRING"));
}