GET / HTTP/1.0 HTTP/1.1 200 OK Date: Mon, 14 Nov 2007 17:20:20 GMT Server: Apache X-Powered-By: PHP/4.4.0 Content-Length: 5429 Connection: close Content-Type: text/html; charset=ISO-8859-1 ... actual html content ...
Web browsers communicate with web servers using the HTTP protocol. The formal definition is given by RFC-2616: Hypertext Transfer Protocol -- HTTP/1.1. This protocol follows the client/server model where the browser is the client making a request and the web server is the server providing the reply.
An example protocol exchange made with telnet www.mun.ca 80 :
GET / HTTP/1.0 HTTP/1.1 200 OK Date: Mon, 14 Nov 2007 17:20:20 GMT Server: Apache X-Powered-By: PHP/4.4.0 Content-Length: 5429 Connection: close Content-Type: text/html; charset=ISO-8859-1 ... actual html content ...
The red text is the request and the blue text is the reply. The server closes the connection after sending the reply.
Telnet is also used to make the following request with the command telnet www.mun.ca 80 :
HEAD /foo/bar/xx HTTP/1.0 HTTP/1.1 404 Not Found Date: Mon, 14 Nov 2007 18:15:25 GMT Server: Apache Connection: close Content-Type: text/html; charset=iso-8859-1
The user types the following lines after running telnet:
HEAD /foo/bar/xx HTTP/1.0
The blank line indicates the end of the request header. Each line is terminated by a CR and a LF (abbreviated as CRLF).
A request contains a series of lines. The first line follows the form:
method path protocol version
where protocol version specifies the HTTP version (1.0 or 1.1), path is the rest of the URL after the hostname, and method is:
A series of header lines follows the request line. More content can follow the header lines. This content is separated by a blank line.
The reply line follows the form:
protocol version status code text description of status
For example,
Header lines can following the response line. The content is separated by a blank line.
Common codes include:
A header line contains an attribute name, a colon (:), and an attribute value terminated by a CRLF. An example of the reply header lines is:
Date: Mon, 14 Nov 2007 18:15:25 GMT Server: Apache Connection: close Content-Type: text/html; charset=iso-8859-1
Some of the more common header lines are:
MIME: Multipurpose Internet Mail Extensions
MIME was initially created for the transmission and reception of non-text e-mail (e.g., images, programs ...).
When a web server transmits a file to a browser using the HTTP protocol it also sends the MIME type of the file. The MIME type controls how the browser handles the file. All files must have an associated MIME type, the browser will prompt the user if the MIME type is unknown.
A list of registered mime types is maintained by the Internet Assigned Numbers Authority (IANA). A mime type is composed of two parts separated by a slash. The first part gives the general media type (e.g., text, image) and the second part gives the specific format (e.g., text/html, text/plain, image/jpeg, image/png).
The MIME type for a HTML file/document is:
text/html
A plain text file is:
text/plain
A file containing HTML tags, but sent as text/plain would be displayed as a straight text file.
The XHTML MIME type is
application/xhtml+xml
Any XML document, including XHTML, can also be described with these MIME types.
application/xml or text/xml
Mime types for common image formats are:
image/bmp
image/gif
image/jpeg
image/png
image/tiff
When a browser reads a local file it uses the file extension to determine its MIME type. Under Linux the file, /etc/mime.types, specifies the mapping between file extensions and MIME types. A subset of the contents is:
application/ogg ogg application/x-tex tex application/xhtml+xml xhtml xht application/xml application/xml-dtd application/xml-external-parsed-entity application/zip zip audio/mpeg mpga mp2 mp3 audio/x-pn-realaudio ram rm audio/x-wav wav image/bmp bmp image/gif gif image/jpeg jpeg jpg jpe image/png png image/tiff tiff tif model/vrml wrl vrml multipart/form-data multipart/mixed text/css css text/html html htm text/plain asc txt text/sgml sgml sgm text/x-setext etx text/xml xml xsl video/mpeg mpeg mpg mpe video/quicktime qt mov video/x-msvideo avi
Typing in the following into the browser's location line
http://localhost:12345/foo/bar
yields
GET /foo/bar HTTP/1.1 Host: localhost:12345 User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.0.7-1.1.fc4 Firefox/1.0.7 Accept: text/xml,application/xml,application/xhtml+xml,text/html;q=0.9,text/plain;q=0.8,image/png,*/*;q=0.5 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive GET /favicon.ico HTTP/1.1 Host: localhost:12345 User-Agent: Mozilla/5.0 (X11; U; Linux i686; en-US; rv:1.7.12) Gecko/20050922 Fedora/1.0.7-1.1.fc4 Firefox/1.0.7 Accept: image/png,*/*;q=0.5 Accept-Language: en-us,en;q=0.5 Accept-Encoding: gzip,deflate Accept-Charset: ISO-8859-1,utf-8;q=0.7,*;q=0.7 Keep-Alive: 300 Connection: keep-alive
Where does the second request come from?
Write a Java program to output the HTTP request from a browser. The program can be written using a server socket.
Socket sock = listen.accept(); BufferedReader rd = new BufferedReader( new InputStreamReader( sock.getInputStream() )); BufferedWriter bw = new BufferedWriter( new OutputStreamWriter( sock.getOutputStream() )); // get request String line = rd.readLine(); if ( line == null ) { sock.close(); continue; } String[] words = line.split("\\s+"); System.out.println( line ); do { line = rd.readLine(); if ( line == null ) break; System.out.println( line ); } while ( line.length() > 0 ); if ( words[1].equals("/favicon.ico") ) { sendNotFound( bw ); } else { sendReply( words[1], bw ); } rd.close(); bw.close(); sock.close();
This code is almost identical to the LineServer.
The not found reply is generated by:
public static void sendNotFound( BufferedWriter out ) { PrintWriter pr = new PrintWriter( out ); pr.println("HTTP/1.0 404 Not Found"); pr.println("Connection: close"); pr.println(); pr.flush(); }
A HTTP response containing a HTML page is generated by:
public static void sendReply( String path, BufferedWriter out ) { PrintWriter pr = new PrintWriter( out ); pr.println("HTTP/1.0 200 OK"); pr.println("Connection: close"); pr.println("Content-Type: text/html"); pr.println(); pr.println("<html>"); pr.println("<body>"); pr.println("<pre>" + "path: " + path + "</pre>"); pr.println("<pre>" + new Date() + "</pre>"); pr.println("</body>"); pr.println("</html>"); pr.flush(); }
The HTTP protocol has not been correctly implemented by the above code. What is missing? Why does it still work?
A very very simple web server.
import java.io.*; import java.net.*; import java.util.Date; public class HttpRequestViewer { public static void sendNotFound( BufferedWriter out ) { PrintWriter pr = new PrintWriter( out ); pr.println("HTTP/1.0 404 Not Found"); pr.println("Connection: close"); pr.println(); pr.flush(); } public static void sendReply( String path, BufferedWriter out ) { PrintWriter pr = new PrintWriter( out ); pr.println("HTTP/1.0 200 OK"); pr.println("Connection: close"); pr.println("Content-Type: text/html"); pr.println(); pr.println("<html>"); pr.println("<body>"); pr.println("<pre>" + "path: " + path + "</pre>"); pr.println("<pre>" + new Date() + "</pre>"); pr.println("</body>"); pr.println("</html>"); pr.flush(); } public static void main( String[] args ) { try { int port = Integer.parseInt( args[0] ); ServerSocket listen = new ServerSocket( port ); System.out.println ("Accepting HTTP request from port: " + listen.getLocalPort()); while ( true ) { Socket sock = listen.accept(); BufferedReader rd = new BufferedReader( new InputStreamReader( sock.getInputStream() )); BufferedWriter bw = new BufferedWriter( new OutputStreamWriter( sock.getOutputStream() )); // get request String line = rd.readLine(); if ( line == null ) { sock.close(); continue; } String[] words = line.split("\\s+"); System.out.println( line ); do { line = rd.readLine(); if ( line == null ) break; System.out.println( line ); } while ( line.length() > 0 ); if ( words[1].equals("/favicon.ico") ) { sendNotFound( bw ); } else { sendReply( words[1], bw ); } rd.close(); bw.close(); sock.close(); } } catch( IOException e ) { System.out.println("error: " + e ); } } }
The interaction between a web browser and a server can be summarized as:
Simple web servers are fairly easy to build. The server waits for a HTTP request, decodes the requests, and sends a reply.
The example HTTP server implementation should be critiqued on the following:
The main method for a simple server is:
public static void main( String[] args ) { if ( args.length != 1 ) { System.out.println("java HttpRequestDispatcher handler"); System.exit( 1 ); } Runtime.getRuntime().addShutdownHook( new Thread() { public void run() { logger.info("shutting down"); } } ); HttpRequestHandler handler = getHandler( args[0] ); if ( handler == null ) { System.out.println("bad handler"); System.exit( 1 ); } logger.info("start"); try { ServerSocket listen = new ServerSocket( 8888 ); while ( true ) { Socket sock = listen.accept(); new HttpRequestDispatcher( sock, handler); } } catch( IOException e ) { logger.warning( e.getMessage() ); } } }
The HttpRequestHandler interface
decouples the server from the code that handles
the request.
package httpserver; public interface HttpRequestHandler { void handleRequest( HttpRequest request, HttpResponse response ) throws Exception; }
HttpResponse and HttpRequest
are responsible for the details of the http responses
and requests.
The HTTP request handler is configured at run time with:
public static HttpRequestHandler getHandler( String className ) { try { Class c = Class.forName( className ); Constructor cons = c.getConstructor(); Object obj = cons.newInstance(); return (HttpRequestHandler)obj; } catch( Exception ex ) { logger.warning( ex.getMessage() ); return null; } }
Any class that implements the HttpRequestHandler
can be passed as an argument.
What are the advantages or disadvantages to this approach?
The HTTP protocol is decoded with HttpRequest
and a HttpResponse is created to handle the response.
The handler is invoked with:
public void run() { BufferedReader rd = null; try { // XXX assumes a character encoding rd = new BufferedReader( new InputStreamReader( sock.getInputStream() )); HttpRequest request = new HttpRequest( rd ); HttpResponse response = new HttpResponse(sock.getOutputStream()); handler.handleRequest( request, response ); } catch( Exception e ) { logger.warning( e.getMessage() ); } finally { try { if ( rd != null ) rd.close(); sock.close(); } catch( Exception e ) { logger.warning( e.getMessage() ); } } }
package httpserver; import java.io.*; import java.net.*; import java.lang.reflect.*; import java.util.logging.Logger; public class HttpRequestDispatcher extends Thread { static private Logger logger = Logger.getLogger("http"); private Socket sock; private HttpRequestHandler handler; public HttpRequestDispatcher( Socket sock, HttpRequestHandler handler ) { this.sock = sock; this.handler = handler; start(); } public void run() { BufferedReader rd = null; try { // XXX assumes a character encoding rd = new BufferedReader( new InputStreamReader( sock.getInputStream() )); HttpRequest request = new HttpRequest( rd ); HttpResponse response = new HttpResponse(sock.getOutputStream()); handler.handleRequest( request, response ); } catch( Exception e ) { logger.warning( e.getMessage() ); } finally { try { if ( rd != null ) rd.close(); sock.close(); } catch( Exception e ) { logger.warning( e.getMessage() ); } } } public static HttpRequestHandler getHandler( String className ) { try { Class c = Class.forName( className ); Constructor cons = c.getConstructor(); Object obj = cons.newInstance(); return (HttpRequestHandler)obj; } catch( Exception ex ) { logger.warning( ex.getMessage() ); return null; } } public static void main( String[] args ) { if ( args.length != 1 ) { System.out.println("java HttpRequestDispatcher handler"); System.exit( 1 ); } Runtime.getRuntime().addShutdownHook( new Thread() { public void run() { logger.info("shutting down"); } } ); HttpRequestHandler handler = getHandler( args[0] ); if ( handler == null ) { System.out.println("bad handler"); System.exit( 1 ); } logger.info("start"); try { ServerSocket listen = new ServerSocket( 8888 ); while ( true ) { Socket sock = listen.accept(); new HttpRequestDispatcher( sock, handler); } } catch( IOException e ) { logger.warning( e.getMessage() ); } } }
Most of the information about a HTTP request can be represented by:
public class HttpRequest { private String method; private String protocol; private String path; private BufferedReader content; private HashMap<String,String> headers = new HashMap<String,String>();
The request is partially decoded by:
public HttpRequest( BufferedReader rd ) throws IOException { content = rd; String line = rd.readLine(); if ( line == null ) { throw new RuntimeException("empty request"); } String[] words = line.split("\\s+", 3); if ( words.length != 3 ) { throw new RuntimeException("bad request"); } method = words[0]; path = words[1]; protocol = words[2]; while( (line=rd.readLine()) != null ) { if ( line.length() == 0 ) break; String[] s = line.split(":", 2); headers.put( s[0], s[1] ); } }
Any content in the request is currently not handled.
package httpserver; import java.util.HashMap; import java.util.Iterator; import java.io.IOException; import java.io.BufferedReader; public class HttpRequest { private String method; private String protocol; private String path; private BufferedReader content; private HashMap<String,String> headers = new HashMap<String,String>(); public HttpRequest( BufferedReader rd ) throws IOException { content = rd; String line = rd.readLine(); if ( line == null ) { throw new RuntimeException("empty request"); } String[] words = line.split("\\s+", 3); if ( words.length != 3 ) { throw new RuntimeException("bad request"); } method = words[0]; path = words[1]; protocol = words[2]; while( (line=rd.readLine()) != null ) { if ( line.length() == 0 ) break; String[] s = line.split(":", 2); headers.put( s[0], s[1] ); } } public String getMethod() { return method; } public String getPath() { return path; } public String getProtocol() { return protocol; } public String getHeader( String key ) { return headers.get( key ); } public Iterator<String> getHeaderNames() { return headers.keySet().iterator(); } }
HashMap a good
choice to store headers (i.e., what assumptions
are being made)?
The state of a HTTP response can be represented by:
private OutputStream out; private String contentType = "text/html"; private byte[] headers = null;
Some HTTP protocol information is handled with:
public static final int RESP_OK = 200; public static final int RESP_FILE_NOT_FOUND = 404; public static final int RESP_METHOD_NOT_ALLOWED = 405;
Normal and error HTTP responses are generated with:
private void sendResponseStart() { try { String s = "HTTP/1.0 " + responseCode + " see body\n"; out.write( s.getBytes() ); s = "Connection: close\n"; out.write( s.getBytes() ); s = "Content-Type: " + contentType + "\n"; // empty line out.write( s.getBytes() ); if ( headers != null ) { out.write( headers ); } headers = null; // why? out.write( '\n' ); out.flush(); } catch( IOException e ) { throw new RuntimeException( e.getMessage() ); } } public void sendError( int code, String message ) { PrintWriter pw = getWriter(); pw.println("HTTP/1.0 " + code + " see body"); pw.println("Connection: close"); pw.println("Content-Type: text/html"); pw.println(); pw.println("<html><body><pre>"); pw.println( message ); pw.println("</pre></body></html>"); pw.flush(); } }
The response is controlled with:
public void setHeaders( byte[] headers ) { this.headers = headers; } public void setContentType( String t ) { contentType = t; } public void setStatus( int code ) { responseCode = code; }
How can this class's methods be misused? Give a failure scenario.
package httpserver; import java.util.ArrayList; import java.io.PrintWriter; import java.io.BufferedWriter; import java.io.OutputStream; import java.io.OutputStreamWriter; import java.io.IOException; public class HttpResponse { public static final int RESP_OK = 200; public static final int RESP_FILE_NOT_FOUND = 404; public static final int RESP_METHOD_NOT_ALLOWED = 405; private OutputStream out; private String contentType = "text/html"; private byte[] headers = null; private int responseCode = RESP_OK; private boolean headerSent = false; public HttpResponse( OutputStream out ) { this.out = out; } public void setHeaders( byte[] headers ) { this.headers = headers; } public void setContentType( String t ) { contentType = t; } public void setStatus( int code ) { responseCode = code; } public OutputStream getOutputStream() { if ( !headerSent ) { headerSent = true; sendResponseStart(); } return out; } public PrintWriter getWriter() { if ( !headerSent ) { headerSent = true; sendResponseStart(); } return new PrintWriter( new BufferedWriter( new OutputStreamWriter( out ))); } private void sendResponseStart() { try { String s = "HTTP/1.0 " + responseCode + " see body\n"; out.write( s.getBytes() ); s = "Connection: close\n"; out.write( s.getBytes() ); s = "Content-Type: " + contentType + "\n"; out.write( s.getBytes() ); if ( headers != null ) { out.write( headers ); } headers = null; out.write( '\n' ); out.flush(); } catch( IOException e ) { throw new RuntimeException( e.getMessage() ); } } public void sendError( int code, String message ) { PrintWriter pw = getWriter(); pw.println("HTTP/1.0 " + code + " see body"); pw.println("Connection: close"); pw.println("Content-Type: text/html"); pw.println(); pw.println("<html><body><pre>"); pw.println( message ); pw.println("</pre></body></html>"); pw.flush(); } }
FileHandler extends BaseHandler
and implements the HTTP request handler interface,
HttpRequestHandler.
The BaseHandler provides common functionality
for all HTTP request handlers.
The request handle first ensures that the request is either a GET or POST request. Can the condition be rewritten for clarity?
public void handleRequest( HttpRequest request, HttpResponse response ) throws Exception { String method = request.getMethod(); if ( !method.equals("GET") && !method.equals("POST") ) { String msg = "only handle GET and POST"; log( msg ); response.sendError(HttpResponse.RESP_METHOD_NOT_ALLOWED, msg); return; }
The log method is inherited from
BaseHandler. Inheritance is used
to aid software reuse in this case, and not generalization.
A file can be served either as a plain text file or as as a HTML text file. The suffix, -plain, indicates that the file should be served as a plain file.
File cwd = new File("."); String path = request.getPath(); // remove leading '/' String filename = path.substring(1); String suffix = "-plain"; if ( filename.endsWith( suffix ) ) { int len = filename.length() - suffix.length(); filename = filename.substring(0, len); response.setContentType("text/plain"); }
The file is processed and sent to the browser with:
log( request.getMethod() + " " + path ); File f = new File( cwd, filename ); if ( ! f.exists() ) { log( f + " not found" ); String msg = path + " not found"; response.sendError(HttpResponse.RESP_FILE_NOT_FOUND, msg); return; } PrintWriter pw = response.getWriter(); BufferedReader rd = new BufferedReader( new FileReader( f )); String line; while( (line=rd.readLine()) != null ) { pw.println( line ); } pw.flush(); rd.close(); // clean up resources
package httpserver; import java.io.File; import java.io.FileReader; import java.io.PrintWriter; import java.io.BufferedReader; public class FileHandler extends BaseHandler implements HttpRequestHandler { public FileHandler() { } public void handleRequest( HttpRequest request, HttpResponse response ) throws Exception { String method = request.getMethod(); if ( !method.equals("GET") && !method.equals("POST") ) { String msg = "only handle GET and POST"; log( msg ); response.sendError(HttpResponse.RESP_METHOD_NOT_ALLOWED, msg); return; } File cwd = new File("."); String path = request.getPath(); // remove leading '/' String filename = path.substring(1); String suffix = "-plain"; if ( filename.endsWith( suffix ) ) { int len = filename.length() - suffix.length(); filename = filename.substring(0, len); response.setContentType("text/plain"); } log( request.getMethod() + " " + path ); File f = new File( cwd, filename ); if ( ! f.exists() ) { log( f + " not found" ); String msg = path + " not found"; response.sendError(HttpResponse.RESP_FILE_NOT_FOUND, msg); return; } PrintWriter pw = response.getWriter(); BufferedReader rd = new BufferedReader( new FileReader( f )); String line; while( (line=rd.readLine()) != null ) { pw.println( line ); } pw.flush(); rd.close(); // clean up resources } }
The BaseHandler provides
logging functionality for its subclasses.
package httpserver; import java.util.logging.Logger; /** * BaseHandler should be extended by all HTTP request handlers. * It provides some common utility methods. */ public class BaseHandler { static private Logger logger = Logger.getLogger("http"); public void log( String message ) { logger.info( message ); } }
BaseHandler does not
implement the HttpRequestHandler interface,
should it? Are there any alternatives?
java.util.logging.Logger class.
Notice that only one logging object should be used
by the program.
The singleton design pattern is
illustrated by the getLogger
method.
Assuming that the httpserver code
lives in the httpserver directory and you have created directory classes therein,
then the server can be tested by:
javac -d classes -cp .. *.javaThe -d classes switch sends all the generated class files to the classes directory.
java -cp classes httpserver.HttpRequestDispatcher httpserver.FileHandler
The handleRequest method of the
httpserver.ImageCountHandler is used
to generate a page counting image.
public void handleRequest( HttpRequest request, HttpResponse response ) throws Exception { String method = request.getMethod(); String path = request.getPath(); log( method + " " + path ); if ( !method.equals("GET") && !method.equals("POST") ) { String msg = "only handle GET and POST"; log( msg ); response.sendError(HttpResponse.RESP_METHOD_NOT_ALLOWED, msg); return; } if ( path.equals( "/no" ) ) { response.setHeaders( nocache.getBytes() ); } response.setContentType("image/png"); String c = nextCount() + ""; try { writePNGImageString( 70, 20, c, response.getOutputStream()); } catch( IOException ex ) { //XXX fix me } }
log reports the HTTP request.
The count is maintained by:
private int count = 0; private int nextCount() { return count++; }
package httpserver; import java.io.RandomAccessFile; import java.io.IOException; import java.io.OutputStream; import java.awt.image.BufferedImage; import java.awt.Graphics2D; import java.awt.Color; import java.awt.Font; import java.awt.FontMetrics; import javax.imageio.ImageIO; import java.awt.geom.Rectangle2D; public class ImageCountHandler extends BaseHandler implements HttpRequestHandler { public ImageCountHandler() { } private static String nocache = "Cache-control: no-cache\nCache-control: no-store\nPragma: no-cache\nExpires: 0\n"; private int count = 0; private int nextCount() { return count++; } private void writePNGImageString( int imgWidth, int imgHeight, String str, OutputStream out ) throws IOException { BufferedImage image = new BufferedImage( imgWidth, imgHeight, BufferedImage.TYPE_4BYTE_ABGR); Graphics2D g2d = image.createGraphics(); Font f = new Font( "Monospaced", Font.BOLD, 14 ); g2d.setFont( f ); FontMetrics fm = g2d.getFontMetrics(); g2d.setColor( Color.white ); g2d.fillRect(0, 0, imgWidth, imgHeight ); g2d.setColor( Color.red ); Rectangle2D bounds = fm.getStringBounds( str, g2d ); int w = (int)bounds.getWidth(); int h = (int)bounds.getHeight(); int x = (imgWidth - w)/2; int y = imgHeight - (imgHeight - h)/2; g2d.drawString( str, x, y ); ImageIO.write(image, "png", out ); } public void handleRequest( HttpRequest request, HttpResponse response ) throws Exception { String method = request.getMethod(); String path = request.getPath(); log( method + " " + path ); if ( !method.equals("GET") && !method.equals("POST") ) { String msg = "only handle GET and POST"; log( msg ); response.sendError(HttpResponse.RESP_METHOD_NOT_ALLOWED, msg); return; } if ( path.equals( "/no" ) ) { response.setHeaders( nocache.getBytes() ); } response.setContentType("image/png"); String c = nextCount() + ""; try { writePNGImageString( 70, 20, c, response.getOutputStream()); } catch( IOException ex ) { //XXX fix me } } }
This server is run with the command:
java -cp classes httpserver.HttpRequestDispatcher httpserver.ImageCountHandler
Logging is provided by the java.util.logging
package from the Java API.
A logger instance is created with:
static private Logger logger = Logger.getLogger("http");
A static variable is used since only one instance per program
is necessary. The Java API will only create one instance,
even if Logger.getLogger is invoked again.
Message can be placed in the following categories:
Each of the above categories has a matching method that
accepts a String argument.
For example:
logger.warning( e.getMessage() );
The log can be saved to a file with the following code:
FileHandler handler = new FileHandler("server.log"); Logger logger = Logger.getLogger("httpserver"); logger.addHandler(handler);