Web Site Stacks

Software Engineering Large Practical

I really want to develop an app?

  • I think I'm being more than flexible anyway, but fine
  • Justify this choice in your report
  • Certainly nothing to stop you developing a browser-based interface and a mobile app
  • Make really sure it can be tested (ie. emulated) on DiCE
  • I highly recommend submitting a proposal to make sure
  • The app must at least connect to a server, so you will still need to write that
  • This is your own choice and hence own risk

Can I use third-party libraries/frameworks?

  • Yes, I highly encourage you to do so
  • Just make sure everything works on DiCE so that I can test it
  • The point of specifying DiCE is simply so that I can be sure of being able to test your code
  • It is not to force some specific environment with a limited set of dependencies

Installation to the local directory?

  • “I cannot install things I need because I have no root access”
  • Install to the local directory, this generally depends upon what it is your are installing

$ ./configure --prefix=${HOME}/my-dir
$ make && make install
  

Installation to the local directory?


$ pip install -t ${HOME}/my-dir
  

$ gem --user-install
  
In all cases make sure this will work for any directory rather than specifically the one you are developing in

Can I use an existing project?

  • See the clarifications answer here
  • Basically, yes if you clearly mark which parts were done before the start of this project
    • Start source code control now and tag the current state as the start
    • But still clearly state what you have done for this project in your report
  • Mistakes you have made before now will not be judged
    • But they may make it more difficult not to make further mistakes which will be judged

Clarifications

  • All of these questions have been clarified on the SELP web page
  • See here

Today's Lecture

  • I will talk about your web site stack

How a web page is served

  • Your browser makes HTTP requests to a web server
  • For a static page the server locates the page on disk and sends that to the client browser
  • For a dynamic page, the server must compute the page to serve to the client browser:
    • That computation may be based on the URL used to make the request
    • It may be based on the content of the request
    • It is likely based on the persistent state of the server

HyperText Transfer Protocol

  • HTTP is a client-server request-response protocol
  • It is connection oriented using Transfer Control Protocol(TCP)
  • Usually a browser is the client, but also web-crawlers and mobile apps

9 possible requests

  1. GET
  2. HEAD
  3. POST
  4. PUT
  5. DELETE
  6. TRACE
  7. OPTIONS
  8. CONNECT
  9. PATCH

But you only really care about 2

  1. GET - Requests that the web-server sends the resource specified by the given URI. This should not modify the state of the server
  2. HEAD
  3. POST - Requests that the server accepts the data provided, usually to modify the state of the server (or even external state)
  4. PUT
  5. DELETE
  6. TRACE
  7. OPTIONS
  8. CONNECT
  9. PATCH

Safe

  • GET, HEAD, OPTIONS, and TRACE
  • Only information retrieval, although safe side effects such as logging and counting are fine.

Unsafe

  • POST, PUT, DELETE, and PATCH
  • Should not be used by conforming web crawlers for example

Nothing actually enforces either of these constraints though

If you fancy implementing this

A better alternative

  • Is to use an existing web server
  • To do this, you need some way for the web server and your code to talk to each other
  • This may well sound like you are simply moving the problem
    • Instead of talking to the client browser you now need to talk to the web server
    • There are several good reasons for this, including:
      • Talking across a network is much harder than an entity on the same machine
      • The server can worry about load balancing, concurrent requests etc.

The anatomy of a web server

Modifying Web Server Code

  • You could take an existing web server and modify the source code
  • Unfortunately accepting requests and performing remote communication are different kinds of tasks than is preparing web pages
    • Preparing web pages often involves a lot of string manipulation
  • Additionally many people had existing applications that they simply wanted to provide a web interface for
  • Many people simply wanted to write their web applications in Perl
  • Many web servers host several applications

Common Gateway Interface

  • Allows the web server to call a command-line program
  • The command-line program may be written in any programming language
  • Has the drawback that a new process must be created for each request
    • Some techniques are available to get around this such as “pre-forking” the process
    • But ultimately the limitation is there
  • An alternative is to run the code for your web application inside the process of the web-server
  • This involves extending the web-server, usually through a module system
  • Some kind of binding will be required if you wish to do so in a language in which the web server itself is not written in

Apache

  • The Apache web server software can run CGI web applications
  • However, this is simple a module within Apache mod_cgi
  • There are many modules allowing you to write your web application code in many languages
  • Each module exports its own API to the desired language, for example:
    • mod_mruby
    • mod_php

Nginx

  • Addresses the problem of high load; more than 10k concurrent requests
  • Also has a module system with modules for:
    • FastCGI - CGI with pre-forking
    • WSGI - A standard for Python web applications
    • Closure, Java, Groovy
    • Many others, including third-party addons

Which server to choose

  • Depends upon your development language
    • Which in turns depends on what you are developing
  • Many languages have their own web server

How a web application is developed

  • It is a complex business to write a fast web-server
  • This involves balancing the number of threads used for concurrent connections
  • In development however, you may be sure of only a single connection
  • So you do not need a full-blown web server
  • You only require something that will:
    • accept HTTP requests on a given port
    • translate those requests and forward them to your web application code
    • accept the responses from your web application code and forward those back to the original requester (browser/test code)

Web Application Frameworks

  • Typically include a simple local web server
  • Assist the developer in sticking to the protocol demanded of one or many production web servers
  • Hence your web application is developed using the simple local server but can be deployed without further modification using a production web server

Do not worry about deployment

  • For this project you need not worry about deployment to production
  • Provided you have used a similar setup I can test your web server locally using the development setup

URLs and Requests to Pages

  • When designing your web application you do not wish to be concerned with connections, threads or protocols
  • Whichever framework you ultimately decide to use, will provide a means such that all you need to provide, is some function:
    • relative URL + request → String (usually HTML)

URLs and Requests to Pages

This just returns a simple string, which the browser is of course capable of rendering.

from bottle import route, run

@route('/hello')
def hello():
    return "Hello World!"

URLs and Requests to Pages

Generally you would wish to return HTML formatted output rather than a simple string
from bottle import route, run

@route('/hello')
def hello():
    return """<!doctype>
<html>
    <head>
        <title>Hello</title>
    </head>
    <body>
        Hello World!
    </body>
</html>
"""

URLs and Requests to Pages

I may want some dynamic routes, such that part of the URL is a parameter that is used to generated the output:
from bottle import route, run

@route('/')
@route('/hello/<name>')
def hello(name='Stranger'):
    prefix =  """<!doctype>
<html>
    <head>
        <title>Hello</title>
    </head>
    <body>
        Hello """
    suffix = """
    </body>
</html>
"""
    return prefix + name + suffix

URLs and Requests to Pages

Of course you may also examine the actual request content, this is most obvious in a POST request, usually some kind of form asks the user for bits of input and is submitted as a POST request:
from bottle import route, request

@route('/login')
def login():
    return '''
        <form action="/login" method="post">
            Username: <input name="username" type="text" />
            Password: <input name="password" type="password" />
            <input value="Login" type="submit" />
        </form>
    '''
...

URLs and Requests to Pages

Of course you may also examine the actual request content, this is most obvious in a POST request, usually some kind of form asks the user for bits of input and is submitted as a POST request:
from bottle import route, request
...

@route('/login', method='POST')
def do_login():
    username = request.forms.get('username')
    password = request.forms.get('password')
    if check_login(username, password):
        return "<p>Your login information was correct.</p>"
    else:
        return "<p>Login failed.</p>"

Server State

Database

  • For most of your applications you will not need sophisticated database operations
  • Simple Create Read Update and Delete (CRUD) operations will be sufficient for many
  • In a production web site you would look at using a full-blown database server
  • You may use the school's PostgreSQL installation:
    • See instructions here
    • If you are not taking a Database related course you will need to ask for an account using the support form

URLs and Requests to Pages

So our previous login might look up the user in a database:
from bottle import route, request
...

def check_login(username, password):
    user_password = look_up_password_in_database(username)
    return user_password == password

@route('/login', method='POST')
def do_login():
    # As before .... including if check_login (..)

Of course, this would be horribly insecure to store the passwords as plain text

SQLite and similar

  • Since you are not necessarily making a production server a server-free database is also fine
  • For example SQLite is a self-contained, serverless, zero-configuration, transactional SQL database engine
  • It does not support concurrent access, but unless concurrency forms part of your proposal you likely will not need concurrent access
  • You could also simply store state in a local file
    • However, if you do not use a standard database you should detail how you would scale this
    • In particular how easy is it for someone to modify your source to use a proper database?

ORMs

  • ORMs relate objects in your programming language to database entities
  • This is great as it means you do not need to learn much about databases
  • A good ORM will also provide mappings to different databases meaning that you could switch between them if needed
  • For example, switching between using an SQLite and a full MySQL server can be a simple configuration option/parameter

Templating

  • Typically displaying dynamic web pages results in a lot of string manipulation
  • Some pages will have small “holes” in them which are filled in by dynamic content
  • Often you will have a number of elements which must be displayed in HTML

Templating

Recall our previous example of saying hello to someone:
from bottle import route, run

@route('/')
@route('/hello/<name>')
def hello(name='Stranger'):
    prefix =  """<!doctype>
<html>
    <head>
        <title>Hello</title>
    </head>
    <body>
        Hello """
    suffix = """
    </body>
</html>
"""
    return prefix + name + suffix

Templating

It is hard to disagree that this is more readable using a template:
from bottle import route, run

@route('/')
@route('/hello/<name>')
def hello(name='Stranger'):
    output_string =  """<!doctype>
<html>
    <head>
        <title>Hello</title>
    </head>
    <body>
        Hello {{name}}
    </body>
</html>
"""
    return template(output_string, name=name)

Templating

Search Results

  • You often have many more than one parameterised part of your document
  • For example, suppose you have a page that accepts some kind of query, when the query is made you want to return the results

StringBuilder formatted = new StringBuilder();
formatted.append("<ul>");
foreach(Result result : search_results){
    format.append("<li>");
    format.append(result.format());
    format.append("</li>");
}
formatted.append("</ul>");
return formatted.make_string()

Templating

Search Results

  • With a templating engine the string concatenation is done for you, but you have to escape to the code part

<ul>
    <% foreach (Result result : search_results) { %>
        <li>
            <% result.format() %>
        </li>
    <% } %>
</ul>

Templating vs Non-templating

  • Using a templating engine you have to escape the code
  • Without one you are escaping the output
  • General rule is that if you have more output than code use a templating engine
  • Templating engines exist for a reason
    • If you find you have more code than output that is probably a sign that you could better separate the formatting from calculation of the results
    • Think how difficult it would be to re-design the look of your web site

Templating and Inheritance

  • Templating engines usually have some form of inheritance
  • This is useful, it allows you to define a “root” template used by all pages on your site
  • This gives them all the same, header/footer, navigation etc.
  • This means that you can easily changes this for all pages at once

Web 1.0

  • Web 1.0 is characterised by a few features, such as gif navigation buttons, proprietary browser extensions, lack of CSS
  • The most characterising element though is the use of static HTML
  • Every time the content on the screen must change, a request is made to the web server and an entirely new web page is sent back to the browser
  • This can actually work quite well for certain domains, for example some adventure games such as Kingdom of Loathing
  • But it leads to rather unresponsive and constricted application design

Web 1.0

  • Simple examples of wishing to change the display of the content without referring back to the web server include:
    • Validating form input
    • Changing the sorting order of a list of items eg. by price, by relevance etc.
    • Expanding/collapsing hideable elements such as in a tree-view
    • Notebooks with tabs

Javascript & Web 1.0

  • Therefore, depending on your domain, there is a strong chance you will want some dynamic HTML
  • This likely means some Javascript
  • There are alternatives, these tend to fall into four categories:
    1. Mild transformations which translate to Javascript
    2. Strong/statically typed (new or existing) languages which compile to Javascript
    3. Interpreters for existing languages written in Javascript
    4. Entirely new languages which require some extension to the browser
  • It is up to you which of these you go for, but I recommend that at least for this project you avoid the last category
  • I recommend trying CoffeeScript which is in category one

CSS

  • Cascading Style Sheets allows you to separate your content from its display
  • This is very useful, particularly because often the person skilled in creating a pleasing display is not the same person as the one skilled in creating/computing the dynamic content
  • Of course in an individual practical you will perform both roles
  • But proper use of CSS will allow you to demonstrate that your graphical design can be modified without need to modify the content generation code

HTML5 + CSS + Javascript

  • This may seem like a cumbersome method to produce a front-end user interface
  • In some ways it is, HTML was originally the display mechanism and hence is not perfect as a vehicle for content
  • However, if you try writing a UI for a non-web application, you will quickly find that it is difficult to get the layout correct
  • HTML has evolved over time to make the layout work on many different screens etc.
  • So actually you find that this separation works pretty well

Summary of your Stack

  1. A web framework or library which provides:
    1. A means for you to write: URL + Request → HTML
    2. A local web server to locally test your web application
    3. This will have the same API as a more production web server
  2. A means to produce HTML, probably a templating language
  3. A means to Create, Read, Update and Delete persistent state, ie. A database
    • Hopefully an ORM to assist and abstract from this
  4. Some form of dynamic HTML manipulation, which means some form of Javascript

Summary of Steps

  • A request is made to the web server, this translates it into some method/function of your web application
  • Depending on the request, your application will retrieve/update some state on the database
  • Then produce a string, which will likely be some HTML, which in turn is likely produced via a templating engine
  • The HTML is then returned by the web server to the client browser and displayed there
  • That HTML, may well include some Javascript which will allow modification of the displayed page without referring back to the web server

Which Framework Should You Use?

  • First you have to decide which language to use
  • Once done, unless you have significant web development experience, it will be difficult for you to make a decision, so just try something
  • This is why web development job descriptions ask for experience
  • At least this project will give you some
  • Wikipedia gives you a pretty decent comparison to get you started
  • Generally two categories:
    • Heavyweight: Those that include everything
    • Lightweight: Those that do just the routing part and let you choose libraries for the rest
  • Both are reasonable choices

Testing

  • I have not yet said anything about testing
  • You should have some
  • How and what you test will be very dependent on what you are developing
  • A general tip: Try to keep as much of the ‘logic’ of your application separate from the page generation
    • This way you should be able to at least test your logic
    • Most web frameworks provide (or prescribe) some method for testing requests

Load Testing

  • You need not do load testing
  • But you may choose to

Any Questions