Web Site Stacks
Software Engineering Large Practical
I really want to develop an app?
- I think I'm being more than flexible anyway, but fine
- Justify this choice in your report
- Certainly nothing to stop you developing a browser-based interface and a mobile app
- Make really sure it can be tested (ie. emulated) on DiCE
- I highly recommend submitting a proposal to make sure
- The app must at least connect to a server, so you will still need to write that
- This is your own choice and hence own risk
Can I use third-party libraries/frameworks?
- Yes, I highly encourage you to do so
- Just make sure everything works on DiCE so that I can test it
- The point of specifying DiCE is simply so that I can be sure of
being able to test your code
- It is not to force some specific environment with a limited set of dependencies
Installation to the local directory?
- “I cannot install things I need because I have no root access”
- Install to the local directory, this generally depends upon what it is your are installing
$ ./configure --prefix=${HOME}/my-dir
$ make && make install
Installation to the local directory?
$ pip install -t ${HOME}/my-dir
$ gem --user-install
In all cases make sure this will work for any directory
rather than specifically the one you are developing in
Can I use an existing project?
- See the clarifications answer here
- Basically, yes if you clearly mark which parts were done before the start of this project
- Start source code control now and tag the current state as the start
- But still clearly state what you have done for this project in your report
- Mistakes you have made before now will not be judged
- But they may make it more difficult not to make further mistakes which will be judged
Clarifications
- All of these questions have been clarified on the SELP web page
- See here
Today's Lecture
- I will talk about your web site stack
How a web page is served
- Your browser makes HTTP requests to a web server
- For a static page the server locates the page on disk and sends that to the client browser
- For a dynamic page, the server must compute the page to serve to the client browser:
- That computation may be based on the URL used to make the request
- It may be based on the content of the request
- It is likely based on the persistent state of the server
HyperText Transfer Protocol
- HTTP is a client-server request-response protocol
- It is connection oriented using Transfer Control Protocol(TCP)
- Usually a browser is the client, but also web-crawlers and mobile apps
9 possible requests
- GET
- HEAD
- POST
- PUT
- DELETE
- TRACE
- OPTIONS
- CONNECT
- PATCH
But you only really care about 2
- GET - Requests that the web-server sends the resource specified by
the given URI. This should not modify the state of the server
- HEAD
- POST - Requests that the server accepts the data provided, usually
to modify the state of the server (or even external state)
- PUT
- DELETE
- TRACE
- OPTIONS
- CONNECT
- PATCH
Safe
- GET, HEAD, OPTIONS, and TRACE
- Only information retrieval, although safe side effects such
as logging and counting are fine.
Unsafe
- POST, PUT, DELETE, and PATCH
- Should not be used by conforming web crawlers for example
Nothing actually enforces either of these constraints though
If you fancy implementing this
A better alternative
- Is to use an existing web server
- To do this, you need some way for the web server and your code to talk to each other
- This may well sound like you are simply moving the problem
- Instead of talking to the client browser you now need to talk to the web server
- There are several good reasons for this, including:
- Talking across a network is much harder than an entity on the same machine
- The server can worry about load balancing, concurrent requests etc.
The anatomy of a web server
Modifying Web Server Code
- You could take an existing web server and modify the source code
- Unfortunately accepting requests and performing remote communication are different
kinds of tasks than is preparing web pages
- Preparing web pages often involves a lot of string manipulation
- Additionally many people had existing applications that they
simply wanted to provide a web interface for
- Many people simply wanted to write their web applications in Perl
- Many web servers host several applications
Common Gateway Interface
- Allows the web server to call a command-line program
- The command-line program may be written in any programming language
- Has the drawback that a new process must be created for each request
- Some techniques are available to get around this such as “pre-forking” the process
- But ultimately the limitation is there
- An alternative is to run the code for your web application inside the process of the web-server
- This involves extending the web-server, usually through a module system
- Some kind of binding will be required if you wish to do so in a language in which the web server itself is not written in
Apache
- The Apache web server software can run CGI web applications
- However, this is simple a module within Apache
mod_cgi
- There are many modules allowing you to write your web application code in many languages
- Each module exports its own API to the desired language, for example:
Nginx
- Addresses the problem of high load; more than 10k concurrent requests
- Also has a module system with modules for:
- FastCGI - CGI with pre-forking
- WSGI - A standard for Python web applications
- Closure, Java, Groovy
- Many others, including third-party addons
Which server to choose
- Depends upon your development language
- Which in turns depends on what you are developing
- Many languages have their own web server
How a web application is developed
- It is a complex business to write a fast web-server
- This involves balancing the number of threads used for concurrent connections
- In development however, you may be sure of only a single connection
- So you do not need a full-blown web server
- You only require something that will:
- accept HTTP requests on a given port
- translate those requests and forward them to your web application code
- accept the responses from your web application code and forward those back to the original requester (browser/test code)
Web Application Frameworks
- Typically include a simple local web server
- Assist the developer in sticking to the protocol demanded of one or many production web servers
- Hence your web application is developed using the simple local server but can be deployed without further modification using a production web server
Do not worry about deployment
- For this project you need not worry about deployment to production
- Provided you have used a similar setup I can test your web server locally using the development setup
URLs and Requests to Pages
- When designing your web application you do not wish to be concerned with connections, threads or protocols
- Whichever framework you ultimately decide to use, will provide a means such that all you need to provide, is some function:
- relative URL + request → String (usually HTML)
URLs and Requests to Pages
This just returns a simple string, which the browser is of course capable
of rendering.
from bottle import route, run
@route('/hello')
def hello():
return "Hello World!"
URLs and Requests to Pages
Generally you would wish to return HTML formatted output rather than a
simple string
from bottle import route, run
@route('/hello')
def hello():
return """<!doctype>
<html>
<head>
<title>Hello</title>
</head>
<body>
Hello World!
</body>
</html>
"""
URLs and Requests to Pages
I may want some dynamic routes, such that part of the URL is a parameter
that is used to generated the output:
from bottle import route, run
@route('/')
@route('/hello/<name>')
def hello(name='Stranger'):
prefix = """<!doctype>
<html>
<head>
<title>Hello</title>
</head>
<body>
Hello """
suffix = """
</body>
</html>
"""
return prefix + name + suffix
URLs and Requests to Pages
Of course you may also examine the actual request content, this is most
obvious in a POST request, usually some kind of form asks the user
for bits of input and is submitted as a POST request:
from bottle import route, request
@route('/login')
def login():
return '''
<form action="/login" method="post">
Username: <input name="username" type="text" />
Password: <input name="password" type="password" />
<input value="Login" type="submit" />
</form>
'''
...
URLs and Requests to Pages
Of course you may also examine the actual request content, this is most
obvious in a POST request, usually some kind of form asks the user
for bits of input and is submitted as a POST request:
from bottle import route, request
...
@route('/login', method='POST')
def do_login():
username = request.forms.get('username')
password = request.forms.get('password')
if check_login(username, password):
return "<p>Your login information was correct.</p>"
else:
return "<p>Login failed.</p>"
Server State
Database
- For most of your applications you will not need sophisticated database operations
- Simple Create Read Update and Delete (CRUD) operations will be sufficient for many
- In a production web site you would look at using a full-blown database server
- You may use the school's PostgreSQL installation:
- See instructions here
- If you are not taking a Database related course you will need to ask for an account using the support form
URLs and Requests to Pages
So our previous login might look up the user in a database:
from bottle import route, request
...
def check_login(username, password):
user_password = look_up_password_in_database(username)
return user_password == password
@route('/login', method='POST')
def do_login():
# As before .... including if check_login (..)
Of course, this would be horribly insecure to store the passwords as plain
text
SQLite and similar
- Since you are not necessarily making a production server a server-free database is also fine
- For example SQLite is a self-contained, serverless, zero-configuration, transactional SQL database engine
- It does not support concurrent access, but unless concurrency forms part of your proposal you likely will not need concurrent access
- You could also simply store state in a local file
- However, if you do not use a standard database you should detail how you would scale this
- In particular how easy is it for someone to modify your source to use a proper database?
ORMs
- ORMs relate objects in your programming language to database entities
- This is great as it means you do not need to learn much about databases
- A good ORM will also provide mappings to different databases meaning that
you could switch between them if needed
- For example, switching between using an SQLite and a full MySQL server
can be a simple configuration option/parameter
Templating
- Typically displaying dynamic web pages results in a lot of string manipulation
- Some pages will have small “holes” in them which are filled in by dynamic content
- Often you will have a number of elements which must be displayed in HTML
Templating
Recall our previous example of saying hello to someone:
from bottle import route, run
@route('/')
@route('/hello/<name>')
def hello(name='Stranger'):
prefix = """<!doctype>
<html>
<head>
<title>Hello</title>
</head>
<body>
Hello """
suffix = """
</body>
</html>
"""
return prefix + name + suffix
Templating
It is hard to disagree that this is more readable using a template:
from bottle import route, run
@route('/')
@route('/hello/<name>')
def hello(name='Stranger'):
output_string = """<!doctype>
<html>
<head>
<title>Hello</title>
</head>
<body>
Hello {{name}}
</body>
</html>
"""
return template(output_string, name=name)
Templating
Search Results
- You often have many more than one parameterised part of your document
- For example, suppose you have a page that accepts some kind of query,
when the query is made you want to return the results
StringBuilder formatted = new StringBuilder();
formatted.append("<ul>");
foreach(Result result : search_results){
format.append("<li>");
format.append(result.format());
format.append("</li>");
}
formatted.append("</ul>");
return formatted.make_string()
Templating
Search Results
- With a templating engine the string concatenation is done for you,
but you have to escape to the code part
<ul>
<% foreach (Result result : search_results) { %>
<li>
<% result.format() %>
</li>
<% } %>
</ul>
Templating vs Non-templating
- Using a templating engine you have to escape the code
- Without one you are escaping the output
- General rule is that if you have more output than code use a templating engine
- Templating engines exist for a reason
- If you find you have more code than output that is probably a sign
that you could better separate the formatting from calculation of the results
- Think how difficult it would be to re-design the look of your web site
Templating and Inheritance
- Templating engines usually have some form of inheritance
- This is useful, it allows you to define a “root” template used by all pages on your site
- This gives them all the same, header/footer, navigation etc.
- This means that you can easily changes this for all pages at once
Web 1.0
- Web 1.0 is characterised by a few features, such as gif navigation buttons, proprietary browser extensions, lack of CSS
- The most characterising element though is the use of static HTML
- Every time the content on the screen must change, a request is made to the web server and an entirely new web page is sent back to the browser
- This can actually work quite well for certain domains, for example some adventure games such as Kingdom of Loathing
- But it leads to rather unresponsive and constricted application design
Web 1.0
- Simple examples of wishing to change the display of the content without referring back to the web server include:
- Validating form input
- Changing the sorting order of a list of items eg. by price, by relevance etc.
- Expanding/collapsing hideable elements such as in a tree-view
- Notebooks with tabs
Javascript & Web 1.0
- Therefore, depending on your domain, there is a strong chance you will want some dynamic HTML
- This likely means some Javascript
- There are alternatives, these tend to fall into four categories:
- Mild transformations which translate to Javascript
- Strong/statically typed (new or existing) languages which compile to Javascript
- Interpreters for existing languages written in Javascript
- Entirely new languages which require some extension to the browser
- It is up to you which of these you go for, but I recommend that at least for this project you avoid the last category
- I recommend trying CoffeeScript which is in category one
CSS
- Cascading Style Sheets allows you to separate your content from its display
- This is very useful, particularly because often the person skilled in creating a pleasing display is not the same
person as the one skilled in creating/computing the dynamic content
- Of course in an individual practical you will perform both roles
- But proper use of CSS will allow you to demonstrate that your graphical design can be modified without need to modify the content generation code
HTML5 + CSS + Javascript
- This may seem like a cumbersome method to produce a front-end user interface
- In some ways it is, HTML was originally the display mechanism and hence is not perfect as a vehicle for content
- However, if you try writing a UI for a non-web application, you will quickly find that it is difficult to get the layout correct
- HTML has evolved over time to make the layout work on many different screens etc.
- So actually you find that this separation works pretty well
Summary of your Stack
- A web framework or library which provides:
- A means for you to write: URL + Request → HTML
- A local web server to locally test your web application
- This will have the same API as a more production web server
- A means to produce HTML, probably a templating language
- A means to Create, Read, Update and Delete persistent state, ie. A database
- Hopefully an ORM to assist and abstract from this
- Some form of dynamic HTML manipulation, which means some form of Javascript
Summary of Steps
- A request is made to the web server, this translates it into some method/function of your web application
- Depending on the request, your application will retrieve/update some state on the database
- Then produce a string, which will likely be some HTML, which in turn is likely produced via a templating engine
- The HTML is then returned by the web server to the client browser and displayed there
- That HTML, may well include some Javascript which will allow modification of the displayed page without referring back to the web server
Which Framework Should You Use?
- First you have to decide which language to use
- Once done, unless you have significant web development experience,
it will be difficult for you to make a decision, so just try something
- This is why web development job descriptions ask for experience
- At least this project will give you some
- Wikipedia gives you a pretty decent comparison to get you started
- Generally two categories:
- Heavyweight: Those that include everything
- Lightweight: Those that do just the routing part and let you choose libraries for the rest
- Both are reasonable choices
Testing
- I have not yet said anything about testing
- You should have some
- How and what you test will be very dependent on what you are developing
- A general tip: Try to keep as much of the ‘logic’ of your
application separate from the page generation
- This way you should be able to at least test your logic
- Most web frameworks provide (or prescribe) some method for testing requests
Load Testing
- You need not do load testing
- But you may choose to