Optimisation & Testing

Software Engineering Large Practical

Optimisation

  • Re-usability can conflict heavily with readability
  • Similarly, optimised or fast code can conflict with readability
  • You are writing a web application which may, ultimately, have to service millions of requests quickly
  • Optimised code is generally the opposite of reusable code
  • It is optimised for its particular assumptions which cannot be violated

Premature Optimisation

  • The notion of optimising code before it is required
  • The downside is that code becomes less adaptable
  • Because the requirements on your optimised piece of code may change, you may have to throw away your specialised code and all its optimisations
  • Note: I do not mean the requirements of the project
    • Of course they may also change
    • But here I refer to the requirements of a particular portion of your code which may change

Premature Optimisation

  • Worse than throwing away your own optimisations, you may instead elect to work around your specialised and optimised section of code
  • Thus your premature optimisation has negatively affected other parts of your code

Timely Optimisation

  • So when is the correct time to optimise?
  • Refactoring is done in between development of new functionality
    • Recall this makes it easier to test that your refactoring has not changed the behaviour of your code
  • This is also a good time to do some optimisation
    • You should be in a good position to test that your optimisations have not negatively impacted correctness
    • This has the additionally bonus that since you are refactoring at a similar time you should already be considering the adaptability and readability of your code

Timely Optimisation

  • The absolute best time to optimise code is when you discover that it is not running fast enough
  • Often this will come towards the end of the project
  • It should certainly be after you have something deployable
  • After you have developed and tested some major portion of functionality

A Plausible Strategy

  • Perform no optimisation until the end of the project once all functionality is complete and tested
  • You may well find at that point that no optimisation is necessary
  • This is a reasonable stance to take, however:
  • During development, you may find that your test suite takes days to run
  • Even one simple run to test the functionality you are currently developing may take minutes or hours
  • This can seriously hamper development, so it may be best to do some optimisation at that point

How to Optimise

  • The very first thing you need before you could possibly optimise code is a benchmark
  • This can be as simple as timing how long it takes to run your test suite
  • O(n2) solutions will beat O(n log n) solutions on sufficiently small inputs, so your benchmarks must not be too small
    • Of course equally small inputs might be the only kind of input you can expect

How to Optimise

  • Once you have a suitable benchmark then you can:
    1. Run your benchmark on a build from a clean repository recording the run time
    2. Perform what you think is an optimisation on your source code
    3. Re-run your benchmark and compare the run times
    4. If you have successfully improved the performance of your code commit your changes, otherwise revert them
    5. Do one optimisation at a time

Interacting Optimisations

  • Word of caution: some optimisations may interact with each other, so you may wish to evaluate them independently as well as in conjunction
    • As always, source code control can empower you to do this

High-level vs Low-level Optimisations

  • It is usually more productive to consider high-level optimisations
  • The compiler is often good at low-level optimisations
  • It is often better to call a method fewer times, than to optimise the code within a method
  • For a web application if you can minimise the number of requests, that may increase the overall performance
    • Common example: validating form input on the client-side

Profiling

  • Profiling is not the same as benchmarking
  • Benchmarking:
    • determines how quickly your program runs
    • is to performance what testing is to correctness
  • Profiling:
    • is used after benchmarking has determined that your program is running too slowly
    • is used to determine which parts of your program are causing it to run slowly
    • is to performance what debugging is to correctness
  • Without benchmarking you risk making your program worse
  • Without profiling you risk wasting effort optimising a part of your program which is either already fast or rarely executed

Documenting Optimisations

  • Source code comments are a good place to explain why the code is the way it is
  • Source code control commits are a good place to document why you performed the optimisations including benchmark/profiler results etc.

Summary

  • As with refactoring do not optimise and develop new code at the same time
  • Do not optimise blindly, instead, benchmark and profile
  • Source code control will make everything easier

Testing

Software Engineering Large Practical

Course: Software Testing

  • There is an entire course in the second semester dedicated to software testing
  • I am therefore providing a brief overview of the basics you will need for this practical

Famous Quote

“...testing can be a very effective way to show the presence of bugs, but is hopelessly inadequate for showing their absence. The only effective way to raise the confidence level of a program significantly is to give a convincing proof of its correctness.” -- Edsger Dijkstra

Types and Testing

  • Static types give you automatic proofs for some desirable properties
  • Rarely is correctness one of them
  • For correctness:
    • Proving correctness is often infeasible, and in any case how do you know your proof is correct?
    • Tests can at least increase confidence

Kinds of Testing

  • There are many kinds of tests, here are but a few loosely defined categories:
    1. Functional Tests
    2. Regression Tests
    3. Unit Tests
    4. Mutation Testing

Quick Quiz

  • What are the most important kinds of tests for you?
    1. Functional Tests
    2. Regression Tests
    3. Unit Tests
    4. Mutation Testing

Functional Tests

  • Functional testing is rather vaguely defined
  • Here I refer to testing the correct behaviour based on user actions
  • For a web application this rather neatly fits into testing the correct response follows from given requests

Functional Tests

Advantages

  • Observable behaviour is what you really care about
  • Known input and expected outputs can often be defined by the user
    • Which in addition may help tackle requirements ambiguity

Functional Tests

Disadvantages

  • When tests fail there may be little indication as to where the flaws are
  • Early testing before all components are ready may be difficult

Regression Tests

  • A subset of functional testing, and again refers to many things
  • However, I will use this term to mean testing non-changing behaviour

Regression Tests

The Oracle Problem

  • In all testing, you have the problem of determining what the correct answer should be
  • If your software involves some complex calculation that is infeasible to do by hand, for example

Regression Tests

The Oracle Problem

  • You can at least:
    • Generate some input, by hand or perhaps randomly
    • Run your software to generate some possibly correct output
    • Assume that that output is correct and test future versions against this
    • If the output for the given input changes, then you should be able to explain it
      • Either you have introduced a bug
      • Or you have fixed one
      • Occasionally both are valid, eg. different formatting

Regression Tests

Advantages

  • Typically easy to setup
  • Helps to avoid breaking functionality that was working previously
  • Can help with the oracle problem in a way that has obvious limitations
  • Increases confidence when refactoring and optimising

Regression Tests

Disadvantages

  • You may simply be confirming that you have not fixed broken behaviour

Unit Tests

  • Testing a whole program at once can be difficult or infeasible
  • The combination of paths can mean the total number of paths to test explodes
  • One solution to this is to test each component separately for correctness
  • Assuming that if all modules are correct then the sum of those correct modules is also correct
  • Here the term component is flexible, it could refer to:
    • Modules
    • Methods
    • Classes
    • Phases
    • Layers etc.

Unit Tests

Advantages

  • You may not have the whole program available, but testing smaller units can help find problems before integration is possible
  • May help locate bugs as you can ascertain which particular module is failing

Unit Tests

Disadvantages

  • Can create a lot of extra code
  • Can make the design less flexible, if only because you have to change all of your tests
    • Or even throw them all away and start again
  • At some point you have to do integration testing

Mutation Tests

  • Mutation testing does not test your application at all
  • It tests your tests
  • Mutation testing is the act of semi-randomly changing lines of code
  • If you do this and all of your tests pass, then your tests are not very comprehensive

Mutation Testing

An Example


if (a && b){ return "Go Ahead"; }
else { return "Abort"; }
Mutated to become:

if (a || b){ return "Go Ahead"; }
else { return "Abort"; }
If your test suite still returns all tests passing after this mutation, then either your test suite is not comprehensive, or your code is illogical

Mutation Testing

Conditions

  • A mutant will cause a test to fail if:
    1. The test reaches the mutant statement
    2. The input data causes distinct mutant states for the original and mutant program
    3. The incorrect value must propogate to the output and be checked by the test

Mutation Testing

  • With lots of compute power you can run lots of mutants
  • Each mutant that fails generally must be inspected manually
  • Automatically you can at least give your test suite a score:
    • number of mutants that fail tests / number of mutants created

Mutation Testing

Advantages and Disadvantages

  • Not so appropriate to discuss advantages and disadvantages here since it is only used in conjunction with other testing
  • Advantage pretty easy to set up

Common Approach

  • There is a common approach to developing applications
    1. Start with the main method
    2. Write some code, for example to parse the input
    3. Write (or update) a test input file
    4. Run your current application
    5. See if the output is what you expect
    6. Go back to step 2.

Common Approach

  • For a web application, this can be even worse:
    1. Write some functionality
    2. Start the web server locally
    3. Manually navigate to your local web site
    4. Manually exercise your new functionality

Start with a Test Suite

  • A much better place to start is your test suite
  • A web application is already well set-up for testing
  • You can easily make known requests and check for expected responses
  • Note: This does not mean you cannot start coding right away

Request/Response Testing

An Example


class FlaskrTestCase(unittest.TestCase):

    def setUp(self):
        self.db_fd, flaskr.app.config['DATABASE'] = tempfile.mkstemp()
        self.app = flaskr.app.test_client()
        flaskr.init_db()

    def tearDown(self):
        os.close(self.db_fd)
        os.unlink(flaskr.app.config['DATABASE'])

    def test_empty_db(self):
        rv = self.app.get('/')
        assert 'No entries here so far' in rv.data
Take from the Flask documentation

Request/Response Testing

An Example


    def setUp(self):
        self.db_fd, flaskr.app.config['DATABASE'] = tempfile.mkstemp()
        self.app = flaskr.app.test_client()
        flaskr.init_db()
The setup simple creates a temporary empty database

Request/Response Testing

An Example


    def tearDown(self):
        os.close(self.db_fd)
        os.unlink(flaskr.app.config['DATABASE'])
Simply states that after this test has run, whether the test succeeded or failed, the temporary database is deleted.

Request/Response Testing

An Example


    def test_empty_db(self):
        rv = self.app.get('/')
        assert 'No entries here so far' in rv.data
The actual test:
  1. asks the framework to resolve the route:/ and return the HTML page response
    • That is the page generated from your method linked to that route
  2. Checks that that page contains the ‘No entries ..’ string

XUnit

  • For this to work you do not even need to use a test framework
  • But of course, one will help
  • XUnit is the collective name for a unit testing framework that has been ported to many languages
  • Although it has unit in the name, it is not really specific to unit testing
  • See the wikipedia page

Testing Requests

  • Your web application will have a front-end which makes requests to your back-end
  • To ensure correctness:
    1. Ensure the front-end generates correct requests
    2. Ensure the back-end processes requests correctly
    3. Ensure that these are integrated so that the front-end is not generating requests that the back-end cannot process
  • However, this is often complicated by the fact that some of your front-end is actually produced by your back-end
  • For example, the HTML page response you produce may contain an input form on it, or even simply a link

For this Practical

  • Because this is a relatively short practical, and I'm aware that some of you have had quite a lot to learn:
  • This practical will not require you to test your front-end code
  • That does not mean it might not help you to do so, only that it will not be reflected in your grade if you elect not to
  • Javascript certainly has an XUnit implementation
  • In summary: All I wish for you to do is some kind of automated tests that your application makes the correct responses to known requests
    • Alternatively you can of course prove your application correct

Testing Requests

  • Testing your responses to requests is further complicated because in addition, you should make sure that you handle correctly, requests that your front-end does not produce
  • For example, an input POST request with missing fields
  • Of course the correct response may simply be an error response
    • But you should at least not have your web application crash
  • If you are successful enough, other developers will use your API, whether you want them to or not

Known Requests - Expected Responses

  • This is a very test-friendly environment
  • However, many of your known requests should also be expected to update the persistant state
  • Still, this is relatively easy to test: you have known preconditions and expected postconditions
    • For example you can test that a request that should insert something into the database, really does do that

Quick Quiz

  • What are the most important kinds of tests for you?
    1. Functional Tests
    2. Regression Tests
    3. Unit Tests
    4. Mutation Testing
  • Likely Functional Tests
    • if you get functional testing right, then you’re doing enough for this practical
  • Regression testing is at least easy, and a subset of functional testing

General Advice

  • You are technically doing unit testing, as you are already testing the back-end separately from the front-end
  • However, I recommend you do not decompose further, as you will likely solidify your design
    • With the possible exception of database access, which may be tested as an isolated component
  • Though of course this will depend largely on your particular application

Enough Testing

  • How do you know when you have tested enough?
  • We would like to test for all inputs but that is generally at best infeasible
  • Instead we divide our input space into equivalence classes
  • But how do we know we have divided our input space into the correct equivalence classes?
    • How do we know we have tested all of the classes?

Coverage Analysis

  • For this you can use a coverage analyser
  • Run your test-suite using the coverage analyser which will tell you which entities have been exercised
  • The entities might be: Functions, Statements, Branches, Conditions
  • This is a large topic for another day, but briefly:
    • Ideally: check every possible path has been executed
    • Generally infeasible, so we can at least check that every statement has been executed
    • This will give you some confidence that your test suite is quite comprehensive
  • Works well with mutation testing since reaching the mutant statement is the first condition

Too Much Testing

  • This is possible
  • Tests are code, code is hard to maintain, tests are no exception
  • I think it is unlikely this will be a problem for you, but note that unit tests are particularly susceptible to over testing

Psychology and Testing

Confirmation Bias

Often used quote:
“I will look at any additional evidence to confirm the opinion to which I have already come” - Lord Molson, British Politician (1903-1991)
Funny, but entirely misleading

Psychology and Testing

Confirmation Bias

  • The confirmation bias is a well documented psychological effect
  • It is not the idea that people believe what they wish to believe
  • It is the idea that: when evaluating any hypothesis, we are predisposed to search for evidence to confirm it

Confirmation Bias

Country Test

  • A study, carried out in the USA, and actually seeking to investigate perceived similarity, concerned two pairs of countries:
    1. East Germany and West Germany
    2. Sri Lanka and Nepal
  • The study asked two groups of participants respectively:
    1. Which two countries are more similar to one another?
    2. Which two countries are more different to one another?
  • In the first group the majority said that East and West Germany were more similar to each other
  • In the second group the majority likewise said that East and West Germany were more different from each other.

Confirmation Bias

Country Test Explanation

  • The study was carried out in the USA, during the cold war
  • Participants routinely consumed news stories regarding East and West Germany
  • Only very rarely heard anything mentioning Sri Lanka and Nepal

Confirmation Bias

Country Test Explanation

  • Hence when asked which two countries more similar:
    • Particpants evaluated the hypotheses:
      1. East and West Germany are similar. and
      2. Sri Lanka and Nepal are similar.
    • Knowing more about East and West Germany participants could recall far more similarities than they could for Sri Lanka and Nepal
    • They hence concluded that East and West Germany were indeed more similar

Confirmation Bias

Country Test Explanation

  • When asked which two countries more different:
    • Particpants evaluated the hypotheses:
      1. East and West Germany are different. and
      2. Sri Lanka and Nepal are different.
    • Knowing more about East and West Germany participants could recall far more differences than they could for Sri Lanka and Nepal
    • They hence concluded that East and West Germany were indeed more different

Confirmation Bias

General Hypothesis Testing


Confirmation Bias

General Hypothesis Testing


Confirmation Bias

General Hypothesis Testing


Confirmation Bias

General Hypothesis Testing


My program is correct on all inputs

Psychology and Testing

Confirmation Bias

  • In summary: your tests should evaluate the hypothsis:
    • “My web application is incorrect and contains flaws
    • rather than:
    • “My web application is correct and flawless
  • Tests can only ever show the presence of flaws never their absence
  • This is a difficult philosophy to follow, because you would rather believe the second hypothesis
    • but you can at least try

Recall Famous Quote

“...testing can be a very effective way to show the presence of bugs, but is hopelessly inadequate for showing their absence. The only effective way to raise the confidence level of a program significantly is to give a convincing proof of its correctness.” -- Edsger Dijkstra

New Functionality

  • Before you start new functionality:
    1. Write at least one test for that new functionality
    2. Run your test suite and make sure your new tests fail
      • It is very easy to write tests which never fail
  • Then:
    1. Code until those new tests turn green
    2. Then refactor and make sure the tests are still green
    3. Commit to your git repository as you go
    4. Return to step 1
  • This assists in avoiding the confirmation bias because when you write the tests your hypothesis is that your current code will fail them

New Functionality

  • Always run your full test suite
  • This ensures that you do not break something that was previously working
    • If your test suite becomes too long to run during a development cycle:
      • Create a subset that exercises most functionality
      • Make sure you run your full test suite before a commit
      • Too long is generally: more than a few seconds

Fixing Bugs

  • In recent years many developers have begun to ignore the difference between a feature request and bug report
  • This is why bug trackers have been renamed as issue trackers
  • When you find a new flaw in your application, treat it exactly the same as a missing feature

Fixing Bugs

  • Before you start to debug your flaw:
    1. Write at least one test that currently fails due to that flaw
    2. Run your test suite and make sure it fails
  • Then:
    1. Debug until those new tests turn green
    2. Then refactor and make sure the tests are still green
    3. Commit to your git repository as you go
    4. Return to step 1
  • This helps to avoid re-introducing a bug once you have fixed it, because you now have a specific test for it

Testing - Summary

  • You should never be manually exercising your web application to determine correctness
  • You should do so only to determine usability
    • and who knows you might actually build something you wish to use yourself
  • Running your test suite should be one command
  • Try to demonstrate your web application has flaws

Refactoring, Optimisation & Testing

Software Engineering Large Practical

Automation

  • Tests are really just a special case of the maxim:
    • “Never do what the computer can do for you”
  • Any activity that requires more than one step, should be reduced to exactly one step
  • Code as if you have anterograde amnesia
    • Everything you do today, you will forget, so it must be recorded
    • If there is specific software for recording it, such as an issue tracker then use it

Instructions

  • If you have to write down instructions, such as installation instructions:
    • Ask yourself whether or not you can simply make those instructions executable

Any Questions