Optimisation & Testing
Software Engineering Large Practical
Optimisation
- Re-usability can conflict heavily with readability
- Similarly, optimised or fast code can conflict with readability
- You are writing a web application which may, ultimately, have to service millions of requests quickly
- Optimised code is generally the opposite of reusable code
- It is optimised for its particular assumptions which cannot be violated
Premature Optimisation
- The notion of optimising code before it is required
- The downside is that code becomes less adaptable
- Because the requirements on your optimised piece of code may change,
you may have to throw away your specialised code and all its optimisations
- Note: I do not mean the requirements of the project
- Of course they may also change
- But here I refer to the requirements of a particular portion
of your code which may change
Premature Optimisation
- Worse than throwing away your own optimisations, you may instead elect
to work around your specialised and optimised section of code
- Thus your premature optimisation has negatively affected other parts of
your code
Timely Optimisation
- So when is the correct time to optimise?
- Refactoring is done in between development of new functionality
- Recall this makes it easier to test that your refactoring has not
changed the behaviour of your code
- This is also a good time to do some optimisation
- You should be in a good position to test that your optimisations
have not negatively impacted correctness
- This has the additionally bonus that since you are refactoring
at a similar time you should already be considering the adaptability
and readability of your code
Timely Optimisation
- The absolute best time to optimise code is when you discover that
it is not running fast enough
- Often this will come towards the end of the project
- It should certainly be after you have something deployable
- After you have developed and tested some major portion of functionality
A Plausible Strategy
- Perform no optimisation until the end of the project once all
functionality is complete and tested
- You may well find at that point that no optimisation is necessary
- This is a reasonable stance to take, however:
- During development, you may find that your test suite takes days to run
- Even one simple run to test the functionality you are currently developing
may take minutes or hours
- This can seriously hamper development, so it may be best to do some
optimisation at that point
How to Optimise
- The very first thing you need before you could possibly optimise
code is a benchmark
- This can be as simple as timing how long it takes to
run your test suite
- O(n2) solutions will beat O(n log n) solutions on sufficiently
small inputs, so your benchmarks must not be too small
- Of course equally small inputs might be the only kind of
input you can expect
How to Optimise
- Once you have a suitable benchmark then you can:
- Run your benchmark on a build from a clean repository
recording the run time
- Perform what you think is an optimisation on your source code
- Re-run your benchmark and compare the run times
- If you have successfully improved the performance of your code
commit your changes, otherwise revert them
- Do one optimisation at a time
Interacting Optimisations
- Word of caution: some optimisations may interact with each other, so you
may wish to evaluate them independently as well as in conjunction
- As always, source code control can empower you to do this
High-level vs Low-level Optimisations
- It is usually more productive to consider high-level optimisations
- The compiler is often good at low-level optimisations
- It is often better to call a method fewer times, than to optimise
the code within a method
- For a web application if you can minimise the number of
requests, that may increase the overall performance
- Common example: validating form input on the client-side
Profiling
- Profiling is not the same as benchmarking
- Benchmarking:
- determines how quickly your program runs
- is to performance what testing is to correctness
- Profiling:
- is used after benchmarking has determined that your program is
running too slowly
- is used to determine which parts of your program are causing
it to run slowly
- is to performance what debugging is to correctness
- Without benchmarking you risk making your program worse
- Without profiling you risk wasting effort optimising a part of your
program which is either already fast or rarely executed
Documenting Optimisations
- Source code comments are a good place to explain
why the code is the way it is
- Source code control commits are a good place to document why you performed the
optimisations including benchmark/profiler results etc.
Summary
- As with refactoring do not optimise and develop new code at the same time
- Do not optimise blindly, instead, benchmark and profile
- Source code control will make everything easier
Testing
Software Engineering Large Practical
Course: Software Testing
- There is an entire course
in the second semester dedicated to software testing
- I am therefore providing a brief overview of the basics you will need for this practical
Famous Quote
“...testing can be a very effective way to show the presence of
bugs, but is hopelessly inadequate for showing their absence. The only
effective way to raise the confidence level of a program significantly is
to give a convincing proof of its correctness.” -- Edsger Dijkstra
Types and Testing
- Static types give you automatic proofs for some desirable properties
- Rarely is correctness one of them
- For correctness:
- Proving correctness is often infeasible, and in any case how do you know your proof is correct?
- Tests can at least increase confidence
Kinds of Testing
- There are many kinds of tests, here are but a few loosely defined categories:
- Functional Tests
- Regression Tests
- Unit Tests
- Mutation Testing
Quick Quiz
- What are the most important kinds of tests for you?
- Functional Tests
- Regression Tests
- Unit Tests
- Mutation Testing
Functional Tests
- Functional testing is rather vaguely defined
- Here I refer to testing the correct behaviour based on user actions
- For a web application this rather neatly fits into testing the correct response follows from given requests
Functional Tests
Advantages
- Observable behaviour is what you really care about
- Known input and expected outputs can often be defined by the user
- Which in addition may help tackle requirements ambiguity
Functional Tests
Disadvantages
- When tests fail there may be little indication as to where the flaws are
- Early testing before all components are ready may be difficult
Regression Tests
- A subset of functional testing, and again refers to many things
- However, I will use this term to mean testing non-changing behaviour
Regression Tests
The Oracle Problem
- In all testing, you have the problem of determining what the correct answer should be
- If your software involves some complex calculation that is infeasible to do by hand, for example
Regression Tests
The Oracle Problem
- You can at least:
- Generate some input, by hand or perhaps randomly
- Run your software to generate some possibly correct output
- Assume that that output is correct and test future versions against this
- If the output for the given input changes, then you should be able to explain it
- Either you have introduced a bug
- Or you have fixed one
- Occasionally both are valid, eg. different formatting
Regression Tests
Advantages
- Typically easy to setup
- Helps to avoid breaking functionality that was working previously
- Can help with the oracle problem in a way that has obvious limitations
- Increases confidence when refactoring and optimising
Regression Tests
Disadvantages
- You may simply be confirming that you have not fixed broken behaviour
Unit Tests
- Testing a whole program at once can be difficult or infeasible
- The combination of paths can mean the total number of paths to test explodes
- One solution to this is to test each component separately for correctness
- Assuming that if all modules are correct then the sum of those correct modules is also correct
- Here the term component is flexible, it could refer to:
- Modules
- Methods
- Classes
- Phases
- Layers etc.
Unit Tests
Advantages
- You may not have the whole program available, but testing smaller units can help find problems before integration is possible
- May help locate bugs as you can ascertain which particular module is failing
Unit Tests
Disadvantages
- Can create a lot of extra code
- Can make the design less flexible, if only because you have to change all of your tests
- Or even throw them all away and start again
- At some point you have to do integration testing
Mutation Tests
- Mutation testing does not test your application at all
- It tests your tests
- Mutation testing is the act of semi-randomly changing lines of code
- If you do this and all of your tests pass, then your tests are not very comprehensive
Mutation Testing
An Example
if (a && b){ return "Go Ahead"; }
else { return "Abort"; }
Mutated to become:
if (a || b){ return "Go Ahead"; }
else { return "Abort"; }
If your test suite still returns all tests passing after this mutation,
then either your test suite is not comprehensive, or your code is illogical
Mutation Testing
Conditions
- A mutant will cause a test to fail if:
- The test reaches the mutant statement
- The input data causes distinct mutant states for the original and mutant program
- The incorrect value must propogate to the output and be checked by the test
Mutation Testing
- With lots of compute power you can run lots of mutants
- Each mutant that fails generally must be inspected manually
- Automatically you can at least give your test suite a score:
- number of mutants that fail tests / number of mutants created
Mutation Testing
Advantages and Disadvantages
- Not so appropriate to discuss advantages and disadvantages here
since it is only used in conjunction with other testing
- Advantage pretty easy to set up
Common Approach
- There is a common approach to developing applications
- Start with the main method
- Write some code, for example to parse the input
- Write (or update) a test input file
- Run your current application
- See if the output is what you expect
- Go back to step 2.
Common Approach
- For a web application, this can be even worse:
- Write some functionality
- Start the web server locally
- Manually navigate to your local web site
- Manually exercise your new functionality
Start with a Test Suite
- A much better place to start is your test suite
- A web application is already well set-up for testing
- You can easily make known requests and check for expected responses
- Note: This does not mean you cannot start coding right away
Request/Response Testing
An Example
class FlaskrTestCase(unittest.TestCase):
def setUp(self):
self.db_fd, flaskr.app.config['DATABASE'] = tempfile.mkstemp()
self.app = flaskr.app.test_client()
flaskr.init_db()
def tearDown(self):
os.close(self.db_fd)
os.unlink(flaskr.app.config['DATABASE'])
def test_empty_db(self):
rv = self.app.get('/')
assert 'No entries here so far' in rv.data
Take from the Flask documentation
Request/Response Testing
An Example
def setUp(self):
self.db_fd, flaskr.app.config['DATABASE'] = tempfile.mkstemp()
self.app = flaskr.app.test_client()
flaskr.init_db()
The setup simple creates a temporary empty database
Request/Response Testing
An Example
def tearDown(self):
os.close(self.db_fd)
os.unlink(flaskr.app.config['DATABASE'])
Simply states that after this test has run, whether the test succeeded or
failed, the temporary database is deleted.
Request/Response Testing
An Example
def test_empty_db(self):
rv = self.app.get('/')
assert 'No entries here so far' in rv.data
The actual test:
- asks the framework to resolve the route:
/
and return the HTML page response
- That is the page generated from your method linked to that route
- Checks that that page contains the ‘No entries ..’ string
XUnit
- For this to work you do not even need to use a test framework
- But of course, one will help
- XUnit is the collective name for a unit testing framework that has been ported to many languages
- Although it has unit in the name, it is not really specific to unit testing
- See the wikipedia page
Testing Requests
- Your web application will have a front-end which makes
requests to your back-end
- To ensure correctness:
- Ensure the front-end generates correct requests
- Ensure the back-end processes requests correctly
- Ensure that these are integrated so that the front-end is
not generating requests that the back-end cannot process
- However, this is often complicated by the fact that some of your front-end is actually produced by your back-end
- For example, the HTML page response you produce may contain an input form on it, or even simply a link
For this Practical
- Because this is a relatively short practical, and I'm aware that some of you have had quite a lot to learn:
- This practical will not require you to test your front-end code
- That does not mean it might not help you to do so, only that it will not be reflected in your grade if you elect not to
- Javascript certainly has an XUnit implementation
- In summary: All I wish for you to do is some kind of automated tests that your application makes the correct responses to known requests
- Alternatively you can of course prove your application correct
Testing Requests
- Testing your responses to requests is further complicated because in addition,
you should make sure that you handle
correctly, requests that your front-end does not produce
- For example, an input POST request with missing fields
- Of course the correct response may simply be an error response
- But you should at least not have your web application crash
- If you are successful enough, other developers will use your API,
whether you want them to or not
Known Requests - Expected Responses
- This is a very test-friendly environment
- However, many of your known requests should also be expected
to update the persistant state
- Still, this is relatively easy to test: you have
known preconditions and expected postconditions
- For example you can test that a request that should insert something into the database, really does do that
Quick Quiz
- What are the most important kinds of tests for you?
- Functional Tests
- Regression Tests
- Unit Tests
- Mutation Testing
- Likely Functional Tests
- if you get functional testing right, then you’re
doing enough for this practical
- Regression testing is at least easy, and a subset of functional testing
General Advice
- You are technically doing unit testing, as you are already
testing the back-end separately from the front-end
- However, I recommend you do not decompose further, as you will
likely solidify your design
- With the possible exception of database access, which may be tested as an isolated component
- Though of course this will depend largely on your particular application
Enough Testing
- How do you know when you have tested enough?
- We would like to test for all inputs but that is generally
at best infeasible
- Instead we divide our input space into equivalence classes
- But how do we know we have divided our input space into the
correct equivalence classes?
- How do we know we have tested all of the classes?
Coverage Analysis
- For this you can use a coverage analyser
- Run your test-suite using the coverage analyser which will tell you which entities have been exercised
- The entities might be: Functions, Statements, Branches, Conditions
- This is a large topic for another day, but briefly:
- Ideally: check every possible path has been executed
- Generally infeasible, so we can at least check that every statement has been executed
- This will give you some confidence that your test suite is quite comprehensive
- Works well with mutation testing since reaching the mutant statement is the first condition
Too Much Testing
- This is possible
- Tests are code, code is hard to maintain, tests are no exception
- I think it is unlikely this will be a problem for you, but note that unit tests are particularly susceptible to over testing
Psychology and Testing
Confirmation Bias
Often used quote:
“I will look at any additional evidence to confirm the
opinion to which I have already come”
- Lord Molson, British Politician (1903-1991)
Funny, but entirely misleading
Psychology and Testing
Confirmation Bias
- The confirmation bias is a well documented psychological effect
- It is not the idea that people believe what they wish to believe
- It is the idea that: when evaluating any hypothesis, we are
predisposed to search for evidence to confirm it
Confirmation Bias
Country Test
- A study, carried out in the USA, and actually seeking to investigate
perceived similarity, concerned two pairs of countries:
- East Germany and West Germany
- Sri Lanka and Nepal
- The study asked two groups of participants respectively:
- Which two countries are more
similar to one another?
- Which two countries are more
different to one another?
- In the first group the majority said that
East and West Germany were more similar
to each other
- In the second group the majority likewise said that
East and West Germany were more
different from each other.
Confirmation Bias
Country Test Explanation
- The study was carried out in the USA, during the cold war
- Participants routinely consumed news stories regarding East and West Germany
- Only very rarely heard anything mentioning Sri Lanka and Nepal
Confirmation Bias
Country Test Explanation
- Hence when asked which two countries more similar:
- Particpants evaluated the hypotheses:
- East and West Germany are similar. and
- Sri Lanka and Nepal are similar.
- Knowing more about East and West Germany participants could
recall far more similarities
than they could for Sri Lanka and Nepal
- They hence concluded that East and West Germany were indeed more similar
Confirmation Bias
Country Test Explanation
- When asked which two countries more different:
- Particpants evaluated the hypotheses:
- East and West Germany are different. and
- Sri Lanka and Nepal are different.
- Knowing more about East and West Germany participants could
recall far more differences
than they could for Sri Lanka and Nepal
- They hence concluded that East and West Germany were indeed more different
Confirmation Bias
General Hypothesis Testing
Confirmation Bias
General Hypothesis Testing
Confirmation Bias
General Hypothesis Testing
Confirmation Bias
General Hypothesis Testing
My program is correct on all inputs
Psychology and Testing
Confirmation Bias
- In summary: your tests should evaluate the hypothsis:
- “My web application is incorrect and contains flaws”
- rather than:
- “My web application is correct and flawless”
- Tests can only ever show the presence of flaws never their absence
- This is a difficult philosophy to follow, because you would rather believe the second hypothesis
Recall Famous Quote
“...testing can be a very effective way to show the presence of
bugs, but is hopelessly inadequate for showing their absence. The only
effective way to raise the confidence level of a program significantly is
to give a convincing proof of its correctness.” -- Edsger Dijkstra
New Functionality
- Before you start new functionality:
- Write at least one test for that new functionality
- Run your test suite and make sure your new tests fail
- It is very easy to write tests which never fail
- Then:
- Code until those new tests turn green
- Then refactor and make sure the tests are still green
- Commit to your git repository as you go
- Return to step 1
- This assists in avoiding the confirmation bias because
when you write the tests your hypothesis is that your current
code will fail them
New Functionality
- Always run your full test suite
- This ensures that you do not break something that was previously working
- If your test suite becomes too long to run during a development cycle:
- Create a subset that exercises most functionality
- Make sure you run your full test suite before a commit
- Too long is generally: more than a few seconds
Fixing Bugs
- In recent years many developers have begun to ignore the difference between a feature request and bug report
- This is why bug trackers have been renamed as issue trackers
- When you find a new flaw in your application, treat it exactly the same as a missing feature
Fixing Bugs
- Before you start to debug your flaw:
- Write at least one test that currently fails due to that flaw
- Run your test suite and make sure it fails
- Then:
- Debug until those new tests turn green
- Then refactor and make sure the tests are still green
- Commit to your git repository as you go
- Return to step 1
- This helps to avoid re-introducing a bug once you have fixed it,
because you now have a specific test for it
Testing - Summary
- You should never be manually exercising your web application
to determine correctness
- You should do so only to determine usability
- and who knows you might actually build something you wish to use yourself
- Running your test suite should be one command
- Try to demonstrate your web application has flaws
Refactoring, Optimisation & Testing
Software Engineering Large Practical
Automation
- Tests are really just a special case of the maxim:
- “Never do what the computer can do for you”
- Any activity that requires more than one step, should be
reduced to exactly one step
- Code as if you have anterograde amnesia
- Everything you do today, you will forget, so it must be
recorded
- If there is specific software for recording it, such as an
issue tracker then use it
Instructions
- If you have to write down instructions, such as installation instructions:
- Ask yourself whether or not you can simply make those instructions executable