Note: This page refers to a past version of the course. You can also
consult the current Inf1-DA course web
pages.
- People
- Sample Exam
- Course description
- Lecture slides and notes
- Tutorials and Lab Exercises
- Assessed assignments
- General Information & Other Resources
Lecturer : |
Helen Pain |
helen AT inf DOT ed DOT ac DOT uk |
Teaching Assistants: |
Kate Byrne |
k DOT byrne AT ed DOT ac DOT uk |
Structured Data |
|
Gail Sinclair |
csincla1 AT inf DOT ed DOT ac DOT uk |
Semi-structured Data |
|
Gaya Nadarajan |
gaya DOT n AT ed DOT ac DOT uk |
Unstructured Data |
A mock exam for the structured data and semi-structured data
components of the course can be found
here;
the answers are
here.
The goal of this strand is to provide an introduction to collecting,
representing and interpreting data across the range of
Informatics.Students will learn the different perspectives from which
data is used, the different terminology used when referring to them and
a number of representation and manipulation methods. A small
number of running, illustrative examples wil be presented, from the
perspectives of hypothesis testing and query formation and answering.
After completing the course successfully, students should be able to:
- Demonstrate knowledge of the terminology and paradigms used in
different areas of informatics for collecting, representing and
interpreting data, by being able to apply them to sample problems.
- Demonstrate understanding of the different types of data
(structured/unstructured, observational/experimental,
quantitative/qualitative), by being able to identify the correct type
of data for a given application.
- Demonstrate proficiency of the entity/relationship model by being
able to specify appropriate representations and queries for simple
examples.
- Show awareness of the importance of logic for the representation of
data by being able to design simple logical representation of a given
data set.
- Present data in a variety of forms (textual, graphical, quantitative), across a range of data types.
- Show awareness of the distinction between object data and
meta-data, by being able to apply it to a number of applications across
informatics (e.g., databases, corpora).
- Demonstrate knowledge of the basic algorithms for interpreting and
processing data, by being able to demonstrate how these algorithms work
for simple data sets.
Lecture Notes
The pdf document available
here
has now been updated with the first and second chapters of the lecture
notes for the Data and Analysis Module. The Unstructured Data section
is still to be added.
Background reading materials
Apart from the lecture-notes which you can download from the link above
(or get hardcopies from the Teaching office), there are also a number
of reading-packs that are ONLY AVAILABLE from the Teaching office.
For the Structured-Data component of the course there is only one background reading pack
BP01
Ramakrishnan, R. and Gehrke, J. (2002).
Database Management Systems, McGraw-Hill, 3rd Edition. (Chapter 4), pp 100-126.
ISBN:
0072465638
For the Semi-Structured Data component there are two packs:
BP02
McEnery, Tony and Andrew Wilson. (2001). What is a corpus and what is in it?
Chapter 2 of:
Corpus Linguistics: an Introduction. Edinburgh University Press, 2nd Edition. pp 29-74.
BP03
Manning, Christopher D. and Hinrich Schutze. (1999). Topics in Information Retrieval. Chapter 15 of:
Foundations of
Statistical Natural Language Processing. MIT Press, Cambridge, MA. pp 529 - 571.
For the Unstructured Data component:
BP04
Dix, A., Finlay, J., Abowd, G. and Beale, R. (2004). Evaluation Techniques. Chapter 9 of:
Human-Computer Interaction, (3rd edition)
Pearson/Prentice Hall, Harlow, England. pp
318-364
Slides
- First set (Structured Data)
All sets of slides here (Printer friendly).
- Second set (Semi-structured Data)
Lecture 6 slides here (Printer friendly).
Lecture 7 slides here (Printer friendly).
Lecture 8 slides here (Printer friendly).
Lecture 9 slides here (Printer friendly).
Lecture 10 slides here (Printer friendly).
- Third set (Unstructured Data)
Note Open these using Acrobat Reader only! Other readers might not work.
Lecture 11 (Requirements) slides here (Printer friendly).
Lecture 12 (Methods) slides here (Printer friendly).
Lecture 13 (Visualising data) slides here (Printer friendly).
Lecture 14 (Use Cases) slides here (Printer friendly).
Lecture 14 (Evaluation) slides here (Printer friendly).
Lecture 15 (Methods) slides here (Printer friendly).
Tutorials
DA Tutorial 1, Week 3:
Handout
here (pdf format) with
appendix.
DA Tutorial 2, Week 5:
Handout here (pdf format).
DA Tutorial 3, Week 6:
Handout here (pdf format).
DA Tutorial 4, Week 8:
Handout here (pdf format).
DA Tutorial 5, Week 9:
Handout here (pdf format).
DA Tutorial 6, Week 11:
Handout here (pdf format).
Lab Sessions
DA Lab 1, Week 4:
Handout here (pdf format).
DA Lab 2, Week 7:
Handout here (pdf format).
DA Lab 3, Week 10:
Handout here (pdf format). MATLAB Tutorial
here (pdf format).
ASN-02, First DA Coursework:
Handout here (pdf format) with
answer template file.
Due Friday 9th Feb 2007, 12 noon.
ASN-05, Second DA Coursework:
Handout here (pdf format).
Due Friday 9th March 2007, 12 noon.
ASN-06, Joint DA/OOP Coursework:
Handout here (pdf format).You will also need
this zip file.
Due Friday 23rd March 2007, 5:00pm.
Some useful pages on Use Cases:
Resources
for writing use cases, AlistairCockburn
Structuring Use Cases with GoalsUse Case Fundamentals a summary