Graham Cormode has been awarded a prestigious ERC consolidator grant worth 1.5M euro, to support his research. The grant is for a project entitled “Small Summaries for Big Data”. The project focuses on the area of the design and analysis of compact summaries: data structures which capture key features of the data, and which can be created effectively over distributed data sets. The project will substantially advance the state of the art in data summarization, to the point where accurate and effective summaries are available for a wide array of problems, and can be used seamlessly in applications that process big data.
The University of Warwick department of Computer Science was rated second in the country in the recent REF exercise run by the UK government.
The University is also one of five selected to take part in the Alan Turing Institute, a new venture designed to lead British research in Computer Science. The department of computer science, along with Mathematics and Statistics, will lead the university’s involvement.
This position has been filled.
A Microsoft Research scholarship place is available to study algorithms for massive data analysis, leading to a PhD in Computer Science. Increasingly, we are faced with larger and larger volumes of data from which to extract insights and intelligence. Of particular interest is data that can be represented as a graph or (adjacency) matrix. A promising approach is to look for ways to “sketch” such structures: to build a representation that is much more compact than the input, but which allows some function of interest on the original data to be approximated accurately using the sketch. Such sketches are well-known and widely used for data that can be represented as a vector (such as to identify the most frequent elements, or to count the number of distinct items). The goal of this scholarship project is to develop new algorithms for sketching of massive graphs and matrices, and to demonstrate their usefulness via theoretical analysis and empirical evaluation. The hope is to advance our knowledge of the theory in this area, and design algorithms which can be used in practice, such as for querying data represented as a (massive) graph, clustering/partitioning graph structured data, and optimization problems over large graphs.
The scholarship will support tuition fees and stipend to study at University of Warwick, under the guidance of Professor Graham Cormode and Dr. Milan Vojnovic of Microsoft Research. The Microsoft PhD Scholarship Programme recognises and supports exceptional students who show the potential to make an outstanding contribution to science and computing. Each Microsoft scholarship consists of an annual bursary for up to a maximum of three years.
During the course of their PhD, Scholars are invited to Microsoft Research in Cambridge for an annual Summer School that includes a series of talks of academic interest and posters sessions, which provides the Scholars the opportunity to present their work to Microsoft researchers and a number of Cambridge academics. There is also the possibility of internships at Microsoft Research. Applicants require a first-class Honours degree or equivalent in Computer Science, Mathematics or Computer Engineering, experience in programming and aptitude for mathematical subjects. Knowledge of algorithms, linear algebra, graph theory and probability is essential. A Masters degree is desirable. Before the Scholarship can be awarded the candidate must also undergo the formal admission procedure to the university of Warwick, and approval from Microsoft Research. The scholarship covers fees for students from European Union countries. In exceptional cases, it may be possible to support students from outside the EU.
To apply, please contact Graham Cormode or Milan Vojnovic directly with a CV and description of your experience relevant to this project. Please apply by January 31 2015 for full consideration. Further details and suggested reading is available from Prof. Graham Cormode (G.Cormode@Warwick.ac.uk).
Nick Duffield (Texas A&M University) and Graham Cormode presented their tutorial on Sampling for Big Data at KDD 2014. The abstract is as follows:
One response to the proliferation of large datasets has been to develop ingenious ways to throw resources at the problem, using massive fault tolerant storage architectures, parallel and graphical computation models such as MapReduce, Pregel and Giraph. However, not all environments can support this scale of resources, and not all queries need an exact response. This motivates the use of sampling to generate summary datasets that support rapid queries, and prolong the useful life of the data in storage. To be effective, sampling must mediate the tensions between resource constraints, data characteristics, and the required query accuracy. The state-of-the-art in sampling goes far beyond simple uniform selection of elements, to maximize the usefulness of the resulting sample. This tutorial reviews progress in sample design for large datasets, including streaming and graph-structured data. Applications are discussed to sampling network traffic and social networks.