[USC logo]

 

CSCI 599 (Fall 2008)
Evolution of Database Research
Highlighting Jim Gray's Work

Course Summary

Reading List

Schedules

Presentations

Projects

Project Reports

Related Web Sites

Academic Integrity Policy


Instructor

Prof. Cyrus Shahabi

 

University of Southern California
Computer Science Department
SAL 300
Los Angeles, CA 90089-0781
Office (PHE-306a): (213) 740-8162
Lab (RTH-323): (213) 821-1462
Office Hours:  MW 11-12, PHE -306a

 

 

Course Summary

Description:

 

Jim Gray has received ACM Turing award in 1998 for his "seminal contributions to database and transaction processing research and technical leadership in system implementation." Jim has several publications in the area of databases (see http://research.microsoft.com/~gray/JimGrayPublications.htm ) with his major contributions being the granular database locking, two-tier transaction commit semantics, and the data cube operator. He has also had a founding role in the development of the following systems: System R (while at IBM), TerraServer-USA and Skyserver (at Microsoft).

The purpose of this seminar course is to study some of the timeless contributions of Jim Gray to the field of databases. We will start by focusing on his classic work in the area of transaction processing, in particular transaction locking and recovery. His contributions to this area are the basis of online transaction processing on the Internet (e.g., online shopping, online airline reservation). Next, we study his proposed benchmarks to measure the performance of commercial databases. These benchmarks led to an industry-wide competition that resulted in the fast databases of today. We continue by studying his proposed "cube" operation, which has been referenced by almost all the later papers on OLAP and datawarehousing. Finally, we cover some of his recent work in the area of eScience, which many consider him to be its founder.

Jim's papers are easy to follow and he has a gift in explaining complicated concepts in simple words and abstracts them out intuitively and elegantly. Hence, the best way to understand his major contributions is by covering his own papers. During the course of this seminar, we study some of the most well-known papers of Jim and meanwhile I try to share some of my personal memories of Jim as well as those I heard from other colleagues. Finally, given Jim's recent interest in eScience applications, the course includes several development projects (one project per team) for the two ongoing eScience applications at InfoLab in the areas of Earth Observation (JPL) and Oil Reservoir Management (Chevron).

Class Format and Evaluation Method:

In general, a class will consist of 2 paper presentations, each lasting 45-60 minutes followed by discussions, and a 30-minute implementation project update. Each paper will be presented by one student. The student is expected to go beyond the paper to seek online resources and examples that illustrate the principles and algorithms introduced in the paper. Furthermore, each paper presentation should include 10-15 minutes coverage of more recent papers by other authors that either builds on Jim Gray's paper or exposes new ideas related to the theme of Jim's paper. Every student is expected to complete the assigned reading, be prepared to discuss the articles in class, and to write a short critical summary of the presentations. The implementation of selected algorithms will be done in assigned teams of no more than four. Evaluation is based on: the team project (50%), the individual paper presentation(s) (30%) and the written paper summaries (20%).

Time and Location:

 

Wednesdays 3:30-5:50pm

Pre-requisite:

 

CSCI-585 and Instructor Permission

 

Projects

Title

Project Lead

Team Members

Project Description

  Moving Object Databases

Ugur Demiryurek

  Ugur, Dafei, Parixit

 

  GeoSim

 Houtan Shirani-Mehr

  Houtan, Penny

 

  Geodec

  Songhua Xing

  Songhua, Ali, Jeff

 

  Zumberge

  Leyla Kazemi

  Leyla, Elnaz

 

  Brain Project

  Hyunjin Yoon

  Hyunjin

 


Schedules

Week

Date

Papers presented

Presenters

1

08/27/08

No Class

 

1

09/03/08

Course introduction, Paper assignment, Project groups , a-1

 Cyrus Shahabi, Ugur Demiryurek

2

09/10/08

a-2, a-3

Ali Khodaei, Pan Bei

3

09/17/08

Presentation of project proposals

 

4

09/24/08

a-4, b-1

, Ali Khodaei

5

10/01/08

b-2, b-3

Houtan Shirani-Mehr,Parixit Kaira

6

10/08/08

b-4, c-1

Leyla Kazemi , Jeff Khoshgozaran

7

10/15/08

c-2i, c-2ii, c-3i, c-3ii

Songhua Xing, Songhua Xing

8

10/22/08

c-4i, c-4ii, c5

Parixit Kaira , Elnaz

9

10/29/08

d-1, e-3

Jeff Khoshgozaran,  Dafei Yin

10

11/05/08

No Class

11

11/12/08

e-1ii

Houtan Shirani-Mehr

12

11/19/08

e-4i, e-4ii , e-5

Elnaz, Hyunjin Yoon 

13

11/26/08

Project presentation

 

14

12/3/08

Project presentation

 


Reference Book:

TBD.

Reading List

 a) Database Locking

1) Locking, J. Gray, Proc. Woods Hole Conference on Concurrent Systems and Parallel Computation, J. Dennis Ed., ACM 1970 Conference Record. 1970. pp. 169-176

2) Granularity of Locks and Degrees of Consistency , J. Gray, R. Lorie, G.F. Putzolu, and I.L. Traiger, Modeling in Data Base Management Systems , G.M. Nijssen ed., North Holland Pub., 1976, pp. 364-394.

3) Granularity of Locks in a Shared Database , J. Gray, R. Lorie, G.F. Putzolu, Proceedings of International Conference on Very Large Databases, ACM Conference Record, 1975. pp. 231-248.

4) Views, Authorization and Locking in a Relational Database System , J. Gray, D.D. Chamberlin, I.L. Traiger, National Computer Conference Proceedings, Spartan Press, 1975, pp. 425-430.

 

b) Database Recovery

1) A Transaction Model, J. Gray, Lecture Notes in Computer Science, V. 85, Springer Verlag, 1980, pp. 282-298.

2) The Transaction Concept, Virtues And Limitations, J. Gray, Proceedings of 7th VLDB, Cannes , France , 1981, pp. 144-154.

3) The Recovery Manager of a Data Management System, J. Gray, P.R. McJones, M.W. Blasgen, R.A. Lorie, T.G. Price, G.R Putzolu, I.L. Traiger, ACM Computing Surveys, V. 13.2, 1982, pp. 223-242.

4) Consensus on Transaction Commit, Jim Gray, Leslie Lamport, MSR-TR-2003-96, January 2004, ACM TODS Vol. 31, No. 1, March 2006, pp. 133-160

 

c) Database Performance and Benchmarking

1) One Thousand Transactions Per Second, J. Gray, B. Good, P.W. Homan, D.E. Gawlick, H. Sammer, IEEE Compcon Proceedings, San Francisco, IEEE Press, 1985.

2) i. A Measure of Transaction Processing Power with 25 others Datamation, V 31.7, April 1985, pp 112-118.

ii. A Measure of Transaction Processing 20 Years Later, Jim Gray, MSR-TR-2005-57, April 2005. IEEE Data Engineering Bulletin, V.28.2 , pp. 3-4, June 2005

3) i. The Five Minute Rule for Trading Memory for Disc Accesses, and the 10 Byte Rule for Trading Memory for CPU Time , J. Gray, F. Putzolu, Proceedings of SIGMOD 87, June 1987, pp. 395-398.

ii. The Five-Minute Rule Ten Years Later, and Other Computer Storage Rules of Thumb, J. Gray, G. Graefe, SIGMOD Record 26(4) 1997: pp. 63-68 MSR TR 97 33

4) i Performance / Price Sort and PennySort , J. Gray, C. Nyberg, MSR-TR-98-23, April 1998

ii. Thousands of DebitCredit Transactions-Per-Second: Easy and Inexpensive, Jim Gray, MSR-TR-2005-39, April 2005.

5) Parallel Database Systems: The Future of Database Processing , David J. DeWitt,Jim Gray : Communications of the ACM, Vol. 36, No. 6, June 1992

 

d) Datacube

1) MapReduce: Simplifieed Data Processing on Large Clusters, J.Dean and S.Ghemawat, Google, Inc.

e) eScience

1) i. Scientific Data Management in the Coming Decade, Jim Gray, David T. Liu, Maria A. Nieto-Santisteban, Alexander S. Szalay, Gerd Heber, David DeWitt, MSR-TR-2005-10, January 2005. ACM SIGMOD Record, V 34.4, pp 35-41

ii. Data Cube: A Relational Aggregation Operator Generalizing Group-by, Cross-Tab, and Sub Totals, J. Gray, S. Chaudhuri, A. Bosworth, A. Layman, D. Reichart, M. Venkatrao, F. Pellow, H. Pirahesh: Data Mining and Knowledge Discovery 1(1), 1997, 29-53.

2) Cross-Matching Multiple Spatial Observations and Dealing with Missing Data , J. Gray, A. Szalay, T. Budavári, R. Lupton, M. Nieto-Santisteban, A. Thakar, MSR-TR-2006-175, December 2006

3) Using Table Valued Functions in SQL Server 2005 to Implement a Spatial Data Library, Jim Gray, Alex Szalay, Gyorgy Fekete, MSR-TR-2005-122, September 2005.

4) i.The Revolution in Database Architecture, Jim Gray, MSR-TR-2004-31, March 2004.

ii. When Database Systems Meet The Grid, Maria A. Nieto-Santisteban, Alexander S. Szalay, Aniruddha R. Thakar, William J. O'Mullane, Jim Gray, James Annis, MSR-TR-2004-81, August 2004. Proceedings of ACM CIDR 2005, Asilomar , CA , Jan 2005.

5) Data Mining the SDSS SkyServer Database, J. Gray, A.S. Szalay, A. Thakar, P. Kunszt, C. Stoughton, D. Slutz, J. vandenBerg Distributed Data & Structures 4: Records of the 4th International Meeting , pp 189-210 W. Litwin, G. Levy (eds), Paris France March 2002, Carleton Scientific 2003, ISBN 1-894145-13-5, also MSR-TR-2002-01, Jan. 200

 

 


Related Web Sites


Academic Integrity Policy

Academic Integrity

All homeworks must be solved and written independently, or you will be penalized for cheating. The USC Student Conduct Code prohibits plagiarism. All USC students are responsible for reading and following the Student Conduct Code, which appears on pp. 73-78 of the 1999-2000 SCampus.

In this course we encourage students to study together. This includes discussing general strategies to be used on individual assignments. However, all work submitted for the class is to be done individually.

Some examples of what is not allowed by the conduct code: copying all or part of someone else's work (by hand or by looking at others' files, either secretly or if shown), and submitting it as your own; giving another student in the class a copy of your assignment solution; consulting with another student during an exam. If you have questions about what is allowed, please discuss it with the instructor.

Students who violate University standards of academic integrity are subject to disciplinary sanctions, including failure in the course and suspension from the University. Since dishonesty in any form harms the individual, other students, and the University, policies on academic integrity will be strictly enforced.

USC seeks to maintain an optimal learning environment. General principles of academic honesty include the concept of respect for the intellectual property of others, the expectation that individual work will be submitted unless otherwise allowed by an instructor, and the obligations both to protect one's own academic work from misuse by others as well as to avoid using another's work as one's own. All students are expected to understand and abide by these principles. Scampus, the Student Guidebook, contains the Student Conduct Code in Section 11.00, while the recommended sanctions are located in Appendix A: http://www.usc.edu/dept/publications/SCAMPUS/gov/. Students will be referred to the Office of Student Judicial Affairs and Community Standards for further review, should there be any suspicion of academic dishonesty. The Review process can be found at: http://www.usc.edu/student-affairs/SJACS/ .

Violations of the Student Conduct Code will be filed with the Office of Student Conduct, and appropriate sanctions will be given.

Statement for Students with Disabilities

Any student requesting academic accommodations based on a disability is required to register with Disability Services and Programs (DSP) each semester. A letter of verification for approved accommodations can be obtained from DSP. Please be sure the letter is delivered to me (or to TA) as early in the semester as possible. DSP is located in STU 301 and is open 8:30 a.m.–5:00 p.m., Monday through Friday. The phone number for DSP is (213) 740-0776.