Database management systems

Course Purpose

1) The aim is to present the general principles of the database systems with a practical focus, and some implementation assignments. Those assignments may or may not include some programming

2)The course assumes that students have some programming or strong logical reasoning skills in programming languages like C or C++. Some programing data structures such as (heap files, buffer manager, B+ trees, hash indexes, varoius goin methodes) are used in the course

3) The course concentrates on issues of the design, tuning , and implementation of the databases

The object of Database

Describing and storing data in the database

Chapter 1: Relational Model

Data is formatted in tables, where each row represents an entry. For example, in a table of salespeople each row represents information about each particular salesperson. This data includes his/her name, age, address, etc. Such a single row of information is called a record.

The record can be treated as an array of an information, with its components treated as fields.

Relation is rule which relates different records of different tables. For example: records exist in a table of salespeople, as well as in a table of products. So it is possible to assign each salesperson as a specialist in each product. Such an assignment can be characterized as a relation.

Relational Model Concepts

Relation is often represented as a table. Each table consists of several rows with many related values. One such row is called a tuple. In formal relational model a table is called table, the column headers are called attributes.

Chapter 2: File Organization and Indexing

Even though the database shows us data in the form of relations we must understand that it is just a logical representation. Finally all the data needs to be stored on the hard disk as files. It is very important in performance point of view that this organization should allow the database software to access data quickly and in an efficient way.

Files on Disk

Records

Data is stored in files in the form of records. Records are related type of data. Ex. Details of one employee. However the data types too are important while storing data. The various data types are integer, text, date and time etc. Once they are recorded to a file while retrieving them back we must know their correct data type. With the latest databases we have a huge number of data types. To store binary data like program files and images we have a data type called BLOBs.

Fixed Length v/s Variable Length Records

Files are accessed sequentially hence to retrieve each record we must know its size. If the records are known to be of fixed size the work is very simple. Consider a list of mobile numbers where each number is exactly 10 digits long. Retrieving number will be simple here because we can read 10 characters at a time. But consider a list of names. How do we know the size of names? They can vary. To make matters more complex we can have records of certain number of fixed length entries and some variable length entries. Variable length records can be separated by a delimiter or separator such as a comma ',' or new line. Comma Separated Values or CVS is one such file based database.

   If every record in the file has exactly the same size (in bytes) the file is said to be made up of Fixed length records.

if different records in the file have different sizes,the file is made up of variable length records.

Records on Disk: Spanned v/s Unspanned

Records need to be stored on a disk. Read and Write operations on disk are usually done in disk blocks. If the block size is greater than on record we might record many records in a single block and vice versa. In any case if the block size is not exactly divisible by the size of each records or to put in better terms the number of records that can be stored in a block is not a natural number a part of record will get stored on one block and remaining on some other disk block. In such a case there will be a pointer at the end of the disk block that will point to the next block. Such a organization is called Spanned. If Records are not allowed to cross block boundaries we call them unspanned.

Related materials

School:Computer Science:

Bibliography

Database Management Systems, second edition by Raghu Ramakrishnan(University of Wisconsin) and Johannes Gerkhe (Cornell University)

This course is a part of the computer science program.

This article is issued from Wikiversity - version of the Saturday, September 15, 2012. The text is available under the Creative Commons Attribution/Share Alike but additional terms may apply for the media files.