DBMS-BBA SEM 3rd


Topics 1-5                        Topics 6-10  


Topics 11-15                    Topics 16-20 


   Topics 21-25

 

Topic 1.1 Traditional file oriented approach


1.    In traditional approach, information is stored in flat files which are maintained by the file system under the operating system’s control.
2.    Application programs go through the file system in order to access these flat files
How data is stored in flat files
·        Data is stored in flat files as records.
·        Records consist of various fields which are delimited by a space, comma, pipe, any special character etc.
·        End of records and end of files will be marked using any predetermined character set or special characters in order to identify them
Example:  Storing employee data in flat files

Topic 1.2 Disadvantages of simple file system



1.    Data Security
The data stored in the flat file(s) can be easily accessible and hence it is not secure.
Example: Consider an online banking application where we store the account related information of all customers in flat files. A customer will have access only to his account related details. However from a flat file, it is difficult to put such constraints. It is a big security issue.
2.    Data Redundancy
In this storage model, the same information may get duplicated in two or more files. This may lead to to higher storage and access cost. it also may lead to data inconsistency.
For Example, assume the same data is repeated in two or more files. If a change is made to data stored in one file, other files also needs to be change accordingly.
Example: Assume employee details such as firstname, lastname, emailid are stored in employee_details file and employee_salary file. If a change needs to be made to emailid, both employee_details file and emplyee_salary file need to be updated otherwise it will lead to inconsistent data.
3.    Data Isolation
Data Isolation means that all the related data is not available in one file. Usually the data is scattered in various files having different formats. Hence writing new application programs to retrieve the appropriate data is difficult.
4.    Program and Data Dependence
In traditional file approach, application programs are closely dependent on the files in which data is stored. If we make any changes in the physical format of the file(s), like addition of a data field, etc, all application programs needs to be changed accordingly. Consequently, for each of the application programs that a programmer writes or maintains, the programmer must be concerned with data management. There is no centralized execution of the data management functions. Data management is scattered among all the application programs.
Example: Consider the banking system. An employee_salary file exists which has details about the salary of employees. An employee_salary record is described by
employee_id
firstname
lastname
salary_amount
An application program is available to display all the details about the salary of all employees. Assume a new data field, the date_of_joining is added to the employee_salary file. Since the application program depends on the file, it also needs to be altered.
5.    Lack of Flexibility
The traditional systems are able to retrieve information for predetermined requests for data. If we need unanticipated data, huge programming effort is needed to make the information available, provided the information is there in the files. By the time the information is made available, it may no longer be required or useful.
Example : Consider a software application which is able to generate employee salary report. Assume that all the data is stored in flat files. Suppose we now have a requirement to retrieve all the employee details whose salary is greater than Rs.10000. It is not easy to generate such on-demand reports and lot of time is needed for application developers to modify the application to meet such requirements.
6.    Concurrent Access Anomalies
Many traditional systems allow multiple users to access and update the same piece of data simultaneously. However this concurrent updates may result in inconsistent data. To guard against this possibility, the system must maintain some form of supervision. But supervision is difficult because data may be accessed by many different application programs and these application programs may not have been coordinated previously.
Example: Consider a personal information system which has the data of all employees. Now there may be an employee updating his address details in the system and at the same time, an administrator may be taking a report containing the data of all employees. This is called concurrent access. Since the employee's address is being updated at the same time, there is a possibility of the administrator reading an incorrect address.
These difficulties lead to the development of database systems.
Topic 2.1 Database approach

In order to remove all limitations of the File Based Approach, a new approach was required that must be more effective known as Database approach
The Database is a shared collection of logically related data, designed to meet the information needs of an organization. A database is a computer based record keeping system whose over all purpose is to record and maintains information. The database is a single, large repository of data, which can be used simultaneously by many departments and users. Instead of disconnected files with redundant data, all data items are integrated with a minimum amount of duplication.
The database is no longer owned by one department but is a shared corporate resource. The database holds not only the organization's operational data but also a description of this data. For this reason, a database is also defined as a self-describing collection of integrated records. The description of the data is known as the Data Dictionary or Meta Data (the 'data about data'). It is the self-describing nature of a database that provides program-data independence.
A database implies separation of physical storage from use of the data by an application program to achieve program/data independence. Using a database system, the user or programmer or application specialist need not know the details of how the data are stored and such details are "transparent to the user". Changes (or updating) can be made to data without affecting other components of the system. These changes include, for example, change of data format or file structure or relocation from one device to another.
In the DBMS approach, application program written in some programming language like Java, Visual Basic.Net, and Developer 2000 etc. uses database connectivity to access the database stored in the disk with the help of operating system's file management system.

Topic 2.2 Advantages and Characteristics of Database approach

1.    Self-Describing Nature of a Database System
A Database System contains not only the database itself but also the descriptions of data structure and constraints (meta-data). This information is used by the DBMS software or database users if needed. This separation makes a database system totally different from the traditional file-based system in which the data definition is a part of application programs.
2.    Insulation between Program and Data
In the file based system, the structure of the data files is defined in the application programs so if a user wants to change the structure of a file, all the programs that access that file might need to be changed as well. On the other hand, in the database approach, the data structure is stored in the system catalog not in the programs.  Therefore, one change is all that’s needed.
3.    Support multiple views of data
A view is a subset of the database which is defined and dedicated for particular users of the system. Multiple users in the system might have different views of the system. Each view might contain only the data of interest to a user or a group of users.
4.    Sharing of data and Multiuser system
A multiuser database system must allow multiple users access to the database at the same time. As a result, the multiuser DBMS must have concurrency control strategies to ensure several users  access to the same data item at the same time, and to do so in a manner  that the data will always be correct.
5.    Control Data Redundancy
In the Database approach, ideally each data item is stored in only one place in the database.  In some cases redundancy still exists so as to improve system performance, but such redundancy is controlled and kept to minimum.
6.    Data Sharing
The integration of the whole data in an organization leads to the ability to produce more information from a given amount of data.
7.    Enforcing Integrity Constraints
DBMSs provides capabilities to define and enforce certain constraints such as data type, data uniqueness, etc.
8.    Restricting Unauthorized Access
Not all users of the system have the same accessing privileges.  DBMSs provides a security subsystem to create and control the user accounts.
9.    Data Independence
System data (Meta Data) descriptions are separated from the application programs.  Changes to the data structure is handled by the DBMS and not embedded in the program.
10. Transaction Processing
The DBMS must include concurrency control subsystems to ensure that several users trying to update the same data do so in a controlled manner.  The results of any updates to the database must maintain consistency and validity.
11. Providing multiple views of data
A view may be a subset of the database. Various users may have different views of the database itself.  Users may not need to be aware of how and where the data they refer to is stored.
12. Providing backup and recovery facilities
If the computer system fails in the middle of a complex update process, the recovery subsystem is responsible for making sure that the database is restored to the stage it was in before the process started executing.
13. Managing information
Managing information means taking care of it so that it works for us, and is useful for the work we are doing. The information we collect is no longer subject to “accidental disorganization” and becomes more easily accessible and integrated with the rest of our work. Managing information using a database allows us to become strategic users of the data we have.








Topic 3.1 Database Management Systems (DBMS)

A database management system (DBMS) is a collection of programs that enables you to store, modify, and extract information from a database. There are many different types of database management systems, ranging from small systems that run on personal computers to huge systems that run on mainframes.
A database management system (DBMS) is system software for creating and managing databases. The DBMS provides users and programmers with a systematic way to create, retrieve, update and manage data.
A DBMS makes it possible for end users to create, read, update and delete data in a database. The DBMS essentially serves as an interface between the database and end users or application programs, ensuring that data is consistently organized and remains easily accessible.
The DBMS manages three important things: the data, the database engine that allows data to be accessed, locked and modified -- and the database schema, which defines the database’s logical structure. These three foundational elements help provide concurrency, security, data integrity and uniform administration procedures. Typical database administration tasks supported by the DBMS include change management, performance monitoring/tuning and backup and recovery. Many database management systems are also responsible for automated rollbacks, restarts and recovery as well as the logging and auditing of activity.
The DBMS is perhaps most useful for providing a centralized view of data that can be accessed by multiple users, from multiple locations, in a controlled manner. A DBMS can limit what data the end user sees, as well as how that end user can view the data, providing many views of a single database schema. End users and software programs are free from having to understand where the data is physically located or on what type of storage media it resides because the DBMS handles all requests.
The DBMS can offer both logical and physical data independence. That means it can protect users and applications from needing to know where data is stored or having to be concerned about changes to the physical structure of data (storage and hardware). As long as programs use the application programming interface (API) for the database that is provided by the DBMS, developers won't have to modify programs just because changes have been made to the database.
With relational DBMSs (RDBMSs), this API is SQL, a standard programming language for defining, protecting and accessing data in a RDBMS.

Topic 3.2 Components of DBMS

A database management system (DBMS) consists of several components. Each component plays very important role in the database management system environment. The major components of database management system are:

·                  Software
·                  Hardware
·                  Data
·                  Procedures
·                  Database Access Language
·                  Users






Software

The main component of a DBMS is the software. It is the set of programs used to handle the database and to control and manage the overall computerized database

1.            DBMS software itself, is the most important software component in the overall system
2.            Operating system including network software being used in network, to share the data of database among multiple users.
3.            Application programs developed in programming languages such as C++, Visual Basic that are used to to access database in database management system. Each program contains statements that request the DBMS to perform operation on database. The operations may include retrieving, updating, deleting data etc . The application program may be conventional or online  workstations or terminals.

Hardware

Hardware consists of a set of physical electronic devices such as computers (together with associated I/O devices like disk drives), storage devices, I/O channels, electromechanical devices that make interface between computers and the real world systems etc, and so on. It is impossible to implement the DBMS without the hardware devices, In a network, a powerful computer with high data processing speed and a storage device with large storage capacity is required as database server.

Data

Data is the most important component of the DBMS. The main purpose of DBMS is to process the data. In DBMS, databases are defined, constructed and then data is stored, updated and retrieved to and from the databases. The database contains both the actual (or operational) data and the metadata (data about data or description about data).

Procedures

Procedures refer to the instructions and rules that help to design the database and to use the DBMS. The users that operate and manage the DBMS require documented procedures on hot use or run the database management system. These may include.

1.            Procedure to install the new DBMS.
2.            To log on to the DBMS.
3.            To use the DBMS or application program.
4.            To make backup copies of database.
5.            To change the structure of database.
6.            To generate the reports of data retrieved from database.

Database Access Language

The database access language is used to access the data to and from the database. The users use the database access language to enter new data, change the existing data in database and to retrieve required data from databases. The user write a set of appropriate commands in a database access language and submits these to the DBMS. The DBMS translates the user commands and sends it to a specific part of the DBMS called the Database Jet Engine. The database engine generates a set of results according to the commands submitted by user, converts these into a user readable form called an Inquiry Report and then displays them on the screen. The administrators may also use the database access language to create and maintain the databases.
The most popular database access language is SQL (Structured Query Language). Relational databases are required to have a database query language.

Users

The users are the people who manage the databases and perform different operations on the databases in the database system.There are three kinds of people who play different roles in database system
1.            Application Programmers
2.            Database Administrators
3.            End-Users



Application Programmers
The people who write application programs in programming languages (such as Visual Basic, Java, or C++) to interact with databases are called Application Programmer.

Database Administrators 
A person who is responsible for managing the overall database management system is called database administrator or simply DBA.

End-Users
The end-users are the people who interact with database management system to perform different operations on database such as retrieving, updating, inserting, deleting data etc.


Topic 4-Advantages of DBMS
The database management system has promising potential advantages, which are explained below: 
1. Controlling Redundancy: In file system, each application has its own private files, which cannot be shared between multiple applications. This can often lead to considerable redundancy in the stored data, which results in wastage of storage space. By having centralized database most of this can be avoided. It is not possible that all redundancy should be eliminated. Sometimes there are sound business and technical reasons for· maintaining multiple copies of the same data. In a database system, however this redundancy can be controlled.
For example: In case of college database, there may be the number of applications like General Office, Library, Account Office, Hostel etc. Each of these applications may maintain the following information into own private file applications:

It is clear from the above file systems, that there is some common data of the student which has to be mentioned in each application, like Rollno, Name, Class, Phone_No~ Address etc. This will cause the problem of redundancy which results in wastage of storage space and difficult to maintain.
2. Integrity can be enforced: Integrity of data means that data in database is always accurate, such that incorrect information cannot be stored in database. In order to maintain the integrity of data, some integrity constraints are enforced on the database. A DBMS should provide capabilities for defining and enforcing the constraints.
3. Inconsistency can be avoided : When the same data is duplicated and changes are made at one site, which is not propagated to the other site, it gives rise to inconsistency and the two entries regarding the same data will not agree. At such times the data is said to be inconsistent. So, if the redundancy is removed chances of having inconsistent data is also removed.
4. Data can be shared: The data about entities is shared by multiple applications in centralized DBMS as compared to file system so now applications can be developed to operate against the same stored data. The applications may be developed without having to create any new stored files. 
5. Standards can be enforced : Since DBMS is a central system, so standard can be enforced easily may be at Company level, Department level, National level or International level. The standardized data is very helpful during migration or interchanging of data. The file system is an independent system so standard cannot be easily enforced on multiple independent applications.
6. Restricting unauthorized access: When multiple users share a database, it is likely that some users will not be authorized to access all information in the database. For example, account office data is often considered confidential, and hence only authorized persons are allowed to access such data. In addition, some users may be permitted only to retrieve data, whereas other are allowed both to retrieve and to update. Hence, the type of access operation retrieval or update must also be controlled. Typically, users or user groups are given account numbers protected by passwords, which they can use to gain access to the database. A DBMS should provide a security and authorization subsystem, which the DBA uses to create accounts and to specify account restrictions. The DBMS should then enforce these restrictions automatically.
7. Providing Backup and Recovery: A DBMS must provide facilities for recovering from hardware or software failures. The backup and recovery subsystem of the DBMS is responsible for recovery. For example, if the computer system fails in the middle of a complex update program, the recovery subsystem is responsible for making sure that the .database is restored to the state it was in before the program started executing.
8. Cost of developing and maintaining system is lower: It is much easier to respond to unanticipated requests when data is centralized in a database than when it is stored in a conventional file system. Although the initial cost of setting up of a database can be large, but the cost of developing and maintaining application programs to be far lower than for similar service using conventional systems. The productivity of programmers can be higher in using non-procedural languages that have been developed with DBMS than using procedural languages.
9. Concurrency Control : DBMS systems provide mechanisms to provide concurrent access of data to multiple users.






Topic 5-Disadvantages of DBMS
The disadvantages of the database approach are summarized as follows:
1. Complexity : The provision of the functionality that is expected of a good DBMS makes the DBMS an extremely complex piece of software. Database designers, developers, database administrators and end-users must understand this functionality to take full advantage of it. Failure to understand the system can lead to bad design decisions, which can have serious consequences for an organization.
2. Size : The complexity and breadth of functionality makes the DBMS an extremely large piece of software, occupying many megabytes of disk space and requiring substantial amounts of memory to run efficiently.
3. Performance: Typically, a File Based system is written for a specific application, such as invoicing. As result, performance is generally very good. However, the DBMS is written to be more general, to cater for many applications rather than just one. The effect is that some applications may not run as fast as they used to.
4. Higher impact of a failure: The centralization of resources increases the vulnerability of the system. Since all users and applications rely on the vailability of the DBMS, the failure of any component can bring operations to a halt.
5. Cost of DBMS: The cost of DBMS varies significantly, depending on the environment and functionality provided. There is also the recurrent annual maintenance cost.
6. Additional Hardware costs: The disk storage requirements for the DBMS and the database may necessitate the purchase of additional storage space. Furthermore, to achieve the required performance it may be necessary to purchase a larger machine, perhaps even a machine dedicated to running the DBMS. The procurement of additional hardware results in further expenditure.
7. Cost of Conversion: In some situations, the cost of the DBMS and extra hardware may be insignificant compared with the cost of converting existing applications to run on the new DBMS and hardware. This cost also includes the cost of training staff to use these new systems and possibly the employment of specialist staff to help with conversion and running of the system. This cost is one of the main reasons why some organizations feel tied to their current systems and cannot switch to modern database technology.

1 comment: