Topic 1.1 Traditional file
oriented approach
1. In traditional approach, information is stored in flat files which
are maintained by the file system under the operating system’s control.
2. Application programs go through the file system in order to access
these flat files
How data
is stored in flat files
·
Data is stored in flat
files as records.
·
Records consist of various
fields which are delimited by a space, comma, pipe, any special character etc.
·
End of records and end of
files will be marked using any predetermined character set or special
characters in order to identify them
Example: Storing
employee data in flat files
Topic 1.2
Disadvantages of simple file system
1.
Data Security
The data stored in the flat
file(s) can be easily accessible and hence it is not secure.
Example: Consider an online
banking application where we store the account related information of all
customers in flat files. A customer will have access only to his account
related details. However from a flat file, it is difficult to put such
constraints. It is a big security issue.
2.
Data Redundancy
In this storage model, the
same information may get duplicated in two or more files. This may lead to to
higher storage and access cost. it also may lead to data inconsistency.
For Example, assume the
same data is repeated in two or more files. If a change is made to data stored
in one file, other files also needs to be change accordingly.
Example: Assume employee
details such as firstname, lastname, emailid are stored in employee_details
file and employee_salary file. If a change needs to be made to emailid, both
employee_details file and emplyee_salary file need to be updated otherwise it
will lead to inconsistent data.
3.
Data Isolation
Data Isolation means that
all the related data is not available in one file. Usually the data is
scattered in various files having different formats. Hence writing new
application programs to retrieve the appropriate data is difficult.
4.
Program and Data Dependence
In traditional file approach,
application programs are closely dependent on the files in which data is
stored. If we make any changes in the physical format of the file(s), like
addition of a data field, etc, all application programs needs to be changed
accordingly. Consequently, for each of the application programs that a
programmer writes or maintains, the programmer must be concerned with data
management. There is no centralized execution of the data management functions.
Data management is scattered among all the application programs.
Example: Consider the
banking system. An employee_salary file exists which has details about the
salary of employees. An employee_salary record is described by
employee_id
firstname
lastname
salary_amount
An application program is
available to display all the details about the salary of all employees. Assume
a new data field, the date_of_joining is added to the employee_salary file.
Since the application program depends on the file, it also needs to be altered.
5.
Lack of Flexibility
The traditional systems are
able to retrieve information for predetermined requests for data. If we need
unanticipated data, huge programming effort is needed to make the information
available, provided the information is there in the files. By the time the
information is made available, it may no longer be required or useful.
Example : Consider a
software application which is able to generate employee salary report. Assume
that all the data is stored in flat files. Suppose we now have a requirement to
retrieve all the employee details whose salary is greater than Rs.10000. It is
not easy to generate such on-demand reports and lot of time is needed for
application developers to modify the application to meet such requirements.
6.
Concurrent Access Anomalies
Many traditional systems
allow multiple users to access and update the same piece of data
simultaneously. However this concurrent updates may result in inconsistent
data. To guard against this possibility, the system must maintain some form of
supervision. But supervision is difficult because data may be accessed by many
different application programs and these application programs may not have been
coordinated previously.
Example: Consider a
personal information system which has the data of all employees. Now there may
be an employee updating his address details in the system and at the same time,
an administrator may be taking a report containing the data of all employees.
This is called concurrent access. Since the employee's address is being updated
at the same time, there is a possibility of the administrator reading an
incorrect address.
These difficulties lead to
the development of database systems.
Topic 2.1 Database approach
In order to remove all limitations of
the File Based Approach, a new approach was required that must be more
effective known as Database approach
The Database is a shared collection of
logically related data, designed to meet the information needs of an
organization. A database is a computer based record keeping system whose over
all purpose is to record and maintains information. The database is a single,
large repository of data, which can be used simultaneously by many departments
and users. Instead of disconnected files with redundant data, all data items
are integrated with a minimum amount of duplication.
The database is no longer owned by one
department but is a shared corporate resource. The database holds not only the
organization's operational data but also a description of this data. For this
reason, a database is also defined as a self-describing collection of
integrated records. The description of the data is known as the Data Dictionary
or Meta Data (the 'data about data'). It is the self-describing nature of a
database that provides program-data independence.
A database implies separation of
physical storage from use of the data by an application program to achieve
program/data independence. Using a database system, the user or programmer or
application specialist need not know the details of how the data are stored and
such details are "transparent to the user". Changes (or updating) can
be made to data without affecting other components of the system. These changes
include, for example, change of data format or file structure or relocation
from one device to another.
In the DBMS approach, application
program written in some programming language like Java, Visual Basic.Net, and
Developer 2000 etc. uses database connectivity to access the database stored in
the disk with the help of operating system's file management system.
Topic 2.2 Advantages and
Characteristics of Database approach
1. Self-Describing
Nature of a Database System
A Database System contains not only the database itself but also
the descriptions of data structure and constraints (meta-data). This information
is used by the DBMS software or database users if needed. This separation makes
a database system totally different from the traditional file-based system in
which the data definition is a part of application programs.
2.
Insulation between Program and Data
In the file based system, the structure of the data files is
defined in the application programs so if a user wants to change the structure
of a file, all the programs that access that file might need to be changed as
well. On the other hand, in the database approach, the data structure is stored
in the system catalog not in the programs. Therefore, one change is all
that’s needed.
3.
Support multiple views of data
A view is a subset of the database which is defined and
dedicated for particular users of the system. Multiple users in the system
might have different views of the system. Each view might contain only the data
of interest to a user or a group of users.
4.
Sharing of data and Multiuser system
A multiuser database system must allow multiple users access to
the database at the same time. As a result, the multiuser DBMS must have
concurrency control strategies to ensure several users access to the same
data item at the same time, and to do so in a manner that the data will
always be correct.
5.
Control Data Redundancy
In the Database approach, ideally each data item is stored in
only one place in the database. In some cases redundancy still exists so
as to improve system performance, but such redundancy is controlled and kept to
minimum.
6.
Data Sharing
The integration of the whole data in an organization leads to
the ability to produce more information from a given amount of data.
7.
Enforcing Integrity Constraints
DBMSs provides capabilities to define and enforce certain
constraints such as data type, data uniqueness, etc.
8.
Restricting Unauthorized Access
Not all users of the system have the same accessing
privileges. DBMSs provides a security subsystem to create and control the
user accounts.
9.
Data Independence
System data (Meta Data) descriptions are separated from the
application programs. Changes to the data structure is handled by the
DBMS and not embedded in the program.
10.
Transaction Processing
The DBMS must include concurrency control subsystems to ensure
that several users trying to update the same data do so in a controlled
manner. The results of any updates to the database must maintain
consistency and validity.
11.
Providing multiple views of data
A view may be a subset of the database. Various users may have
different views of the database itself. Users may not need to be aware of
how and where the data they refer to is stored.
12.
Providing backup and recovery facilities
If the computer system fails in the middle of a complex update
process, the recovery subsystem is responsible for making sure that the
database is restored to the stage it was in before the process started
executing.
13.
Managing information
Managing information means taking care of it so that it works
for us, and is useful for the work we are doing. The information we collect is
no longer subject to “accidental disorganization” and becomes more easily
accessible and integrated with the rest of our work. Managing information using
a database allows us to become strategic users of the data we have.
Topic 3.1 Database Management Systems (DBMS)
A database management
system (DBMS) is a collection of programs that enables you to store, modify,
and extract information from a database. There are many different types of
database management systems, ranging from small systems that run on personal
computers to huge systems that run on mainframes.
A database management
system (DBMS) is system software for creating and managing databases. The DBMS
provides users and programmers with a systematic way to create, retrieve,
update and manage data.
A DBMS makes it possible
for end users to create, read, update and delete data in a database. The DBMS
essentially serves as an interface between the database and end users or
application programs, ensuring that data is consistently organized and remains
easily accessible.
The DBMS manages three
important things: the data, the database engine that allows data to be
accessed, locked and modified -- and the database schema, which defines the
database’s logical structure. These three foundational elements help provide
concurrency, security, data integrity and uniform administration procedures.
Typical database administration tasks supported by the DBMS include change management,
performance monitoring/tuning and backup and recovery. Many database management
systems are also responsible for automated rollbacks, restarts and recovery as
well as the logging and auditing of activity.
The DBMS is perhaps most
useful for providing a centralized view of data that can be accessed by
multiple users, from multiple locations, in a controlled manner. A DBMS can
limit what data the end user sees, as well as how that end user can view the
data, providing many views of a single database schema. End users and software
programs are free from having to understand where the data is physically
located or on what type of storage media it resides because the DBMS handles
all requests.
The DBMS can offer both
logical and physical data independence. That means it can protect users and
applications from needing to know where data is stored or having to be
concerned about changes to the physical structure of data (storage and
hardware). As long as programs use the application programming interface (API)
for the database that is provided by the DBMS, developers won't have to modify
programs just because changes have been made to the database.
With relational DBMSs
(RDBMSs), this API is SQL, a standard programming language for defining,
protecting and accessing data in a RDBMS.
Topic 3.2 Components of DBMS
A database management system (DBMS) consists of
several components. Each component plays very important role in the database
management system environment. The major components of database management system
are:
·
Software
·
Hardware
·
Data
·
Procedures
·
Database
Access Language
·
Users
Software
The main component of a DBMS is the software. It is
the set of programs used to handle the database and to control and manage the
overall computerized database
1.
DBMS
software itself, is the most important software component in the overall system
2.
Operating
system including network software being used in network, to share the data of
database among multiple users.
3.
Application
programs developed in programming languages such as C++, Visual Basic that are
used to to access database in database management system. Each program contains
statements that request the DBMS to perform operation on database. The
operations may include retrieving, updating, deleting data etc . The
application program may be conventional or online workstations or
terminals.
Hardware
Hardware consists of a set of physical electronic
devices such as computers (together with associated I/O devices like disk
drives), storage devices, I/O channels, electromechanical devices that make
interface between computers and the real world systems etc, and so on. It is
impossible to implement the DBMS without the hardware devices, In a network, a
powerful computer with high data processing speed and a storage device with
large storage capacity is required as database server.
Data
Data is the most important component of the DBMS.
The main purpose of DBMS is to process the data. In DBMS, databases are
defined, constructed and then data is stored, updated and retrieved to and from
the databases. The database contains both the actual (or operational) data and
the metadata (data about data or description about data).
Procedures
Procedures refer to the instructions and rules that
help to design the database and to use the DBMS. The users that operate and
manage the DBMS require documented procedures on hot use or run the database
management system. These may include.
1.
Procedure
to install the new DBMS.
2.
To log on
to the DBMS.
3.
To use
the DBMS or application program.
4.
To make
backup copies of database.
5.
To change
the structure of database.
6.
To
generate the reports of data retrieved from database.
Database Access Language
The database access language is used to access the
data to and from the database. The users use the database access language to
enter new data, change the existing data in database and to retrieve required
data from databases. The user write a set of appropriate commands in a database
access language and submits these to the DBMS. The DBMS translates the user
commands and sends it to a specific part of the DBMS called the Database Jet
Engine. The database engine generates a set of results according to the
commands submitted by user, converts these into a user readable form called an
Inquiry Report and then displays them on the screen. The administrators may
also use the database access language to create and maintain the databases.
The most popular database access language is SQL
(Structured Query Language). Relational databases are required to have a
database query language.
Users
The users are the people who manage the databases
and perform different operations on the databases in the database system.There
are three kinds of people who play different roles in database system
1.
Application
Programmers
2.
Database Administrators
3.
End-Users
Application Programmers
The people who write application programs in
programming languages (such as Visual Basic, Java, or C++) to interact with
databases are called Application Programmer.
Database Administrators
A person who is responsible for managing the
overall database management system is called database administrator or simply
DBA.
End-Users
The end-users are the people who interact with
database management system to perform different operations on database such as
retrieving, updating, inserting, deleting data etc.
Topic 4-Advantages of DBMS
The
database management system has promising potential advantages, which are
explained below:
1.
Controlling Redundancy: In file system, each application has its
own private files, which cannot be shared between multiple applications. This
can often lead to considerable redundancy in the stored data, which results in
wastage of storage space. By having centralized database most of this can be
avoided. It is not possible that all redundancy should be eliminated. Sometimes
there are sound business and technical reasons for· maintaining multiple copies
of the same data. In a database system, however this redundancy can be
controlled.
For
example: In case of college database, there may be the number of
applications like General Office, Library, Account Office, Hostel etc. Each of
these applications may maintain the following information into own private file
applications:
It is
clear from the above file systems, that there is some common data of the
student which has to be mentioned in each application, like Rollno, Name, Class,
Phone_No~ Address etc. This will cause the problem of redundancy which results
in wastage of storage space and difficult to maintain.
2. Integrity
can be enforced: Integrity of data means that data in database is always
accurate, such that incorrect information cannot be stored in database. In
order to maintain the integrity of data, some integrity constraints are
enforced on the database. A DBMS should provide capabilities for defining and
enforcing the constraints.
3. Inconsistency
can be avoided : When the same data is duplicated and changes are made at one
site, which is not propagated to the other site, it gives rise to inconsistency
and the two entries regarding the same data will not agree. At such times the
data is said to be inconsistent. So, if the redundancy is removed chances of
having inconsistent data is also removed.
4. Data
can be shared: The data about entities is shared by multiple applications
in centralized DBMS as compared to file system so now applications can be
developed to operate against the same stored data. The applications may be
developed without having to create any new stored files.
5.
Standards can be enforced : Since DBMS is a central system, so
standard can be enforced easily may be at Company level, Department level, National
level or International level. The standardized data is very helpful during
migration or interchanging of data. The file system is an independent system so
standard cannot be easily enforced on multiple independent applications.
6. Restricting
unauthorized access: When multiple users share a database, it is likely that some
users will not be authorized to access all information in the database. For
example, account office data is often considered confidential, and hence only
authorized persons are allowed to access such data. In addition, some users may
be permitted only to retrieve data, whereas other are allowed both to retrieve
and to update. Hence, the type of access operation retrieval or update must
also be controlled. Typically, users or user groups are given account numbers
protected by passwords, which they can use to gain access to the database. A
DBMS should provide a security and authorization subsystem, which the DBA uses
to create accounts and to specify account restrictions. The DBMS should then
enforce these restrictions automatically.
7. Providing
Backup and Recovery: A DBMS must provide facilities for recovering
from hardware or software failures. The backup and recovery subsystem of the
DBMS is responsible for recovery. For example, if the computer system fails in
the middle of a complex update program, the recovery subsystem is responsible
for making sure that the .database is restored to the state it was in before
the program started executing.
8. Cost
of developing and maintaining system is lower: It is much easier to
respond to unanticipated requests when data is centralized in a database than
when it is stored in a conventional file system. Although the initial cost of
setting up of a database can be large, but the cost of developing and
maintaining application programs to be far lower than for similar service using
conventional systems. The productivity of programmers can be higher in using
non-procedural languages that have been developed with DBMS than using
procedural languages.
9. Concurrency
Control : DBMS systems provide mechanisms to provide concurrent access
of data to multiple users.
Topic 5-Disadvantages of
DBMS
The
disadvantages of the database approach are summarized as follows:
1. Complexity
: The provision of the functionality that is expected of a good DBMS
makes the DBMS an extremely complex piece of software. Database designers,
developers, database administrators and end-users must understand this
functionality to take full advantage of it. Failure to understand the system
can lead to bad design decisions, which can have serious consequences for an
organization.
2. Size
: The complexity and breadth of functionality makes the DBMS an
extremely large piece of software, occupying many megabytes of disk space and
requiring substantial amounts of memory to run efficiently.
3. Performance: Typically,
a File Based system is written for a specific application, such as invoicing.
As result, performance is generally very good. However, the DBMS is written to
be more general, to cater for many applications rather than just one. The
effect is that some applications may not run as fast as they used to.
4. Higher
impact of a failure: The centralization of resources increases the
vulnerability of the system. Since all users and applications rely on the
vailability of the DBMS, the failure of any component can bring operations to a
halt.
5. Cost
of DBMS: The cost of DBMS varies significantly, depending on the
environment and functionality provided. There is also the recurrent annual
maintenance cost.
6.
Additional Hardware costs: The disk storage requirements for the DBMS
and the database may necessitate the purchase of additional storage space.
Furthermore, to achieve the required performance it may be necessary to
purchase a larger machine, perhaps even a machine dedicated to running the
DBMS. The procurement of additional hardware results in further expenditure.
7. Cost
of Conversion: In some situations, the cost of the DBMS and extra hardware
may be insignificant compared with the cost of converting existing applications
to run on the new DBMS and hardware. This cost also includes the cost of
training staff to use these new systems and possibly the employment of
specialist staff to help with conversion and running of the system. This cost
is one of the main reasons why some organizations feel tied to their current
systems and cannot switch to modern database technology.
I would like to say that this write-up very forced me to try and do it! Your writing style has been surprised me.
ReplyDeleteDocument Management Software India
Document Management Software Chennai
Document Management Software