2. Specific objective
• State the importance of DBMS effectiveness and
database tools
• State the advantages of using database system to
store operational data.
• Explain the concept of RDBMS
• Describe the overall structure of DBMS &
Architecture of Client/Server system.
• Explain the concept of data mining and data
warehousing
3. An introduction to database
• Data
• Database
• DBMS
• Disadvantages of file processing system
• Advantages of DBMS over file processing
system
• Application of database.
4. What is Data
• Data is a collection of facts, such as numbers, words,
measurements, observations or even just
descriptions of things.
• The singular form is "datum", so we say "that datum
is very high".
• "Data" is the plural so we say "the data are
available", but it is also a collection of facts, so "the
data is available" is fine too.
• Give Example of data
5. What is Database
•A database is an organized collection of data. The data
is typically organized to model aspects of reality in a
way that supports processes requiring information
•Database management systems (DBMS) are computer
software applications that interact with the user, other
applications, and the database itself to capture and
analyze data. A general-purpose DBMS is designed to
allow the definition, creation, querying, update, and
administration of databases
6. Disadvantages of file processing
system
• Before knowing the disadvantage of file processing
we should know what is file processing
• A file processing system is a collection of files and
programs that access/modify these files. Typically,
new files and programs are added over time (by
different programmers) as new information needs to
be stored and new ways to access information are
needed.
7. Disadvantages of file processing system, Advantages of
DBMS over file processing system
• Problems with file processing systems:
• data redundancy and inconsistency
• difficulty of accessing data
• Atomicity problems - ensuring that a system failure
during a database update does not leave the database
in an inconsistent state
• Security problems
a) Not all users should have access to all data
b) Example: bank payroll personnel shouldn’t know my
checking account balance
c) Difficult to enforce security in an ad hoc system
8. • Integrity problems
a) Data may need to satisfy certain conditions,
called consistency constraints
b) Example: account balances should never fall
below $0
c) Difficult to enforce/add/change consistency
constraints in a file processing system
9. Advantage of DBMS over File Processing
• Sharing of the data
• Reduction in Redundancy
• Avoiding Inconsistency
• Transaction support
• Maintaining Integrity
• Enforcement of security
• Balancing conflicting requirements
• Enforcing standards
10. Application of database
• BANKING
• AIRLINES
• UNIVERSITIES
• TELECOMMUNICATION
• SALES & MARKETTING
• ONLINE TRADING
• MANUFACTURING UNITS
• HUMAN RESOURCES DEVELOPMENTS
• SCIENTIFIC APPLICATION & GOVERNMENT DEPARTMENT
11. What is RDBMS,
• RDBMS stands for Relational Database Management System. It
organizes data into related rows and columns.
• The principles of the relational model were first outlined by Dr. E. F.
Codd in a June 1970 paper called "A Relational Model of Data for
Large Shared Data Banks:' In this paper. Dr. Codd proposed the
relational model for database systems. The more popular models
used at that time were hierarchical and network, or even simple flat
file data structures. Relational database management systems
(RDBMS) soon became very popular, especially for their ease of use
and flexibility in structure. In addition, a number of innovative
vendors, such as Oracle, supplemented the RDBMS with a suite of
powerful application development and user products, providing a
total solution.
12. Properties of RDBMS
• Every value has to be atomic
• Each an every row is unique
• Column values are of the same kind /type
• The sequence
13. Difference Between DBMS & RDBMS
Sr No DBMS RDBMS
1)
DBMS applications store data as
file.
RDBMS applications store data in a
tabular form.
2)
In DBMS, data is generally stored
in either a hierarchical form or a
navigational form.
In RDBMS, the tables have an identifier
called primary key and the data values
are stored in the form of tables.
3)
Normalization is not present in
DBMS.
Normalization is present in RDBMS.
4)
DBMS does not apply any
security with regards to data
manipulation.
RDBMS defines the integrity constraint
for the purpose of ACID (Atomocity,
Consistency, Isolation and Durability)
property.
5)
DBMS uses file system to store
data, so there will be no relation
between the tables.
in RDBMS, data values are stored in the
form of tables, so a relationship between
these data values will be stored in the
form of a table as well.
14. Sr.
No.
DBMS RDBMS
6)
DBMS has to provide some
uniform methods to access the
stored information.
RDBMS system supports a tabular structure
of the data and a relationship between them
to access the stored information.
7)
DBMS does not support
distributed database.
RDBMS supports distributed database.
8)
DBMS is meant to be for small
organization and deal with
small data. it supports single
user.
RDBMS is designed to handle large amount
of data. it supports multiple users.
9)
Examples of DBMS are file
systems, xml etc.
Example of RDBMS are mysql, postgre, sql
server, oracle etc.
15. Names of various DBMS and RDBMS
softwares
DBMS RDBM
FoxPro
FoxProW
Dbase
MS Access etc
Oracle RDBMS
IBM DB2
Microsoft SQL Server
SAP Sybase ASE
Teradata
ADABAS
MySQL
FileMaker
Microsoft Access
Informix
16. • Data abstraction
• Database languages
• Instance and schema,
• Data independence
a) Logical and
b) Physical Independence.
17. Data abstraction
• The major purpose of a database system is to provide users
with an abstract view of the system. The system hides certain
details of how data is stored and created and maintained
Complexity should be hidden from database users.
• Data abstraction is usually the first step in database design.
A complete database is much too complex a system to be
developed without first creating a simplified framework.
• Data abstraction makes it possible for the developer to start
from essential elements data abstractions and incrementally
add data detail to create the final system
19. • There are several levels of abstraction: Physical Level:
– How the data are stored.
– E.g. index, B-tree, hashing.
– Lowest level of abstraction.
– Complex low-level structures described in detail.
• Logical /Conceptual Level:
– Next highest level of abstraction.
– Describes what data are stored.
– Describes the relationships among data.
– Database administrator level.
• View Level:
– Highest level.
– Describes part of the database for a particular group of users.
– Can be many different views of a database.
– E.g. tellers in a bank get a view of customer accounts, but not of payroll data.
20. Database languages
• DDL – Data Definition Language
• DML- Data Manipulation Language
• DCL – Data Control Language
21. DDL – Data Definition Language
• Data Definition Language (DDL) is a standard for commands
that define the different structures in a database. DDL
statements create, modify, and remove database objects such
as tables, indexes, and users. Common DDL statements are
CREATE, ALTER, and DROP.
• For describing data and data structures a suitable description
tool, a data definition language (DDL), is needed. With this
help a data scheme can be defined and also changed later.
22. • Typical DDL operations (with their respective
keywords in the structured query language SQL):
Creation of tables and definition of attributes
(CREATE TABLE ...)
• Change of tables by adding or deleting attributes
(ALTER TABLE …)
• Deletion of whole table including content (!) (DROP
TABLE …)
24. DML- Data Manipulation Language
• There are two types of DML
Procedural
Non Procedural
• Data Manipulation Language (DML) statements are used
for managing data within schema objects
• Additionally a language for the descriptions of the
operations with data like store, search, read, change, etc.
the so-called data manipulation, is needed. Such
operations can be done with a data manipulation
language (DML). Within such languages keywords like
insert, modify, update, delete, select, etc. are common
25. • SELECT - retrieve data from the a database
• INSERT - insert data into a table
• UPDATE - updates existing data within a table
• DELETE - deletes all records from a table, the
space for the records remain
• MERGE - UPSERT operation (insert or update)
• CALL - call a PL/SQL or Java subprogram
• EXPLAIN PLAN - explain access path to data
• LOCK TABLE - control concurrency
26. DCL – Data Control Language
• Used to control access to data stored in a
database. In particular, it is a component of
Structured Query Language (SQL).Examples of
DCL commands include
• GRANT - gives user's access privileges to
database
• REVOKE - withdraw access privileges given
with the GRANT command
27. • The operations for which privileges may be
granted to or revoked from a user or role may
include CONNECT, SELECT, INSERT, UPDATE,
DELETE, EXECUTE, and USAGE.
• In the Oracle database, executing a DCL
command issues an implicit commit. Hence
you cannot roll back the command.
28. TCL:-Transaction control Language
• Transaction Control (TCL) statements are used to
manage the changes made by DML statements. It allows
statements to be grouped together into logical
transactions.
• COMMIT - save work done
• SAVEPOINT - identify a point in a transaction to which
you can later roll back
• ROLLBACK - restore database to original since the last
COMMIT
• SET TRANSACTION - Change transaction options like
isolation level and what rollback segment to use
29. Schema and Instance
• A database schema is the skeleton structure that
represents the logical view of the entire database. It
defines how the data is organized and how the
relations among them are associated. It formulates all
the constraints that are to be applied on the data.
• A database schema defines its entities and the
relationship among them. It contains a descriptive
detail of the database, which can be depicted by means
of schema diagrams. It’s the database designers who
design the schema to help programmers understand
the database and make it useful.
31. • Physical Database Schema − This schema pertains to
the actual storage of data and its form of storage like
files, indices, etc. It defines how the data will be
stored in a secondary storage.
• Logical Database Schema − This schema defines all
the logical constraints that need to be applied on the
data stored. It defines tables, views, and integrity
constraints.
32. Database Instance
• It is important that we distinguish these two terms
individually. Database schema is the skeleton of
database. It is designed when the database doesn't
exist at all. Once the database is operational, it is very
difficult to make any changes to it. A database schema
does not contain any data or information.
• A database instance is a state of operational database
with data at any given time. It contains a snapshot of
the database. Database instances tend to change with
time. A DBMS ensures that its every instance (state) is
in a valid state, by diligently following all the
validations, constraints, and conditions that the
database designers have imposed.
33. Data independence
• A database system normally contains a lot of data in addition
to users’ data. For example, it stores data about data, known
as metadata, to locate and retrieve data easily. It is rather
difficult to modify or update a set of metadata once it is
stored in the database. But as a DBMS expands, it needs to
change over time to satisfy the requirements of the users. If
the entire data is dependent, it would become a tedious and
highly complex job. Metadata itself follows a layered
architecture, so that when we change data at one layer, it
does not affect the data at another level. This data is
independent but mapped to each other.
35. Logical Data Independence
• Logical data is data about database, that is, it
stores information about how data is managed
inside. For example, a table (relation) stored in
the database and all its constraints, applied on
that relation.
• Logical data independence is a kind of
mechanism, which liberalizes itself from actual
data stored on the disk. If we do some changes
on table format, it should not change the data
residing on the disk.
36. Physical Data Independence
• All the schemas are logical, and the actual data is
stored in bit format on the disk. Physical data
independence is the power to change the
physical data without impacting the schema or
logical data.
• For example, in case we want to change or
upgrade the storage system itself − suppose we
want to replace hard-disks with SSD − it should
not have any impact on the logical data or
schemas.
37. Architecture of DBMS
• The architecture of DBMS is highly related with the
computer system on which it is mounted. The
different approaches are implemented to use the
database
a) Client server based
b) Parallel computing based
c) Distributed database
40. • 3. Data Structure :
• (a) Data Files
(b) Data Dictionary
(c) Indices
(d) Statistical Data
41. Data Structures
• Following data structures are required as a part of the physical
system implementation.
• Data Files : It stores the database.
• Data Dictionary : It stores meta data (data about data) about the
structure of the database.
• Indices : Provide fast access to data items that hold particular
values.
• Statistical Data : It stores statistical information about the data in
the database. This information is used by query processor to select
efficient ways to execute query.
42. Query Processor Components
• DML Pre-compiler : It translates DML statements in a query
language into low level instructions that query evaluation engine
understands. It also attempts to transform user's request into an
equivalent but more efficient form.
• Embedded DML Pre-compiler : It converts DML statements
embedded in an application program to normal procedure calls in
the host language. The Pre-compiler must interact with the DML
compiler to generate the appropriate code.
• DDL Interpreter : It interprets the DDL statements and records
them in a set of tables containing meta data or data dictionary.
• Query Evaluation Engine : It executes low-level instructions
generated by the DML compiler.
43. Storage Manager Components
• They provide the interface between the low-level data stored in the
database and application programs and queries submitted to the system.
• Authorization and Integrity Manager : It tests for the satisfaction of
integrity constraints checks the authority of users to access data.
• Transaction Manager : It ensures that the database remains in a
consistent state despite the system failures and that concurrent
transaction execution proceeds without conflicting.
• File Manager : It manages the allocation of space on disk storage and the
data structures used to represent information stored on disk.
• Buffer Manager : It is responsible for fetching data from disk storage into
main memory and deciding what data to cache in memory.
44. Database Users
• Naïve users:- Naive means Lacking Experience, these are the users
who need not be aware of the presence of the Data Base System.
Example of these type of users is The user of an ATM machine.
Because these users only responds to the instructions displayed on
the screen (enter your pin number, click here, enter the required
money etc). Obviously operations performed by these users are
very limited
• Application Programmers:- Professional / Application
programmers are those who are responsible for developing
application programs or user interface. The application
programs could be written in a general-purpose
programming language or the commands available to
manipulate a database.
45. • Sophiscated users:-Simply we can say that these are the EXPERIENCED
users. These people interact with the system without writing
programs. Instead they from their requests in a database query
language. They submit each such query to a query processor, whose
function is to break down DML (Data Manipulation Language, the
language which is used to MAINTAIN the data. we shall discuss
about this later) statements into instructions that the storage
manager understands. Analysts who submit queries to explore data
in the Data Base fall in this category.
• Specialized users:-These are the sophisticated users who write
specialized database applications that do not fit into the traditional
data-processing framework. Among these applications are
computer - aided design systems, knowledge-based and expert
systems, systems that store data with complex data types (Ex,
Graphics Data and Audio Data) and environment-modeling systems.
46. Functions of Database Administrator
• Schema Definition:- The Database Administrator creates
the database schema by executing DDL
statements. Schema includes the logical structure of
database table(Relation) like data types of
attributes,length of attributes,integrity constraints etc.
• Storage structure and access method definition:- atabase
tables or indexes are stored in the following ways: Flat
files,Heaps,B+ Tree etc
• Schema and physical organization modification:-
The DBA carries out changes to the existing schema and
physical organization.
•
47. • Granting authorization for data access:- The DBA provides
different access rights to the users according to their level. Ordinary
users might have highly restricted access to data, while you go up in
the hierarchy to the administrator ,you will get more access rights.
• Routine Maintenance:- Some of the routine maintenance
activities of a DBA is given below.
a) Taking backup of database periodically
b) Ensuring enough disk space is available all the time.
c) Monitoring jobs running on the database.
d) Ensure that performance is not degraded by some expensive
task submitted by some users.
e) Performance Tuning
48. Introduction to client server architecture.
• In client-server architectures the application
program functions are divided up between
clients and servers.
• The client takes care of the presentation logic.
• The server handles data storage and data
access logic.
• Application logic may reside on the client,
server or be split up between the two.
• Most networks today use a client-server
architecture .
50. • Client-server architectures are more efficient
since they distribute processing between client
and server.
• Another strength is that they allow hardware and
software from different servers to be used
together.
• This is also a weakness, since it is sometimes
difficult to get software from different vendors to
work together smoothly.
• For this reason, a third category of software,
called middleware was developed.
51. Two/Three tier Architecture.
• Two-Tier Architecture:- The two-tier is based on
Client Server architecture. The two-tier architecture is like
client server application. The direct communication takes
place between client and server. There is no intermediate
between client and server. Because of tight coupling a 2 tiered
application will run faster.
• The above figure shows the architecture of two-tier. Here the
communication is one to one. Let us see the concept of two
tier with real time application. For example now we have a
need to save the employee details in database. The two tiers
of two-tier architecture is
52. • Database (Data tier)
• Client Application (Client tier)
• So, in client application the client writes the
program for saving the record in SQL Server
and thereby saving the data in the database.
54. • Three-Tier Architecture:
• Three tier architecture having three layers. They are
• Client layer
• Business layer
• Data layer
• Client layer: Here we design the form using textbox, label
etc.
• Business layer: It is the intermediate layer which has the
functions for client layer and it is used to make
communication faster between client and data layer. It
provides the business processes logic and the data access.
• Data layer: it has the database.
55. • Advantages
• Easy to modify with out affecting other
modules
• Fast communication
• Performance will be good in three tier
architecture.
56. The 12 Rules (Codd’s laws) for fully functional
RDBMS.
•Rule 1: Information Rule:- The data stored in a database, may it be
user data or metadata, must be a value of some table cell. Everything
in a database must be stored in a table format.
•Rule 2: Guaranteed Access Rule:- Every single data element (value) is
guaranteed to be accessible logically with a combination of table-
name, primary-key (row value), and attribute-name (column value). No
other means, such as pointers, can be used to access data.
•Rule 3: Systematic Treatment of NULL Values:-The NULL values in a
database must be given a systematic and uniform treatment. This is a
very important rule because a NULL can be interpreted as one the
following − data is missing, data is not known, or data is not applicable.
57. • Rule 4: Active Online Catalog:- The structure description of the entire
database must be stored in an online catalog, known as data dictionary,
which can be accessed by authorized users. Users can use the same query
language to access the catalog which they use to access the database
itself.
• Rule 5: Comprehensive Data Sub-Language Rule:- A database can only be
accessed using a language having linear syntax that supports data
definition, data manipulation, and transaction management operations.
This language can be used directly or by means of some application. If the
database allows access to data without any help of this language, then it is
considered as a violation.
• Rule 6: View Updating Rule:- All the views of a database, which can
theoretically be updated, must also be updatable by the system.
58. • Rule 7: High-Level Insert, Update, and Delete Rule:- A database must
support high-level insertion, updation, and deletion. This must not be
limited to a single row, that is, it must also support union, intersection and
minus operations to yield sets of data records.
• Rule 8: Physical Data Independence:- The data stored in a database must
be independent of the applications that access the database. Any change
in the physical structure of a database must not have any impact on how
the data is being accessed by external applications.
• Rule 9: Logical Data Independence:-The logical data in a database must be
independent of its user’s view (application). Any change in logical data
must not affect the applications using it. For example, if two tables are
merged or one is split into two different tables, there should be no impact
or change on the user application. This is one of the most difficult rule to
apply.
59. • Rule 10: Integrity Independence:- A database must be independent of the
application that uses it. All its integrity constraints can be independently
modified without the need of any change in the application. This rule
makes a database independent of the front-end application and its
interface.
• Rule 11: Distribution Independence:- The end-user must not be able to
see that the data is distributed over various locations. Users should always
get the impression that the data is located at one site only. This rule has
been regarded as the foundation of distributed database systems.
• Rule 12: Non-Subversion Rule:- If a system has an interface that provides
access to low-level records, then the interface must not be able to subvert
the system and bypass security and integrity constraints.
60. Introduction to Distributed database
• A distributed database system allows applications to
access data from local and remote databases. In a
homogenous distributed system, each database is an
Oracle database. In a heterogeneous distributed
system, at least one of the databases is a non-Oracle
database. Distributed database uses a client-server
architecture to process information requests.
• This section contains the following topics:
• Homogenous Distributed Database Systems
• Heterogeneous Distributed Database Systems
• Client-Server Database Architecture
61. Homogenous Distributed Database Systems
• A homogenous distributed database system is a network of two or more
Oracle databases that reside on one or more machines. Below
figure illustrates a distributed system that connects three databases: HQ,
MFG, and SALES. An application can simultaneously access or modify the
data in several databases in a single distributed environment. For example,
a single query on local database MFG can retrieve joined data from the
PRODUCTS table on the local database and the DEPT table on the remote
HQ database.
• For a client application, the location and platform of the databases are
transparent. You can also create synonyms for remote objects in the
distributed system so that users can access them with the same syntax as
local objects. For example, if you are connected to database MFG yet want
to access data on database HQ, creating a synonym on MFG for the
remote DEPT table allows you to issue this query:
• SELECT * FROM dept; In this way, a distributed system gives the
appearance of native data access. Users on MFG do not have to know that
the data they access resides on remote databases.
63. Heterogeneous Distributed Database Systems
• In a heterogeneous distributed database system, at
least one of the databases is a non-Oracle system. To
the application, the heterogeneous distributed
database system appears as a single, local, Oracle
database; the local Oracle server hides the distribution
and heterogeneity of the data.
• The Oracle server accesses the non-Oracle system
using Oracle8i Heterogeneous Services and a system-
specific transparent gateway. For example, if you
include a DB2 database in an Oracle distributed system,
you need to obtain a DB2-specific transparent gateway
so that the Oracle databases in the system can
communicate with it.
64. Client-Server Database Architecture
• A database server is the Oracle software managing a database, and a client
is an application that requests information from a server. Each computer
in a network is a node that can host one or more databases. Each node in
a distributed database system can act as a client, a server, or both,
depending on the situation.
• In Below Figure , the host for the HQ database is acting as a database
server when a statement is issued against its local data (for example, the
second statement in each transaction issues a statement against the local
DEPT table), but is acting as a client when it issues a statement against
remote data (for example, the first statement in each transaction is issued
against the remote table EMP in the SALES database)
66. Introduction to datamining
• Data mining, the extraction of hidden predictive
information from large databases, is a powerful new
technology with great potential to help companies focus on
the most important information in their data warehouses.
Data mining tools predict future trends and behaviors,
allowing businesses to make proactive, knowledge-driven
decisions. The automated, prospective analyses offered by
data mining move beyond the analyses of past events
provided by retrospective tools typical of decision support
systems. Data mining tools can answer business questions
that traditionally were too time consuming to resolve. They
scour databases for hidden patterns, finding predictive
information that experts may miss because it lies outside
their expectations.
67. data warehousing
• In computing , a data warehouse (DW or DWH), also known as an
enterprise data warehouse (EDW), is a system used for reporting
and data analysis . DWs are central repositories of integrated data
from one or more disparate sources. They store current and
historical data and are used for creating analytical reports for
knowledge workers throughout the enterprise. Examples of reports
could range from annual and quarterly comparisons and trends to
detailed daily sales analyses.
• The data stored in the warehouse is uploaded from the operational
systems (such as marketing, sales, etc., shown in the figure to the
right). The data may pass through an operational data store for
additional operations before it is used in the DW for reporting.