Más contenido relacionado


dbms introduction.pptx

  1. DBMS II SEM Introduction
  2. Introduction • Data: Known facts that can be recorded and have an implicit meaning; raw data, unprocessed data • Information: Processed data • Database: a highly organized, interrelated, and structured set of data about a particular enterprise • Controlled by a database management system (DBMS) • DBMS • Set of programs to access the data • An environment that is both convenient and efficient to use • Database systems are used to manage collections of data that are: • Highly valuable • Relatively large • Accessed by multiple users and applications, often at the same time. • A modern database system is a complex software system whose task is to manage a large, complex collection of data. • Databases touch all aspects of our lives
  3. Database Examples • Enterprise Information • Sales: customers, products, purchases • Accounting: payments, receipts, assets • Human Resources: Information about employees, salaries, payroll taxes. • Manufacturing: management of production, inventory, orders, supply chain. • Banking and finance • customer information, accounts, loans, and banking transactions. • Credit card transactions • Finance: sales and purchases of financial instruments (e.g., stocks and bonds; storing real-time market data • Universities: registration, grades
  4. Databases • Traditional applications: • Numeric and textual databases • More recent applications: • Multimedia databases • Geographic Information Systems (GIS) • Biological and genome databases • Data warehouses • Mobile databases • Real-time and active databases
  5. • Social Networks started capturing a lot of information about people and about communications among people-posts, tweets, photos, videos in systems such as: - Facebook - Twitter - Linked-In • All of the above constitutes data • Search Engines, Google, Bing, Yahoo: collect their own repository of web pages for searching purposes
  6. DBMS Functions • Define a particular database in terms of its data types, structures etc. • Construct or load the initial database contents on a secondary storage medium • Manipulating the database: • Retrieval: Querying, generating reports • Modification: Insertions, deletions and updates to its content • Accessing the database through Web applications • Processing and sharing by a set of concurrent users and application programs – yet, keeping all data valid and consistent
  7. • DBMS may additionally provide: • Protection or security measures to prevent unauthorized access • “Active” processing to take internal actions on data • Presentation and visualization of data • Maintenance of the database and associated programs over the lifetime of the database application
  8. Purpose of Database Systems File-processing system is supported by a conventional operating system. The system stores permanent records in various files, and it needs different application programs to extract records from, and add records to, the appropriate files. Before database management systems (DBMSs) were introduced, organizations usually stored information in such systems. Data redundancy and inconsistency: • Since different programmers create the files and application programs over a long period, the various files are likely to have different structures and the programs may be written in several programming languages. Moreover, the same information may be duplicated in several places (files). This redundancy leads to higher storage and access cost. In addition, it may lead to data inconsistency; that is, the various copies of the same data may no longer agree. Difficulty in accessing data • Need to write a new program to carry out each new task Data isolation • Multiple files and formats. Because data are scattered in various files, and files may be in different formats, writing new application programs to retrieve the appropriate data is difficult. Integrity problems: • The data values stored in the database must satisfy certain types of consistency constraints. Suppose also that the university requires that the account balance of a department may never fall below zero. Developers enforce these constraints in the system by adding appropriate code in the various application programs. However, when new constraints are added, it is difficult to change the programs to enforce them. The problem is compounded when constraints involve several data items from different files.
  9. • Atomicity of updates • Failures may leave database in an inconsistent state with partial updates carried out • Example: Transfer of funds from one account to another should either complete or not happen at all • Concurrent access by multiple users • Concurrent access needed for performance • Uncontrolled concurrent accesses can lead to inconsistencies • Ex: Two people reading a balance (say 100) and updating it by withdrawing money (say 50 each) at the same time • Security problems • Hard to provide user access to some, but not all, data
  10. Simplified database system environment
  11. Types of Databases 1. Relational Database • A relational database management system (RDBMS) is a system where data is organized in two-dimensional tables using rows and columns. • This is one of the most popular data models which is used in industries. It is based on SQL. • Every table in a database has a key field which uniquely identifies each record. • This type of system is the most widely used DBMS. • Relational database management system software is available for personal computers, workstation and large mainframe systems. • For example − Oracle Database, MySQL, Microsoft SQL Server etc.
  12. 2. Object Oriented Database • It is a system where information or data is represented in the form of objects which is used in object-oriented programming. • It is a combination of relational database concepts and object-oriented principles. • Relational database concepts are concurrency control, transactions, etc. • OOPs principles are data encapsulation, inheritance, and polymorphism. • It requires less code and is easy to maintain. • For example − Object DB software.
  13. 3. Hierarchical Database • It is a system where the data elements have a one to many relationship (1: N). Here data is organized like a tree which is similar to a folder structure in your computer system. • The hierarchy starts from the root node, connecting all the child nodes to the parent node. • It is used in industry on mainframe platforms. • For example− IMS(IBM), Windows registry (Microsoft).
  14. 4. Network database • A Network database management system is a system where the data elements maintain one to one relationship (1: 1) or many to many relationship (N: N). • It also has a hierarchical structure, but the data is organized like a graph and it is allowed to have more than one parent for one child record.
  15. 5. NoSQL databases • NoSQL is a broad category that includes any database that doesn’t use SQL as its primary data access language. • These types of databases are also sometimes referred to as non-relational databases. • Unlike in relational databases, data in a NoSQL database doesn’t have to conform to a pre-defined schema, so these types of databases are great for organizations seeking to store unstructured or semi-structured data. • One advantage of NoSQL databases is that developers can make changes to the database on the fly, without affecting applications that are using the database. • • Examples: Apache Cassandra, MongoDB, CouchDB, and CouchBase
  16. 6. Cloud databases • A cloud database refers to any database that’s designed to run in the cloud. Like other cloud-based applications, cloud databases offer flexibility and scalability, along with high availability. Cloud databases are also often low-maintenance, since many are offered via a SaaS model. • Examples: Microsoft Azure SQL Database, Amazon Relational Database Service, Oracle Autonomous Database. 7. Columnar databases • Also referred to as column data stores, store data in columns rather than rows. These types of databases are often used in data warehouses because they’re great at handling analytical queries. When you’re querying a columnar database, it essentially ignores all of the data that doesn’t apply to the query, because you can retrieve the information from only the columns you want. • Examples: Google BigQuery, Cassandra, HBase, MariaDB, Azure SQL Data Warehouse
  17. 8. Document databases • Document databases, also known as document stores, use JSON-like documents to model data instead of rows and columns. Sometimes referred to as document-oriented databases, document databases are designed to store and manage document-oriented information, also referred to as semi-structured data. Document databases are simple and scalable, making them useful for mobile apps that need fast iterations. • Examples: MongoDB, Amazon DocumentDB, Apache CouchDB 9. Graph databases • Graph databases are a type of NoSQL database that are based on graph theory. Graph-Oriented Database Management Systems (DBMS) software is designed to identify and work with the connections between data points. Therefore graph databases are often used to analyze the relationships between heterogeneous data points, such as in fraud prevention or for mining data about customers from social media. • Examples: Datastax Enterprise Graph, Neo4J 10. Time series databases • A time series database is a database optimized for time-stamped, or time series, data. Examples of this type of data include network data, sensor data, and application performance monitoring data. All of those Internet of Things sensors that are getting attached to everything put out a constant stream of time series data. • Examples: Druid, eXtremeDB, InfluxDB
  18. Characteristics of Database Approach 1. Self-Describing Nature of a Database System : One of the most fundamental characteristics of the database approach is that the database system contains not only the database itself but also an entire definition or description of the database structure and constraints also known as metadata of the database. 2. Support for Multiple Views of the Data : • A database sometimes has many users, each of whom may require a special perspective or view of the database. • A view could also be a subset of the database, or it’s going to contain virtual data that is derived from the database files but isn’t explicitly stored.
  19. 3. Sharing of knowledge and Multi-user Transaction Processing: • A multi-user DBMS, as its name implies, must allow multiple users to access the database at an equivalent time or concurrently. • This is often essential if data for multiple applications is to be integrated and maintained during a single database such as the latest feature of WhatsApp integration with Facebook. • The DBMS must implement concurrency control in the software to make sure that several users trying to update equivalent data do so in a controlled manner in order that the results of the updates are correct. 4. Manages Information • A database always takes care of its information because information is always helpful for whatever work we do. It manages all the information that is required to us. 5. Easy Operation Implementation • All the operations like insert, delete, update, search etc. are carried out in a flexible and easy way. Database makes it very simple to implement these operations. A user with little knowledge can perform these operations. This characteristic of database makes it more powerful.
  20. 6. Data For Specific Purpose • A database is designed for data of specific purpose. For example, a database of student management system is designed to maintain the record of student’s marks, fees and attendance etc. This data has a specific purpose of maintaining student record. 7. It has Users of Specific Interest • A database always has some indented group of users and applications in which these user groups are interested. • For example, in a library system, there are three users, official administration of the college, the librarian, and the students.
  21. Characteristics of Data in the Database ▰Shared ▰Persistence ▰Validity/ Correctness ▰Security ▰Consistency ▰Non-Redundancy ▰Independence
  22. Advantages of Using the Database Approach ▰Controlling redundancy in data storage and in development and maintenance efforts. ▻Sharing of data among multiple users. ▰Restricting unauthorized access to data. ▰Providing persistent storage for program Objects ▰Providing Storage Structures (e.g. indexes) for efficient Query Processing ▰Providing backup and recovery services. ▰Providing multiple interfaces to different classes of users. ▰Representing complex relationships among data. ▰Enforcing integrity constraints on the database. ▰Drawing inferences and actions from the stored data using deductive and active rules
  23. When not to use a DBMS ▰Main inhibitors (costs) of using a DBMS: ▻High initial investment and possible need for additional hardware. ▻Overhead for providing generality, security, concurrency control, recovery, and integrity functions. ▰When a DBMS may be unnecessary: ▻If the database and applications are simple, well defined, and not expected to change. ▻If there are stringent real-time requirements that may not be met because of DBMS overhead. ▻If access to data by multiple users is not required. ▰When no DBMS may suffice: ▻If the database system is not able to handle the complexity of data because of modeling limitations ▻If the database users need special operations not supported by the DBMS.
  24. Database Users ▰Users may be divided into ▻Those who actually use and control the database content, and those who design, develop and maintain database applications (called “Actors on the Scene”), and ▻Those who design and develop the DBMS software and related tools, and the computer systems operators (called “Workers Behind the Scene”).
  25. Actors on the scene ▻Database administrators: ▻Responsible for authorizing access to the database, for coordinating and monitoring its use, acquiring software and hardware resources, controlling its use and monitoring efficiency of operations. ▻Database Designers: ▻Responsible to define the content, the structure, the constraints, and functions or transactions against the database. They must communicate with the end-users and understand their needs.
  26. ▻End-users: They use the data for queries, reports and some of them update the database content. End-users can be categorized into:  Casual: access database occasionally when needed  Naïve or Parametric: they make up a large section of the end-user population.  They use previously well-defined functions in the form of “canned transactions” against the database.  Examples are bank-tellers or reservation clerks who do this activity for an entire shift of operations.  Sophisticated:  These include business analysts, scientists, engineers, others thoroughly familiar with the system capabilities.  Many use tools in the form of software packages that work closely with the stored database.  Stand-alone:  Mostly maintain personal databases using ready-to-use packaged applications.  An example is a tax program user that creates its own internal database.  Another example is a user that maintains an address book
  27. Workers Behind the scene ▰DBMS system designers and implementers : ▻Design and implement the DBMS modules and interfaces including modules for implementing the catalog, query language processing, interface processing, accessing and buffering data, controlling concurrency, and handling data recovery and security. ▰Tool developers ▻Design and implement tools which are optional packages for database design, performance monitoring, natural language or graphical interfaces, prototyping, simulation, and test data generation ▰Operators and maintenance personnel (system administration personnel) are responsible for the actual running and maintenance of the hardware and software environment for the database system.
  28. Schemas, Instances and Database State Database Schema (meta-data): The Design of a database is called the schema. It Includes descriptions of the database structure and the constraints that should hold on the database. The database schema changes very infrequently. Database Instance: The actual data stored in a database at a particular moment in time. Also called database state ( or occurrence, snapshot) The database state changes every time the database is updated. Schema is also called intension, whereas state is called extension.
  29. Schema diagram for UNIVERSITY database
  30. Instance diagram for UNIVERSITY database
  31. DBMS Architecture • Three-Schema Architecture External schema at the external level to describe the various user views. Usually uses the same data model as the conceptual level or high- level data model. Conceptual schema at the conceptual level to describe the structure and constraints for the whole database. Uses a conceptual or an implementation data model. Internal schema at the internal level to describe data storage structures and access paths. Typically uses a physical data model.
  32. • External/ View level • This is the highest level of database abstraction. It includes a number of external schemas or user views. This level provides different views of the same database for a specific user or a group of users. An external view provides a powerful and flexible security mechanism by hiding the parts of the database from a particular user. • Conceptual or Logical level • This level describes the structure of the whole database. It acts as a middle layer between the physical storage and user view. It explains what data to be stored in the database, what the data types are, and what relationship exists among those data. There is only one conceptual schema per database.
  33. • Internal or Physical level • This is the lowest level of database abstraction. It describes how the data is stored in the database and provides the methods to access data from the database. It allows viewing the physical representation of the database on the computer system. • The interface between the conceptual and internal schema identifies how an element in the conceptual schema is stored and how it may be accessed. It is one which is closest to physical storage.
  34. Data Independence The capacity to change the schema at one level without having to change the schema at the next higher level Types: Logical Data Independence: The capacity to change the conceptual schema without having to change the external schemas and their application programs. Physical Data Independence: The capacity to change the internal schema without having to change the conceptual schema. Requires only the mappings between one schema and higher-lever schemas to change
  35. Three Schema Architecture – Advantages • Database abstraction • Easier to use for a user. • Allows each user to access customized view of data. • Enables a database admin to change the storage structure without affecting the user’s view
  36. 3-tier Client Server DBMS Architecture
  37. • The 3-tier architecture consists of the three layers as follows − • Presentation layer − This layer is also called the client layer. The front- end layer consists of a user interface. The main purpose is to communicate with the application layer. • Application layer − This layer is also called the business logic layer. It acts as a middle layer between the client and the database server which are used to exchange partially processed data. • Database layer − In this layer the data or information is stored. This layer performs operations like insert, update and delete to connect with the database.