2. Part 1. Introduction to Database System
Introduction to Database
History of RDBMS
Entity-Relationship Modeling
Database Language
3. Introduction to Database
File-Based Approach
Each program defines and manages its own data
Limitation
Duplication of data
Data dependence
Incompatibility of files
Separation and isolation of data
Fixed queries/proliferation of application program
Database Approach
A shared collection of logically related data, designed
to meet the information needs of an organization
4.
Database Management System(DBMS)
A software system that enables users to define, create and
maintain the database and provides controlled access to
database
DDL
DML : procedural, non-procedural
Control : security, integrity, concurrency control,
recovery control, user-accessible catalog
Components of the DBMS Environment
Hardware - Software - Data - Procedures - People
5.
Advantages of DBMS
- Control of data redundancy - Economy of scale
- Data consistency
- Balance of conflicting requirements
- More information from the same amount of data
- Sharing of data
- Improved data accessibility and responsiveness
- Improved data integrity
- Increased productivity
- Improved security
- Improved maintenance through data
independence
- Enforcement of standards
- Increased concurrency
- Improved backup and recovery services
Disadvantages of DBMS
- Complexity, Size, Cost of DBMSs, Additional H/W costs
- Cost of conversion, Performance, Higher impact of a failure
6.
Three-Level Database Architecture
External Level
The users’ view of the database
Conceptual Level
The community view of the database
Internal Level
The physical representation of the database on the computer
7.
Functions of a DBMS
1. Data storage, retrieval, and update
2. A user-accessible catalog
3. Transaction support
4. Concurrency control services
5. Recovery services
6. Authorization services
7. Support for data communication
8. Integrity services
9. Services to promote data independence
10. Utility services
8.
Components of a DBMS
Programmers
Users
DBA
Application
Programs
Queries
Database
Schema
DML
preprocessor
Query
processor
DDL
compiler
Program
object code
Database
manager
Dictionary
manager
Access
methods
File
manager
System
buffers
Database and
system catalog
DBMS
9.
Components of Database Manager
Authorization
control
Integrity
checker
Command
processor
Query
optimizer
Transaction
manager
Scheduler
Buffer
manager
Recovery
manager
Data
Manager
10. History of RDBMS
History of DBMS
1960s - Apollo moon-landing project, GUAM
mid 1960s - IMS by IBM (hierarchical DBMS)
mid 1960s - IDS by GE (network DBMS)
1965 - CODASYL(Conference on Data SYStems
Language)
1967 -DBTG(Data Base Task Group)
1970 - E.F.Codd of the IBM Research Lab.
Late 1970s - System R project at IBM
1980s - commercial relational DBMS(DB2, Oracle,
Informix..)
Now - OODBMS, ORDBMS
11.
Terminology
Relation : a relation is a table with columns and rows
Attribute : an attribute is a named column of a relation
Domain : a domain is the set of allowable values for
one or more attributes
Tuple : a tuple is a row of a relation
Degree : the degree of a relation is the number of
attributes it contrains
Cardinality : the cardinality of a relation is the number
of tuples it contains
Relational database : a collection of normalized
relation
12.
Properties of Relations
The relation has a name that is distinct from all other
relation names
Each cell of the relation contains exactly on atomic
value
Each attribute has a distinct name
The values of an attribute are all from the same domain
The order of attributes has no significance
Each tuple is distinct; there are no duplicate tuples
The order of tuples has no significance, theoretically
13.
When is a DBMS Relational?
Foundational rules
Rule 0 : Foundational rule Rule 12 : Nonsubversion rule
Structural rules
Rule 1 : Information representation
Rule 6 : View updateing
Integrity rules
Rule 3 : Systematic treatment of null values
Rule 10 : Integrity independance
Data manipulation rules
Rule2 : Guaranteed access
Rule 4 : Dynamic online catalog based on the
relational model
Rule5 : Comprehensive data sublanguage
Rule7 : High-level insert, update, delete
Data independence rules
Rule8 : Physical data independence
Rule11 : Distribution independence
Rule 9 : Logical data independence
14. Entity-Relationship Modeling
Concepts of the E-R Modeling
Entity Types
An object or concept that is identified by the enterprise as having an
independent existence
Attributes
A property of an entity or a relationship type
Relationship Types
A meaningful association among entity types
15.
Conceptual Database Design
The process of constructing a model of the
information used in an enterprise, independent of all
physical considerations
Logical Database Design
The process of constructing a model of the
information used in an enterprise based on a specific
data model, but independent of a particular DBMS and
other physical considerations.
Physical Database Design
The process of producing a description of the
implementation of the database on secondary storage;
it describes the storage structures and access
methods used to archieve efficient access to the data
16. Database Language
SQL
1974 - SEQUEL by D.Chamberlin (IBM)
1975 - SQUARE by Boyce (System R project)
1976 - SEQUEL/2 (SQL) by Chamberlin and Boyce)
late 1970 - SQL(Oracle), QUEL(Ingres)
1982 - Relational Database Language(RDL) : ANSI
1987 - ISO standard
1989 - Integrity Enhancement Feature (ISO)
1992 - SQL2(SQL92) : ISO
19. Part 2. Understanding Oracle Database
Overview of oracle Database Architecture
Memory Structure
Process Structure
Storage Structure
New Features
20. Overview of Oracle Architecture
PMON
SMON
RECO
D000
S000
Redo Log
Buffer
SGA
Shared SQL Area
P000
Database Buffer Cashe
* Total SGA Size :
1700 Mbyte
* Fixed Size :
70 Kbyte
* Variavle Size :
490 MByte
TL-812
4,000,000 KByte
Server
1,200,000 KByte
DBW0
CKPT
Data File
Raw Device
USER
2,100 KByte
LGWR
ARCH
Archive Log Mode(50M)
21. Memory Structure : Shared Pool
Shared Pool
Library Cache
Dictionary
Cache
Shared
SQL Area
Control Structures
for example:
PL/SQL Procedures
and Package
Control Structures
for examples;
Locks
Library
Cache handles
and so on ...
Character Set
Conversion
Memory
Network Security
Attributes
and so on ..
Reusable
Runtime
Memory
Shared Pool Contents
- Text of the SQL or PL/SQL statement
- Parsed form of the SQL or PL/SQL statement
- Execution plan for the SQL or PL/SQL
statements
- Data dictionary cache containing rows of data
dictionary information
Library Cache
- shared SQL area
- private SQL area
- PL/SQL procedures and package
- control structures : lock and library cache handles
Dictionary Cache
- names of all tables and views in the database
- names and datatypes of columns in database tables
- privileges of all Oracle users
SHARED_POOL_SIZE
22. Memory Structure :Database Buffer
Cache
Database Buffer Cache holds copies of data blocks read from disk
All users concurrently connected to the system share access to the buffer cache
Dirty List
LRU List
Size = DB_BLOCK_SIZE * DB_BLOCK_BUFFERS
SGA
Shared Pool
Shared SQL
Area
Database Buffer Cache
23. Memory Structure :Redo Log Buffer
Circular buffer containing information about changes made to the database
save it redo entry
Redo Entries is used when Database Recovery
DBWR write contents of Redo Log Buffer to Online Redo Log
LOG_BUFFER
change vector #1
redo record
change vector #1
change vector #1
25. Background Process
DBWR
(Database Writer)
- write all dirty buffers to datafiles
- Use a LRU algorithm to keep most recently used blocks in memory
- Defers write for I/O optimization
dirty list reaches a threshold length
A process SCNAS a specifed number of buffer in the LRU without finding free buffer
A time-out occurs
DBWR checkpoint occurs
LGWR
(Log Writer)
- writes redo log entries to disk
Commit occurs
The redo log buffers pool becomes one-third full
DBWR completes cleaning the buffer blocks at a checkpoint
LGWR time-out
- A commit confirmation is not issued until the tx has been recorded in the redo
log file
26. Cont’d
PMON
(Process Monitor)
- Cleans up abnormally terminated connection
- Rolls back uncommited transactions
- Releases locks held by a terminated process
- Frees SGA resources allocated to the failed processes
- Database maintenance
SMON
(System Monitor)
- Performs automatic instance recovery
- Reclaims space used by temporary segments no longer in use
- Merges contiguous area of free space in the datafile
27. Cont’d
CKPT
(Check Point)
- is enabled by setting the parameter CHECKPOINT_PROCESS=TRUE
- If enabled, take over LGWR’s task of updating files at a checkpoint
- Updates header of datafiles and control files at the end of checkpoint
- More frequent checkpoint reduce recovery time from instance failure
- CKPT improve the performance of database with many database files
ARCH
(Archiver)
- Copies redo log files to tape or disk for media failure
- Operates only when a log switch occurs
- Is optional and is only needed when in ARCHIVELOG mode
- May write to a tape drive or to a disk
LCKn
(Lock), Dnnn (Dispatcher), Snnn (Server),
RECO (Recover), Pnnn(Parallel), SNPn(Snapshot process)(Job
Queue),
QMNn(Queue Monitor),
28. Server/User Process
User
Processes
- A user process is used when a user runs an application program
- Runs the tool/application and is considered the client
- Passes SQL to the server process and receives the results
Server
Processes
- A server process must place the data in the database buffer cache
- Parce and execute SQL statements
- Read data blocks from disk into the shred database buffers of the SGA
- Return the results of SQL statements to the user process
Parse : check syntax, security access, object resolution, optimization
Execute : applies the parse tree to the data, perform a physical read and
change
Fetch : Passes data to the user (only SELECT)
33. Cont’d
Objects stored in tablespaces
Tablespace (one or more datafiles)
Table
Table
INDEX
INDEX
INDEX
INDEX
INDEX
INDEX
INDEX
INDEX
INDEX
INDEX
Table
Database Files
Objects
(Physical structures associated
with only one tablespace)
(stored in tablespace may
span serveral datafiles)
34. Block
Header
Table Dictionary
Row Dictionary
Free Space
General Block Information
(Block add, Segment type)
85 ~ 100 bytes
Table info in Cluster
Row info in Block
(2 byte per row)
Row Data
using when New Row
Insert or Update
(pctfree, pctused)
Table or Index Data
35. PCTFREE / PCTUSED
PCTFREE
PCTFREE
PCTUSED
PCTUSED
20% Free space
61% Free space
PCTUSED = 40
PCTFREE = 20
Insert new row until 80%
20% use when Update
Can insert new row when below 60%
When Usage is below 40% (61% Free
space), block is listed in FREELIST
36. Extent
A
set of contiguous database blocks within a datafile.
Extent
are allocated when.
- The segment is created (INITIAL EXTENT)
- The segments grows (NEXT EXTENT)
- The table is altered to allocate extents.
Extent
are de-allocated when the
- The segment is dropped and truncated.
- The segment is larger than optimal and contains free extents
(for rollback segments only)
Each
segment is created with at least on extend( initial extent )
( Rollback segment : 2)
ALTER
TABLE table_name DEALLOCATE UNUSED
37. Segment
a set of one or more extents that contains all the data for a specific type of logical storage
structure within a tablespace
Data Segment
- A collection of extents that holds all of the data for a table or a cluster
Index Segment
- A collection of extents that holds all of the index data for search optimization on large tables
and clusters
Rollback Segment
- A collection of extents that holds rollback data for rollback, read-consistency, or recovery
Temporary segment
- A collection of extents that holds data belonging to temporary tables created during a sort
operation
Bootstrap segment
- An extent that contains dictionary definitions for dictionary tables to be loaded when the
database is opened.
38. Oracle Client/Server Architecture
NETWORK
Server b
Client
Application
Server/Server
Client/Server
Server A
Benefit of Client/Server Component
- Database S/W work on Server
- Minimize network resource
- concurrency, consistency, transparency
- Only Server upgrade to increase size
- Minimize Client H/W spec
- concurrency, consistency, transparency
Hinweis der Redaktion
1. 인사말
2. 강사소개
3. 강의 목적
1. Architecture의 중요성
2. Oracle8(TM) Server와 Oracle8(TM) Server Enterprise Edition 의 차이
- Oracle8(TM) Server : Oracle Workgroup Server의 대체명.
- Oracle8(TM) Server Enterprise Edition : Oracle7 Server의 대체명.
전사급의 OLTP나 DW와 같은 애플리케이션을 위해
강력한 인상을 줄 수 있도록 ”Enterprise”를 명칭에 삽입..
1. Architecture의 중요성
2. Oracle8(TM) Server와 Oracle8(TM) Server Enterprise Edition 의 차이
- Oracle8(TM) Server : Oracle Workgroup Server의 대체명.
- Oracle8(TM) Server Enterprise Edition : Oracle7 Server의 대체명.
전사급의 OLTP나 DW와 같은 애플리케이션을 위해
강력한 인상을 줄 수 있도록 ”Enterprise”를 명칭에 삽입..
1. 오라클이란?
2. 오라클의 Overview
3. SGA란, SGA의 구성요소 3가지와 SGA의 역할
4. Background Process에 대해 (역할과 어떤것이 있는지)
5. Instance = SGA + Background Process
6. Server Process/User Process
7. File 의 구성요소와 간단한 역할
1) Fixed size
이 부분은 백그라운드 프로세스가 access하는데 필요한 일반적인 정보를 포함하고 있는 부분으로서 user data는 없으며 parameter로 크게 또는 작게 지정 할 수 없다.따라서 항시 instance내에서 일정한 크기를 갖으며 버젼별,os 별로 약간의 차이는 있다.
2) Variable size
이 size는 parameter file(initSID.ora)의 SHARED_POOL_SIZE에서 지정한 크기와 각종 파라미터로 지정한 값의 합으로 결정된다. SHARED_POOL_SIZE는 byte단위로 지정하며 OS의 shared memory
segment보다는 작아야 한다. InitSID.ora file에는 instance와 관련된 여러 parameter가 지정되어
있는데 이곳의 parameter의 지정 값에 따라서 SGA의 영역에 일정한 부분을 차지한다.따라서 SGA의 크기에 영향을 주는 요소는 단순히 SHARED_POOL_SIZE이외에 각종 parameter에 의해 점유되는 부분을 고려해야 한다.일반적으로 각 parameter값을 크게 할 수록 메모리 사용은 일정한 비율로 늘어나며 다음은 몇가지 예이다.
*DB_FILES - 10 증가시 약 6K소모
*DML_LOCKS - 100 증가시 9.7K소모
*PROCESSES - 10 증가시 19.5K 소모
*SEQUENCE_CACHE_ENTRIES - 10 증가시 약 1.17K 증가
*ROW_CACHE_ENQUEUES - 100 증가시 약 3.5K 증가
*SESSIONS - 10 증가시 약 5.3K 증가 : :
현재 각 parameter에 의해 점유된 SGA내의 점유된 메모리 영역의 크기는 V$SGASTAT에서 조회하여 볼 수 있다.(select * from v$sgastat;)
3) Database Buffer Cache
SGA에서 disk의 data가 저장되는 곳으로서 performance에 큰 영향을 준다.
이곳의 size가 작으므로 발생 할 수 있는 현상은 빈번한 disk i/o이다.크기는 DB_BLOCK_BUFFERS로 지정하며 buffer의 갯수를 지정한다.byte 산정은 DB_BLOCK_BUFFERS * DB_BLOCK_SIZE로 산출된다.
4) Log Buffers
이것은 redo log 용도로 사용될 메모리 내의 log buffer size를 말 한다.크기는 byte단위로 LOG_BUFFERS 로 지정한다.
1. SGA 의 특징
- data와 control information을 저장한다.
- SGA는 non-paged, non-swapped memory
- 전체 메모리에 1/3 정도로 구성한다.
2. 오라클 기본 메모리 구조
- software code areas : 실행되고 있거나 실행될 오라클을 위한 코드가 저장되는 곳
- the system global area (SGA)
the database buffer cache, the redo log buffer, the shared pool
- program global areas (PGA) : 프로세스(Server,background)에 대한 data와 control
정보를 담고있다.( stack areas, data areas )
- sort areas (SORT_AREA_SIZE에서 지정)
3. Shared Pool 의 구성과 역할
- Identical SQL문 (by Hashing algorithm)
1. Dirty List : 변경은 되었으나 아직 디스크에는 반영안된 dirty buffer를 가지고 있음
2. LRU List : 다음 3가지로 구성되어 있다.
- Free Buffer : 사용가능
- Pinned Buffer : 현재 사용중
- Dirty Buffer : update로 locking 되어진 상태(Dirty List로 옮겨지지 않은 버퍼)
Dirty Buffer => Dirty List로 간다.
* LRU end MRU end
<--------------------------------------------------------------------->
---->
free buffer를 찾는다.
사용하면 MRU end 로 보낸다
dirty buffer를 만나면 dirty list로 보낸다.
- Can be by bypassed using the UNRECOVERABLE keyword in the CREATE TABLE, CREATE INDEX statement
- Can be bypassed by the Oracle data loader
- Recovery를 위해
- Update, Delete, Insert 순으로 저장크기
- If an instance failure occurs, the redo log files are used to recover the modified data that was in memory.
- These files are only used for recovery
1. Background Process의 종류
- PMON, SMON, DBWR, LGWR : mandatory processes
- 위 4개중 어느 하나가 죽어도 instance는 죽는다 : Instance Restart
- the other processes are optional
2. Server Process/User Process
3. Legend
- LCKn : Lock process (parallel server system에서 Instance간의 locking담당)
- RECO : Recoverer process (분산 트랜잭션에 대한 failure 해결)
- PMON : Process monitor
- SMON : System monitor
- CKPT : Checkpoint
- ARCH : Archiver
- DBWR : Database writer
- LGWR : Log writer
- Pnnn : Parallel processes
- SNPn : Snapshot process (Replication)
Background Process 참조(P51)
1. Properties LGWR
- 1개의 인스턴스에는 1개의 LGWR이 있다
- Transaction이 redo log file에 기록된 후에 Commit이 마무리된다.
(왜? 인스턴스 failure대비)
- 한 User가 Commit을 한후 LGWR flush를 하기전 다른 유저가 commit을 할 경우
commit당 I/O의 비율을 줄이기 위해 piggyback된다. (계속수행하고 에러발생시
재처리한다)
- Long Transaction일경우 1/3이 차면 redo log buffer 를 비운다.
2. Log Switch
- log switch : LGWR은 현재 redo log group => 다음 redo log write
- log switch시에는 check point가 자동으로 수행된다
- Redo log가 다 찾을경우, DBA가 직접 “ALTER SYSTEM SWITCH LOGFILE”
- 1개의 멤버라도 괜챤으면 계속수행된다.
3. Log Sequence #
* PMON의 역할중 clean-up
alter system kill session ‘ , ‘
=> serial# 증가하는 경우
0. When Do Checkpoint
- every log switch
- LOG_CHECKPOINT_TIMEOUT
- LOG_CHECKPOINT_INTERVAL
- instance shutdown (not abort)
- ALTER SYSTEM CHECKPOINT (by DBA)
- tablespace offline while at least one of its files is online
1. RECO : 분산 트랜잭션에 대한 Failure시 해결한다
2. LCKn : Parallel Server system에서 인스턴스간의 locking을 담당한다.
3. Pnnn ( Parallel Query process) : parallel query, parallel index creation, parallel data loading, parallel CREATE TABLE AS SELECT
4. SNPn (Snapshot process) : Automatic refreshes of snapshot (read-only replicated tables), the server job queues and replication queues
5. QMNn (Queue Monitor) : message queue를 모니터 하는 Oracle Advanced Queuing을 위한 프로세스
<Options>
1. The parallel query option 별도라이센스
2. The procedural option is required for snapshots. It is included with the Oracle7 licensed product.
3. The distributed option is required for distributed transactions. 별도 라이센스
4. The replication option 별도 라이센스
1. Single Task, Two-Task
- Single-Task : User(Application Code)와 Server(Oracle Server Code)가 하나의’
Process Batch성 작업에서 30%정도의 향상효과가 있다.
Re-make해 주어야한다.
1. 각 파일의 기능과 역할
- Datafile : Database의 모든 data를 저장하는 곳이다. Table, Index와 같은 Logical
구조를 물리적으로 저장하는 곳이다.
- Redo Log Files : Recovery를 위해존재한다. 데이타베이스를 변경하는 모든 변화를
record 단위로 기록한다.
적어도 2개의 redo log group 이 있어야하며, 그룹당 2개이상의 멤버
각 그룹의 멤버는 같은 Size와 같은 내용을 갖는다.
모든 그룹의 멤버의 개수는 갖다.
- Control File : 데이타베이스의 물리적인 구조와 데이타베이스 Status등을 저장한다.
Database file, log file, database name, Synchronization information
서로 다른 disk에 적어도 2개 이상을 갖도록 한다.
- Parameter File : 뒤 참조
- Alert File : chronological log of message and errors
internal error (ORA-600), block corruption error(ORA-1578), dead lock
errors(ORA-60), DDL, Server Manager statement(startup, shutdown..)
BACKGROUND_DUMP_DEST
- Trace File : 에러발생시 BACKGROUND_DUMP_DEST, USER_DUMP_DEST
2. Control file은 다음과 같은 경우에 다시 생성한다.
. CONTROL FILE이 손상되었는데, 이용 가능한 BACKUP FILE이 없을 경우
. 데이타베이스 이름을 변경할 경우
. DATA FILE, LOG FILE의 MAX 갯수를 확장할 때
alter database backup controlfile to trace noresetlogs;
1. Physical storage structure
- Datafile : a physical datafile belonging to a single tablespace
- Segment : a set of one or more extents that contains all the data for a specific
structure within a tablespace
- Extents : a set of contiguous data blocks in a database
- Block : multiple physical file blocks allocated from an existing datafile
2. Logical storage structure
- Tablespace : a logical repository for physically grouped data
- Table / Clusters / Indexes
< Tablespace >
1. 각 tablespace는 여러 개의 OS 파일을 가질수 있다.
2. Tablespace는 database가 돌고있는 중에 online할수 있다.
3. SYSTEM, RBS을 제외한 tablespace는 offline 할 수 있다.
4. Read-write <=> Read-only 변환 가능
5. 해당 tablespace에 생성된 오브젝트는 다른 테이블스페이스에 할당되지 않는다.
<Tablespace Uses>
1. Controlling space allocation and assign space quotas to users
2. 각 tablespace를 online, offline하여 관리
3. Data storage를 분산하면 I/O 경합이 줄어 효율이 향상될 것이다.
4. 부분 backup과 부분 recovery가 가능하다.
5. Static data에 대해 read-only tablespace로 할 수 있다.
< SYSTEM tablespace>
1. Database operation을 위해 필요함
2. Data dictionary 정보, stored procedures/package/trigger 의 definition
3. SYSTEM rollback segment
4. Can contain user data
< Non-system Tablespace >
1. Database 관리상 편하다
2. Rollback segment, temporary segment, application data/index, user space
3. Rollback segment tablespace
4. Temporary Tablespace : 영구적인 data, index는 저장안함, dba가관리
( V$SORT_SEGMENT )
5. Resizing datafile (AUTOEXTEND option)
6. Read-only tablespace : online, no active tx, no active rbs, not online backup
1. Block의 구조 자세히
- Header : general block information
- Table directory : information about the table
- Row directory : row information about the actural rows in the block (1row당 2bytes)
- Free space : for insert or updates of rows, or for additional transaction entries
- Row data : stores table or index data
2. Space Management Parameters
- PCTFREE : udpate를 위해 비워두는 공간 (default 10)
- PCTUSED : pctused 밑으로 내려가면 freelist에 등록되어 새로운 row insert(40)
- INITRANS : block header에 처음 할당되는 tx entries의 개수
(각 tx entry는 23bytes) - MAXTRANS : block에 동시에 access할 수 있는 tx 의 개수 (255개까지)
block 의 free space 를 사용한다. 즉, tx이 많을 경우 pctfree를 늘린다
3. Chaning and Migration
- Analyze table table_name list chained rows;
- utlchain.sql (create chained_rows)
1. PCTFREE ={ (max# bytes per rows) - (#bytes inserted per rows) } / (max# of bytes per row) * 100
2. default : pctree 10, pctused 40
3. pctfree + pctused < 100 (90% recommanded)
4. Example
(1) Setting a Lower PCTFREE
- block 사용율을 높인다(사용될 블럭의 수를 줄인다)
- block reorg로 인해 process cost가 는다
- row migration 가능성이 높다
(2) Setting a Higer PCTFREE
- update를 위해 여유공간을 많이 둔다 (사용될 블럭의 수가 많이든다)
- block reorg가 필요없어서 process cost가 적다
- chain row를 줄일수 있다
(3) Setting a Lower PCTUSED
- block이 free될 가능성이 작아 process cost를 줄인다
- unused space가 많아 진다
(4) Setting a Higher PCTUSED
- block이 free될 가능성이 높아 process cost가 는다
- space usage가 높아진다