=-=-=-==-=-Overview of the Talk-=-=-=-=-=
Introduction to the Subject
Database
Rational Database
Object Rational Database
Database Management System
History
Programming
SQL,
Connecting Java, Matlab to a Database
Advance DBMS
Data Grid
BigTable
Demo
Products
MySQL, SQLite, Oracle,
DB2, Microsoft Access,
Microsoft SQL Server
Products Comparison.
Scanning the Internet for External Cloud Exposures via SSL Certs
Tutorial On Database Management System
1. *This image is take form Microsoft website
By
P. Sathish Kumar
Senior Research Fellow, Defence Laboratory, Jodhpur
12th December 2008
2. Outline of the talk
• Introduction to the Subject
– Database
Rational Database
–
Object Rational Database
–
– Database Management System
– History
– Programming
– SQL,
– Connecting Java, Matlab to a Database
Advance DBMS
• Data Grid
• BigTable
• Demo
• Products
• MySQL, SQLite, Oracle, DB2, Microsoft Access, Microsoft SQL Server
• Products Comparison.
3. Database
A database is a collection of data, typically
describing the activities of one or more related
organizations. For example, a university database
might contain information about the following:
Entities such as students, faculty, courses, and
classrooms.
Relationships between entities, such as students'
enrollment in courses, faculty teaching courses,
and the use of rooms for courses.
5. Object Rational Database
In an object database (also object-oriented database),
information is represented in the form of objects as
used in object-oriented programming.
In Computer, Object is collecting of state(data) and
behavior(processes).
CREATE TYPE t_person AS OBJECT(
CREATE TYPE t_address AS OBJECT ( id INTEGER,
street VARCHAR2(15), first_name VARCHAR2(10),
city VARCHAR2(15), last_name VARCHAR2(10),
state CHAR(2), dob DATE,
zip VARCHAR2(5) phone VARCHAR2(12),
); address t_address
);
6. Database Management System
A database management system, or DBMS, is software
designed to assist in maintaining and utilizing large
collections of data. The need for such systems, as well
as their use, is growing rapidly.
The alternative to using a DBMS is to store the data in
files and write application-specific code to manage it.
7. History
early 1960 – Charles Bachman create first general-purpose DBMS General
Electric
late 1960 – IBM Created Information Management System (IMS) which is used
even today.
1970- Edgar Codd, at IBM's San Jose Research Laboratory, proposed a new data
representation framework called the relational data model.
1973 Bachman recived ACM's Turing Award (the computer science equivalent
of a Nobel Prize) for work in the database area
1977 Software Development Laboratories, the precursor to Oracle, is founded
by Larry Ellison, Bob Miner, and Ed Oates.
1978 Oracle Version 1, written in assembly language, runs on PDP-11 under RSX,
in 128K of memory
1980 - Dr. E.F. Codd Created Structure query language (SQL) for relational
databases, developed as part of IBM's System R project.
1990 …. Google Created BitTable for its Web Indexting Appliation .
9. SQL
Structured Query Language (SQL) is the standard
language designed to access relational Databases.
SQL uses a simple syntax that is easy to learn and use
There are five types of SQL statements, outlined in the
following list:
1. Query statements retrieve rows stored in database
tables.
SELECT statement.
2. Data Manipulation Language (DML) statements
modify the contents of tables. There are
INSERT adds rows to a table.
UPDATE changes rows.
DELETE removes rows.
10. SQL
1. Data Definition Language (DDL) statements
define the data structures, such as tables,that make
up a database. There are five basic types of DDL
statements:
CREATE creates a database structure.
ALTER modifies a database structure.
DROP removes a database structure.
RENAME changes the name of a table.
TRUNCATE deletes all the rows from a table.
11. SQL
Transaction Control (TC) statements either
permanently record any changes made to rows, or
undo those changes. There are three TC statements:
permanently records changes made to rows.
COMMIT
ROLLBACK undoes changes made to rows.
SAVEPOINT sets a “save point” to which you can roll back
changes.
Data Control Language (DCL) statements change
the permissions on database structures. There are two
DCL statements:
GRANT gives another user access to your database structures.
REVOKE prevents another user from accessing your database
structures
12. Conneting java to Database
import java.sql.*;
public class JDBCMain {
public static void main(String[] args) {
try {
DriverManager.registerDriver(new oracle.jdbc.driver.OracleDriver());
String url = quot;jdbc:oracle:thin:@localhost:1521:ORCLquot;;
Connection conn = DriverManager.getConnection(url, quot;scottquot;, quot;tigerquot;);
stat = conn.createStatement();
String sql = “select * from emp”;
rs = stat.executeQuery(sql);
rsm = rs.getMetaData();
int colCount = rsm.getColumnCount();
for (int i=1; i <= colCount; i++)
System.out.println(rsm.getColumnName(i) + “tquot;);
while( rs.next( )){
for (int i = 1; i <= colCount; ++i)
System.out.print( rs.getString(i) + “tquot;);
System.out.println(“quot;);
}
conn.close();
}catch (SQLException sqlE){
conn.close(); System.out.println(sqlE.getMessage());
}catch (Exception e){
conn.close(); System.out.println(sqlE.getMessage());
}
}
}
13. Conneting Database to MatLab
In Matlab Database conneting is done through
Database Toolbox
ds = ‘oracleODBC’
# ‘oracleODBC’ is the datasource name which is set using the
# database toolbox
sqlquery = ‘select * from emp’
conn = database (ds, ‘scott’, ‘tiger’)
data = fetch(conn, SQLquery);
if (isempty(data))
errordlg('No patients were found within that date range')
close(conn);
return
end
15. Data Grid
A data grid is a grid computing system that deals with
data — the controlled sharing and management of
large amounts of distributed data. These are often, but
not always, combined with computational grid
computing systems
Data Services
Single System Image
Data is present in the node
with the process reside.
Less Data movement
More RPC Calls
16. Type of Datagrid
Replicated Topology
Partitioned Topology
Near Topology
17. Replicated Topology
Advantage:
Extreme Performance
Data is Replicated to all the member of the data grid
Less Latency Access: Data is avaliavle for use without
any waiting
DisAdvantage:
Cost of data entry
Cost of data Update
No Scalability
18. Partitioned Topology
Transpatently partition the data to distribute the load
across all grid nodes
Advantage
Extreme Scalability
Load Balancing
Ownership
Point to Point
19. Near Topology
Local in-Memory cache in front of the entire data set
provide by the data grid
Advantage
Extreme Programming
Extreme Scalability
Less Latency Access
21. Google’s BigTable
Bigtable is a distributed storage system for managing
structured data that is designed to scale to a very large
size: petabytes of data across thousands of commodity
servers.
Google’s BigTable is implemented in C
Bigtable users:
Google Reader, Google Maps, Google Book Search, My
Search History, Google Earth, Blogger.com, Google Code
hosting, Orkut, and YouTube
22. Data Model
• Doesn’t support a full relational data model
• Multi-dimensional sorted map
• Indexed by (row:string(64), column:string, time:int64) -> string
The row range for a table is dynamically partitioned. Each row range is
called a tablet
Row Key Time Sample Column “content” “anchor:cnnsi.comquot; anchor:my.look.caquot;
“com.cnn.www” t9 “CNN”
t8 “CNN.com”
t6 “<html>…”
t5 “<html>…”
t1 “<html>…”
23. Tablet Location
Use three-level hierarchy analogous to that of a B+ tree
- Location is of ip:port relevant server
- 1st level: Bootstrapped from lock server, points to location of root tablet
- 2nd level: Uses META 0 data to find owner of appropriate META 1 tablet
- 3rd level: META1 table holds locations of tablets of all other tables
24. Use of BigTable
As of August 2006,
• 388 non-test Bigtable clusters
• 24,500 tablet servers. (Many are used for development purposes)
•14 busy clusters with 8069 total tablet servers
•1.2 million requests per second,
•incoming RPC traffic of about 741 MB/s
•outgoing RPC traffic of about 16 GB/s.
25. Reference:
Bigtable: A Distributed Storage System for Structured Data Fay
Chang, Jeffrey Dean, Sanjay Ghemawat, Wilson C. Hsieh, Deborah
A. Wallach Mike Burrows, Tushar Chandra, Andrew Fikes, Robert
E. Gruber
27. HBase
Hbase is the Java implemented of Bigtable
Its is open source Project under Apache
Its is the only implementation that allow as to use
Bigtable.
http://hadoop.apache.org/hbase/
28. The World’s Most Popular open source Database.
Latest release 5.1.30 (27 November 2008)
Writen in C, C++
New MySQL Enterprise with Query Analyzer Improves
Database Application Performance.
Part of almost all Linux distorts
Part of LAMP
L->Linux
A-> Apache HTTP Server
M->MySQL
P->PHP(Programming)
29.
30. SQLite is an ACID(Atomicity, Consistency, Isolation,
Durability) compliant relational database
management system contained in a relatively small
(~500kB) C programming library
A Lager number of Language Support
BASIC, C, C++, Common Lisp, Java, C#, Delphi, Curl, Lua, Tcl,
R, PHP, Perl, Ruby, Objective-C (on Mac OS X), Python,
newLisp , JavaScript , VBScript and Smalltalk
Use in place where u least expect
Firefox
Embedded System/Cell Phone
Google Gears
32. Latest release 11g / 11 July 2007;
Written C
Support Data Grid by Oracle
Feature
Advanced Security (adds data encryption methods)
Data Mining (ODM) (mines for patterns in existing data)
Real Application Clusters (RAC) (coordinates multiple processors)
Oracle Real Application Testing (new at version 11g) — including Database Replay (for
testing workloads) and SQL Performance Analyzer (SPA) (for preserving SQL efficiency in
changing environments)[30]
Oracle Spatial (integrates relational data with geographic information systems (GIS))
Total Recall (optimizes long-term storage of historical data)
Oracle Warehouse Builder (in various forms and sub-options)
33.
34. Latest release 9.5
Writen in C, C++
Use in Mainframe system like OS/2, z/OS , Linux on
zSeries
Exceplemt support for XML, XQuary
DB2 has APIs for
.NET CLI, Java, Python, Perl, PHP, Ruby, C++, C, REXX, PL/I,
COBOL, RPG, FORTRAN, and many other programming
languages.
36. Microsoft Office Access is a relational database
management system from Microsoft that combines the
relational Microsoft Jet Database Engine with a
graphical user interface and software development
tools. It is a member of the 2007 Microsoft Office
system.
Latest release 12.0.6211.1000 (2007 SP1) / December 11,
2007
ActiveX Support
Language
VB, VC++, C#, C, C++
37.
38. Latest release SQL Server 2008 / 06 August 2008;
Its primary query languages are MS-SQL and T-SQL.
39.
40. Max DB Max table Max row Max Max Max Max Min Max
Product
size size size columns Blob/Clob CHAR NUMBER DATE DATE
per row size size size value value
DB2 512 TB (512 512 TB 32,677 B 1012 2 GB 32 KB 64 bits 0001 9999
TiB) (32 KiB)
Microsoft 2 GB 2 GB 16 MB 255 64 KB 255 B 32 bits ? ?
Access (memo (text
field), 1 GB field)
(quot;OLE
Objectquot;
field)
Microsoft SQL 524,258 TB 524,258 TB Unlimited 1024 2 GB 8000 B 64 bits 1753 2 9999
Server (does not (32,767 files *
include 2008) 16 TB max
file size)
MySQL 5 Unlimited 2 GB (Win32 64 KB 3398 4 GB 64 KB 64 bits 1000 9999
FAT32) to 16 (longtext, (text)
TB (Solaris) longblob)
Oracle Unlimited (4 4 GB * block Unlimited 1000 4 GB (or 4000 B 126 bits -4712 9999
GB * block size (with max datafile
size per BIGFILE size for
tablespace) tablespace) platform)
SQLite 32 TB (230 ? ? 2000 1 GB 1 GB 64 bits No DATE No DATE
pages * 32 KB type type
max page
size)