This document discusses Python database programming. It introduces databases and how they store data in tables connected through columns. It discusses SQL for creating, accessing, and manipulating database data. It then discusses how Python supports various databases and database operations. It covers the Python DB-API for providing a standard interface for database programming. It provides examples of connecting to a database, executing queries, and retrieving and inserting data.
2. Introduction
From a construction firm to a stock exchange, every organisation depends on large
databases. These are essentially collections of tables, and’ connected with each
other through columns.
These database systems support SQL, the Structured Query Language, which is
used to create, access and manipulate the data.
SQL is used to access data, and also to create and exploit the relationships between
the stored data.
Additionally, these databases support database normalisation rules for avoiding
redundancy of data.
The Python programming language has powerful features for database
programming.
Python supports various databases like MySQL, Oracle, Sybase, PostgreSQL, etc.
Python also supports Data Definition Language (DDL), Data Manipulation Language
(DML) and Data Query Statements.
For database programming, the Python DB API is a widely used module that
provides a database application programming interface.
3. Benefits of Python for database
programming
There are many good reasons to use Python for
programming database applications:
Programming in Python is maybe more efficient and faster
compared to other languages.
Python is famous for its portability.
It is platform independent.
Python supports SQL cursors.
In many programming languages, the application developer needs
to take care of the open and closed connections of the database,
to avoid further exceptions and errors. In Python, these
connections are taken care of.
Python supports relational database systems.
Python database APIs are compatible with various databases, so it
is very easy to migrate and port database application interfaces.
4. DB-API (SQL-API) for Python
Python DB-API is independent of any database engine, which
enables you to write Python scripts to access any
database engine.
The Python DB API implementation for MySQL is MySQLdb.
For PostgreSQL, it supports psycopg, PyGresQL and pyPgSQL
modules.
DB-API implementations for Oracle are dc_oracle2 and
cx_oracle.
Pydb2 is the DB-API implementation for DB2.
Python’s DB-API consists of connection objects, cursor objects,
standard exceptions and some other module contents.
5. The DB API provides a minimal standard for working
with databases using Python structures and syntax
wherever possible. This API includes the following:
Importing the API module.
Acquiring a connection with the database.
Issuing SQL statements and stored procedures.
Closing the connection
6. Functions and attributes
connect(parameters...)Constructor for creating a
connection to the database.
Returns a Connection Object.
Parameters are the same as for the MySQL C API.
In addition, there are a few additional keywords that
correspond to what you would pass mysql_options() before
connecting.
Note that some parameters must be specified as keyword
arguments!
The default value for each parameter is NULL or zero, as
appropriate. The important parameters are:
7. Host: name of host to connect to. Default: use the local host via a
UNIX socket (where applicable)
User: user to authenticate as. Default: current effective user.
Passwd: password to authenticate with. Default: no password.
Db: database to use. Default: no default database.
Port: TCP port of MySQL server. Default: standard port (3306).
unix_socket: location of UNIX socket. Default: use default
location or TCP for remote hosts.
Conv: type conversion dictionary. Default: a copy of
MySQLdb.converters.conversions
Compress :Enable protocol compression. Default: no
compression.
8. connect_timeout: Abort if connect is not completed within given number of
seconds. Default: no timeout (?)
named_pipe: Use a named pipe (Windows). Default: don't.
init_command: Initial command to issue to server upon connection. Default:
Nothing.
read_default_file: MySQL configuration file to read; see the MySQL
documentation for mysql_options().
read_default_group: Default group to read; see the MySQL documentation for
mysql_options().
Cursorclass: cursor class that cursor() uses, unless overridden. Default:
MySQLdb.cursors.Cursor. This must be a keyword parameter.
Unicode: If set, CHAR and VARCHAR columns are returned as Unicode strings,
using the specified character set. None means to use a default encoding.
9. Connection Objects
Connection objects are returned by the connect() function.
commit() :If the database and the tables support transactions,
this commits the current transaction; otherwise this method
successfully does nothing.
rollback() :If the database and tables support transactions, this
rolls back (cancels) the current transaction; otherwise a
NotSupportedError is raised.
Compatibility note: The older MySQLmodule defines this
method, which sucessfully does nothing. This is dangerous
behavior, as a successful rollback indicates that the current
transaction was backed out, which is not true, and fails to notify
the programmer that the database now needs to be cleaned up
by other means.
10. cursor([cursorclass]) : MySQL does not support cursors;
however, cursors are easily emulated.
You can supply an alternative cursor class as an optional
parameter.
If this is not present, it defaults to the value given when creating
the connection object, or the standard Cursor class. Also see the
additional supplied cursor classes in the usage section.
begin() :Explicitly start a transaction.
Normally you do not need to use this: Executing a query implicitly
starts a new transaction if one is not in progress.
If AUTOCOMMIT is on, you can use begin() to temporarily turn
it off. AUTOCOMMIT will resume after the next commit() or
rollback.
11. Cursor Objects
close() :Closes the cursor. Future operations raise ProgrammingError. If you are
using server-side cursors, it is very important to close the cursor when you are
done with it and before creating a new one.
insert_id() : Returns the last AUTO_INCREMENT field value inserted into the
database. (Non-standard)
info() : Returns some information about the last query. Normally you don't need
to check this. With the default cursor, any MySQL warnings cause Warning to be
raised. If you are using a cursor class without warnings, then you might want to use
this. See the MySQL docs for mysql_info(). (Non-standard)
setinputsizes(): Does nothing, successfully.
setoutputsizes(): Does nothing, successfully.
nextset() :Advances the cursor to the next result set, discarding the remaining
rows in the current result set. If there are no additional result sets, it returns
None; otherwise it returns a true value.
12. Cursor Attributes
Cursor.description details regarding the result columns
This read-only attribute is a sequence of 7-item named tuples.
Each of these named tuples contains information describing one result column:
name
type_code
display_size
internal_size
precision
scale
null_ok
The values for precision and scale are only set for numeric types. The values for
display_size and null_ok are always None.
This attribute will be None for operations that do not return rows or if the
cursor has not had an operation invoked via the Cursor.execute() or
Cursor.executemany() method yet.
13. rowcount – number of rows of the
result
Cursor.rowcount
This read-only attribute specifies the number of rows
that the last Cursor.execute() or Cursor.executemany()
call produced (for DQL statements like SELECT) or
affected (for DML statements like UPDATE or INSERT).
It is also set by the Cursor.copy_from() and
:meth’:Cursor.copy_to methods.
The attribute is -1 in case no such method call has been
performed on the cursor or the rowcount of the last
operation cannot be determined by the interface.
14. Methods of Cursor
close – close the cursor
Close the cursor now (rather than whenever it is
deleted)
Return type: None
The cursor will be unusable from this point forward; an
Error (or subclass) exception will be raised if any
operation is attempted with the cursor.
15. execute – execute a database
operation
Cursor.execute(operation[, parameters]) Prepare and
execute a database operation (query or command)
Parameters: operation (str) – the database operation
parameters – a sequence or mapping of
parameters
Returns:the cursor, so you can chain commands
16. executemany – execute many similar
database operations
Cursor.executemany(operation[, seq_of_parameters])
Prepare and execute many similar database operations
(queries or commands)
Parameters: operation (str) – the database operation
seq_of_parameters – a sequence or
mapping of parameter tuples or mappings
Returns: the cursor, so you can chain commands
Prepare a database operation (query or command) and
then execute it against all parameter tuples or mappings
found in the sequence seq_of_parameters.
18. callproc – Call a stored procedure
Cursor.callproc(self, procname, [parameters]): Call a stored
database procedure with the given name
Parameters: procname (str) – the name of the database function
parameters – a sequence of parameters (can be empty
or omitted)
This method calls a stored procedure (function) in the PostgreSQL
database.
The sequence of parameters must contain one entry for each input
argument that the function expects.
The result of the call is the same as this input sequence; replacement
of output and input/output parameters in the return value is currently
not supported.
The function may also provide a result set as output. These can be
requested through the standard fetch methods of the cursor.
19. fetchone – fetch next row of the
query result
Cursor.fetchone() Fetch the next row of a query result set
Returns: the next row of the query result set
Return type: named tuple or None Fetch the next row of a
query result set, returning a single named tuple, or None
when no more data is available.
The field names of the named tuple are the same as the
column names of the database query as long as they are valid
Python identifiers.
An Error (or subclass) exception is raised if the previous call
to Cursor.execute() or Cursor.executemany() did not
produce any result set or no call was issued yet.
20. # Using a while loop
cursor.execute("SELECT * FROM employees")
row = cursor.fetchone()
while row is not None:
print(row)
row = cursor.fetchone()
# Using the cursor as iterator
cursor.execute("SELECT * FROM employees")
for row in cursor:
print(row)
21. fetchmany – fetch next set of rows of
the query result
Cursor.fetchmany([size=None][, keep=False]) Fetch the
next set of rows of a query result
Parameters: size (int or None) – the number of rows to be
fetched
keep – if set to true, will keep the passed arraysize
Tpye keep: bool
Returns: the next set of rows of the query result
Return type:list of named tuples
Fetch the next set of rows of a query result, returning a list of
named tuples. An empty sequence is returned when no more
rows are available. The field names of the named tuple are the
same as the column names of the database query as long as they
are valid Python identifiers.
22. fetchall – fetch all rows of the query
result
cursor.fetchall() Fetch all (remaining) rows of a query
result
Returns: the set of all rows of the query result
Return type: list of named tuples Fetch all (remaining)
rows of a query result, returning them as list of named
tuples.
The field names of the named tuple are the same as the
column names of the database query as long as they are
valid Python identifiers.
Note that the cursor’s arraysize attribute can affect the
performance of this operation.
23. cursor.execute("SELECT * FROM employees ORDER BY
emp_no")
head_rows = cursor.fetchmany(size=2)
remaining_rows = cursor.fetchall()
24. arraysize - the number of rows to
fetch at a time
Cursor.arraysize The number of rows to fetch at a time
This read/write attribute specifies the number of rows to
fetch at a time with Cursor.fetchmany().
It defaults to 1, meaning to fetch a single row at a time.
25. Example
import MySQLdb
# Open database connection
db =
MySQLdb.connect(host="localhost",port=3306,user="root",passwd="",db=
"test" )
# prepare a cursor object using cursor() method
cursor = db.cursor()
# execute SQL query using execute() method.
cursor.execute("SELECT VERSION()")
# Fetch a single row using fetchone() method.
data = cursor.fetchone()
print "Database version : %s " % data
# disconnect from server
db.close()
26. If a connection is established with the datasource, then a
Connection Object is returned and saved into db for
further use, otherwise db is set to None.
Next, db object is used to create a cursor object, which
in turn is used to execute SQL queries.
Finally, before coming out, it ensures that database
connection is closed and resources are released.
27. Creating Database Table
Once a database connection is established, we are ready to create tables
or records into the database tables using execute method of the created
cursor.
28. import MySQLdb
# Open database connection
db =
MySQLdb.connect(host="localhost",port=3306,user="root",passwd="",db="test"
)
# prepare a cursor object using cursor() method
cursor = db.cursor()
# Drop table if it already exist using execute() method.
cursor.execute("DROP TABLE IF EXISTS EMPLOYEE")
# Create table as per requirement
sql = """CREATE TABLE EMPLOYEE (
FIRST_NAME CHAR(20) NOT NULL,
LAST_NAME CHAR(20),
AGE INT,
SEX CHAR(1),
INCOME FLOAT )"""
cursor.execute(sql)
# disconnect from server
db.close()
29. INSERT Operation
import MySQLdb
# Open database connection
db =
MySQLdb.connect(host="localhost",port=3306,user="root",passwd="",db="test" )
# prepare a cursor object using cursor() method
cursor = db.cursor()
# Prepare SQL query to INSERT a record into the database.
sql = """INSERT INTO EMPLOYEE(FIRST_NAME,
LAST_NAME, AGE, GEN, INCOME)
VALUES ('Solanki', 'raviraj', 20, 'M', 2000)"""
try:
# Execute the SQL command
cursor.execute(sql)
# Commit your changes in the database
db.commit()
except:
# Rollback in case there is any error
db.rollback()
# disconnect from server
db.close()
30. Insert 2nd
way
import MySQLdb
# Open database connection
db = MySQLdb.connect(host="localhost",port=3306,user="root",passwd="",db="test" )
# prepare a cursor object using cursor() method
cursor = db.cursor()
# Prepare SQL query to INSERT a record into the database.
sql = "INSERT INTO EMPLOYEE(FIRST_NAME,
LAST_NAME, AGE, GEN, INCOME)
VALUES ('%s', '%s', '%d', '%c', '%d' )" %
('Mac', 'Mohan', 20, 'M', 2000)
try:
# Execute the SQL command
cursor.execute(sql)
# Commit your changes in the database
db.commit()
except:
# Rollback in case there is any error
db.rollback()
# disconnect from server
db.close()
31. READ Operation
READ Operation on any database means to fetch some useful
information from the database.
Once our database connection is established, you are ready to make
a query into this database.
You can use either fetchone() method to fetch single record or
fetchall() method to fetch multiple values from a database table.
fetchone(): It fetches the next row of a query result set.
A result set is an object that is returned when a cursor object is used
to query a table.
fetchall(): It fetches all the rows in a result set. If some rows have
already been extracted from the result set, then it retrieves the
remaining rows from the result set.
rowcount: This is a read-only attribute and returns the number of
rows that were affected by an execute() method.
32. Types of Errors
The following are the six different error types supported
by MySQL for Python.
DataError
IntegrityError
InternalError
NotSupportedError
OperationalError
ProgrammingError
33. DataError
This exception is raised due to problems with the processed data
(for example, numeric value out of range, division by zero,
and so on).
IntegrityError
If the relational integrity of the database is involved (for example a
foreign key check fails, duplicate key, and so on), this exception
is raised.
InternalError
This exception is raised when there is an internal error in the MySQL
database itself (for example, an invalid cursor, the transaction
is out of sync, and so on). This is usually an issue of timing out or
otherwise being perceived by MySQL as having lost connectivity with
a cursor.
34. NotSupportedError
MySQL for Python raises this exception when a method or database
API that is not supported is used (for example, requesting a
transaction-oriented function when transactions are not available.
They also can arise in conjunction with setting characters sets, SQL
modes, and when using MySQL in conjunction with Secure Socket
Layer (SSL).
OperationalError
Exception raised for operational errors that are not necessarily under
the control of the programmer (for example, an unexpected
disconnect, the data source name is not found, a transaction
could not be processed, a memory allocation error occurrs,
and so on.).
35. ProgrammingError
Exception raised for actual programming errors (for
example, a table is not found or already exists,
there is a syntax error in the MySQL statement, a
wrong number of parameters is specified, and so
on.).
36. PostgreSQL database
PostgreSQL is a powerful, open source object-relational
database system.
It is a multi-user database management system.
It runs on multiple platforms including Linux, FreeBSD,
Solaris, Microsoft Windows and Mac OS X.
PostgreSQL is developed by the PostgreSQL Global
Development Group.
The PostgreSQL can be integrated with Python using
psycopg2 module. sycopg2 is a PostgreSQL database adapter
for the Python programming language.
psycopg2 was written with the aim of being very small and
fast.
37. Sqlite3 module in python
SQLite3 is a very easy to use database engine.
It is self-contained, serverless, zero-configuration and
transactional.
It is very fast and lightweight, and the entire database is
stored in a single disk file.
It is used in a lot of applications as internal data storage.
The Python Standard Library includes a module called
"sqlite3" intended for working with this database.
This module is a SQL interface compliant with the DB-API
2.0 specification.
You do not need to install this module separately because its
being shipped by default along with Python version 2.5.x
onwards.
38. To use sqlite3 module, you must first create a connection
object that represents the database and then optionally
you can create cursor object which will help you in
executing all the SQL statements.
39. Using Python's SQLite Module
To use the SQLite3 module we need to add an import statement to our python
script:
import sqlite3
Connecting SQLite to the Database
We use the function sqlite3.connect to connect to the database. We
can use the argument ":memory:" to create a temporary DB in the
RAM or pass the name of a file to open or create it.
# Create a database in RAM
db = sqlite3.connect(':memory:')
# Creates or opens a file called mydb with a SQLite3 DB
db = sqlite3.connect('data/mydb')
When we are done working with the DB we need to close the
connection:
db.close()
40. Object-relational mappers (ORMs)
Object relation mapping is a technique that lets you query and manipulate data
from a database using an oop.
An object-relational mapper (ORM) is a code library that automates the
transfer of data stored in relational databases tables into objects that are more
commonly used in application code.
41. An ORM is the software artefact who maps from
relational data base tables to object/class
This means: maps the tables and columns in a relational
database directly to the object instance and wraps all
SQL/DDL functionality in his methods.
42. ORM Object Relational Mapping. We
communicate with the database using
the ORM and only use Python objects
and classes.
43. SQLAlchemy
SQLAlchemy is the Python SQL toolkit and Object
Relational Mapper that gives application developers the
full power and flexibility of SQL.
SQLAlchemy provides a full suite of well known
enterprise-level persistence patterns, designed for
efficient and high-performing database access, adapted
into a simple and Pythonic domain language.
44. Major SQLAlchemy features include:
An industrial strength ORM, built from the core on the identity map, unit of work,
and data mapper patterns. These patterns allow transparent persistence of objects
using a declarative configuration system. Domain models can be constructed and
manipulated naturally, and changes are synchronized with the current transaction
automatically.
A relationally-oriented query system, exposing the full range of SQL's capabilities
explicitly, including joins, subqueries, correlation, and most everything else, in
terms of the object model. Writing queries with the ORM uses the same
techniques of relational composition you use when writing SQL. While you can
drop into literal SQL at any time, it's virtually never needed.
A comprehensive and flexible system of eager loading for related collections and
objects. Collections are cached within a session, and can be loaded on individual
access, all at once using joins, or by query per collection across the full result set.
45. A Core SQL construction system and DBAPI interaction layer. The
SQLAlchemy Core is separate from the ORM and is a full database
abstraction layer in its own right, and includes an extensible Python-
based SQL expression language, schema metadata, connection
pooling, type coercion, and custom types.
All primary and foreign key constraints are assumed to be
composite and natural. Surrogate integer primary keys are of course
still the norm, but SQLAlchemy never assumes or hardcodes to this
model.
Database introspection and generation. Database schemas can be
"reflected" in one step into Python structures representing database
metadata; those same structures can then generate CREATE
statements right back out - all within the Core, independent of the
ORM.
46. Create engine
create an “engine” which is basically an object that
knows how to communicate with the provided
database using the credentials you supply.
In this case, we are using a Sqlite database that doesn’t
need credentials.
47. Echo-true
set echo to True. This means that SqlAlchemy will output
all the SQL command it is executing to stdout.
This is handy for debugging, but should be set to False
when you’re ready to put the code into production.
48. Metadata
we create a MetaData object. This cool creation from the
SqlAlchemy team holds all the database metadata.
It consists of Python objects that hold descriptions of the
tables and other schema-level objects of the database.
We can bind the metadata object to our database here
or in the create_all statement near the end of the code.
49. Table create
The last section is how we create the tables programmatically. This is accomplished
by using SqlAlchemy’s Table and Column objects.
we have various field types available to us, like String and Integer.
There are many others too.
For this example, we create a database and name it “users”, then pass in our
metadata object.
Next, we put it in the Columns.
The “id” column is set as our primary key.
SqlAlchemy will magically increment this for us as we add users to the database.
The “name” column is a String type and capped at 40 characters long.
The “age” column is just a simple Integer and the “password” column is just set to
String.
We didn’t set its length, but we probably should. The only major difference in the
addresses_table is how we set up the Foreign key attribute that connects the two
tables.
Basically, we point it at the other table by passing the correct field name in a string
to the ForeignKey object.
50. Create_All()
The final line of this snippet actually creates the database
and the table. You can call it as often as you like as it will
always check for the existence of the specified table
before trying to create it. That means you can create
additional tables and call create_all and SqlAlchemy will
only create the new table.
51. insert
first, you need to create the Insert object by calling the
table’s insert method. Then you can use the Insert’s values
method to add the required values for the row. Next, we
create the Connection object via the engine’s connect
method. Finally, we call the Connection object’s execute
method on the Insert object.
52. Way of inserting
In both cases, you will need to call the table object insert
method. Basically, you just take the engine out of the
picture in the second instance. The last insert method
we’ll look at is how to insert multiple rows:
53. select
First we have to import the select method from
sqlalchemy.sql.
Then we pass it the table as a one element list. Finally we
call the select object’s execute method and store the
returned data in the result variable.
Now that we have all the results, we should probably see
if we got what we expected. Thus, we create a for loop
to iterate over the result.
54. All we had to do was specify the column names in our
select statement. The little “c” basically means “column”,
so we do a select on column name and column age. If
you had multiple tables, then the select statement would
be something like this:
select([tableOne, tableTwo])
Of course, this will probably return duplicate results, so
you’ll want to do something like this to mitigate the
issue:
s = select([tableOne, tableTwo],
tableOne.c.id==tableTwo.c.user_id)
55. Model
declarative_base is a factory function, that returns a base class
(actually a metaclass), and the entities are going to inherit from
it.
Once the definition of the class is done, the Table and mapper
will be generated automatically.
There is some magic involved, but on the other hand
SQLAlchemy forces you to explicitly define things like the table
name, primary keys and relationships.
First create the Base class:
from sqlalchemy.ext.declarative import
declarative_base
Base = declarative_base()
56. The entities are classes, which derive from the Base class.
We are also using the relationship function to define the
relationships between the entities.
The many-to-many relationship between tags and images
requires us to define an association table, which we'll be
joining over.
When defining the images relationships, we are also using
the backref parameter, which adds the image properties
to the tags and comments entities.
We want those references to be dynamically loaded,
because we probably don't want to load all images, when
accessing these entities.
57. Connecting and Creating the Schema
First of all we need to create the engine, which is used to
connect to the database. This example uses SQLite3, which
should already be included in your Python installation.
from sqlalchemy import create_engine
engine =
create_engine(r'sqlite:///f:foo.db',echo=True)
A call to the metadata of the Base class then generates the
Schema:
Base.metadata.create_all(engine)
Since we have set echo=True for the engine, we can see the
generated SQL:
58. Sessions
A Session is a Python class, which handles the conversation
with the database for us.
It implements the Unit of Work pattern for synchronizing
changes to the database.
Basically it tracks all records you add or modify.
We can acquire a Session class with the sessionmaker, which
simplifies the configuration, since we only bind our database
engine to it:
from sqlalchemy.orm import sessionmaker
Session = sessionmaker(bind=engine)
And now whenever we want to talk to the database, we can
create a new Session:
session = Session()
59. SQLObject
SQLObject is a Python object-relational mapper
between a SQL database and Python objects.
In SQLObject, database concepts are mapped into
Python in a way that's very similar to SQLAlchemy,
where tables are mapped as classes, rows as instances
and columns as attributes.
It also provides a Python-object-based query language
that makes SQL more abstract.
60. Introduction to Nosql
NoSQL is a non-relational database management systems,
different from traditional relational database management
systems in some significant ways.
It is designed for distributed data stores where very large scale
of data storing needs (for example Google or Facebook which
collects terabits of data every day for their users).
These type of data storing may not require fixed schema, avoid
join operations and typically scale horizontally.
provides a mechanism for storage and retrieval of data other
than tabular relations model used in relational databases.
NoSQL database doesn?t use tables for storing data.
It is generally used to store big data and real-time web
applications.
61. MongoDB - Overview
MongoDB is a cross-platform, document oriented database
that provides, high performance, high availability, and easy
scalability.
MongoDB works on concept of collection and document.
Database
Database is a physical container for collections. Each
database gets its own set of files on the file system. A single
MongoDB server typically has multiple databases.
62. Collection
Collection is a group of MongoDB documents.
It is the equivalent of an RDBMS table.
A collection exists within a single database.
Collections do not enforce a schema.
Documents within a collection can have different
fields.
Typically, all documents in a collection are of similar
or related purpose.
63. Document
A document is a set of key-value pairs.
Documents have dynamic schema.
Dynamic schema means that documents in the same
collection do not need to have the same set of fields or
structure, and common fields in a collection's documents
may hold different types of data.
64.
65. Why Use MongoDB?
Document Oriented Storage Data is stored in the form of JSON style−
documents.
Index on any attribute
Replication and high availability
Auto-sharding
Rich queries
Fast in-place updates
Professional support by MongoDB
Where to Use MongoDB?
Big Data
Content Management and Delivery
Mobile and Social Infrastructure
User Data Management
Data Hub