SlideShare ist ein Scribd-Unternehmen logo
1 von 15
Name: OsikoyaOmotoyosi Samuel
                           Project: The Database
                     Accessor: HashimShaheen
                         Level: HND Computing

What is a Database?
        The database is a very broad topic and has various definitions but basically has one
meaning. Oracle Corporation defines a database as a collection of physical files that are managed
by an instance of their database software product where a file is a collection of related records that
are stored as a single unit by an operating system. Microsoft SQL and Sybase defines a database as
a collection of data pieces that have a common owner. One may get confused when working with
different products, for example, the definition given by Microsoft for a database is exactly how
Oracle defines a Schema.


        A database is a shared, integrated computer structure that stores a collection of End-user
data, that is, raw facts of interest to the end user AND metadata, (or data about data) through which
the end-user data are integrated and managed. Simply put, DB, as database is often called or
represented, is an assembly of any information in an organized manner, such that a computer
program or a person can easily access pieces of data depending on what is being searched for or
needed.We can also say thatthe database consists of files and files are made up of records which
consists of data pieces
        The metadata is a description of the data characteristics and the set of relationships that links
the data found within the database. For example, the metadata component stores information such
as the name of each data element, the type of values (numeric, dates, or text) stored on each data
element, whether or not the data element can be left empty, and so on. The metadata also provide
information that complements and expands the value and use of the data.
Summarily, the metadata present a more complete picture of the data in the database.
       Due to the importance of data and then, the database, it is expedient to put things in the
hands of an expert, a Database Specialist, who is specialized in the field.
       A database specialist controls and manages the software program for a company. This is the
person responsible for the collection and management of typical information on prospects and
customers of business in a given organization. While the database specialist manages the data that
goes into the software program, the manager also works with various other employees in the
company to collect, monitor and share the information that is collected.
The first thing that a database specialist does when working for an employer or a client is to know
what the client wants or needs. Databases are used across all types of industries, but each business
or company has its own different needs and requirements because of the different works they do.
Once the database specialist knows what a company or organization wants, the database specialist
can evaluate the current software.

Analyzing the current database software allows the database specialist to determine if the company
is collecting the right or wrong information from customers. For some companies, it may be
collecting information such as the name and mailing address of the customer, along with a phone
number, email address and a record of the customer’s buying behaviour with the company.
The database specialist will now use this information to report findings to different departments
within the company. For example, a database specialist typically can run reports from the software.
These reports can help the marketing department of the company determine how marketing
campaigns affect customer buying behaviour. The marketing department also uses the information
that the database specialist collects to send out marketing materials, including emails, brochures,
direct mail campaigns, coupons, catalogs and more.

The specialist also spends a lot of time updating the information that is in the software. For
example, if the marketing department sends out a postcard to the entire database that includes a
coupon code, the data specialist will work with the sales people who are taking the orders online
and via telephone to track the response to the direct mail campaign.
The database specialist will also receive postcards that get returned as undeliverable for various
reasons. The specialist then contact the customer in an attempt to update their contact information in
the system or remove any of the information that is not accurate. If the postcard gets returned with a
new address, but the post office is unable to forward it because the forwarding has expired, then the
database specialist can put the new address in the database.




What a database looks like:
Showing a database of customer information in a particular company
    Name             Age      Address        Email                          Phone Number        Date of
                                                                                                Last
                                                                                                business
                                                                                                with
                                                                                                customer
    Mrs. Jaden       65       2          Smith001@linuxmail.org             +120235799764       11-12-
    Smith                     Montgomery                                                        2012
                              road,
                              Alaska,
                              USA
    Mrs. Goldie      42       Plot 24b   G_phillip@yahoo.com                +447877474749       20-02-
    phillip                   green way,                                                        2013
                              hatfield,
                              England
    Mr. Samuel       19       2, Nimrod  Robby@gmail.com                    Not Provided        3-03-
    Roberto                   Drive,                                                            2013
                              Nairobi,
                              Kenya
Miss.        24           23,           T.Balogun@rocketmail.com +2348035175111
    TundeBalogun              Adedoyin
                              Street,
                              Ogba,
                              Lagos,
                              Nigeria
    Mr. Patrick      51       23, Roe       Patjose@yahoo.com              +4474678373932 15-03-
    Jose                      Green lane,                                                 2011
                              Hatfield,
                              England




Showing the User interface of a Database system

  New Customer? Register your Details correctly. Fields marked with (*)MUSTbe filled.
  Surname:


  Middle Initial:         Date of Birth:           dd       mm      yyyy


  First Name:


  Address:


  Email:


  Phone Number:
  Have an account already? Sign in here




                    HOW DATABASES WORK ACROSS ORGANIZATIONS


Role of the Database in an Organization


       An organization is traditionally viewed as a three level pyramid-operational activities at the
bottom, management planning and control activities in the middle and strategic planning and policy
making in top management. The corporate database contains data relating to the organization, its
operations, its plan and its environment.
State of Database Management in Organizations


        The needs of organizations and management are changeable, diverse and often ill-defined,
yet they must be met. Added to these are outside pressures from federal taxing authorities, federal
securities agencies and legislators making privacy laws. Both internal and external forces demand
that organizations exercise control over their data resources.


        Decisions and actions in the organization are based upon the image contained in the
corporate database. Managerial decisions direct the actions at the operational level and produce
plans and expectations which are formally captured and stored in the corporate database.
Transactions record actual results of organizational activities and environmental changes and update
the database to maintain a current image.


        Basically, People in the organization query the database for information to conduct the daily
operations. Middle management receives reports comparing actual results to previously recorded
plans and expectations. The corporate database provides data for modeling and forecasting which
support top management needs. The corporate database supports all levels of an organization and is
vital for operations, decision making and the management process.


       While management seeks to control data resources, computer applications grow. When a
corporation achieves comprehensive support of its operations, for instance, computer applications
begin to penetrate into higher management levels. With comprehensive database support of
operations, an MIS can mature as a tool for planning, control and decision making. Earlier, in the
development of an MIS, an organization must appoint a DBA to manage its data resources.


       While an organization’s move toward the database approach can be hastened by the
acquisition of a DBMS, the latter is not necessary. Most commercially available DBMS’s fall
substantially short of ideal capabilities, making their acquisition an interim measure - a move to
help the organization learn how to operate in a managed data environment. In seeking DBMS
capability, building one’s own system is unrealistic except for large organizations with special
needs, such as a very large database or large volumes of known transactions requiring rapid online
response.


        Data is a vital resource in an organization and must be managed. The organizational
database is an essential component in a management information system. Of the four components of
a data processing system, attention to data has lagged behind the development of machines and
programming technology. Taking a database approach requires an organization to focus on data as a
valued resource. Data is separate from programs and application systems which use it.
Typical database applications include:
       Banking: all transactions
       Airlines: reservations, schedules
       Universities: registration, grades
       Sales: customers, products, purchases
       Online retailers: order tracking, customized recommendations
       Manufacturing: production, inventory, orders, supply chain
       Human resources: employee records, salaries, tax deductions          etc.


Every day, everybody makes use of data, hence the need for database. From small scale businesses
to large corporation, to individuals etc., the importance of data and database cannot be be-littled.


But what is data?
         Data can be defined as raw facts. The term raw implies that the data has not been processed
or worked upon to give implied meaning. For example, the digits 2344223783 is a raw fact, as it has
not been worked upon to give any implied meaning. After processing by respective end-users, it
could mean so many things. It could be some amount of money been stored away in a bank, hence
it's a calculated Currency to that user, with its unit as the currency of the country. It could be a
social security number SSN, of a US citizen. It could be the number of citizens living in a particular
country, at a particular time, hence it's a demographic figure, and so on. Such processed raw facts or
data is called Information.
        Information is the result of processing raw data to reveal its meaning. Data processing can
be as simple as organizing data to reveal patterns or as complex as making forecasts or drawing
inferences using statistical modeling. To reveal meaning, information requires context. For example,
an average temperature reading of 105 degrees does not mean much unless you also know its
context: Is this in degrees Fahrenheit or Celsius? Is this a machine temperature, a body temperature,
or an outside air temperature?
       It should be noted that raw data must be properly formatted for storage, processing, and
presentation.
       As it is,Timely and useful information requires accurate data. Such data must be properly
generated and stored in a format that is easy to access and process. And, like any basic resource, the
data environment must be managed carefully.
        Hence, Data management is a discipline that focuses on the proper generation, storage, and
retrieval of data. Given the crucial role that data play, it is seen that data management is a core
activity for any business, government agency, service organization, or charity. But efficient data
management typically requires the use of a computer database which is usually operated by a
database administrator.


DATABASE ADMINISTRATOR
This is a person who handles the environmental aspect of a database. He/she is the person
who has complete control on the database.




Roles of a Database Administrator
       The database administrator performs a critical role within an organization and is an
important and key role in Database Management Systems. The major responsibility of a database
administrator is to handle the process of developing the database and maintaining the database of an
organization. The database administrator is responsible for defining the internal layout of the
database and ensuring the internal layout optimizes system performance.

        The database administrator has full access over all type of important data of an organization.
The database administrator decides what data will be stored in the database and how to organize
data in database so that it can be access easily on requirement or need of an organization. To design
the database of an organization, the database administrator must have a meeting with users and
determine their requirements.

       The database administrator is also responsible for preparing documentation, including
recording the procedures, standards, guidelines, and data descriptions necessary for the efficient and
continuing use of the database environment. Documents should include materials to help end users,
database application programmers, the operation staff, and all personnel connected with the
database management system.

        The database administrator is responsible for monitoring the database environment, such as
seeing that the database is meeting performance standards, making sure the accuracy, integrity, and
security of data are maintained.

       The database administrator is also responsible to manage any enhancements into the
database environment.
Other roles of Data Administrator may include:
   (a) Establishing the needs of users and monitoring user access and security;
   (b) Monitoring performance and managing parameters to provide fast query responses to 'front
       end' users;
   (c) Mapping out the 'conceptual design' for a planned database in outline;
   (d) Considering both 'back end' organization of data and 'front end' accessibility for end users
       and;
   (e) Refining the 'logical design' so that it can be translated into a specific data model and more.




ORGANIZATIONAL ISSUES
       In the heart of the matter some issues that affects the use of database within an organization.
A database security manager is the most important asset to maintaining and securing sensitive data
within an organization. Database security managers are required to multitask and juggle a variety of
headaches that accompany the maintenance of a secure database.

If you own a business it is important to understand some of the security problems that occur within
an organization and how to avoid them. If you understand the how, where, and why of database
security you can prevent future problems from occurring.

Daily Maintenance: Database audit logs require daily review to make certain that there has been
no data misuse. This requires overseeing database privileges and then consistently updating user
access accounts. A database security manager also provides different types of access control for
different users and assesses new programs that are performing with the database. If these tasks are
performed on a daily basis, you can avoid a lot of problems with users that may pose a threat to the
security of the database.
Varied Security Methods for Applications: More often than not applications developers will vary
the methods of security for different applications that are being utilized within the database. This
can create difficulty with creating policies for accessing the applications. The database must also
possess the proper access controls for regulating the varying methods of security otherwise sensitive
data is at risk.

Post-Upgrade Evaluation: When a database is upgraded it is necessary for the administrator to
perform a post-upgrade evaluation to ensure that security is consistent across all programs. Failure
to perform this operation opens up the database to attack.

Split the Position: Sometimes organizations fail to split the duties between the IT administrator and
the database security manager. Instead the company tries to cut costs by having the IT administrator
do everything. This action can significantly compromise the security of the data due to the
responsibilities involved with both positions. The IT administrator should manage the database
while the security manager performs all of the daily security processes.

Application Spoofing: Hackers are capable of creating applications that resemble the existing
applications connected to the database. These unauthorized applications are often difficult to
identify and allow hackers access to the database via the application in disguise.

Manage User Passwords: Sometimes IT database security managers will forget to remove IDs and
access privileges of former users which leads to password vulnerabilities in the database. Password
rules and maintenance needs to be strictly enforced to avoid opening up the database to
unauthorized users.

Windows OS Flaws: Windows operating systems are not effective when it comes to database
security. Often theft of passwords is prevalent as well as denial of service issues. The database
security manager can take precautions through routine daily maintenance checks.


B. DATABASE MANAGEMENT SYSTEM (DBMS)


       A database management system (DBMS) is a collection of programs that manages the
database structure and controls access to the data stored in the database. In a sense, a database
resembles a very well-organized electronic filing cabinet in which powerful software, known as a
database management system, helps manage the cabinet’s contents.


       The figure below shows an overview of the basic concepts of the database and the DBMS
software, known as the Database System.




       A SIMPLIFIED DATABASE SYSTEM ENVIRONMENT.




Database system             USERS/PROGRAMMERS

                           APPLICATION PROGRAMS/QUERIES

                       DBMS SOFTWARE
                              SOFTWARE TO PROCESS
                                QUERIES/PROGRAMS


                              SOFTWARE TO ACCESS
                                   STRORED DATA
STORED DATABASE                        STORED
                             DEFINITION

                             (META-DATA)
                                                              DATABASEE




        The DBMS serves as the intermediary between the user and the database. The database
structure itself is stored as a collection of files, and the only way to access the data in those files is
through the DBMS. The figure below emphasizes the point that the DBMS presents the end user (or
application program) with a single, integrated view of the data in the database. The DBMS receives
all application requests and translates them into the complex operations required to fulfill those
requests. The DBMS hides much of the database’s internal complexity from the application
programs and users.
The application program might be written by a programmer using a programming language such as
Visual Basic.NET, Java, or C#, or it might be created through a DBMS utility program.




       A database management system (DBMS) is a collection of programs that enables users to create and main
involv
es
specify
ing the
data
types,
structu
res,
and
constra
ints of
the
data to be stored in the database. The database definition or descriptive information is also stored by
the DBMS in the form of a database catalog or dictionary; it is called meta-data.
         Constructing the database is the process of storing the data on some storage medium that is
controlled by the DBMS. Manipulating a database includes functions such as querying the database
to retrieve specific data, updating the database to reflect changes in the miniworld, and generating
reports from the data. Sharing a database allows multiple users and programs to access the database
simultaneously.
       An application program accesses the database by sending queries or requests for data to the
DBMS. A query typically causes some data to be retrieved; a transaction may cause some data to be
read and some data to be written into the database.
       Other important functions provided by the DBMS include protecting the database and
maintaining it over a long period of time. Protection includes system protection against hardware or
software malfunction (or crashes) and security protection against unauthorized or malicious access.
A typical large database may have a life cycle of many years, so the DBMS must be able to
maintain the database system by allowing the system to evolve as requirements change over time.
It is not absolutely necessary to use general-purpose DBMS software to implement a computerized
database. We could write our own set of programs to create and maintain the database, in effect
creating our own special-purpose DBMS software. In either case—whether we use a general-
purpose DBMS or not—we usually have to deploy a considerable amount of complex software. In
fact, most DBMS s are very complex software systems.


Having a DBMS between the end user’s applications and the database offers some important
advantages. First, the DBMS enables the data in the database to be shared among multiple
applications or users. Second, the DBMS integrates the many different users’ views of the data into
a single all-encompassing data repository.
Because data are the crucial raw material from which information is derived, you must have a good
method to manage such data. The DBMS helps make data management more efficient and effective.
       In particular, a DBMS provides advantages such as:


Improved Data Sharing. The DBMS helps create an environment in which end users have better
access to more and better-managed data. Such access makes it possible for end users to respond
quickly to changes in their environment.


Improved Data Access. The DBMS makes it possible to produce quick answers to ad hoc queries.
From a database perspective, a query is a specific request issued to the DBMS for data
manipulation—for example, to read or update the data. Simply put, a query is a question, and an ad
hoc query is a spur-of-the-moment question. The DBMS sends back an answer (called the query
result set) to the application. For example, end users, when dealing with large amounts of sales data,
might want quick answers to questions (ad hoc queries)
Such as:
                What was the dollar volume of sales by product during the past six months?
               What is the sales bonus figure for each of our salespeople during the past three
               months?
               How many of our customers have credit balances of $3,000 or more?


Improved Data Security. The more users access the data, the greater the risks of data security
breaches. Corporations invest considerable amounts of time, effort, and money to ensure that
corporate data are used properly. A DBMS provides a framework for better enforcement of data
privacy and security policies.
Minimized Data Inconsistency. Data inconsistency exists when different versions of the same data
appear in different places. For example, data inconsistency exists when a company’s sales
department stores a sales representative’s name as ―Andy Colen‖ and the company’s personnel
department stores that same person’s name as ―Smith A Colen,‖ or when the company’s regional
sales office shows the price of a product as $45.95 and its national sales office shows the same
product’s price as $43.95. The probability of data inconsistency is greatly reduced in a properly
designed database.


Improved Decision Making. Better-managed data and improved data access make it possible to
generate better-quality information, on which better decisions are based. The quality of the
information generated depends on the quality of the underlying data. Data quality is a
comprehensive approach to promoting the accuracy, validity, and timeliness of the data. While the
DBMS does not guarantee data quality, it provides a framework to facilitate data quality initiatives.


Better Data Integration. Wider access to well-managed data promotes an integrated view of the
organization’s operations and a clearer view of the big picture. It becomes much easier to see how
actions in one segment of the company affect other segments.


Increased End-User Productivity. The availability of data, combined with the tools that transform
data into useful information, empowers end users to make quick, informed decisions that can make
the difference between success and failure in the global economy.




METHODS OF DATA ORGANIZATION AND ACCESS
        Data organization is the permanent logical structure of the file. You tell the computer how to
retrieve records from the file by specifying the access mode.
      Organizing and accessing data are two of the driving forces behind data management.
Organizing data involves arranging data in storage so that they may be easily accessed.
       Accessing data refers to retrieving data from storage.
       Data organization and access are important determinants of how easily managers and users
can obtain the information they need to do their jobs. Since some organization and access schemes
provide faster or more flexible ways to locate individual records than others, it is important for
managers to anticipate what data they and their subordinates will need when designing files and
databases.



Data Organizing Methods:

There are different types of ways to organize data:
Sequential Organization
A sequential file contains records organized in the order they were entered. The order of the records
is fixed. The records are stored and sorted in physical, contiguous blocks within each block the
records are in sequence. Records in these files can only be read or written sequentially.
Once stored in the file, the record cannot be made shorter, or longer, or deleted. However, the record
can be updated if the length does not change. (This is done by replacing the records by creating a
new file.) New records will always appear at the end of the file.
If the order of the records in a file is not important, sequential organization will suffice, no
matter how many records you may have. Sequential output is also useful for report printing or
sequential reads which some programs prefer to do.

Line-Sequential Organization
Line-sequential files are like sequential files, except that the records can contain only characters as
data. Line-sequential files are maintained by the native byte stream files of the operating system.

Indexed-Sequential Organization
Key searches are improved by this system too. The single-level indexing structure is the simplest
one where a file, whose records are pairs, contains a key pointer. This pointer is the position in the
data file of the record with the given key. A subset of the records, which are evenly spaced along the
data file, is indexed, in order to mark intervals of data records.
This is how a key search is performed: the search key is compared with the index keys to find the
highest index key coming in front of the search key, while a linear search is performed from the
record that the index key points to, until the search key is matched or until the record pointed to by
the next index entry is reached. Regardless of double file access (index + data) required by this sort
of search, the access time reduction is significant compared with sequential file searches.
It is important to note that the hardware for Index-Sequential Organization is usually Disk-based,
rather than tape. Records are physically ordered by primary key. And the index gives the physical
location of each record. Records can be accessed sequentially or directly, via the index. The index is
stored in a file and read into memory at the point when the file is opened. Also, indexes must be
maintained.

Inverted List
In file organization, this is a file that is indexed on many of the attributes of the data itself. The
inverted list method has a single index for each key type. The records are not necessarily stored in a
sequence. They are placed in the data storage area, but indexes are updated for the record keys and
location.
Here's an example, in a company file, an index could be maintained for all productsand another one
might be maintained for product types. Thus, it is faster to search the indexes than every record.
These types of file are also known as "inverted indexes." Nevertheless, inverted list files use more
media space and the storage devices get full quickly with this type of organization. The benefits are
apparent immediately because searching is fast. However, updating is much slower.
Direct or Hashed Access
With direct or hashed access a portion of disk space is reserved and a "hashing" algorithm computes
the record address. So there is additional space required for this kind of file in the store. Records are
placed randomly throughout the file. Records are accessed by addresses that specify their disc
location. Also, this type of file organization requires a disk storage rather than tape. It has an
excellent search retrieval performance, but care must be taken to maintain the indexes. If the
indexes become corrupt, what is left might as well go to the bit-bucket, so it is as well to have
regular backups of this kind of file just as it is for all stored valuable data!




Data Accessing Methods:


There are different ways to access data:
Sequential access:


Sequential access is a method whereby the records of file are accessed in sequential order. The
records in a sequential file appear one after another in the order in which they were entered into the
computer and subsequently stored on the medium. Access to any record requires access to all of the
preceding records. Magnetic tape is a storage medium that is sequential in nature. To access a
particular record on magnetic tape, you must read all of the preceding records first. You could use
the sequential access method to record the individual student grades each week because you must
access and update all of the records of the student anyway.


Direct access:


Direct access also called random access is a method in which the records in a file are stored and
accessed in random order. A direct access file has a key, called a key field or access key, that lets the
computer locate, retrieve and update any record in the file without reading each preceding record. A
key field is a field that uniquely identifies each record. Account numbers, employee identification
number and social security numbers are examples of key fields.
Indexed sequential access:


This type of access allows both sequential and direct access of the record in a file. An indexed
sequential file can be set up in many ways. Basically records are stored sequentially when the
indexed sequential file is created. However, where records are added to the file, they are stored out
of sequence in an overflow area.


The computer keeps an index of the key fields from each record. It automatically sorts and updates
the index to allow both sequential and direct access. Then it searches the index by key field to
access a record. When it finds the key field, it can access the record directly using an address
associated with the first key field in the sorted index and follows the rest of the index in sequential
order. The sorted index allow the computer to find records in sequence no matter where they are
physically located on a disk. In practice, multiple indexes usually narrow the location of each
record. This type of file access does not work with tape because tape is a sequential access medium
only.




                                           REFERENCES



Opel. A, (2011), Databases Demystified, Published by the McGraw-Hill Companies Pages 1 – 3
accessed on 03 March, 2013



www.management-hub.com About MIS - Role of Database in an Organization.html accessed on
04 March, 2013.


Coronel, Morris, Rob - Database Systems (Design, Implementation and Management) 9th edition


Elmasri, Navathe- Fundamentals of DB Systems, 6th Edition


http://publib.boulder.ibm.com/infocenter/iadthelp/v7r0/index.jsp?topic=/com.ibm.etools.iseries.lang
ref.doc/c0925395156.htm accessed on March 5, 2013


http://articles.submityourarticle.com/data-access-and-organization-methods-219812 accessed on
March 5, 2013
http://www.spamlaws.com/database-security-issues.html accesed on March 5, 2013
Lorette K., Wallace O., 2003-2013 Conjecture Corporation accessed
onhttp://www.wisegeek.com/what-does-a-database-specialist-do.htmaccessed on

Weitere ähnliche Inhalte

Ähnlich wie Project 1

CHAPTER5Database Systemsand Big DataRafal Olechows
CHAPTER5Database Systemsand Big DataRafal OlechowsCHAPTER5Database Systemsand Big DataRafal Olechows
CHAPTER5Database Systemsand Big DataRafal Olechows
JinElias52
 
Data quality and bi
Data quality and biData quality and bi
Data quality and bi
jeffd00
 

Ähnlich wie Project 1 (20)

introduction to data science
introduction to data scienceintroduction to data science
introduction to data science
 
Semantic 'Radar' Steers Users to Insights in the Data Lake
Semantic 'Radar' Steers Users to Insights in the Data LakeSemantic 'Radar' Steers Users to Insights in the Data Lake
Semantic 'Radar' Steers Users to Insights in the Data Lake
 
Bad customer data?
Bad customer data?Bad customer data?
Bad customer data?
 
Converting Big Data To Smart Data | The Step-By-Step Guide!
Converting Big Data To Smart Data | The Step-By-Step Guide!Converting Big Data To Smart Data | The Step-By-Step Guide!
Converting Big Data To Smart Data | The Step-By-Step Guide!
 
Intro to big data and applications - day 1
Intro to big data and applications - day 1Intro to big data and applications - day 1
Intro to big data and applications - day 1
 
Modern trends in information systems
Modern trends in information systemsModern trends in information systems
Modern trends in information systems
 
Hh
HhHh
Hh
 
BI LECTURE 3- 2023.pptx
BI LECTURE 3- 2023.pptxBI LECTURE 3- 2023.pptx
BI LECTURE 3- 2023.pptx
 
Big_Data.pptx
Big_Data.pptxBig_Data.pptx
Big_Data.pptx
 
CHAPTER5Database Systemsand Big DataRafal Olechows
CHAPTER5Database Systemsand Big DataRafal OlechowsCHAPTER5Database Systemsand Big DataRafal Olechows
CHAPTER5Database Systemsand Big DataRafal Olechows
 
Solving data discovery in the enterprise
Solving data discovery in the enterpriseSolving data discovery in the enterprise
Solving data discovery in the enterprise
 
Data quality and bi
Data quality and biData quality and bi
Data quality and bi
 
what-is-datafication-and-why-is-it-the-future-of-business-in-2023.pdf
what-is-datafication-and-why-is-it-the-future-of-business-in-2023.pdfwhat-is-datafication-and-why-is-it-the-future-of-business-in-2023.pdf
what-is-datafication-and-why-is-it-the-future-of-business-in-2023.pdf
 
Big data careers
Big data careersBig data careers
Big data careers
 
Lecture #03
Lecture #03Lecture #03
Lecture #03
 
Data set Improve your business with your own business data
Data set   Improve your business with your own business dataData set   Improve your business with your own business data
Data set Improve your business with your own business data
 
Semantic 'Radar' Steers Users to Insights in the Data Lake
Semantic 'Radar' Steers Users to Insights in the Data LakeSemantic 'Radar' Steers Users to Insights in the Data Lake
Semantic 'Radar' Steers Users to Insights in the Data Lake
 
data collection, data integration, data management, data modeling.pptx
data collection, data integration, data management, data modeling.pptxdata collection, data integration, data management, data modeling.pptx
data collection, data integration, data management, data modeling.pptx
 
Offers bank dss
Offers bank dssOffers bank dss
Offers bank dss
 
White paper
White paperWhite paper
White paper
 

Project 1

  • 1. Name: OsikoyaOmotoyosi Samuel Project: The Database Accessor: HashimShaheen Level: HND Computing What is a Database? The database is a very broad topic and has various definitions but basically has one meaning. Oracle Corporation defines a database as a collection of physical files that are managed by an instance of their database software product where a file is a collection of related records that are stored as a single unit by an operating system. Microsoft SQL and Sybase defines a database as a collection of data pieces that have a common owner. One may get confused when working with different products, for example, the definition given by Microsoft for a database is exactly how Oracle defines a Schema. A database is a shared, integrated computer structure that stores a collection of End-user data, that is, raw facts of interest to the end user AND metadata, (or data about data) through which the end-user data are integrated and managed. Simply put, DB, as database is often called or represented, is an assembly of any information in an organized manner, such that a computer program or a person can easily access pieces of data depending on what is being searched for or needed.We can also say thatthe database consists of files and files are made up of records which consists of data pieces The metadata is a description of the data characteristics and the set of relationships that links the data found within the database. For example, the metadata component stores information such as the name of each data element, the type of values (numeric, dates, or text) stored on each data element, whether or not the data element can be left empty, and so on. The metadata also provide information that complements and expands the value and use of the data. Summarily, the metadata present a more complete picture of the data in the database. Due to the importance of data and then, the database, it is expedient to put things in the hands of an expert, a Database Specialist, who is specialized in the field. A database specialist controls and manages the software program for a company. This is the person responsible for the collection and management of typical information on prospects and customers of business in a given organization. While the database specialist manages the data that goes into the software program, the manager also works with various other employees in the company to collect, monitor and share the information that is collected.
  • 2. The first thing that a database specialist does when working for an employer or a client is to know what the client wants or needs. Databases are used across all types of industries, but each business or company has its own different needs and requirements because of the different works they do. Once the database specialist knows what a company or organization wants, the database specialist can evaluate the current software. Analyzing the current database software allows the database specialist to determine if the company is collecting the right or wrong information from customers. For some companies, it may be collecting information such as the name and mailing address of the customer, along with a phone number, email address and a record of the customer’s buying behaviour with the company. The database specialist will now use this information to report findings to different departments within the company. For example, a database specialist typically can run reports from the software. These reports can help the marketing department of the company determine how marketing campaigns affect customer buying behaviour. The marketing department also uses the information that the database specialist collects to send out marketing materials, including emails, brochures, direct mail campaigns, coupons, catalogs and more. The specialist also spends a lot of time updating the information that is in the software. For example, if the marketing department sends out a postcard to the entire database that includes a coupon code, the data specialist will work with the sales people who are taking the orders online and via telephone to track the response to the direct mail campaign. The database specialist will also receive postcards that get returned as undeliverable for various reasons. The specialist then contact the customer in an attempt to update their contact information in the system or remove any of the information that is not accurate. If the postcard gets returned with a new address, but the post office is unable to forward it because the forwarding has expired, then the database specialist can put the new address in the database. What a database looks like: Showing a database of customer information in a particular company Name Age Address Email Phone Number Date of Last business with customer Mrs. Jaden 65 2 Smith001@linuxmail.org +120235799764 11-12- Smith Montgomery 2012 road, Alaska, USA Mrs. Goldie 42 Plot 24b G_phillip@yahoo.com +447877474749 20-02- phillip green way, 2013 hatfield, England Mr. Samuel 19 2, Nimrod Robby@gmail.com Not Provided 3-03- Roberto Drive, 2013 Nairobi, Kenya
  • 3. Miss. 24 23, T.Balogun@rocketmail.com +2348035175111 TundeBalogun Adedoyin Street, Ogba, Lagos, Nigeria Mr. Patrick 51 23, Roe Patjose@yahoo.com +4474678373932 15-03- Jose Green lane, 2011 Hatfield, England Showing the User interface of a Database system New Customer? Register your Details correctly. Fields marked with (*)MUSTbe filled. Surname: Middle Initial: Date of Birth: dd mm yyyy First Name: Address: Email: Phone Number: Have an account already? Sign in here HOW DATABASES WORK ACROSS ORGANIZATIONS Role of the Database in an Organization An organization is traditionally viewed as a three level pyramid-operational activities at the bottom, management planning and control activities in the middle and strategic planning and policy making in top management. The corporate database contains data relating to the organization, its operations, its plan and its environment.
  • 4. State of Database Management in Organizations The needs of organizations and management are changeable, diverse and often ill-defined, yet they must be met. Added to these are outside pressures from federal taxing authorities, federal securities agencies and legislators making privacy laws. Both internal and external forces demand that organizations exercise control over their data resources. Decisions and actions in the organization are based upon the image contained in the corporate database. Managerial decisions direct the actions at the operational level and produce plans and expectations which are formally captured and stored in the corporate database. Transactions record actual results of organizational activities and environmental changes and update the database to maintain a current image. Basically, People in the organization query the database for information to conduct the daily operations. Middle management receives reports comparing actual results to previously recorded plans and expectations. The corporate database provides data for modeling and forecasting which support top management needs. The corporate database supports all levels of an organization and is vital for operations, decision making and the management process. While management seeks to control data resources, computer applications grow. When a corporation achieves comprehensive support of its operations, for instance, computer applications begin to penetrate into higher management levels. With comprehensive database support of operations, an MIS can mature as a tool for planning, control and decision making. Earlier, in the development of an MIS, an organization must appoint a DBA to manage its data resources. While an organization’s move toward the database approach can be hastened by the acquisition of a DBMS, the latter is not necessary. Most commercially available DBMS’s fall substantially short of ideal capabilities, making their acquisition an interim measure - a move to help the organization learn how to operate in a managed data environment. In seeking DBMS capability, building one’s own system is unrealistic except for large organizations with special needs, such as a very large database or large volumes of known transactions requiring rapid online response. Data is a vital resource in an organization and must be managed. The organizational database is an essential component in a management information system. Of the four components of a data processing system, attention to data has lagged behind the development of machines and programming technology. Taking a database approach requires an organization to focus on data as a valued resource. Data is separate from programs and application systems which use it.
  • 5. Typical database applications include: Banking: all transactions Airlines: reservations, schedules Universities: registration, grades Sales: customers, products, purchases Online retailers: order tracking, customized recommendations Manufacturing: production, inventory, orders, supply chain Human resources: employee records, salaries, tax deductions etc. Every day, everybody makes use of data, hence the need for database. From small scale businesses to large corporation, to individuals etc., the importance of data and database cannot be be-littled. But what is data? Data can be defined as raw facts. The term raw implies that the data has not been processed or worked upon to give implied meaning. For example, the digits 2344223783 is a raw fact, as it has not been worked upon to give any implied meaning. After processing by respective end-users, it could mean so many things. It could be some amount of money been stored away in a bank, hence it's a calculated Currency to that user, with its unit as the currency of the country. It could be a social security number SSN, of a US citizen. It could be the number of citizens living in a particular country, at a particular time, hence it's a demographic figure, and so on. Such processed raw facts or data is called Information. Information is the result of processing raw data to reveal its meaning. Data processing can be as simple as organizing data to reveal patterns or as complex as making forecasts or drawing inferences using statistical modeling. To reveal meaning, information requires context. For example, an average temperature reading of 105 degrees does not mean much unless you also know its context: Is this in degrees Fahrenheit or Celsius? Is this a machine temperature, a body temperature, or an outside air temperature? It should be noted that raw data must be properly formatted for storage, processing, and presentation. As it is,Timely and useful information requires accurate data. Such data must be properly generated and stored in a format that is easy to access and process. And, like any basic resource, the data environment must be managed carefully. Hence, Data management is a discipline that focuses on the proper generation, storage, and retrieval of data. Given the crucial role that data play, it is seen that data management is a core activity for any business, government agency, service organization, or charity. But efficient data management typically requires the use of a computer database which is usually operated by a database administrator. DATABASE ADMINISTRATOR
  • 6. This is a person who handles the environmental aspect of a database. He/she is the person who has complete control on the database. Roles of a Database Administrator The database administrator performs a critical role within an organization and is an important and key role in Database Management Systems. The major responsibility of a database administrator is to handle the process of developing the database and maintaining the database of an organization. The database administrator is responsible for defining the internal layout of the database and ensuring the internal layout optimizes system performance. The database administrator has full access over all type of important data of an organization. The database administrator decides what data will be stored in the database and how to organize data in database so that it can be access easily on requirement or need of an organization. To design the database of an organization, the database administrator must have a meeting with users and determine their requirements. The database administrator is also responsible for preparing documentation, including recording the procedures, standards, guidelines, and data descriptions necessary for the efficient and continuing use of the database environment. Documents should include materials to help end users, database application programmers, the operation staff, and all personnel connected with the database management system. The database administrator is responsible for monitoring the database environment, such as seeing that the database is meeting performance standards, making sure the accuracy, integrity, and security of data are maintained. The database administrator is also responsible to manage any enhancements into the database environment. Other roles of Data Administrator may include: (a) Establishing the needs of users and monitoring user access and security; (b) Monitoring performance and managing parameters to provide fast query responses to 'front end' users; (c) Mapping out the 'conceptual design' for a planned database in outline; (d) Considering both 'back end' organization of data and 'front end' accessibility for end users and; (e) Refining the 'logical design' so that it can be translated into a specific data model and more. ORGANIZATIONAL ISSUES In the heart of the matter some issues that affects the use of database within an organization. A database security manager is the most important asset to maintaining and securing sensitive data within an organization. Database security managers are required to multitask and juggle a variety of
  • 7. headaches that accompany the maintenance of a secure database. If you own a business it is important to understand some of the security problems that occur within an organization and how to avoid them. If you understand the how, where, and why of database security you can prevent future problems from occurring. Daily Maintenance: Database audit logs require daily review to make certain that there has been no data misuse. This requires overseeing database privileges and then consistently updating user access accounts. A database security manager also provides different types of access control for different users and assesses new programs that are performing with the database. If these tasks are performed on a daily basis, you can avoid a lot of problems with users that may pose a threat to the security of the database. Varied Security Methods for Applications: More often than not applications developers will vary the methods of security for different applications that are being utilized within the database. This can create difficulty with creating policies for accessing the applications. The database must also possess the proper access controls for regulating the varying methods of security otherwise sensitive data is at risk. Post-Upgrade Evaluation: When a database is upgraded it is necessary for the administrator to perform a post-upgrade evaluation to ensure that security is consistent across all programs. Failure to perform this operation opens up the database to attack. Split the Position: Sometimes organizations fail to split the duties between the IT administrator and the database security manager. Instead the company tries to cut costs by having the IT administrator do everything. This action can significantly compromise the security of the data due to the responsibilities involved with both positions. The IT administrator should manage the database while the security manager performs all of the daily security processes. Application Spoofing: Hackers are capable of creating applications that resemble the existing applications connected to the database. These unauthorized applications are often difficult to identify and allow hackers access to the database via the application in disguise. Manage User Passwords: Sometimes IT database security managers will forget to remove IDs and access privileges of former users which leads to password vulnerabilities in the database. Password rules and maintenance needs to be strictly enforced to avoid opening up the database to unauthorized users. Windows OS Flaws: Windows operating systems are not effective when it comes to database security. Often theft of passwords is prevalent as well as denial of service issues. The database security manager can take precautions through routine daily maintenance checks. B. DATABASE MANAGEMENT SYSTEM (DBMS) A database management system (DBMS) is a collection of programs that manages the database structure and controls access to the data stored in the database. In a sense, a database resembles a very well-organized electronic filing cabinet in which powerful software, known as a database management system, helps manage the cabinet’s contents. The figure below shows an overview of the basic concepts of the database and the DBMS
  • 8. software, known as the Database System. A SIMPLIFIED DATABASE SYSTEM ENVIRONMENT. Database system USERS/PROGRAMMERS APPLICATION PROGRAMS/QUERIES DBMS SOFTWARE SOFTWARE TO PROCESS QUERIES/PROGRAMS SOFTWARE TO ACCESS STRORED DATA
  • 9. STORED DATABASE STORED DEFINITION (META-DATA) DATABASEE The DBMS serves as the intermediary between the user and the database. The database structure itself is stored as a collection of files, and the only way to access the data in those files is through the DBMS. The figure below emphasizes the point that the DBMS presents the end user (or application program) with a single, integrated view of the data in the database. The DBMS receives all application requests and translates them into the complex operations required to fulfill those requests. The DBMS hides much of the database’s internal complexity from the application programs and users. The application program might be written by a programmer using a programming language such as Visual Basic.NET, Java, or C#, or it might be created through a DBMS utility program. A database management system (DBMS) is a collection of programs that enables users to create and main involv es specify ing the data types, structu res, and constra ints of the data to be stored in the database. The database definition or descriptive information is also stored by the DBMS in the form of a database catalog or dictionary; it is called meta-data. Constructing the database is the process of storing the data on some storage medium that is controlled by the DBMS. Manipulating a database includes functions such as querying the database to retrieve specific data, updating the database to reflect changes in the miniworld, and generating reports from the data. Sharing a database allows multiple users and programs to access the database simultaneously. An application program accesses the database by sending queries or requests for data to the DBMS. A query typically causes some data to be retrieved; a transaction may cause some data to be read and some data to be written into the database. Other important functions provided by the DBMS include protecting the database and
  • 10. maintaining it over a long period of time. Protection includes system protection against hardware or software malfunction (or crashes) and security protection against unauthorized or malicious access. A typical large database may have a life cycle of many years, so the DBMS must be able to maintain the database system by allowing the system to evolve as requirements change over time. It is not absolutely necessary to use general-purpose DBMS software to implement a computerized database. We could write our own set of programs to create and maintain the database, in effect creating our own special-purpose DBMS software. In either case—whether we use a general- purpose DBMS or not—we usually have to deploy a considerable amount of complex software. In fact, most DBMS s are very complex software systems. Having a DBMS between the end user’s applications and the database offers some important advantages. First, the DBMS enables the data in the database to be shared among multiple applications or users. Second, the DBMS integrates the many different users’ views of the data into a single all-encompassing data repository. Because data are the crucial raw material from which information is derived, you must have a good method to manage such data. The DBMS helps make data management more efficient and effective. In particular, a DBMS provides advantages such as: Improved Data Sharing. The DBMS helps create an environment in which end users have better access to more and better-managed data. Such access makes it possible for end users to respond quickly to changes in their environment. Improved Data Access. The DBMS makes it possible to produce quick answers to ad hoc queries. From a database perspective, a query is a specific request issued to the DBMS for data manipulation—for example, to read or update the data. Simply put, a query is a question, and an ad hoc query is a spur-of-the-moment question. The DBMS sends back an answer (called the query result set) to the application. For example, end users, when dealing with large amounts of sales data, might want quick answers to questions (ad hoc queries) Such as: What was the dollar volume of sales by product during the past six months? What is the sales bonus figure for each of our salespeople during the past three months? How many of our customers have credit balances of $3,000 or more? Improved Data Security. The more users access the data, the greater the risks of data security breaches. Corporations invest considerable amounts of time, effort, and money to ensure that corporate data are used properly. A DBMS provides a framework for better enforcement of data privacy and security policies.
  • 11. Minimized Data Inconsistency. Data inconsistency exists when different versions of the same data appear in different places. For example, data inconsistency exists when a company’s sales department stores a sales representative’s name as ―Andy Colen‖ and the company’s personnel department stores that same person’s name as ―Smith A Colen,‖ or when the company’s regional sales office shows the price of a product as $45.95 and its national sales office shows the same product’s price as $43.95. The probability of data inconsistency is greatly reduced in a properly designed database. Improved Decision Making. Better-managed data and improved data access make it possible to generate better-quality information, on which better decisions are based. The quality of the information generated depends on the quality of the underlying data. Data quality is a comprehensive approach to promoting the accuracy, validity, and timeliness of the data. While the DBMS does not guarantee data quality, it provides a framework to facilitate data quality initiatives. Better Data Integration. Wider access to well-managed data promotes an integrated view of the organization’s operations and a clearer view of the big picture. It becomes much easier to see how actions in one segment of the company affect other segments. Increased End-User Productivity. The availability of data, combined with the tools that transform data into useful information, empowers end users to make quick, informed decisions that can make the difference between success and failure in the global economy. METHODS OF DATA ORGANIZATION AND ACCESS Data organization is the permanent logical structure of the file. You tell the computer how to retrieve records from the file by specifying the access mode. Organizing and accessing data are two of the driving forces behind data management. Organizing data involves arranging data in storage so that they may be easily accessed. Accessing data refers to retrieving data from storage. Data organization and access are important determinants of how easily managers and users can obtain the information they need to do their jobs. Since some organization and access schemes provide faster or more flexible ways to locate individual records than others, it is important for managers to anticipate what data they and their subordinates will need when designing files and databases. Data Organizing Methods: There are different types of ways to organize data:
  • 12. Sequential Organization A sequential file contains records organized in the order they were entered. The order of the records is fixed. The records are stored and sorted in physical, contiguous blocks within each block the records are in sequence. Records in these files can only be read or written sequentially. Once stored in the file, the record cannot be made shorter, or longer, or deleted. However, the record can be updated if the length does not change. (This is done by replacing the records by creating a new file.) New records will always appear at the end of the file. If the order of the records in a file is not important, sequential organization will suffice, no matter how many records you may have. Sequential output is also useful for report printing or sequential reads which some programs prefer to do. Line-Sequential Organization Line-sequential files are like sequential files, except that the records can contain only characters as data. Line-sequential files are maintained by the native byte stream files of the operating system. Indexed-Sequential Organization Key searches are improved by this system too. The single-level indexing structure is the simplest one where a file, whose records are pairs, contains a key pointer. This pointer is the position in the data file of the record with the given key. A subset of the records, which are evenly spaced along the data file, is indexed, in order to mark intervals of data records. This is how a key search is performed: the search key is compared with the index keys to find the highest index key coming in front of the search key, while a linear search is performed from the record that the index key points to, until the search key is matched or until the record pointed to by the next index entry is reached. Regardless of double file access (index + data) required by this sort of search, the access time reduction is significant compared with sequential file searches. It is important to note that the hardware for Index-Sequential Organization is usually Disk-based, rather than tape. Records are physically ordered by primary key. And the index gives the physical location of each record. Records can be accessed sequentially or directly, via the index. The index is stored in a file and read into memory at the point when the file is opened. Also, indexes must be maintained. Inverted List In file organization, this is a file that is indexed on many of the attributes of the data itself. The inverted list method has a single index for each key type. The records are not necessarily stored in a sequence. They are placed in the data storage area, but indexes are updated for the record keys and location. Here's an example, in a company file, an index could be maintained for all productsand another one might be maintained for product types. Thus, it is faster to search the indexes than every record. These types of file are also known as "inverted indexes." Nevertheless, inverted list files use more media space and the storage devices get full quickly with this type of organization. The benefits are apparent immediately because searching is fast. However, updating is much slower.
  • 13. Direct or Hashed Access With direct or hashed access a portion of disk space is reserved and a "hashing" algorithm computes the record address. So there is additional space required for this kind of file in the store. Records are placed randomly throughout the file. Records are accessed by addresses that specify their disc location. Also, this type of file organization requires a disk storage rather than tape. It has an excellent search retrieval performance, but care must be taken to maintain the indexes. If the indexes become corrupt, what is left might as well go to the bit-bucket, so it is as well to have regular backups of this kind of file just as it is for all stored valuable data! Data Accessing Methods: There are different ways to access data: Sequential access: Sequential access is a method whereby the records of file are accessed in sequential order. The records in a sequential file appear one after another in the order in which they were entered into the computer and subsequently stored on the medium. Access to any record requires access to all of the preceding records. Magnetic tape is a storage medium that is sequential in nature. To access a particular record on magnetic tape, you must read all of the preceding records first. You could use the sequential access method to record the individual student grades each week because you must access and update all of the records of the student anyway. Direct access: Direct access also called random access is a method in which the records in a file are stored and accessed in random order. A direct access file has a key, called a key field or access key, that lets the computer locate, retrieve and update any record in the file without reading each preceding record. A key field is a field that uniquely identifies each record. Account numbers, employee identification number and social security numbers are examples of key fields.
  • 14. Indexed sequential access: This type of access allows both sequential and direct access of the record in a file. An indexed sequential file can be set up in many ways. Basically records are stored sequentially when the indexed sequential file is created. However, where records are added to the file, they are stored out of sequence in an overflow area. The computer keeps an index of the key fields from each record. It automatically sorts and updates the index to allow both sequential and direct access. Then it searches the index by key field to access a record. When it finds the key field, it can access the record directly using an address associated with the first key field in the sorted index and follows the rest of the index in sequential order. The sorted index allow the computer to find records in sequence no matter where they are physically located on a disk. In practice, multiple indexes usually narrow the location of each record. This type of file access does not work with tape because tape is a sequential access medium only. REFERENCES Opel. A, (2011), Databases Demystified, Published by the McGraw-Hill Companies Pages 1 – 3 accessed on 03 March, 2013 www.management-hub.com About MIS - Role of Database in an Organization.html accessed on 04 March, 2013. Coronel, Morris, Rob - Database Systems (Design, Implementation and Management) 9th edition Elmasri, Navathe- Fundamentals of DB Systems, 6th Edition http://publib.boulder.ibm.com/infocenter/iadthelp/v7r0/index.jsp?topic=/com.ibm.etools.iseries.lang ref.doc/c0925395156.htm accessed on March 5, 2013 http://articles.submityourarticle.com/data-access-and-organization-methods-219812 accessed on March 5, 2013
  • 15. http://www.spamlaws.com/database-security-issues.html accesed on March 5, 2013 Lorette K., Wallace O., 2003-2013 Conjecture Corporation accessed onhttp://www.wisegeek.com/what-does-a-database-specialist-do.htmaccessed on