SlideShare ist ein Scribd-Unternehmen logo
1 von 41
Foundations of Business
Intelligence: Databases and
Information Management
Terminologies Used….
Analytic Entity
Attribute Entity-relationship diagram
Big data Field
Bit File
Byte Foreign key
Data administration Hadoop
Data cleansing In-memory computing
Data definition Information policy
Data dictionary Key field
Data governance Non-relational database management systems
Data inconsistency Normalization
Data manipulation language Online analytical processing (OLAP)
Data mart Primary key
Data mining Program-data dependence
Data quality audit Record
Data redundancy Referential integrity
Data warehouse Relational DBMS
Database Sentiment analysis
Database administration Structured Query Language (SQL)
Database management system (DBMS) Text mining
Database server Tuple
Web mining
Learning Objectives
• 1 What are the problems of managing data resources in a traditional
file environment?
• 2 What are the major capabilities of database management systems
(DBMS), and why is a relational DBMS so powerful?
• 3 What are the principal tools and technologies for accessing
information from databases to improve business performance and
decision making?
• 4 Why are information policy, data administration, and data quality
assurance essential for managing the firm’s data resources?
Video Cases
• Case 1: Dubuque Uses Cloud Computing and Sensors to Build a
Smarter City
• Case 2: Brooks Brothers Closes In on Omnichannel Retail
• Case 3: Maruti Suzuki Business Intelligence and Enterprise Databases
File Organization Terms and Concepts
• Database: Group of related files
• File: Group of records of same type
• Record: Group of related fields
• Field: Group of characters as word(s) or number(s)
• Entity: Person, place, thing on which we store information
• Attribute: Each characteristic, or quality, describing entity
The Data
Hierarchy
Problems with the Traditional File
Environment
• Files maintained separately by different departments
• Data redundancy
• Data inconsistency
• Program-data dependence
• Lack of flexibility
• Poor security
• Lack of data sharing and availability
Traditional
File
Processing
Database Management Systems
• Database
– Serves many applications by centralizing data and controlling redundant data
• Database management system (DBMS)
– Interfaces between applications and physical data files
– Separates logical and physical views of data
– Solves problems of traditional file environment
• Controls redundancy
• Eliminates inconsistency
• Uncouples programs and data
• Enables organization to centrally manage data and data security
Human Resources Database with Multiple Views
Relational DBMS
• Represent data as two-dimensional tables
• Each table contains data on entity and attributes
• Table: grid of columns and rows
• Rows (tuples): Records for different entities
• Fields (columns): Represents attribute for entity
• Key field: Field used to uniquely identify each record
• Primary key: Field in table used for key fields
• Foreign key: Primary key used in second table as look-up field to identify
records from original table
Relational Database Tables
Operations of a Relational DBMS
• Three basic operations used to develop useful sets of data
• SELECT
• Creates subset of data of all records that meet stated criteria
• JOIN
• Combines relational tables to provide user with more information than available in
individual tables
• PROJECT
• Creates subset of columns in table, creating tables with only the information specified
The Three Basic Operations of a Relational DBMS
Capabilities of Database Management
Systems
• Data definition capability
• Data dictionary
• Querying and reporting
• Data manipulation language
• Structured Query Language (SQL)
• Many DBMS have report generation capabilities for creating polished
reports (Microsoft Access)
Access Data Dictionary Features
Example of an SQL Query
An Access Query
Designing Databases
• Conceptual design vs. physical design
• Normalization
• Streamlining complex groupings of data to minimize redundant data elements
and awkward many-to-many relationships
• Referential integrity
• Rules used by RDBMS to ensure relationships between tables remain
consistent
• Entity-relationship diagram
• A correct data model is essential for a system serving the business
well
An Unnormalized Relation for Order
Normalized Tables Created from Order
An Entity-Relationship Diagram
Non-relational Databases and Databases in
the Cloud
• Non-relational databases: “NoSQL”
• More flexible data model
• Data sets stored across distributed machines
• Easier to scale
• Handle large volumes of unstructured and structured data
• Databases in the cloud
• Appeal to start-ups, smaller businesses
• Amazon Relational Database Service, Microsoft SQL Azure
• Private clouds
The Challenge of Big Data
• Big data
• Massive sets of unstructured/semi-structured data from web traffic, social
media, sensors, and so on
• Volumes too great for typical DBMS
• Petabytes, exabytes of data
• Can reveal more patterns, relationships and anomalies
• Requires new tools and technologies to manage and analyze
Business Intelligence Infrastructure (1 of 3)
• Array of tools for obtaining information from separate systems and
from big data
• Data warehouse
– Stores current and historical data from many core operational transaction
systems
– Consolidates and standardizes information for use across enterprise, but data
cannot be altered
– Provides analysis and reporting tools
Business Intelligence Infrastructure (2 of 3)
• Data marts
– Subset of data warehouse
– Typically focus on single subject or line of business
• Hadoop
• Enables distributed parallel processing of big data across inexpensive
computers
• Key services
• Hadoop Distributed File System (HDFS): data storage
• MapReduce: breaks data into clusters for work
• Hbase: NoSQL database
• Used Yahoo, NextBio
Business Intelligence Infrastructure (3 of 3)
• In-memory computing
• Used in big data analysis
• Uses computers main memory (RAM) for data storage to avoid delays in
retrieving data from disk storage
• Can reduce hours/days of processing to seconds
• Requires optimized hardware
• Analytic platforms
• High-speed platforms using both relational and non-relational tools optimized
for large datasets
Contemporary Business Intelligence Infrastructure
Analytical Tools: Relationships, Patterns,
Trends
• Tools for consolidating, analyzing, and providing access to vast
amounts of data to help users make better business decisions
• Multidimensional data analysis (OLAP - Online Analytical Processing)
• Data mining
• Text mining
• Web mining
Online Analytical Processing (OLAP)
• Supports multidimensional data analysis
• Viewing data using multiple dimensions
• Each aspect of information (product, pricing, cost, region, time period) is
different dimension
• Example: How many washers sold in the East in June compared with other
regions?
• OLAP enables rapid, online answers to ad hoc queries
Multidimensional Data Model
Data Mining
• Finds hidden patterns, relationships in datasets
• Example: customer buying patterns
• Infers rules to predict future behavior
• Types of information obtainable from data mining:
• Associations
• Sequences
• Classification
• Clustering
• Forecasting
Text Mining and Web Mining
• Text mining
• Extracts key elements from large unstructured data sets
• Sentiment analysis software
• Web mining
• Discovery and analysis of useful patterns and information from web
• Web content mining
• Web structure mining
• Web usage mining
Databases and the Web
–Many companies use the web to make some internal databases
available to customers or partners
–Typical configuration includes:
• Web server
• Application server/middleware/CGI scripts
• Database server (hosting DBMS)
–Advantages of using the web for database access:
• Ease of use of browser software
• Web interface requires few or no changes to database
• Inexpensive to add web interface to system
Linking Internal Databases to the Web
Establishing an Information Policy
• Firm’s rules, procedures, roles for sharing, managing, standardizing
data
• Data administration
• Establishes policies and procedures to manage data
• Data governance
• Deals with policies and processes for managing availability, usability, integrity,
and security of data, especially regarding government regulations
• Database administration
• Creating and maintaining database
Ensuring Data Quality
• More than 25 percent of critical data in Fortune 1000 company
databases are inaccurate or incomplete
• Before new database is in place, a firm must:
• Identify and correct faulty data
• Establish better routines for editing data once database in operation
• Data quality audit
• Data cleansing
Foundations of business intelligence databases and information management
Foundations of business intelligence databases and information management
Foundations of business intelligence databases and information management
Foundations of business intelligence databases and information management

Weitere ähnliche Inhalte

Was ist angesagt?

MDM Strategy & Roadmap
MDM Strategy & RoadmapMDM Strategy & Roadmap
MDM Strategy & Roadmap
victorlbrown
 
Laudon mis12 ppt01
Laudon mis12 ppt01Laudon mis12 ppt01
Laudon mis12 ppt01
Norazila Mat
 

Was ist angesagt? (20)

Data Management
Data Management Data Management
Data Management
 
James hall ch 9
James hall ch 9James hall ch 9
James hall ch 9
 
Chapter 4 MIS
Chapter 4 MISChapter 4 MIS
Chapter 4 MIS
 
Chapter 5 MIS
Chapter 5 MISChapter 5 MIS
Chapter 5 MIS
 
Ethical and social issues in management information systems for BBA hons pro...
Ethical and social issues in management information systems  for BBA hons pro...Ethical and social issues in management information systems  for BBA hons pro...
Ethical and social issues in management information systems for BBA hons pro...
 
MDM Strategy & Roadmap
MDM Strategy & RoadmapMDM Strategy & Roadmap
MDM Strategy & Roadmap
 
Management Information System [Kenneth Laudon]
Management Information System [Kenneth Laudon]Management Information System [Kenneth Laudon]
Management Information System [Kenneth Laudon]
 
Chapter 6 MIS
Chapter 6 MISChapter 6 MIS
Chapter 6 MIS
 
5 Level of MDM Maturity
5 Level of MDM Maturity5 Level of MDM Maturity
5 Level of MDM Maturity
 
MIS-CH05: IT Infrastructure and Emerging Technologies
MIS-CH05: IT Infrastructure and Emerging TechnologiesMIS-CH05: IT Infrastructure and Emerging Technologies
MIS-CH05: IT Infrastructure and Emerging Technologies
 
Master Data Management
Master Data ManagementMaster Data Management
Master Data Management
 
MIS-CH10: e-Commerce: Digital Markets, Digital Goods
MIS-CH10: e-Commerce: Digital Markets, Digital GoodsMIS-CH10: e-Commerce: Digital Markets, Digital Goods
MIS-CH10: e-Commerce: Digital Markets, Digital Goods
 
Enterprise Systems
Enterprise SystemsEnterprise Systems
Enterprise Systems
 
Chapter 11 MIS
Chapter 11 MISChapter 11 MIS
Chapter 11 MIS
 
Chapter 2
Chapter 2Chapter 2
Chapter 2
 
MIS Chapter 2
MIS Chapter 2MIS Chapter 2
MIS Chapter 2
 
MIS-CH02: Global e-Business and Collaboration
MIS-CH02: Global e-Business and CollaborationMIS-CH02: Global e-Business and Collaboration
MIS-CH02: Global e-Business and Collaboration
 
‏‏‏‏‏‏‏‏‏‏Chapter 12: Data Quality Management
‏‏‏‏‏‏‏‏‏‏Chapter 12: Data Quality Management‏‏‏‏‏‏‏‏‏‏Chapter 12: Data Quality Management
‏‏‏‏‏‏‏‏‏‏Chapter 12: Data Quality Management
 
Managing the digital firm
Managing the digital firmManaging the digital firm
Managing the digital firm
 
Laudon mis12 ppt01
Laudon mis12 ppt01Laudon mis12 ppt01
Laudon mis12 ppt01
 

Ähnlich wie Foundations of business intelligence databases and information management

4- DB Ch6 18-3-2020.pptx
4- DB Ch6 18-3-2020.pptx4- DB Ch6 18-3-2020.pptx
4- DB Ch6 18-3-2020.pptx
Shoaibmirza18
 
History of database processing module 1 (2)
History of database processing module 1 (2)History of database processing module 1 (2)
History of database processing module 1 (2)
chottu89
 
01-Database Administration and Management.pdf
01-Database Administration and Management.pdf01-Database Administration and Management.pdf
01-Database Administration and Management.pdf
TOUSEEQHAIDER14
 
Dbms and it infrastructure
Dbms and  it infrastructureDbms and  it infrastructure
Dbms and it infrastructure
projectandppt
 

Ähnlich wie Foundations of business intelligence databases and information management (20)

RowanDay4.pptx
RowanDay4.pptxRowanDay4.pptx
RowanDay4.pptx
 
4- DB Ch6 18-3-2020.pptx
4- DB Ch6 18-3-2020.pptx4- DB Ch6 18-3-2020.pptx
4- DB Ch6 18-3-2020.pptx
 
Lecture-1.ppt
Lecture-1.pptLecture-1.ppt
Lecture-1.ppt
 
Management information system database management
Management information system database managementManagement information system database management
Management information system database management
 
Chapter5
Chapter5Chapter5
Chapter5
 
History of database processing module 1 (2)
History of database processing module 1 (2)History of database processing module 1 (2)
History of database processing module 1 (2)
 
CST204 DBMS Module-1
CST204 DBMS Module-1CST204 DBMS Module-1
CST204 DBMS Module-1
 
dbms introduction.pptx
dbms introduction.pptxdbms introduction.pptx
dbms introduction.pptx
 
CS3270 - DATABASE SYSTEM - Lecture (1)
CS3270 - DATABASE SYSTEM -  Lecture (1)CS3270 - DATABASE SYSTEM -  Lecture (1)
CS3270 - DATABASE SYSTEM - Lecture (1)
 
01-Database Administration and Management.pdf
01-Database Administration and Management.pdf01-Database Administration and Management.pdf
01-Database Administration and Management.pdf
 
Introduction to RDBMS
Introduction to RDBMSIntroduction to RDBMS
Introduction to RDBMS
 
Database management system
Database management systemDatabase management system
Database management system
 
Database
DatabaseDatabase
Database
 
Dbms and it infrastructure
Dbms and  it infrastructureDbms and  it infrastructure
Dbms and it infrastructure
 
(Dbms) class 1 & 2 (Presentation)
(Dbms) class 1 & 2 (Presentation)(Dbms) class 1 & 2 (Presentation)
(Dbms) class 1 & 2 (Presentation)
 
Chapter 5 data resource management
Chapter 5  data resource managementChapter 5  data resource management
Chapter 5 data resource management
 
Dbms
DbmsDbms
Dbms
 
Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)Data Lakehouse, Data Mesh, and Data Fabric (r2)
Data Lakehouse, Data Mesh, and Data Fabric (r2)
 
DBMS Lecture 1.ppt
DBMS Lecture 1.pptDBMS Lecture 1.ppt
DBMS Lecture 1.ppt
 
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?Data Lake Acceleration vs. Data Virtualization - What’s the difference?
Data Lake Acceleration vs. Data Virtualization - What’s the difference?
 

Mehr von Amity University | FMS - DU | IMT | Stratford University | KKMI International Institute | AIMA | DTU

All About DBMS - Interview Question and Answers
All About DBMS - Interview Question and AnswersAll About DBMS - Interview Question and Answers

Mehr von Amity University | FMS - DU | IMT | Stratford University | KKMI International Institute | AIMA | DTU (20)

All About DBMS - Interview Question and Answers
All About DBMS - Interview Question and AnswersAll About DBMS - Interview Question and Answers
All About DBMS - Interview Question and Answers
 
Concept of Governance - Management of Operational Risk for IT Officers/Execut...
Concept of Governance - Management of Operational Risk for IT Officers/Execut...Concept of Governance - Management of Operational Risk for IT Officers/Execut...
Concept of Governance - Management of Operational Risk for IT Officers/Execut...
 
Emerging Technologies in IT
Emerging Technologies in ITEmerging Technologies in IT
Emerging Technologies in IT
 
Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...Introduction to DBMS - Notes in Layman...
Introduction to DBMS - Notes in Layman...
 
Fundamentals of DBMS
Fundamentals of DBMSFundamentals of DBMS
Fundamentals of DBMS
 
CASE (Computer Aided Software Design)
CASE (Computer Aided Software Design)CASE (Computer Aided Software Design)
CASE (Computer Aided Software Design)
 
SOFTWARE RELIABILITY AND QUALITY ASSURANCE
SOFTWARE RELIABILITY AND QUALITY ASSURANCESOFTWARE RELIABILITY AND QUALITY ASSURANCE
SOFTWARE RELIABILITY AND QUALITY ASSURANCE
 
Software Testing (Contd..) SDLC Model
Software Testing (Contd..) SDLC ModelSoftware Testing (Contd..) SDLC Model
Software Testing (Contd..) SDLC Model
 
Software Testing - SDLC Model
Software Testing - SDLC ModelSoftware Testing - SDLC Model
Software Testing - SDLC Model
 
Coding - SDLC Model
Coding - SDLC ModelCoding - SDLC Model
Coding - SDLC Model
 
Software Design - SDLC Model
Software Design - SDLC ModelSoftware Design - SDLC Model
Software Design - SDLC Model
 
Models of SDLC (Contd..) & Feasibility Study
Models of SDLC (Contd..)  & Feasibility StudyModels of SDLC (Contd..)  & Feasibility Study
Models of SDLC (Contd..) & Feasibility Study
 
Models of SDLC (Software Development Life Cycle / Program Development Life Cy...
Models of SDLC (Software Development Life Cycle / Program Development Life Cy...Models of SDLC (Software Development Life Cycle / Program Development Life Cy...
Models of SDLC (Software Development Life Cycle / Program Development Life Cy...
 
Introduction to Software Engineering
Introduction to Software EngineeringIntroduction to Software Engineering
Introduction to Software Engineering
 
CLOUD SECURITY IN INSURANCE INDUSTRY WITH RESPECT TO INDIAN MARKET
CLOUD SECURITY IN INSURANCE INDUSTRY WITH RESPECT TO INDIAN MARKETCLOUD SECURITY IN INSURANCE INDUSTRY WITH RESPECT TO INDIAN MARKET
CLOUD SECURITY IN INSURANCE INDUSTRY WITH RESPECT TO INDIAN MARKET
 
Application Software
Application SoftwareApplication Software
Application Software
 
Application Software – Horizontal & Vertical Software
Application Software – Horizontal & Vertical SoftwareApplication Software – Horizontal & Vertical Software
Application Software – Horizontal & Vertical Software
 
Software: Systems and Application Software
Software:  Systems and Application SoftwareSoftware:  Systems and Application Software
Software: Systems and Application Software
 
Programming Languages
Programming LanguagesProgramming Languages
Programming Languages
 
Number Codes and Registers
Number Codes and RegistersNumber Codes and Registers
Number Codes and Registers
 

Kürzlich hochgeladen

Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
ZurliaSoop
 

Kürzlich hochgeladen (20)

Application orientated numerical on hev.ppt
Application orientated numerical on hev.pptApplication orientated numerical on hev.ppt
Application orientated numerical on hev.ppt
 
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
Jual Obat Aborsi Hongkong ( Asli No.1 ) 085657271886 Obat Penggugur Kandungan...
 
Spatium Project Simulation student brief
Spatium Project Simulation student briefSpatium Project Simulation student brief
Spatium Project Simulation student brief
 
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptxBasic Civil Engineering first year Notes- Chapter 4 Building.pptx
Basic Civil Engineering first year Notes- Chapter 4 Building.pptx
 
How to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptxHow to setup Pycharm environment for Odoo 17.pptx
How to setup Pycharm environment for Odoo 17.pptx
 
Introduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The BasicsIntroduction to Nonprofit Accounting: The Basics
Introduction to Nonprofit Accounting: The Basics
 
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptxHMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
HMCS Vancouver Pre-Deployment Brief - May 2024 (Web Version).pptx
 
Food safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdfFood safety_Challenges food safety laboratories_.pdf
Food safety_Challenges food safety laboratories_.pdf
 
How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17How to Create and Manage Wizard in Odoo 17
How to Create and Manage Wizard in Odoo 17
 
Micro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdfMicro-Scholarship, What it is, How can it help me.pdf
Micro-Scholarship, What it is, How can it help me.pdf
 
Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)Accessible Digital Futures project (20/03/2024)
Accessible Digital Futures project (20/03/2024)
 
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...Kodo Millet  PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
Kodo Millet PPT made by Ghanshyam bairwa college of Agriculture kumher bhara...
 
How to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POSHow to Manage Global Discount in Odoo 17 POS
How to Manage Global Discount in Odoo 17 POS
 
On National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan FellowsOn National Teacher Day, meet the 2024-25 Kenan Fellows
On National Teacher Day, meet the 2024-25 Kenan Fellows
 
Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)Jamworks pilot and AI at Jisc (20/03/2024)
Jamworks pilot and AI at Jisc (20/03/2024)
 
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptxHMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
HMCS Max Bernays Pre-Deployment Brief (May 2024).pptx
 
How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17How to Give a Domain for a Field in Odoo 17
How to Give a Domain for a Field in Odoo 17
 
Understanding Accommodations and Modifications
Understanding  Accommodations and ModificationsUnderstanding  Accommodations and Modifications
Understanding Accommodations and Modifications
 
Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...Making communications land - Are they received and understood as intended? we...
Making communications land - Are they received and understood as intended? we...
 
FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024FSB Advising Checklist - Orientation 2024
FSB Advising Checklist - Orientation 2024
 

Foundations of business intelligence databases and information management

  • 1. Foundations of Business Intelligence: Databases and Information Management
  • 2. Terminologies Used…. Analytic Entity Attribute Entity-relationship diagram Big data Field Bit File Byte Foreign key Data administration Hadoop Data cleansing In-memory computing Data definition Information policy Data dictionary Key field Data governance Non-relational database management systems Data inconsistency Normalization Data manipulation language Online analytical processing (OLAP) Data mart Primary key Data mining Program-data dependence Data quality audit Record Data redundancy Referential integrity Data warehouse Relational DBMS Database Sentiment analysis Database administration Structured Query Language (SQL) Database management system (DBMS) Text mining Database server Tuple Web mining
  • 3. Learning Objectives • 1 What are the problems of managing data resources in a traditional file environment? • 2 What are the major capabilities of database management systems (DBMS), and why is a relational DBMS so powerful? • 3 What are the principal tools and technologies for accessing information from databases to improve business performance and decision making? • 4 Why are information policy, data administration, and data quality assurance essential for managing the firm’s data resources?
  • 4. Video Cases • Case 1: Dubuque Uses Cloud Computing and Sensors to Build a Smarter City • Case 2: Brooks Brothers Closes In on Omnichannel Retail • Case 3: Maruti Suzuki Business Intelligence and Enterprise Databases
  • 5. File Organization Terms and Concepts • Database: Group of related files • File: Group of records of same type • Record: Group of related fields • Field: Group of characters as word(s) or number(s) • Entity: Person, place, thing on which we store information • Attribute: Each characteristic, or quality, describing entity
  • 7. Problems with the Traditional File Environment • Files maintained separately by different departments • Data redundancy • Data inconsistency • Program-data dependence • Lack of flexibility • Poor security • Lack of data sharing and availability
  • 9. Database Management Systems • Database – Serves many applications by centralizing data and controlling redundant data • Database management system (DBMS) – Interfaces between applications and physical data files – Separates logical and physical views of data – Solves problems of traditional file environment • Controls redundancy • Eliminates inconsistency • Uncouples programs and data • Enables organization to centrally manage data and data security
  • 10. Human Resources Database with Multiple Views
  • 11. Relational DBMS • Represent data as two-dimensional tables • Each table contains data on entity and attributes • Table: grid of columns and rows • Rows (tuples): Records for different entities • Fields (columns): Represents attribute for entity • Key field: Field used to uniquely identify each record • Primary key: Field in table used for key fields • Foreign key: Primary key used in second table as look-up field to identify records from original table
  • 13. Operations of a Relational DBMS • Three basic operations used to develop useful sets of data • SELECT • Creates subset of data of all records that meet stated criteria • JOIN • Combines relational tables to provide user with more information than available in individual tables • PROJECT • Creates subset of columns in table, creating tables with only the information specified
  • 14. The Three Basic Operations of a Relational DBMS
  • 15. Capabilities of Database Management Systems • Data definition capability • Data dictionary • Querying and reporting • Data manipulation language • Structured Query Language (SQL) • Many DBMS have report generation capabilities for creating polished reports (Microsoft Access)
  • 17. Example of an SQL Query
  • 19. Designing Databases • Conceptual design vs. physical design • Normalization • Streamlining complex groupings of data to minimize redundant data elements and awkward many-to-many relationships • Referential integrity • Rules used by RDBMS to ensure relationships between tables remain consistent • Entity-relationship diagram • A correct data model is essential for a system serving the business well
  • 23. Non-relational Databases and Databases in the Cloud • Non-relational databases: “NoSQL” • More flexible data model • Data sets stored across distributed machines • Easier to scale • Handle large volumes of unstructured and structured data • Databases in the cloud • Appeal to start-ups, smaller businesses • Amazon Relational Database Service, Microsoft SQL Azure • Private clouds
  • 24. The Challenge of Big Data • Big data • Massive sets of unstructured/semi-structured data from web traffic, social media, sensors, and so on • Volumes too great for typical DBMS • Petabytes, exabytes of data • Can reveal more patterns, relationships and anomalies • Requires new tools and technologies to manage and analyze
  • 25. Business Intelligence Infrastructure (1 of 3) • Array of tools for obtaining information from separate systems and from big data • Data warehouse – Stores current and historical data from many core operational transaction systems – Consolidates and standardizes information for use across enterprise, but data cannot be altered – Provides analysis and reporting tools
  • 26. Business Intelligence Infrastructure (2 of 3) • Data marts – Subset of data warehouse – Typically focus on single subject or line of business • Hadoop • Enables distributed parallel processing of big data across inexpensive computers • Key services • Hadoop Distributed File System (HDFS): data storage • MapReduce: breaks data into clusters for work • Hbase: NoSQL database • Used Yahoo, NextBio
  • 27. Business Intelligence Infrastructure (3 of 3) • In-memory computing • Used in big data analysis • Uses computers main memory (RAM) for data storage to avoid delays in retrieving data from disk storage • Can reduce hours/days of processing to seconds • Requires optimized hardware • Analytic platforms • High-speed platforms using both relational and non-relational tools optimized for large datasets
  • 29. Analytical Tools: Relationships, Patterns, Trends • Tools for consolidating, analyzing, and providing access to vast amounts of data to help users make better business decisions • Multidimensional data analysis (OLAP - Online Analytical Processing) • Data mining • Text mining • Web mining
  • 30. Online Analytical Processing (OLAP) • Supports multidimensional data analysis • Viewing data using multiple dimensions • Each aspect of information (product, pricing, cost, region, time period) is different dimension • Example: How many washers sold in the East in June compared with other regions? • OLAP enables rapid, online answers to ad hoc queries
  • 32. Data Mining • Finds hidden patterns, relationships in datasets • Example: customer buying patterns • Infers rules to predict future behavior • Types of information obtainable from data mining: • Associations • Sequences • Classification • Clustering • Forecasting
  • 33. Text Mining and Web Mining • Text mining • Extracts key elements from large unstructured data sets • Sentiment analysis software • Web mining • Discovery and analysis of useful patterns and information from web • Web content mining • Web structure mining • Web usage mining
  • 34. Databases and the Web –Many companies use the web to make some internal databases available to customers or partners –Typical configuration includes: • Web server • Application server/middleware/CGI scripts • Database server (hosting DBMS) –Advantages of using the web for database access: • Ease of use of browser software • Web interface requires few or no changes to database • Inexpensive to add web interface to system
  • 36. Establishing an Information Policy • Firm’s rules, procedures, roles for sharing, managing, standardizing data • Data administration • Establishes policies and procedures to manage data • Data governance • Deals with policies and processes for managing availability, usability, integrity, and security of data, especially regarding government regulations • Database administration • Creating and maintaining database
  • 37. Ensuring Data Quality • More than 25 percent of critical data in Fortune 1000 company databases are inaccurate or incomplete • Before new database is in place, a firm must: • Identify and correct faulty data • Establish better routines for editing data once database in operation • Data quality audit • Data cleansing