SlideShare ist ein Scribd-Unternehmen logo
1 von 6
CROSS JOIN:
This join is a Cartesian join that does not necessitate any condition to join. The resultset
contains records that are multiplication of record number from both the tables.

How do you sort in sql:

order by" statement can be used to sort columns returned in a SELECT statement. The
ORDER BY clause is not valid in views, inline functions, derived tables, and subqueries,
unless TOP is also specified.

How do you select unique records in SQL?:

Using the “DISTINCT” clause

select distinct FirstName from Employees

Union & Union All:

The difference between Union and Union all is that Union all will not eliminate duplicate
rows, instead it just pulls all rows from all tables fitting your query specifics and
combines them into a table.

If you know that all the records returned are unique from your union, use UNION ALL
instead, it gives faster results

Truncate & Delete:

Truncate an Delete both are used to delete data from the table. These both command will
only delete data of the specified table, they cannot remove the whole table data
structure.Both statements delete the data from the table not the structure of the table.

TRUNCATE is a DDL (data definition language) command whereas DELETE is a DML
(data manipulation language) command.

You can use WHERE clause(conditions) with DELETE but you can't use WHERE clause
with TRUNCATE .

You cann't rollback data in TRUNCATE but in DELETE you can rollback
data.TRUNCATE removes(delete) the record permanently.

A trigger doesn’t get fired in case of TRUNCATE whereas Triggers get fired in DELETE
command.



Compute Clause in SQL?:
Generates totals that appear as additional summary columns at the end of the result set.
When used with BY, the COMPUTE clause generates control-breaks and subtotals in the
result set.

USE Database;
GO
SELECT CustomerID, OrderDate, SubTotal, TotalDue
FROM Sales.SalesOrderHeader
WHERE ID = 1
ORDER BY OrderDate
COMPUTE SUM(SubTotal), SUM(TotalDue);



What is Datawarehousing:

In computing, a data warehouse (DW) is a database used for reporting and analysis. The
data stored in the warehouse is uploaded from the operational systems. The data may pass
through an operational data store for additional operations before it is used in the DW for
reporting.
A data warehouse maintains its functions in three layers: staging, integration, and access.
Staging is used to store raw data for use by developers. The integration layer is used to
integrate data and to have a level of abstraction from users. The access layer is for getting
data out for users.

The term Data Warehouse was coined by Bill Inmon in 1990, which he defined in the
following way: "A warehouse is a subject-oriented, integrated, time-variant and non-
volatile collection of data in support of management's decision making process". He
defined the terms in the sentence as follows:

Subject Oriented:
Data that gives information about a particular subject instead of about a company's
ongoing operations.

Integrated:
Data that is gathered into the data warehouse from a variety of sources and merged into a
coherent whole.

Time-variant:
All data in the data warehouse is identified with a particular time period.

Non-volatile
Data is stable in a data warehouse. More data is added but data is never removed. This
enables management to gain a consistent picture of the business.

What is Data Mart:    A data mart is the access layer of the data warehouse environment
that is used to get data out to the users. The data mart is a subset of the data warehouse
which is usually oriented to a specific business line or team.
A data mart is a simple form of a data warehouse that is focused on a single subject (or
functional area), such as Sales, Finance, or Marketing. Data marts are often built and
controlled by a single department within an organization. Given their single-subject
focus, data marts usually draw data from only a few sources. The sources could be
internal operational systems, a central data warehouse, or external data.

A data mart is a repository of data gathered from operational data and other sources that
is designed to serve a particular community of knowledge workers. In scope, the data
may derive from an enterprise-wide database or data warehouse or be more specialized.
The emphasis of a data mart is on meeting the specific demands of a particular group of
knowledge users in terms of analysis, content, presentation,and ease-of-use. Users of a
data mart can expect to have data presented in terms that are familiar.


What are Fact Table & Dimension Tables:

In data warehousing, a fact table consists of the measurements, metrics or facts of a
business process. It is often located at the centre of a star schema or a snowflake schema,
surrounded by dimension tables.
Fact tables provide the (usually) additive values that act as independent variables by
which dimensional attributes are analyzed. Fact tables are often defined by their grain.
The grain of a fact table represents the most atomic level by which the facts may be
defined. The grain of a SALES fact table might be stated as "Sales volume by Day by
Product by Store". Each record in this fact table is therefore uniquely defined by a day,
product and store. Other dimensions might be members of this fact table (such as
location/region) but these add nothing to the uniqueness of the fact records. These
"affiliate dimensions" allow for additional slices of the independent facts but generally
provide insights at a higher level of aggregation (a region contains many stores).

In data warehousing, a dimension table is one of the set of companion tables to a fact
table.
The fact table contains business facts or measures and foreign keys which refer to
candidate keys (normally primary keys) in the dimension tables.
Contrary to fact tables, the dimension tables contain descriptive attributes (or fields)
which are typically textual fields or discrete numbers that behave like text. These
attributes are designed to serve two critical purposes: query constraining/filtering and
query result set labeling.
Dimension attributes are supposed to be:
Verbose - labels consisting of full words,
Descriptive,
Complete - no missing values,
Discretely valued - only one value per row in dimensional table,
Quality assured - no misspelling, no impossible values.
Snake flow schema to Design the tables:

In computing, a snowflake schema is a logical arrangement of tables in a
multidimensional database such that the entity relationship diagram resembles a
snowflake in shape. The snowflake schema is represented by centralized fact tables which
are connected to multiple dimensions.
The snowflake schema is similar to the star schema. However, in the snowflake schema,
dimensions are normalized into multiple related tables, whereas the star schema's
dimensions are normalized with each dimension represented by a single table. A complex
snowflake shape emerges when the dimensions of a snowflake schema are elaborate,
having multiple levels of relationships, and the child tables have multiple parent tables
("forks in the road"). The "snowflaking" effect only affects the dimension tables and NOT
the fact tables.

Processing of ETL Dataware housing:

Extract, transform and load (ETL) is a process in database usage and especially in data
warehousing that involves:
Extracting data from outside sources
Transforming it to fit operational needs (which can include quality levels)
Loading it into the end target (database or data warehouse)

Extract - The first part of an ETL process involves extracting the data from the source
systems. In many cases this is the most challenging aspect of ETL, as extracting data
correctly will set the stage for how subsequent processes will go.

Transform - The transform stage applies a series of rules or functions to the extracted data
from the source to derive the data for loading into the end target.

Load - The load phase loads the data into the end target, usually the data warehouse
(DW). Depending on the requirements of the organization, this process varies widely


What is BCP:

The bcp utility copies data between an instance of SQL Server and a data file in a user-
specified format.The Bulk Copy Program (BCP) is a command-line utility that ships with
Microsoft SQL Server. With BCP, you can import and export large amounts of data in and
out of SQL Server databases quickly and easily.


DTS in SQL Server:

Data Transformation Services, or DTS, is a set of objects and utilities to allow the
automation of extract, transform and load operations to or from a database. The objects
are DTS packages and their components, and the utilities are called DTS tools. DTS was
included with earlier versions of Microsoft SQL Server, and was almost always used with
SQL Server databases, although it could be used independently with other databases.
DTS allows data to be transformed and loaded from heterogeneous sources using OLE
DB, ODBC, or text-only files, into any supported database. DTS can also allow
automation of data import or transformation on a scheduled basis, and can perform
additional functions such as FTPing files and executing external programs. In addition,
DTS provides an alternative method of version control and backup for packages when
used in conjunction with a version control system, such as Microsoft Visual SourceSafe .


Multi dimensional Analysis:

Multidimensional analysis is a data analysis process that groups data into two or more
categories: data dimensions and measurements. For example, a data set consisting of the
number of wins for a single football team at each of several years is a single-dimensional
(in this case, longitudinal) data set. A data set consisting of the number of wins for several
football teams in a single year is also a single-dimensional (in this case, cross-sectional)
data set. A data set consisting of the number of wins for several football teams over
several years is a two-dimensional data set.

In many disciplines, two-dimensional data sets are also called panel data. While, strictly
speaking, two- and higher- dimensional data sets are "multi-dimensional," the term
"multidimensional" tends to be applied only to data sets with three or more dimensions.
For example, some forecast data sets provide forecasts for multiple target periods,
conducted by multiple forecasters, and made at multiple horizons. The three dimensions
provide more information than can be gleaned from two dimensional panel data sets.


Bulk Insert:

The Bulk Insert task provides an efficient way to copy large amounts of data into a SQL
Server table or view. For example, suppose your company stores its million-row product
list on a mainframe system, but the company's e-commerce system uses SQL Server to
populate Web pages. You must update the SQL Server product table nightly with the
master product list from the mainframe. To update the table, you save the product list in a
tab-delimited format and use the Bulk Insert task to copy the data directly into the SQL
Server table.


There are primary three ways in which we store information in OLAP:-
MOLAP
Multidimensional OLAP (MOLAP) stores dimension and fact data in a persistent data
store using compressed indexes. Aggregates are stored to facilitate fast data access.
MOLAP query engines are usually proprietary and optimized for the storage format used
by the MOLAP data store. MOLAP offers faster query processing than ROLAP and
usually requires less storage. However, it doesn’t scale as well and requires a separate
database for storage.

ROLAP
Relational OLAP (ROLAP) stores aggregates in relational database tables. ROLAP use
of the relational databases allows it to take advantage of existing database resources,
plus it allows ROLAP applications to scale well. However, ROLAP’s use of tables to
store aggregates usually requires more disk storage than MOLAP, and it is generally not
as fast.

HOLAP
As its name suggests, hybrid OLAP (HOLAP) is a cross between MOLAP and ROLAP.
Like ROLAP, HOLAP leaves the primary data stored in the source database. Like
MOLAP,
HOLAP stores aggregates in a persistent data store that’s separate from the primary
relational database. This mix allows HOLAP to offer the advantages of both MOLAP
and ROLAP. However, unlike MOLAP and ROLAP, which follow well-defined
standards,
HOLAP has no uniform implementation.

Weitere ähnliche Inhalte

Was ist angesagt?

SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCE
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCESALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCE
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCEcscpconf
 
Sas dataflux management studio Training ,data flux corporate trainig
Sas dataflux management studio Training ,data flux corporate trainig Sas dataflux management studio Training ,data flux corporate trainig
Sas dataflux management studio Training ,data flux corporate trainig bidwhm
 
11667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect411667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect4ambujm
 
Data extraction, transformation, and loading
Data extraction, transformation, and loadingData extraction, transformation, and loading
Data extraction, transformation, and loadingSiddique Ibrahim
 
multi dimensional data model
multi dimensional data modelmulti dimensional data model
multi dimensional data modelmoni sindhu
 
Database performance tuning and query optimization
Database performance tuning and query optimizationDatabase performance tuning and query optimization
Database performance tuning and query optimizationUsman Tariq
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processingVijayasankariS
 
Introduction to database & sql
Introduction to database & sqlIntroduction to database & sql
Introduction to database & sqlzahid6
 

Was ist angesagt? (20)

Ch 9 S Q L
Ch 9  S Q LCh 9  S Q L
Ch 9 S Q L
 
Database concepts
Database conceptsDatabase concepts
Database concepts
 
Physical Design and Development
Physical Design and DevelopmentPhysical Design and Development
Physical Design and Development
 
D01 etl
D01 etlD01 etl
D01 etl
 
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCE
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCESALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCE
SALES BASED DATA EXTRACTION FOR BUSINESS INTELLIGENCE
 
Sas dataflux management studio Training ,data flux corporate trainig
Sas dataflux management studio Training ,data flux corporate trainig Sas dataflux management studio Training ,data flux corporate trainig
Sas dataflux management studio Training ,data flux corporate trainig
 
11667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect411667 Bitt I 2008 Lect4
11667 Bitt I 2008 Lect4
 
Ch1- Introduction to dbms
Ch1- Introduction to dbmsCh1- Introduction to dbms
Ch1- Introduction to dbms
 
Data extraction, transformation, and loading
Data extraction, transformation, and loadingData extraction, transformation, and loading
Data extraction, transformation, and loading
 
Fg d
Fg dFg d
Fg d
 
multi dimensional data model
multi dimensional data modelmulti dimensional data model
multi dimensional data model
 
[PHPUGPH] PHP Roadshow - MySQL
[PHPUGPH] PHP Roadshow - MySQL[PHPUGPH] PHP Roadshow - MySQL
[PHPUGPH] PHP Roadshow - MySQL
 
Dbms schemas for decision support
Dbms schemas for decision supportDbms schemas for decision support
Dbms schemas for decision support
 
Data Preprocessing
Data PreprocessingData Preprocessing
Data Preprocessing
 
Database performance tuning and query optimization
Database performance tuning and query optimizationDatabase performance tuning and query optimization
Database performance tuning and query optimization
 
Data warehousing and online analytical processing
Data warehousing and online analytical processingData warehousing and online analytical processing
Data warehousing and online analytical processing
 
Introduction to database & sql
Introduction to database & sqlIntroduction to database & sql
Introduction to database & sql
 
Database aggregation using metadata
Database aggregation using metadataDatabase aggregation using metadata
Database aggregation using metadata
 
Data mininng trends
Data mininng trendsData mininng trends
Data mininng trends
 
SAP ABAP data dictionary
SAP ABAP data dictionarySAP ABAP data dictionary
SAP ABAP data dictionary
 

Ähnlich wie Sqlserver interview questions

UNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docxUNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docxDURGADEVIL
 
Data warehousing interview_questionsandanswers
Data warehousing interview_questionsandanswersData warehousing interview_questionsandanswers
Data warehousing interview_questionsandanswersSourav Singh
 
UNIT 2 DATA WAREHOUSING AND DATA MINING PRESENTATION.pptx
UNIT 2 DATA WAREHOUSING AND DATA MINING PRESENTATION.pptxUNIT 2 DATA WAREHOUSING AND DATA MINING PRESENTATION.pptx
UNIT 2 DATA WAREHOUSING AND DATA MINING PRESENTATION.pptxshruthisweety4
 
Datawarehousing & DSS
Datawarehousing & DSSDatawarehousing & DSS
Datawarehousing & DSSDeepali Raut
 
Building the DW - ETL
Building the DW - ETLBuilding the DW - ETL
Building the DW - ETLganblues
 
4_etl_testing_tutorial_till_chapter3-merged-compressed.pdf
4_etl_testing_tutorial_till_chapter3-merged-compressed.pdf4_etl_testing_tutorial_till_chapter3-merged-compressed.pdf
4_etl_testing_tutorial_till_chapter3-merged-compressed.pdfabhaybansal43
 
PPT SQL CLASS.pptx
PPT SQL CLASS.pptxPPT SQL CLASS.pptx
PPT SQL CLASS.pptxAngeOuattara
 
UNIT - 1 Part 2: Data Warehousing and Data Mining
UNIT - 1 Part 2: Data Warehousing and Data MiningUNIT - 1 Part 2: Data Warehousing and Data Mining
UNIT - 1 Part 2: Data Warehousing and Data MiningNandakumar P
 
It 302 computerized accounting (week 2) - sharifah
It 302   computerized accounting (week 2) - sharifahIt 302   computerized accounting (week 2) - sharifah
It 302 computerized accounting (week 2) - sharifahalish sha
 
Data warehouse presentaion
Data warehouse presentaionData warehouse presentaion
Data warehouse presentaionsridhark1981
 
Extract, Transform and Load.pptx
Extract, Transform and Load.pptxExtract, Transform and Load.pptx
Extract, Transform and Load.pptxJesusaEspeleta
 
Top answers to etl interview questions
Top answers to etl interview questionsTop answers to etl interview questions
Top answers to etl interview questionssrimaribeda
 
Process management seminar
Process management seminarProcess management seminar
Process management seminarapurva_naik
 
Tableau interview questions www.bigclasses.com
Tableau interview questions www.bigclasses.comTableau interview questions www.bigclasses.com
Tableau interview questions www.bigclasses.combigclasses.com
 

Ähnlich wie Sqlserver interview questions (20)

UNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docxUNIT-5 DATA WAREHOUSING.docx
UNIT-5 DATA WAREHOUSING.docx
 
Data warehousing interview_questionsandanswers
Data warehousing interview_questionsandanswersData warehousing interview_questionsandanswers
Data warehousing interview_questionsandanswers
 
Lesson 2.docx
Lesson 2.docxLesson 2.docx
Lesson 2.docx
 
Data Management
Data ManagementData Management
Data Management
 
UNIT 2 DATA WAREHOUSING AND DATA MINING PRESENTATION.pptx
UNIT 2 DATA WAREHOUSING AND DATA MINING PRESENTATION.pptxUNIT 2 DATA WAREHOUSING AND DATA MINING PRESENTATION.pptx
UNIT 2 DATA WAREHOUSING AND DATA MINING PRESENTATION.pptx
 
Datawarehousing & DSS
Datawarehousing & DSSDatawarehousing & DSS
Datawarehousing & DSS
 
Building the DW - ETL
Building the DW - ETLBuilding the DW - ETL
Building the DW - ETL
 
DW 101
DW 101DW 101
DW 101
 
4_etl_testing_tutorial_till_chapter3-merged-compressed.pdf
4_etl_testing_tutorial_till_chapter3-merged-compressed.pdf4_etl_testing_tutorial_till_chapter3-merged-compressed.pdf
4_etl_testing_tutorial_till_chapter3-merged-compressed.pdf
 
Data warehousing
Data warehousingData warehousing
Data warehousing
 
PPT SQL CLASS.pptx
PPT SQL CLASS.pptxPPT SQL CLASS.pptx
PPT SQL CLASS.pptx
 
GROPSIKS.pptx
GROPSIKS.pptxGROPSIKS.pptx
GROPSIKS.pptx
 
UNIT - 1 Part 2: Data Warehousing and Data Mining
UNIT - 1 Part 2: Data Warehousing and Data MiningUNIT - 1 Part 2: Data Warehousing and Data Mining
UNIT - 1 Part 2: Data Warehousing and Data Mining
 
T-SQL Overview
T-SQL OverviewT-SQL Overview
T-SQL Overview
 
It 302 computerized accounting (week 2) - sharifah
It 302   computerized accounting (week 2) - sharifahIt 302   computerized accounting (week 2) - sharifah
It 302 computerized accounting (week 2) - sharifah
 
Data warehouse presentaion
Data warehouse presentaionData warehouse presentaion
Data warehouse presentaion
 
Extract, Transform and Load.pptx
Extract, Transform and Load.pptxExtract, Transform and Load.pptx
Extract, Transform and Load.pptx
 
Top answers to etl interview questions
Top answers to etl interview questionsTop answers to etl interview questions
Top answers to etl interview questions
 
Process management seminar
Process management seminarProcess management seminar
Process management seminar
 
Tableau interview questions www.bigclasses.com
Tableau interview questions www.bigclasses.comTableau interview questions www.bigclasses.com
Tableau interview questions www.bigclasses.com
 

Sqlserver interview questions

  • 1. CROSS JOIN: This join is a Cartesian join that does not necessitate any condition to join. The resultset contains records that are multiplication of record number from both the tables. How do you sort in sql: order by" statement can be used to sort columns returned in a SELECT statement. The ORDER BY clause is not valid in views, inline functions, derived tables, and subqueries, unless TOP is also specified. How do you select unique records in SQL?: Using the “DISTINCT” clause select distinct FirstName from Employees Union & Union All: The difference between Union and Union all is that Union all will not eliminate duplicate rows, instead it just pulls all rows from all tables fitting your query specifics and combines them into a table. If you know that all the records returned are unique from your union, use UNION ALL instead, it gives faster results Truncate & Delete: Truncate an Delete both are used to delete data from the table. These both command will only delete data of the specified table, they cannot remove the whole table data structure.Both statements delete the data from the table not the structure of the table. TRUNCATE is a DDL (data definition language) command whereas DELETE is a DML (data manipulation language) command. You can use WHERE clause(conditions) with DELETE but you can't use WHERE clause with TRUNCATE . You cann't rollback data in TRUNCATE but in DELETE you can rollback data.TRUNCATE removes(delete) the record permanently. A trigger doesn’t get fired in case of TRUNCATE whereas Triggers get fired in DELETE command. Compute Clause in SQL?:
  • 2. Generates totals that appear as additional summary columns at the end of the result set. When used with BY, the COMPUTE clause generates control-breaks and subtotals in the result set. USE Database; GO SELECT CustomerID, OrderDate, SubTotal, TotalDue FROM Sales.SalesOrderHeader WHERE ID = 1 ORDER BY OrderDate COMPUTE SUM(SubTotal), SUM(TotalDue); What is Datawarehousing: In computing, a data warehouse (DW) is a database used for reporting and analysis. The data stored in the warehouse is uploaded from the operational systems. The data may pass through an operational data store for additional operations before it is used in the DW for reporting. A data warehouse maintains its functions in three layers: staging, integration, and access. Staging is used to store raw data for use by developers. The integration layer is used to integrate data and to have a level of abstraction from users. The access layer is for getting data out for users. The term Data Warehouse was coined by Bill Inmon in 1990, which he defined in the following way: "A warehouse is a subject-oriented, integrated, time-variant and non- volatile collection of data in support of management's decision making process". He defined the terms in the sentence as follows: Subject Oriented: Data that gives information about a particular subject instead of about a company's ongoing operations. Integrated: Data that is gathered into the data warehouse from a variety of sources and merged into a coherent whole. Time-variant: All data in the data warehouse is identified with a particular time period. Non-volatile Data is stable in a data warehouse. More data is added but data is never removed. This enables management to gain a consistent picture of the business. What is Data Mart: A data mart is the access layer of the data warehouse environment
  • 3. that is used to get data out to the users. The data mart is a subset of the data warehouse which is usually oriented to a specific business line or team. A data mart is a simple form of a data warehouse that is focused on a single subject (or functional area), such as Sales, Finance, or Marketing. Data marts are often built and controlled by a single department within an organization. Given their single-subject focus, data marts usually draw data from only a few sources. The sources could be internal operational systems, a central data warehouse, or external data. A data mart is a repository of data gathered from operational data and other sources that is designed to serve a particular community of knowledge workers. In scope, the data may derive from an enterprise-wide database or data warehouse or be more specialized. The emphasis of a data mart is on meeting the specific demands of a particular group of knowledge users in terms of analysis, content, presentation,and ease-of-use. Users of a data mart can expect to have data presented in terms that are familiar. What are Fact Table & Dimension Tables: In data warehousing, a fact table consists of the measurements, metrics or facts of a business process. It is often located at the centre of a star schema or a snowflake schema, surrounded by dimension tables. Fact tables provide the (usually) additive values that act as independent variables by which dimensional attributes are analyzed. Fact tables are often defined by their grain. The grain of a fact table represents the most atomic level by which the facts may be defined. The grain of a SALES fact table might be stated as "Sales volume by Day by Product by Store". Each record in this fact table is therefore uniquely defined by a day, product and store. Other dimensions might be members of this fact table (such as location/region) but these add nothing to the uniqueness of the fact records. These "affiliate dimensions" allow for additional slices of the independent facts but generally provide insights at a higher level of aggregation (a region contains many stores). In data warehousing, a dimension table is one of the set of companion tables to a fact table. The fact table contains business facts or measures and foreign keys which refer to candidate keys (normally primary keys) in the dimension tables. Contrary to fact tables, the dimension tables contain descriptive attributes (or fields) which are typically textual fields or discrete numbers that behave like text. These attributes are designed to serve two critical purposes: query constraining/filtering and query result set labeling. Dimension attributes are supposed to be: Verbose - labels consisting of full words, Descriptive, Complete - no missing values, Discretely valued - only one value per row in dimensional table, Quality assured - no misspelling, no impossible values.
  • 4. Snake flow schema to Design the tables: In computing, a snowflake schema is a logical arrangement of tables in a multidimensional database such that the entity relationship diagram resembles a snowflake in shape. The snowflake schema is represented by centralized fact tables which are connected to multiple dimensions. The snowflake schema is similar to the star schema. However, in the snowflake schema, dimensions are normalized into multiple related tables, whereas the star schema's dimensions are normalized with each dimension represented by a single table. A complex snowflake shape emerges when the dimensions of a snowflake schema are elaborate, having multiple levels of relationships, and the child tables have multiple parent tables ("forks in the road"). The "snowflaking" effect only affects the dimension tables and NOT the fact tables. Processing of ETL Dataware housing: Extract, transform and load (ETL) is a process in database usage and especially in data warehousing that involves: Extracting data from outside sources Transforming it to fit operational needs (which can include quality levels) Loading it into the end target (database or data warehouse) Extract - The first part of an ETL process involves extracting the data from the source systems. In many cases this is the most challenging aspect of ETL, as extracting data correctly will set the stage for how subsequent processes will go. Transform - The transform stage applies a series of rules or functions to the extracted data from the source to derive the data for loading into the end target. Load - The load phase loads the data into the end target, usually the data warehouse (DW). Depending on the requirements of the organization, this process varies widely What is BCP: The bcp utility copies data between an instance of SQL Server and a data file in a user- specified format.The Bulk Copy Program (BCP) is a command-line utility that ships with Microsoft SQL Server. With BCP, you can import and export large amounts of data in and out of SQL Server databases quickly and easily. DTS in SQL Server: Data Transformation Services, or DTS, is a set of objects and utilities to allow the automation of extract, transform and load operations to or from a database. The objects are DTS packages and their components, and the utilities are called DTS tools. DTS was
  • 5. included with earlier versions of Microsoft SQL Server, and was almost always used with SQL Server databases, although it could be used independently with other databases. DTS allows data to be transformed and loaded from heterogeneous sources using OLE DB, ODBC, or text-only files, into any supported database. DTS can also allow automation of data import or transformation on a scheduled basis, and can perform additional functions such as FTPing files and executing external programs. In addition, DTS provides an alternative method of version control and backup for packages when used in conjunction with a version control system, such as Microsoft Visual SourceSafe . Multi dimensional Analysis: Multidimensional analysis is a data analysis process that groups data into two or more categories: data dimensions and measurements. For example, a data set consisting of the number of wins for a single football team at each of several years is a single-dimensional (in this case, longitudinal) data set. A data set consisting of the number of wins for several football teams in a single year is also a single-dimensional (in this case, cross-sectional) data set. A data set consisting of the number of wins for several football teams over several years is a two-dimensional data set. In many disciplines, two-dimensional data sets are also called panel data. While, strictly speaking, two- and higher- dimensional data sets are "multi-dimensional," the term "multidimensional" tends to be applied only to data sets with three or more dimensions. For example, some forecast data sets provide forecasts for multiple target periods, conducted by multiple forecasters, and made at multiple horizons. The three dimensions provide more information than can be gleaned from two dimensional panel data sets. Bulk Insert: The Bulk Insert task provides an efficient way to copy large amounts of data into a SQL Server table or view. For example, suppose your company stores its million-row product list on a mainframe system, but the company's e-commerce system uses SQL Server to populate Web pages. You must update the SQL Server product table nightly with the master product list from the mainframe. To update the table, you save the product list in a tab-delimited format and use the Bulk Insert task to copy the data directly into the SQL Server table. There are primary three ways in which we store information in OLAP:- MOLAP Multidimensional OLAP (MOLAP) stores dimension and fact data in a persistent data store using compressed indexes. Aggregates are stored to facilitate fast data access. MOLAP query engines are usually proprietary and optimized for the storage format used by the MOLAP data store. MOLAP offers faster query processing than ROLAP and usually requires less storage. However, it doesn’t scale as well and requires a separate
  • 6. database for storage. ROLAP Relational OLAP (ROLAP) stores aggregates in relational database tables. ROLAP use of the relational databases allows it to take advantage of existing database resources, plus it allows ROLAP applications to scale well. However, ROLAP’s use of tables to store aggregates usually requires more disk storage than MOLAP, and it is generally not as fast. HOLAP As its name suggests, hybrid OLAP (HOLAP) is a cross between MOLAP and ROLAP. Like ROLAP, HOLAP leaves the primary data stored in the source database. Like MOLAP, HOLAP stores aggregates in a persistent data store that’s separate from the primary relational database. This mix allows HOLAP to offer the advantages of both MOLAP and ROLAP. However, unlike MOLAP and ROLAP, which follow well-defined standards, HOLAP has no uniform implementation.