Map Reduce.pdf

S

Course material

Map Reduce
databases project and need the explanation and answer to help me learn.
Hello
Having work related to Map Reduce. Are you interested? Need task 3 and 4 only.
Requirements:
Task and Mark distribution: 1. ER model and table generation (252. SQL programming
(30%) 3. Sequential and distributed processing (254. Research report (20%) Notes: 1. You
are expected to use the students can contact Centre for Academic Writing (CAW)2. Please
notify your registry 3. Any student requiring an extension or deferral should follow the
university process as outlined R model and table generation (25%) processing (25%) the
CUHarvard referencing format. For support and advice on how this Centre for Academic
Writing (CAW). registry course support team and module leader for disability supportAny
student requiring an extension or deferral should follow the university process as outlined
For support and advice on how this and module leader for disability support. Any student
requiring an extension or deferral should follow the university process as outlined •
7086CEM – Data Management Systems This assignment is made up of four parts: - Part A
deals with database design, using E-R modelling. - Part B concerns database creation and
querying, using SQL. - Part C covers parallel processing frameworks - Part D involves a
research report Coursework requirements Please note that plagiarism (copying and
pasting from sources) and collusion (submitting the same content as a fellow student) are
detected automatically by Turnitin and flagged. You should be able to check for similarity.
You are expected to submit your work. The report must be self-contained. Links to
external datasets are not acceptable. Appendices are not required. Report structure Please
follow the instructions below. You are expected: a) to include the name and the code of the
module, the title of the CW, your names and your student Ids. b) to follow the structure of
the CW brief, and present your work accordingly, i.e.: Part A 1. 2. ...... Part B 1. 2. a) b) c)
d) e) ...... Submission process for the group CW is submitted as a group of two students as
follows: a) You are expected to work as a group on ALL the parts of the CW. b) one student
will submit individually in his/her name on Aula, the whole report, with both names on the
front page, as a single pdf file, and c) the other student must submit individually in his/her
name on Aula, one page only, with both names, as a pdf file, where he/she presents
reflections on the collaborative work.
Part A. Conceptual modelling An equipment company wishes to create a database to
support the hiring of tools and machinery to clients. The company has three types of
equipment: power tools, such as drills and vacuum cleaners, plants such as excavators and
scaffolding. Each piece of equipment is identified by a number. Power tools are described by
their model and the voltage they use, whereas plants are classified by their model and their
size in tonnes. Scaffolding can be traditional, aluminium or fibreglass; in addition, its width
can be single or double. A large piece of equipment may be composed of smaller pieces of
equipment. The company has various outlets and each has staff including a manager and
several senior technicians who are responsible for supervising the work of allocated groups
of technicians. A supervision record is also kept for a specific date. All employees are
identified by their number, name, date of birth (DOB) and address. Furthermore, a record is
kept on their employment records and their qualifications. Each outlet has a stock of
equipment that may be hired by clients for varying periods of time, from a minimum of four
hours to a maximum of six months. Each hire agreement between a client and the company
is uniquely identified by using a hire number. Each client is identified by a number and a
name. The company insists that each client must take out insurance cover for each
equipment hire period. Each insurance cover is identified by a unique insurance number
and includes the description of the insurance. The company wishes to keep a record of the
member of staff who was in charge of a specific hire agreement. Each piece of equipment is
checked for faults when it is returned by the client, and the faults/defects/damage
recorded. The company keeps a record of the hire history of each client. 1. Create an ER
diagram for the above scenario and indicate the cardinality of relationships and the nature
of the associations (mandatory or optional). You should allocate adequate attributes to the
entities of interest, especially the identifiers. (20%) 2. Generate, with justification,
relational tables from the ER diagram. Indicate clearly the names of the tables, the
attributes, the primary keys and the foreign keys. (5%) Guidance: i) Create the ER diagram
and clearly identify any identifiers, and indicate the cardinality of relationships and the
nature of the associations (mandatory or optional). ii) Generate tables and include primary
and foreign keys. Use the schema notation; you do not have to produce SQL statements.
Example of table generation in schema form: Course(courseId, courseName)
Student (studentId, name, courseId*)
Part B: SQL programming Consider the following “Publication” Database schema, where
AuthorId, CategoryId, PubIicationd, UserId are unique identifiers. Category (CategoryId,
Type) Author (AuthorId, Name) Publisher(PublisherID, PublisherName) Publication
(PublicationId, AuthorId, Title, CategorytId, PublishedYear, PublisherId) Request (UserId,
PublicationId, RequestDate) User (UserId, Name, Email, Password) TABLE: Author
AuthorId Name A011 Dingle R A012 Ransome A A013 Wardale R A014 Alexander T A015
Spurrier S TABLE: Publisher PublisherId PublisherName P01 Pearson P02 HarperCollins
P03 Simon and Schuster TABLE: Category CategoryId Type C911 Short stories C912
Journal articles C913 Biography C914 Illustrations
TABLE: Publication PubId AuthorId Title CategoryId PublishedYear PublisherId P001 A011
The Blue Treacle C911 1911 P01 P002 A012 In Aleppo Once C911 2001 P01 P003 A012
Illustrating Arthur Ransome C914 1973 P03 P004 A012 Ransome the Artist C914 1994
P03 P005 A014 Bohemia In London C912 2008 P02 P006 A011 The Best of Childhood
C911 2002 P01 P007 A015 Distilled Enthusiasms C912 2010 P02 TABLE: Request UserId
PublicationId RequestDate U016 P001 05-Oct-2020 U241 P001 28-Sep-2020 U55 P002 08-
Sep-2020 U016 P004 06-Oct-2020 U121 P002 23-Sep-2020 TABLE: User UserId Name
Email Password U111 Kenderine J KenderineJ@hotmail.com Kenj2 U241 Wang F
WangF@hotmail.com Wanf05 U55 Flavel K FlavelK@hotmail.com Flak77 U016 Zidane Z
ZidaneZ@hotmail.com Zidz13 U121 Keita R KeitaR@hotmail.com Keir22
1. Use appropriate data types and write the SQL statements to create the tables defined in
the schema above. (10%) 2. Write SQL Statements to return the following data from the
Publication database. a) The names and the email of the users and the dates of all their
requests. (4%) b) The details
of the publications that were requested during September 2020 in descending order of
requested dates. The details should include publication ids, titles and dates on which
publications were requested. (4%) c) The names of the users, the
title of the publications they requested, and the names of their authors.
(4%) d) The number of publications requested for each of the categories (e.g.
‘Illustrations’). (4%) e) The names of users who requested more
than one publication. (4%) Guidance: Please use standard SQL in
Oracle Live SQL. Indicate clearly the primary keys and the foreign keys. State the SQL
statements and give the results. Select and use appropriate data. The presentation of each
query should have a text summary which includes i) the query itself, ii) the corresponding
SQL statement solution, iii) the result of the execution of the statement and iv) evidence that
you have used standard SQL and implemented each statement on a database (use
screenshots, no need for appendices). A small data sample is given. When appropriate, you
can create and insert additional data records in order to make sure that the queries return
results. Links to external datasets or code and are not acceptable. The report must be self-
contained.
Part C. Sequential and parallel processing Consider a university data store about module
coursework (CW) submissions by students. For each CW submission for a specific module,
each record of the data store holds the following data: 1. Module code 2. Module name 3.
Semester id 4. Year 5. CW title 6. Student id 7. Student name 8. CW submission date 9. Actual
date of CW submission by student Examples of records would have the following values:
(7086CEM, Data Management Systems, S1, 2022, Database Management, 10101010, Lionel
Messi, 28/11/2022, 30/11/2022) (7053CEM, Big Data and Visualisation, S2, 2021, Data
Analysis, 07070707, Cristiano Ronaldo, 21/10/2021, 25/10/2021) (7071CEM, Information
Retrieval, S3, 2020, Comparative Evaluation, 11011010, Zinedine Zidane, 09/09/2020,
25/10/2020) A record will contain a list of key-value pairs, as presented in the lectures. A
block will consist of a sequence of records. You are expected to create similar additional
records for this part in order to produce a mapReduce solution in question 2, below. For
each module, the university Registry would like to determine the number of CW
submissions which were submitted late, by one day or more. 1. Assuming that the data is
stored in a relational database, produce with justification a) the SQL statement to create the
corresponding table for the data store and b) the SQL statement to determine the number of
CW submissions that were submitted late by one day or more, for each module. (You do not
need to implement the SQL statements in Oracle Live SQL). (5%) 2. Assuming that the data
is too large to be processed in a centralised manner, and that it is stored in an ordinary file,
produce a decentralised solution, equivalent to 1.b) above, which applies MapReduce to the
data processing. Justify your decisions and all the steps of your solution. (20%) Guidance:
You should study carefully the examples of mapReduce covered in the lecture notes. You
should consider the structure of the key in the (key, value) pair in the original record and in
the mapping stage. This is not a programming exercise. The solution should follow the
structure of the solutions given in the lecture notes.
Part D. Research report Write in your own words a brief report where you introduce and
discuss data warehouses. You should cover the rationale for data warehouses, their
structure and their use. The maximum length of the report is 700 words. Longer reports will
be penalised. (20%) Guidance: Use your own words for the report. Copying and
pasting is plagiarism. You should include relevant references. The maximum length of the
report is 700 words. Longer reports will be penalised.
Marking Rubric Marking Scheme Grade Part A Part B Part C Part D <40 Incorrect
interpretation of scenario and Incomplete formulation of solution Limited identification
of entities and poor annotation of relationships Incorrect generation of relational tables
Limited or absent rationale Poor interpretation of requirements and of queries DDL
and DML SQL statements limited in scope Incomplete and incorrect SQL statements
Absence of rationale Partial understanding of requirements and partially correct SQL
formulation Partial understanding of context and relevance of parallel processing
Incomplete steps in the application of MapReduce Partial justification of design decisions
Lack of understanding of requirements Inadequate identification of issues
Incompetent understanding of structural and processing components Poorly written
essay 40-49 Partial interpretation of scenario and formulation of solution Partially
correct ER diagram with relevant entities and relationships Partial consistency in the
generation of the relational tables Partial justification of design decisions Basic
understanding of requirements and partially correct interpretation Relevant use of DDL
and DML statements in solution formulation Partially complete SQL statements Limited
justification of solution Partial understanding of requirements and partially correct SQL
formulation Partial understanding of context and relevance of parallel processing
Incomplete steps in the application of MapReduce Partial justification of design decisions
Partial interpretation of requirements Limited presentation of key issues Relevant
description of structural and processing components Mostly descriptive essay 50-59
Adequate and consistent interpretation of scenario and satisfactory conceptual modelling
Mostly correct generation of ER diagram with relevant entities and relationships
Relatively competent generation of the relational tables Adequately justified design
decisions Adequate understanding of requirements and mostly correct interpretation
Adequate use of DDL and DML SQL statements in solution formulation Mostly complete
SQL statements Adequate justification of solution Adequate understanding of
requirements and correct SQL formulation Adequate presentation of context of
application of parallel processing Mostly correct application of the different steps of
MapReduce Adequately justified solution Adequate interpretation of requirements
Key issues well identified and partially addressed Adequate presentation of key structural
and processing components Adequately written essay 60-69 Good and clear
interpretation of scenario and good solution formulation of the two parts Correct and
complete identification of entities and relationships Consistent generation of relational
tables Good interpretation of requirements and good formulation of solution Correct
use of DDL and DML statements in solution formulation Complete and relevant treatment
of queries Good justification of decisions Good solution to the initial query in terms of
SQL statements Focused presentation of context of application of parallel processing
Clearly stated and correct sequential steps of MapReduce Well expressed rationale
Good understanding and statement of requirements Well focused presentation of main
issues Good description of structural and processing components Well presented
Well expressed rationale essay with some reflection 70+ Very good and clear
interpretation of scenario and excellent solution formulation of the two parts Correct and
coherent identification of well annotated entities and types of relationships Logical and
consistent generation of relational tables Excellent justification of design decisions
Excellent interpretation of requirements and very good formulation of solution Excellent
use of DDL and DML statements in solution formulation Complete treatment of queries
and SQL formulation Very good rationale Very good solution to the initial query in
terms of SQL statements Excellent presentation on the need for an overall parallel
solution Clear deployment and annotation of the sequential steps of MapReduce
Excellent rationale Excellent understanding and interpretation of requirements Very
good identification and formulation of key issues Relevant and specific presentation of
structural and processing components Reflective writing supported by an excellent
structure

Recomendados

6-2 FINAL Project Milestone Three System Requirement. Submit your.docx von
6-2 FINAL Project Milestone Three System Requirement. Submit your.docx6-2 FINAL Project Milestone Three System Requirement. Submit your.docx
6-2 FINAL Project Milestone Three System Requirement. Submit your.docxJospehStull43
93 views6 Folien
Page 1 of 4CRICOS Provider No. 00103D Assignment 1 Specificati.docx von
Page 1 of 4CRICOS Provider No. 00103D Assignment 1 Specificati.docxPage 1 of 4CRICOS Provider No. 00103D Assignment 1 Specificati.docx
Page 1 of 4CRICOS Provider No. 00103D Assignment 1 Specificati.docxalfred4lewis58146
3 views12 Folien
Cis 515 Effective Communication-snaptutorial.com von
Cis 515 Effective Communication-snaptutorial.comCis 515 Effective Communication-snaptutorial.com
Cis 515 Effective Communication-snaptutorial.comjhonklinz10
19 views21 Folien
CIS 515 Enhance teaching / snaptutorial.com von
CIS 515 Enhance teaching / snaptutorial.com CIS 515 Enhance teaching / snaptutorial.com
CIS 515 Enhance teaching / snaptutorial.com donaldzs56
12 views21 Folien
CIS 515 Education Organization / snaptutorial.com von
CIS 515 Education Organization / snaptutorial.comCIS 515 Education Organization / snaptutorial.com
CIS 515 Education Organization / snaptutorial.comMcdonaldRyan38
32 views21 Folien
75425 Topic SQL Concepts and Database DesignNumber of Pages 4.docx von
75425 Topic SQL Concepts and Database DesignNumber of Pages 4.docx75425 Topic SQL Concepts and Database DesignNumber of Pages 4.docx
75425 Topic SQL Concepts and Database DesignNumber of Pages 4.docxsleeperharwell
2 views8 Folien

Más contenido relacionado

Similar a Map Reduce.pdf

MIS604 Requirement EngineeringAssessment Two – Requirement S.docx von
MIS604 Requirement EngineeringAssessment Two – Requirement S.docxMIS604 Requirement EngineeringAssessment Two – Requirement S.docx
MIS604 Requirement EngineeringAssessment Two – Requirement S.docxaudeleypearl
8 views10 Folien
1. In a study to investigate the relative effectiveness of six dif.docx von
1. In a study to investigate the relative effectiveness of six dif.docx1. In a study to investigate the relative effectiveness of six dif.docx
1. In a study to investigate the relative effectiveness of six dif.docxjackiewalcutt
3 views15 Folien
Itech 1006 assignment 2 sem1 2017 von
Itech 1006 assignment 2 sem1 2017Itech 1006 assignment 2 sem1 2017
Itech 1006 assignment 2 sem1 2017Sandeep Ratnam
183 views10 Folien
IT 510 Final Project Guidelines and Rubric Overview The final projec.docx von
IT 510 Final Project Guidelines and Rubric Overview The final projec.docxIT 510 Final Project Guidelines and Rubric Overview The final projec.docx
IT 510 Final Project Guidelines and Rubric Overview The final projec.docxcareyshaunda
2 views10 Folien
IT 510 Final Project Guidelines and Rubric Overview .docx von
IT 510 Final Project Guidelines and Rubric  Overview .docxIT 510 Final Project Guidelines and Rubric  Overview .docx
IT 510 Final Project Guidelines and Rubric Overview .docxpriestmanmable
9 views23 Folien
Itech 1006 assignment 2 sem1 2017 (2) von
Itech 1006 assignment 2 sem1 2017 (2)Itech 1006 assignment 2 sem1 2017 (2)
Itech 1006 assignment 2 sem1 2017 (2)Sandeep Ratnam
223 views10 Folien

Similar a Map Reduce.pdf(20)

MIS604 Requirement EngineeringAssessment Two – Requirement S.docx von audeleypearl
MIS604 Requirement EngineeringAssessment Two – Requirement S.docxMIS604 Requirement EngineeringAssessment Two – Requirement S.docx
MIS604 Requirement EngineeringAssessment Two – Requirement S.docx
audeleypearl8 views
1. In a study to investigate the relative effectiveness of six dif.docx von jackiewalcutt
1. In a study to investigate the relative effectiveness of six dif.docx1. In a study to investigate the relative effectiveness of six dif.docx
1. In a study to investigate the relative effectiveness of six dif.docx
jackiewalcutt3 views
Itech 1006 assignment 2 sem1 2017 von Sandeep Ratnam
Itech 1006 assignment 2 sem1 2017Itech 1006 assignment 2 sem1 2017
Itech 1006 assignment 2 sem1 2017
Sandeep Ratnam183 views
IT 510 Final Project Guidelines and Rubric Overview The final projec.docx von careyshaunda
IT 510 Final Project Guidelines and Rubric Overview The final projec.docxIT 510 Final Project Guidelines and Rubric Overview The final projec.docx
IT 510 Final Project Guidelines and Rubric Overview The final projec.docx
careyshaunda2 views
IT 510 Final Project Guidelines and Rubric Overview .docx von priestmanmable
IT 510 Final Project Guidelines and Rubric  Overview .docxIT 510 Final Project Guidelines and Rubric  Overview .docx
IT 510 Final Project Guidelines and Rubric Overview .docx
priestmanmable9 views
Itech 1006 assignment 2 sem1 2017 (2) von Sandeep Ratnam
Itech 1006 assignment 2 sem1 2017 (2)Itech 1006 assignment 2 sem1 2017 (2)
Itech 1006 assignment 2 sem1 2017 (2)
Sandeep Ratnam223 views
Coit11237 assignment 2 specifications von Nicole Valerio
Coit11237 assignment 2 specificationsCoit11237 assignment 2 specifications
Coit11237 assignment 2 specifications
Nicole Valerio384 views
Cis 498 Believe Possibilities / snaptutorial.com von Davis5a
Cis 498    Believe Possibilities / snaptutorial.comCis 498    Believe Possibilities / snaptutorial.com
Cis 498 Believe Possibilities / snaptutorial.com
Davis5a29 views
BAIT1003 Assignment von limsh
BAIT1003 AssignmentBAIT1003 Assignment
BAIT1003 Assignment
limsh304 views
Cis 498 Education Organization-snaptutorial.com von robertlesew2
Cis 498 Education Organization-snaptutorial.comCis 498 Education Organization-snaptutorial.com
Cis 498 Education Organization-snaptutorial.com
robertlesew235 views
IT 510 Final Project Guidelines and Rubric Overview .docx von vrickens
IT 510 Final Project Guidelines and Rubric  Overview .docxIT 510 Final Project Guidelines and Rubric  Overview .docx
IT 510 Final Project Guidelines and Rubric Overview .docx
vrickens8 views
Cis 498 Inspiring Innovation--tutorialrank.com von PrescottLunt371
Cis 498  Inspiring Innovation--tutorialrank.comCis 498  Inspiring Innovation--tutorialrank.com
Cis 498 Inspiring Innovation--tutorialrank.com
PrescottLunt37115 views
Cis 498 Exceptional Education - snaptutorial.com von DavisMurphyA87
Cis 498  Exceptional Education - snaptutorial.comCis 498  Exceptional Education - snaptutorial.com
Cis 498 Exceptional Education - snaptutorial.com
DavisMurphyA8749 views
Internet basic of it20 von rosu555
Internet basic of it20Internet basic of it20
Internet basic of it20
rosu555104 views
CIS 498 Effective Communication - snaptutorial.com von donaldzs1
CIS 498 Effective Communication - snaptutorial.comCIS 498 Effective Communication - snaptutorial.com
CIS 498 Effective Communication - snaptutorial.com
donaldzs1204 views
Term Paper TopBikeDue Week 10 and worth 230 pointsThis .docx von jacqueliner9
Term Paper TopBikeDue Week 10 and worth 230 pointsThis .docxTerm Paper TopBikeDue Week 10 and worth 230 pointsThis .docx
Term Paper TopBikeDue Week 10 and worth 230 pointsThis .docx
jacqueliner97 views
Cis 498Education Specialist / snaptutorial.com von McdonaldRyan74
Cis 498Education Specialist / snaptutorial.comCis 498Education Specialist / snaptutorial.com
Cis 498Education Specialist / snaptutorial.com
McdonaldRyan7431 views
Assignment Overview Case 4 is in two parts Your Dreams Our Mission/tutorialou... von davidwarner122
Assignment Overview Case 4 is in two parts Your Dreams Our Mission/tutorialou...Assignment Overview Case 4 is in two parts Your Dreams Our Mission/tutorialou...
Assignment Overview Case 4 is in two parts Your Dreams Our Mission/tutorialou...
davidwarner12253 views
The Strayer Oracle Server may be used to test and compile the SQL Qu.docx von suzannewarch
The Strayer Oracle Server may be used to test and compile the SQL Qu.docxThe Strayer Oracle Server may be used to test and compile the SQL Qu.docx
The Strayer Oracle Server may be used to test and compile the SQL Qu.docx
suzannewarch2 views

Más de sdfghj21

What types of hormones have been shown by scientific studies.docx von
What types of hormones have been shown by scientific studies.docxWhat types of hormones have been shown by scientific studies.docx
What types of hormones have been shown by scientific studies.docxsdfghj21
7 views1 Folie
Unit Autonomic Nervous System This is a 4.docx von
Unit Autonomic Nervous System This is a 4.docxUnit Autonomic Nervous System This is a 4.docx
Unit Autonomic Nervous System This is a 4.docxsdfghj21
4 views1 Folie
Your team will have to find a problem to solve.docx von
Your team will have to find a problem to solve.docxYour team will have to find a problem to solve.docx
Your team will have to find a problem to solve.docxsdfghj21
4 views1 Folie
Your assignment is to find a YouTube video.docx von
Your assignment is to find a YouTube video.docxYour assignment is to find a YouTube video.docx
Your assignment is to find a YouTube video.docxsdfghj21
3 views1 Folie
You will apply your knowledge of course concepts in writing.docx von
You will apply your knowledge of course concepts in writing.docxYou will apply your knowledge of course concepts in writing.docx
You will apply your knowledge of course concepts in writing.docxsdfghj21
3 views1 Folie
Your position as a new school leader will have an.docx von
Your position as a new school leader will have an.docxYour position as a new school leader will have an.docx
Your position as a new school leader will have an.docxsdfghj21
3 views2 Folien

Más de sdfghj21(20)

What types of hormones have been shown by scientific studies.docx von sdfghj21
What types of hormones have been shown by scientific studies.docxWhat types of hormones have been shown by scientific studies.docx
What types of hormones have been shown by scientific studies.docx
sdfghj217 views
Unit Autonomic Nervous System This is a 4.docx von sdfghj21
Unit Autonomic Nervous System This is a 4.docxUnit Autonomic Nervous System This is a 4.docx
Unit Autonomic Nervous System This is a 4.docx
sdfghj214 views
Your team will have to find a problem to solve.docx von sdfghj21
Your team will have to find a problem to solve.docxYour team will have to find a problem to solve.docx
Your team will have to find a problem to solve.docx
sdfghj214 views
Your assignment is to find a YouTube video.docx von sdfghj21
Your assignment is to find a YouTube video.docxYour assignment is to find a YouTube video.docx
Your assignment is to find a YouTube video.docx
sdfghj213 views
You will apply your knowledge of course concepts in writing.docx von sdfghj21
You will apply your knowledge of course concepts in writing.docxYou will apply your knowledge of course concepts in writing.docx
You will apply your knowledge of course concepts in writing.docx
sdfghj213 views
Your position as a new school leader will have an.docx von sdfghj21
Your position as a new school leader will have an.docxYour position as a new school leader will have an.docx
Your position as a new school leader will have an.docx
sdfghj213 views
You work as a network consultant for West a.docx von sdfghj21
You work as a network consultant for West a.docxYou work as a network consultant for West a.docx
You work as a network consultant for West a.docx
sdfghj213 views
You work in the human resources department of a company.docx von sdfghj21
You work in the human resources department of a company.docxYou work in the human resources department of a company.docx
You work in the human resources department of a company.docx
sdfghj213 views
You will submit your introduction to your case In.docx von sdfghj21
You will submit your introduction to your case In.docxYou will submit your introduction to your case In.docx
You will submit your introduction to your case In.docx
sdfghj213 views
You are about to perform a rectal examination of an.docx von sdfghj21
You are about to perform a rectal examination of an.docxYou are about to perform a rectal examination of an.docx
You are about to perform a rectal examination of an.docx
sdfghj213 views
you interviewed the CEO and evaluated the organization to gain.docx von sdfghj21
you interviewed the CEO and evaluated the organization to gain.docxyou interviewed the CEO and evaluated the organization to gain.docx
you interviewed the CEO and evaluated the organization to gain.docx
sdfghj213 views
You are going to brief a small group of newly.docx von sdfghj21
You are going to brief a small group of newly.docxYou are going to brief a small group of newly.docx
You are going to brief a small group of newly.docx
sdfghj213 views
You have been asked to design a LAN for a.docx von sdfghj21
You have been asked to design a LAN for a.docxYou have been asked to design a LAN for a.docx
You have been asked to design a LAN for a.docx
sdfghj213 views
You will be creating a fictional business of your choice.docx von sdfghj21
You will be creating a fictional business of your choice.docxYou will be creating a fictional business of your choice.docx
You will be creating a fictional business of your choice.docx
sdfghj212 views
You will be as a team to one of.docx von sdfghj21
You will be as a team to one of.docxYou will be as a team to one of.docx
You will be as a team to one of.docx
sdfghj213 views
You will develop a safety briefing on personal protective equipment.docx von sdfghj21
You will develop a safety briefing on personal protective equipment.docxYou will develop a safety briefing on personal protective equipment.docx
You will develop a safety briefing on personal protective equipment.docx
sdfghj213 views
You are working on a team and a assigns.docx von sdfghj21
You are working on a team and a assigns.docxYou are working on a team and a assigns.docx
You are working on a team and a assigns.docx
sdfghj213 views
Write a to paper about genetically vigorous.docx von sdfghj21
Write a to paper about genetically vigorous.docxWrite a to paper about genetically vigorous.docx
Write a to paper about genetically vigorous.docx
sdfghj213 views
Write an assignment on the role of the professional nurse.docx von sdfghj21
Write an assignment on the role of the professional nurse.docxWrite an assignment on the role of the professional nurse.docx
Write an assignment on the role of the professional nurse.docx
sdfghj213 views
witnessed such incivility on the part of the physician.docx von sdfghj21
witnessed such incivility on the part of the physician.docxwitnessed such incivility on the part of the physician.docx
witnessed such incivility on the part of the physician.docx
sdfghj213 views

Último

Retail Store Scavenger Hunt.pptx von
Retail Store Scavenger Hunt.pptxRetail Store Scavenger Hunt.pptx
Retail Store Scavenger Hunt.pptxjmurphy154
52 views10 Folien
unidad 3.pdf von
unidad 3.pdfunidad 3.pdf
unidad 3.pdfMarcosRodriguezUcedo
129 views38 Folien
Jibachha publishing Textbook.docx von
Jibachha publishing Textbook.docxJibachha publishing Textbook.docx
Jibachha publishing Textbook.docxDrJibachhaSahVetphys
54 views14 Folien
MIXING OF PHARMACEUTICALS.pptx von
MIXING OF PHARMACEUTICALS.pptxMIXING OF PHARMACEUTICALS.pptx
MIXING OF PHARMACEUTICALS.pptxAnupkumar Sharma
117 views35 Folien
Guidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptx von
Guidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptxGuidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptx
Guidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptxNiranjan Chavan
38 views48 Folien
StudioX.pptx von
StudioX.pptxStudioX.pptx
StudioX.pptxNikhileshSathyavarap
89 views18 Folien

Último(20)

Retail Store Scavenger Hunt.pptx von jmurphy154
Retail Store Scavenger Hunt.pptxRetail Store Scavenger Hunt.pptx
Retail Store Scavenger Hunt.pptx
jmurphy15452 views
Guidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptx von Niranjan Chavan
Guidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptxGuidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptx
Guidelines & Identification of Early Sepsis DR. NN CHAVAN 02122023.pptx
Niranjan Chavan38 views
11.28.23 Social Capital and Social Exclusion.pptx von mary850239
11.28.23 Social Capital and Social Exclusion.pptx11.28.23 Social Capital and Social Exclusion.pptx
11.28.23 Social Capital and Social Exclusion.pptx
mary850239409 views
Nelson_RecordStore.pdf von BrynNelson5
Nelson_RecordStore.pdfNelson_RecordStore.pdf
Nelson_RecordStore.pdf
BrynNelson546 views
11.30.23A Poverty and Inequality in America.pptx von mary850239
11.30.23A Poverty and Inequality in America.pptx11.30.23A Poverty and Inequality in America.pptx
11.30.23A Poverty and Inequality in America.pptx
mary85023986 views
Payment Integration using Braintree Connector | MuleSoft Mysore Meetup #37 von MysoreMuleSoftMeetup
Payment Integration using Braintree Connector | MuleSoft Mysore Meetup #37Payment Integration using Braintree Connector | MuleSoft Mysore Meetup #37
Payment Integration using Braintree Connector | MuleSoft Mysore Meetup #37
Education of marginalized and socially disadvantages segments.pptx von GarimaBhati5
Education of marginalized and socially disadvantages segments.pptxEducation of marginalized and socially disadvantages segments.pptx
Education of marginalized and socially disadvantages segments.pptx
GarimaBhati540 views
ANGULARJS.pdf von ArthyR3
ANGULARJS.pdfANGULARJS.pdf
ANGULARJS.pdf
ArthyR349 views
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE... von Nguyen Thanh Tu Collection
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE...BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE...
BÀI TẬP BỔ TRỢ TIẾNG ANH 11 THEO ĐƠN VỊ BÀI HỌC - CẢ NĂM - CÓ FILE NGHE (FRIE...
CUNY IT Picciano.pptx von apicciano
CUNY IT Picciano.pptxCUNY IT Picciano.pptx
CUNY IT Picciano.pptx
apicciano60 views
Monthly Information Session for MV Asterix (November) von Esquimalt MFRC
Monthly Information Session for MV Asterix (November)Monthly Information Session for MV Asterix (November)
Monthly Information Session for MV Asterix (November)
Esquimalt MFRC98 views
Creative Restart 2023: Atila Martins - Craft: A Necessity, Not a Choice von Taste
Creative Restart 2023: Atila Martins - Craft: A Necessity, Not a ChoiceCreative Restart 2023: Atila Martins - Craft: A Necessity, Not a Choice
Creative Restart 2023: Atila Martins - Craft: A Necessity, Not a Choice
Taste41 views

Map Reduce.pdf

  • 1. Map Reduce databases project and need the explanation and answer to help me learn. Hello Having work related to Map Reduce. Are you interested? Need task 3 and 4 only. Requirements: Task and Mark distribution: 1. ER model and table generation (252. SQL programming (30%) 3. Sequential and distributed processing (254. Research report (20%) Notes: 1. You are expected to use the students can contact Centre for Academic Writing (CAW)2. Please notify your registry 3. Any student requiring an extension or deferral should follow the university process as outlined R model and table generation (25%) processing (25%) the CUHarvard referencing format. For support and advice on how this Centre for Academic Writing (CAW). registry course support team and module leader for disability supportAny student requiring an extension or deferral should follow the university process as outlined For support and advice on how this and module leader for disability support. Any student requiring an extension or deferral should follow the university process as outlined • 7086CEM – Data Management Systems This assignment is made up of four parts: - Part A deals with database design, using E-R modelling. - Part B concerns database creation and querying, using SQL. - Part C covers parallel processing frameworks - Part D involves a research report Coursework requirements Please note that plagiarism (copying and pasting from sources) and collusion (submitting the same content as a fellow student) are detected automatically by Turnitin and flagged. You should be able to check for similarity. You are expected to submit your work. The report must be self-contained. Links to external datasets are not acceptable. Appendices are not required. Report structure Please follow the instructions below. You are expected: a) to include the name and the code of the module, the title of the CW, your names and your student Ids. b) to follow the structure of the CW brief, and present your work accordingly, i.e.: Part A 1. 2. ...... Part B 1. 2. a) b) c) d) e) ...... Submission process for the group CW is submitted as a group of two students as follows: a) You are expected to work as a group on ALL the parts of the CW. b) one student will submit individually in his/her name on Aula, the whole report, with both names on the front page, as a single pdf file, and c) the other student must submit individually in his/her name on Aula, one page only, with both names, as a pdf file, where he/she presents reflections on the collaborative work. Part A. Conceptual modelling An equipment company wishes to create a database to
  • 2. support the hiring of tools and machinery to clients. The company has three types of equipment: power tools, such as drills and vacuum cleaners, plants such as excavators and scaffolding. Each piece of equipment is identified by a number. Power tools are described by their model and the voltage they use, whereas plants are classified by their model and their size in tonnes. Scaffolding can be traditional, aluminium or fibreglass; in addition, its width can be single or double. A large piece of equipment may be composed of smaller pieces of equipment. The company has various outlets and each has staff including a manager and several senior technicians who are responsible for supervising the work of allocated groups of technicians. A supervision record is also kept for a specific date. All employees are identified by their number, name, date of birth (DOB) and address. Furthermore, a record is kept on their employment records and their qualifications. Each outlet has a stock of equipment that may be hired by clients for varying periods of time, from a minimum of four hours to a maximum of six months. Each hire agreement between a client and the company is uniquely identified by using a hire number. Each client is identified by a number and a name. The company insists that each client must take out insurance cover for each equipment hire period. Each insurance cover is identified by a unique insurance number and includes the description of the insurance. The company wishes to keep a record of the member of staff who was in charge of a specific hire agreement. Each piece of equipment is checked for faults when it is returned by the client, and the faults/defects/damage recorded. The company keeps a record of the hire history of each client. 1. Create an ER diagram for the above scenario and indicate the cardinality of relationships and the nature of the associations (mandatory or optional). You should allocate adequate attributes to the entities of interest, especially the identifiers. (20%) 2. Generate, with justification, relational tables from the ER diagram. Indicate clearly the names of the tables, the attributes, the primary keys and the foreign keys. (5%) Guidance: i) Create the ER diagram and clearly identify any identifiers, and indicate the cardinality of relationships and the nature of the associations (mandatory or optional). ii) Generate tables and include primary and foreign keys. Use the schema notation; you do not have to produce SQL statements. Example of table generation in schema form: Course(courseId, courseName) Student (studentId, name, courseId*) Part B: SQL programming Consider the following “Publication” Database schema, where AuthorId, CategoryId, PubIicationd, UserId are unique identifiers. Category (CategoryId, Type) Author (AuthorId, Name) Publisher(PublisherID, PublisherName) Publication (PublicationId, AuthorId, Title, CategorytId, PublishedYear, PublisherId) Request (UserId, PublicationId, RequestDate) User (UserId, Name, Email, Password) TABLE: Author AuthorId Name A011 Dingle R A012 Ransome A A013 Wardale R A014 Alexander T A015 Spurrier S TABLE: Publisher PublisherId PublisherName P01 Pearson P02 HarperCollins P03 Simon and Schuster TABLE: Category CategoryId Type C911 Short stories C912 Journal articles C913 Biography C914 Illustrations TABLE: Publication PubId AuthorId Title CategoryId PublishedYear PublisherId P001 A011 The Blue Treacle C911 1911 P01 P002 A012 In Aleppo Once C911 2001 P01 P003 A012 Illustrating Arthur Ransome C914 1973 P03 P004 A012 Ransome the Artist C914 1994 P03 P005 A014 Bohemia In London C912 2008 P02 P006 A011 The Best of Childhood
  • 3. C911 2002 P01 P007 A015 Distilled Enthusiasms C912 2010 P02 TABLE: Request UserId PublicationId RequestDate U016 P001 05-Oct-2020 U241 P001 28-Sep-2020 U55 P002 08- Sep-2020 U016 P004 06-Oct-2020 U121 P002 23-Sep-2020 TABLE: User UserId Name Email Password U111 Kenderine J KenderineJ@hotmail.com Kenj2 U241 Wang F WangF@hotmail.com Wanf05 U55 Flavel K FlavelK@hotmail.com Flak77 U016 Zidane Z ZidaneZ@hotmail.com Zidz13 U121 Keita R KeitaR@hotmail.com Keir22 1. Use appropriate data types and write the SQL statements to create the tables defined in the schema above. (10%) 2. Write SQL Statements to return the following data from the Publication database. a) The names and the email of the users and the dates of all their requests. (4%) b) The details of the publications that were requested during September 2020 in descending order of requested dates. The details should include publication ids, titles and dates on which publications were requested. (4%) c) The names of the users, the title of the publications they requested, and the names of their authors. (4%) d) The number of publications requested for each of the categories (e.g. ‘Illustrations’). (4%) e) The names of users who requested more than one publication. (4%) Guidance: Please use standard SQL in Oracle Live SQL. Indicate clearly the primary keys and the foreign keys. State the SQL statements and give the results. Select and use appropriate data. The presentation of each query should have a text summary which includes i) the query itself, ii) the corresponding SQL statement solution, iii) the result of the execution of the statement and iv) evidence that you have used standard SQL and implemented each statement on a database (use screenshots, no need for appendices). A small data sample is given. When appropriate, you can create and insert additional data records in order to make sure that the queries return results. Links to external datasets or code and are not acceptable. The report must be self- contained. Part C. Sequential and parallel processing Consider a university data store about module coursework (CW) submissions by students. For each CW submission for a specific module, each record of the data store holds the following data: 1. Module code 2. Module name 3. Semester id 4. Year 5. CW title 6. Student id 7. Student name 8. CW submission date 9. Actual date of CW submission by student Examples of records would have the following values: (7086CEM, Data Management Systems, S1, 2022, Database Management, 10101010, Lionel Messi, 28/11/2022, 30/11/2022) (7053CEM, Big Data and Visualisation, S2, 2021, Data Analysis, 07070707, Cristiano Ronaldo, 21/10/2021, 25/10/2021) (7071CEM, Information Retrieval, S3, 2020, Comparative Evaluation, 11011010, Zinedine Zidane, 09/09/2020, 25/10/2020) A record will contain a list of key-value pairs, as presented in the lectures. A block will consist of a sequence of records. You are expected to create similar additional records for this part in order to produce a mapReduce solution in question 2, below. For each module, the university Registry would like to determine the number of CW submissions which were submitted late, by one day or more. 1. Assuming that the data is stored in a relational database, produce with justification a) the SQL statement to create the corresponding table for the data store and b) the SQL statement to determine the number of CW submissions that were submitted late by one day or more, for each module. (You do not
  • 4. need to implement the SQL statements in Oracle Live SQL). (5%) 2. Assuming that the data is too large to be processed in a centralised manner, and that it is stored in an ordinary file, produce a decentralised solution, equivalent to 1.b) above, which applies MapReduce to the data processing. Justify your decisions and all the steps of your solution. (20%) Guidance: You should study carefully the examples of mapReduce covered in the lecture notes. You should consider the structure of the key in the (key, value) pair in the original record and in the mapping stage. This is not a programming exercise. The solution should follow the structure of the solutions given in the lecture notes. Part D. Research report Write in your own words a brief report where you introduce and discuss data warehouses. You should cover the rationale for data warehouses, their structure and their use. The maximum length of the report is 700 words. Longer reports will be penalised. (20%) Guidance: Use your own words for the report. Copying and pasting is plagiarism. You should include relevant references. The maximum length of the report is 700 words. Longer reports will be penalised. Marking Rubric Marking Scheme Grade Part A Part B Part C Part D <40 Incorrect interpretation of scenario and Incomplete formulation of solution Limited identification of entities and poor annotation of relationships Incorrect generation of relational tables Limited or absent rationale Poor interpretation of requirements and of queries DDL and DML SQL statements limited in scope Incomplete and incorrect SQL statements Absence of rationale Partial understanding of requirements and partially correct SQL formulation Partial understanding of context and relevance of parallel processing Incomplete steps in the application of MapReduce Partial justification of design decisions Lack of understanding of requirements Inadequate identification of issues Incompetent understanding of structural and processing components Poorly written essay 40-49 Partial interpretation of scenario and formulation of solution Partially correct ER diagram with relevant entities and relationships Partial consistency in the generation of the relational tables Partial justification of design decisions Basic understanding of requirements and partially correct interpretation Relevant use of DDL and DML statements in solution formulation Partially complete SQL statements Limited justification of solution Partial understanding of requirements and partially correct SQL formulation Partial understanding of context and relevance of parallel processing Incomplete steps in the application of MapReduce Partial justification of design decisions Partial interpretation of requirements Limited presentation of key issues Relevant description of structural and processing components Mostly descriptive essay 50-59 Adequate and consistent interpretation of scenario and satisfactory conceptual modelling Mostly correct generation of ER diagram with relevant entities and relationships Relatively competent generation of the relational tables Adequately justified design decisions Adequate understanding of requirements and mostly correct interpretation Adequate use of DDL and DML SQL statements in solution formulation Mostly complete SQL statements Adequate justification of solution Adequate understanding of requirements and correct SQL formulation Adequate presentation of context of application of parallel processing Mostly correct application of the different steps of MapReduce Adequately justified solution Adequate interpretation of requirements
  • 5. Key issues well identified and partially addressed Adequate presentation of key structural and processing components Adequately written essay 60-69 Good and clear interpretation of scenario and good solution formulation of the two parts Correct and complete identification of entities and relationships Consistent generation of relational tables Good interpretation of requirements and good formulation of solution Correct use of DDL and DML statements in solution formulation Complete and relevant treatment of queries Good justification of decisions Good solution to the initial query in terms of SQL statements Focused presentation of context of application of parallel processing Clearly stated and correct sequential steps of MapReduce Well expressed rationale Good understanding and statement of requirements Well focused presentation of main issues Good description of structural and processing components Well presented Well expressed rationale essay with some reflection 70+ Very good and clear interpretation of scenario and excellent solution formulation of the two parts Correct and coherent identification of well annotated entities and types of relationships Logical and consistent generation of relational tables Excellent justification of design decisions Excellent interpretation of requirements and very good formulation of solution Excellent use of DDL and DML statements in solution formulation Complete treatment of queries and SQL formulation Very good rationale Very good solution to the initial query in terms of SQL statements Excellent presentation on the need for an overall parallel solution Clear deployment and annotation of the sequential steps of MapReduce Excellent rationale Excellent understanding and interpretation of requirements Very good identification and formulation of key issues Relevant and specific presentation of structural and processing components Reflective writing supported by an excellent structure