SlideShare ist ein Scribd-Unternehmen logo
1 von 38
Visitation time
scheduling
Alfonso de la Fuente Ruiz
2013
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Content index
 Scenario
 The O.R. Problem
 Initial considerations
 First approach: Microsoft Excel
 Importing data from CSV into MS Excel
 Exploring the dataset
 Data order by client
 Vouching for data validity
 Alternatives and decision making
 Coding software and choosing tools
 Microsoft Excel Macros
 Open Office Suite: Calc
 Structured Query Language
 Open Office Suite: Base
 Visual Studio Express
 Oracle and PL/SQL
 Using Transact-SQL in
Microsoft SQL Server 2k+
 Cleaning the data
 Pseudocode for data cleaning
 Result after data cleaning
 PERT and GANTT
 Scheduling schemes
 Scheduling scheme chosen
 Coding the scheme
 Reporting output
 ACID Compliant DBMS
 ACID Compliancy in MS SQL Server (I)
 ACID Compliancy in MS SQL Server (II)
 ACID Compliancy in MS SQL Server (III)
 Database design: a bird´s eye view
 Database normalization
 Database map, visually
 Database map: Table definition
 Database map: Procedures and functions
 References
 Conclusion
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Scenario:
 The small test project that was asked to be
prepared is described in a PDF file
(Portable Document Format) and the data
required is in a CSV file (Comma Separated
Values).
 One natural week was given to find a
solution and to prepare a presentation that
was to be shown remotely to the UK.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
The O.R. problem
 The problem, from the Operational Research perspective, constitutes a very
simple case of “visitation time scheduling” with multiple clients and a single
server which can attend only one petition at a time.
 Therefore, a number of solution schemes are readily available, such as First-Come
First Served, Priority Queues, Gantt techniques and others.
 The difficulty of the problem seems to root not in the complexity of the algorithm
coding stage, but in the data formatting stage (both for input and output) and at
the database design stage.
 The precise software tools to be used were left unspecified, so a large number of
alternatives are all posible choices. SQL Server and PostgreSQL were suggested.
 In our approach, we firstly will use Microsoft Excel in order to study the data and
to perform basic filtering, after which we will consider a number of solutions from
the software market.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Initial considerations
 This problem constitutes a typical Computer Science Project for Business or
Engineering students during their first years at the university.
 The students will usually be asked to solve this kind of problem during one term,
having a couple of months (up to a semester depending upon academic pressure
considerations) to solve it and to prepare a written Project alone or in small
teams, to be handed-in at the end of it.
 The preparation of the Project case involves careful design considerations, ranging
from plagiarism avoidance to speeding up marking processes and exception
control.
 This kind of knowledge can also come in handy for real business applications at the
SAME (Small And Medium-sized Enterprise) level or larger.
 In most scenarios, just a subset of the information contained in these pages will be
documented and presented to students or staff personnel so to avoid informational
saturation and to enhance operational understanding.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
First approach: Microsoft Excel
 Since SQL Server was the first option
suggested, and there exists a very popular
software package from Microsoft in the
market (MSSQLS), in our first approach,
we load the CSV data file in Microsoft
Excel (2013 Spanish version) to have a
look from it.
 In order to do so, we need to import the
data from the file, using the
“Data/Import/From textfile…” feature.
 There we will select the “simple.csv” file
and to follow the assistant.
 In the assistant wizard window we select
delimitated data file type, with headers,
Windows (ANSI) file origins, “Comma” (,)
as the separator character, and “General”
data type for every column so that Excel
autodetects it.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Importing data from CSV into MS Excel
 As a result, we obtain a set of
columns where the headers can show
the “autofilter” option which we
often utilize to order alphabetically or
numerically.
 Here we ordered the data by the
“datetime_from” field, so that we
can observe the information and
assume some hypothesis over the
contents.
 We can easily observe several types of
plausible anomalies in the data which
force us to take some decision-taking
at design time.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Exploring the dataset
 At this point, we depict the problem on a paper
sheet to gain further insight before moving on to
the software tools.
 There we get some data schemes and timetabling
that will be commented upon further on.
 Among other stuff, we observe that the total time
for all visitations does not exceed the total time
available for service, under any set of assumptions,
which is a good sign, for it means that we will be
able to deal with the service without overbooking.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Data order by client
 We now apply a second ordering to the data over the client_id field.
 We name the rows as c#t#, where hashes represent client number and task
number for that particular client.
 Therefore we obtain the following set:
{c1t1,c1t2,c1t3,c1t4 ; c2t1,c2t2,c2t3,c2t4 ; c3t1 ; c4t1,c4t2,c4t3}
id client_id datetime_from datetime_to Name Rep? Inv? >24h?
1 1 2013-01-01 09:00 2013-01-01 10:00 gary doades 0 0 0,00
8 1 2013-01-01 09:01 2013-01-01 09:00 gary doades 0 1 0,00
3 1 2013-01-01 09:45 2013-01-01 10:45 gary doades 0 0 0,00
6 1 2013-01-01 12:00 2013-01-01 12:30 gary doades 0 0 0,00
4 2 2013-01-01 23:00 2013-01-02 06:00 richard ward 0 0 1,00
5 2 2013-01-02 04:00 2013-01-02 04:15 richard ward 0 0 0,00
10 2 2013-01-02 05:00 2013-01-02 06:00 richard ward 0 0 0,00
11 2 2013-02-30 01:00 2013-02-30 02:00 richard ward 0 0 #¡VALOR!
7 3 2013-01-01 01:00 2013-01-01 02:00 natasha lunt 0 0 0,00
2 4 2013-01-01 01:00 2013-01-01 01:01 olivia groom-smith 1 0 0,00
9 4 2013-01-01 01:00 2013-01-01 01:01 olivia groom-smith 0 0 0,00
12 4 2013-01-01 18:00 2013-01-02 19:00 olivia groom-smith 0 0 1,00
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Vouching for data validity
 In order to detect anomalies, we ordered the data by
“datetime_from” and then a few quick tests were
implemented in boolean logic:
 REPETITION “Rep?”: IF(AND(C2=C3;D2=D3);1;0)
Briefly checks whether two visitation frames are
repeated in consecutive rows. Instances #2, #9 for
Olivia Groom-Smith are. Obviously not aplicable to the
last row.
 INVERSION “Inv?”: IF([@[datetime_from]]>=[@[datetime_to]];1;0)
Checks whether the end time strictly happens after the
beginning. Instance #8 for Gary Doades does not.
 MORE THAN ONE DAY “>24h?”:
=DAYS([@[datetime_to]];[@[datetime_from]])
Checks to see whether a visitation begins and ends in
different days. Instances #12, #4 do, where #12 lasts for
more than 24 hours and #4 does not (just 7 hours).
 Instance #11 also returns an error code because the
date format is not correct, as February does not have
30 days.
id client_id datetime_from datetime_to Name Rep? Inv? >24h?
7 3
2013-01-01
01:00
2013-01-01
02:00 natasha lunt 0 0 0,00
2 4
2013-01-01
01:00
2013-01-01
01:01
olivia groom-
smith 1 0 0,00
9 4
2013-01-01
01:00
2013-01-01
01:01
olivia groom-
smith 0 0 0,00
1 1
2013-01-01
09:00
2013-01-01
10:00 gary doades 0 0 0,00
8 1
2013-01-01
09:01
2013-01-01
09:00 gary doades 0 1 0,00
3 1
2013-01-01
09:45
2013-01-01
10:45 gary doades 0 0 0,00
6 1
2013-01-01
12:00
2013-01-01
12:30 gary doades 0 0 0,00
12 4
2013-01-01
18:00
2013-01-02
19:00
olivia groom-
smith 0 0 1,00
4 2
2013-01-01
23:00
2013-01-02
06:00 richard ward 0 0 1,00
5 2
2013-01-02
04:00
2013-01-02
04:15 richard ward 0 0 0,00
10 2
2013-01-02
05:00
2013-01-02
06:00 richard ward 0 0 0,00
11 2
2013-02-30
01:00
2013-02-30
02:00 richard ward 0 0 #¡VALOR!
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Alternatives and decision making
 The first observation that we made is that these data show some conflicts that require decision-taking:
 There are 4 clients (customers) and 12 tasks a priori
 Task c1t2 defines a visitation to end before it begins. This could only be understood as a reverse visitation
(server visiting client) or as a quantum effect.
 We assume that those two alternatives lie outside of the scope for the problem. Removed those, choice is to either
exchange times or to remove the reservation row
 Some tasks already show overlap within the order given a priori, thus rearrangement is required, such as
c1t1 and c1t3
 Task c2t1 occurs overnight, causing it to begin and end in different day dates.
 Task c4t4 occurs in a different month than all other, being a possible outlier or mistaken data. Furthermore,
the date is not correct, since February cannot have 30 days.
 The course of action here could either be to remove the whole row or to correct the month to January.
 Since no certainty exists that this table must contain data from a single month, the whole row will be treated as invalid.
 Client #3 has only one visitation task defined for her, being the only one with a single visitation
 Tasks c4t1 and t4t2 are repeated, so one of them could be deleted or either they could be arranged by their
id number. Furthermore, they only last for one minute, being possibly outliers or mistakes.
 Task c4t3 lasts for more than 24 hours, being a possible outlier of mistake. Thus, it also exhibits the outlook
of c2t1 because it occurs overnight.
 The output will be an array set of, at most, max_id (12) elements from which conflicting rows are to be
deleted.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Coding software and choosing tools
 There is a large number of alternatives being readily available in the market
that provide the software framework needed to deal with these kinds of
problems.
 Among them, we can name but just a few: Microsoft Excel Macros, MS SQL
Server, MS Visual Studio Express, MS Access, MS Project, Open Office Base,
MySQL, SAS (Statistical Software Analysis) GANTT module, Visual Basic,
MicroGPSS, FORTRAN, Borland C++, Delphi, Java, PHP,…
 From here on we show a brief selection of choice among those tools. Usually
the decisión is taken out of convenience, with criteria such as availability
(having the software package already installed and configured on the
machine) but there exist multiple choices, all valid solutions.
 Whenever posible, specialised freeware 4GT (Fourth Generation Techniques)
will be used, being generally considered cheaper, most efficient, optimizing
internal computations and of a higher abstraction level, thus greatly
simplifying coding operations.
 Finally we will deal with the database design issue according to analogous
principles.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Open Office Suite: Calc
 In case no budget is allocated for software licensing,
universities and other organizations often make usage
of the OpenOffice suite for teaching and operational
applications.
 Open Office offers a range of solutions, such as the
“Calc” spreadsheet program and the “Base” database
management program.
 Here we can observe how, upon importing the data
into OpenOffice Calc in an analogous way as we did in
Excel, the wrong “February 30th” data is immediately
detected.
 Some other tools (such as OpenProj) will
automatically detect a mistake in the data and assign
the next available date for the field (+2 days towards
March the 2nd).
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Microsoft Excel Macros
 Another option is to
record a macro from
Microsoft Excel.
 In order to do so, we
need to activate the
“Developer” tab.
 Recording a macro is a
straightforward process,
but the source code
syntax and aspects are
quite complex in case
we had to ammend
anything in the code.
 To keep the code as
readable as possible,
we can use some other
mean.
 The logical course of
action seems to be to
use SQL code in order to
get to the required
scheduling solution.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Structured Query Language
 Structured Query Language (SQL) code is the
market for solved these kinds of problems.
 Therefore, some SQL programming expertise is
assumed in order to get a solution.
Open Office Suite: Base
 Open Office Base can be used to process the data
and to query the table for the output requested,
in the same way that the Microsoft Access
software package would.
 In OO Base, we can quickly create the table that
we need, with the advantage that it is open source
software and implements SQL.
 To do so, we first need to specify the field names
and types. Finally we would need to populate it
with actual data from OO Calc or MS Excel.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Visual Studio Express
 One other Microsoft Tool that
can be used is Visual Studio
Express (demo available for
free download).
 Here we can observe how
VSE also detects the
invalidity of one of the dates
(February the 30th).
 Visual Studio Express can
also be used to process the
data and to query the table
for the output requested.
 It also implements SQL and is
designed for seamless data
Exchange with Microsoft SQL
servers.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Oracle and PL/SQL
 Oracle is a very powerful tool that larger organizations, such as
city councils or international corporations use. It has its own
language extensión for database management: PL/SQL
 PL/SQL stands for "Procedural Language Extensions to SQL."
PL/SQL extends SQL by adding programming structures and
subroutines available in any high-level language.
 The syntax and capabilities are very similar to those in T-SQL
and other derivatives of standard SQL.
 Many Oracle applications are built using client-server
architecture. The Oracle database resides on the server. The
program that makes requests against this database resides on
the client machine. This program can be written in C, Java, or
PL/SQL.
 Because PL/SQL is just like any other programming language, it
has syntax and rules that determine how programming
statements work together. It is important for you to realize that
PL/SQL is not a stand-alone programming language. PL/SQL is a
part of the Oracle RDBMS, and it can reside in two
environments, the client and the server. As a result, it is very
easy to move PL/SQL modules between server-side and client-
side applications.
 Oracle also supplies a reduced command-line SQL extension
called SQL+.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Using Transact-SQL in
Microsoft SQL Server 2k+
 Microsoft SQL server 2000 (and above) is one of the
most popular software tools used to solve these kind
of problems at the business level, wherever
encountering high numbers of tables and instances.
 MSSQLS uses a powerful extension of standard SQL
originally developed by Sybase, called Transact-SQL.
T-SQL code can be bundled into a variety of software
applications: web pages, Visual Basic, Visual C# and so
on.
 New MS SQL Server versions such as 2005 indeed work
with CSV files and are interoperable with all of Visual
Studio, MSExcel and MSProject features and
functionality.
 MS SQL Server requires a moderate investment in
licensing.
 To the right you can see an example (cfr. bib.) where
you can read how to use the ORDER and GROUP BY
statements in T-SQL to aggregate data.
 For our exercise it constitutes a very useful tool to
design code that orders the preprocessed visitation
list by starting date and returns results ordered by
client, once a scheduling scheme has been agreed
upon and implemented.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Cleaning the data
 As we have observed several irregularities within the input data, we need to clean
those by deletion of all rows affected.
 To do so, we can either use the built-in tools of the software package of our
choice, or to write-up some code to do it for us.
 Given that the amount of instances (rows) in our table is very small, we choose to clean
it by hand (with the software packages built-in tools) with the target of speeding up the
process.
 If the amount of instances was higher (say dozens, hundreds or even millions of
registers), we should necessarily code a clean-up routine for this task.
 According to the validity analysis performed at a previous stage, and given the
time available and scope, we choose to simplify as much as posible by completely
removing any instances that show any of the following conflicts:
 REPETITION: All reservations must be DISTINCT, so second and further identical
reservations are deleted. Only the one with the lowest id is kept.
 INVERSION: Reservations with null or negative time lapses are deleted.
 MORE THAN 24 HOURS SERVICE TIME: Reservations that span over more tan one day are
deleted only if the total service time is greater than 24 hours. Otherwise they are kept,
assuming they occur over a night shift. We will also keep visitations lasting for just one
minute, assuming they represent a quick status check.
 WRONG DATA INPUT: Reservations with a wrong date or any other piece of data in any
field are deleted.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Pseudocode for data cleaning
 Since no tool was specified within the problem´s requests, having a wide range of options including
several variants and extensions of SQL, we will use pseudocode to show how to program the main
scheduling routine.
 When later a tool has been chosen, we may easily translate this pseudocode into the grammar of the
language of choice, without any loss of generality.
 We asume that a few simple subroutines are provided by the language for order, deletion and so on.
  We asume ROWS (for short) is a table that is to contain the RESERVATIONS
 ROWS := SELECT DISTINCT FROM RESERVATIONS  Removes duplicates (but obviously for ‘rows.id’, the master key)
 ORDER ROWS BY DATETIME_FROM  Orders all rows by starting time
 FOR ID IN ROWS LOOP:  For every distinct row repeat:
 IF DATE(ROWS[ID].DATETIME_FROM) < 0  All invalid dates should return a negative
 THEN DELETE(ROWS[ID])  Cleans wrongly timestamped rows
 IF ( DATE(ROWS[ID].DATETIME_FROM) >= DATE(ROWS[ID].DATETIME_TO) )
 THEN DELETE(ROWS[ID])  Cleans rows with non positive visitation time spans
 IF ( DAYS(ROWS[ID].DATETIME_TO - DATE(ROWS[ID].DATETIME_FROM ) >= 0
 THEN DELETE(ROWS[ID])  Cleans rows with visitation lasting for one day or more.
 END LOOP  End of loop
 COMMIT_WRITE(ROWS,RESERVATIONS)  Replaces all initial rows with the result of this cleaning routine
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Result after data cleaning
 Subsets to be substracted:
 Repetition candidate subsets: {c4t1,c4t2}. Choice subset: {c4t2}
 Inversion: One negative time lapse {c1t2}
 >24 hours: {c4t3}
 Wrong input: date out of margins (February 30th) {c2t4}
 Substraction set: {c1t2,c2t4,c4t2,c4t3}
 We end up with 8 instances after cleaning:
{c1t1,c1t2,c1t3,c1t4, c2t1,c2t2,c2t3,c2t4, c3t1, c4t1,c4t2,c4t3}
– {c1t2,c2t4,c4t2,c4t3}
= {c1t1,c1t3,c1t4, c2t1,c2t2,c2t3, c3t1, c4t1}
 Or, according to the master key “Reservation ‘id’”:
{1,8,3,6,4,5,10,11,7,2,9,12}
- {8,9,11,12}
= {7,2,1,3,6,4,5,10}
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
PERT and GANTT
 Program Evaluation Review Techniques
(PERT) are a set of tools for Project
Management that are commonly use in
scheduling environments.
 The most widely known of these is the
GANTT bar chart where we can define tasks
to be executed in parallel, serialized or with
interdependencies.
 There are again a number of tools that can
read an input, generate a Gantt chart and
apply scheduling schemes to the data, such
as Microsoft Project, GanttProject, OpenProj
and several others. Or we can just use a
general purpose RDBMS with SQL.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Scheduling schemes
 After we cleaned the data, there are several issues come to our mind that we should consider to
deal with the scheduling of the visitations, of which we name but just a few among the most
relevant:
 We could want all of the visitations to be scheduled as soon as posible.
 The first visitation occurs at 9:00 am, so we could schedule all of the reservations to be atended
only during office hours.
 We could also want to add breaks for meals, resting times, service maintenance or other
managerial reasons. We asume none.
 Some visitations occur overnight, so we can decide to schedule all visitations anytime during the
day and over night
 We could want to reschedule as few reservations as posible, or to have all visitations for the
same client being served together, one right after another, so that each client came only once.
 We could want to simplify:
 To consider the earliest reservation starting time as the beginning and then queue all others right
behind according: first, to their starting time, and second (if there were more tan one) by other
criteria
 Other possible criteria are: visitation duration, client id, alphabetical by name, or any other
priority scheme. For the sake of simplicity we choose the plain vanilla reservation id (the table´s
master key)
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Scheduling scheme chosen
 Since there exists a number of combinations for these and other criteria, that
result in very different scheduling schemes. The choice is usually to be made
among them according to the meta-knowledge that we have of the problem’s
environment (being it a hospital, a supermarket, a computer´s CPU…). This was
also the case at the data clean-up stage.
 Since the problem was submitted decontextualised, we are somewhat free to
choose here. Our scheduling scheme is defined as follows:
 The earliest reservation with the lowest ‘id’ will be scheduled as the first one.
 All others will follow without any time lapses, according to their starting time,
and in case of conflict, to their reservation id.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Coding the scheme
 Alike before, we use pseudocode to show a simple scheduling routine:
  We asume all ROWS have consecutive ID master keys after the COMMIT in the cleaning routine.
 ROWS := SELECT ALL FROM RESERVATIONS  Loads data from the Reservations table
 ORDER ROWS BY DATETIME_FROM  Orders all rows by starting time
 ORDER ROWS BY ID  Orders all rows by the master key
 FOR I=ID FROM ROWS[FIRST] TO ROWS[LAST-1] LOOP:  For every row but the last one, repeat with index ‘i’:
 TIMESPAN := ROWS[I+1].DATETIME_FROM - ROWS[I+1].DATETIME_FROM  Calculates duration for the next task
 ROWS[I+1].DATETIME_FROM := ROWS[I].DATETIME_TO  Set all tasks to start right after the previous one ends
 ROWS[I+1].DATETIME_TO := ROWS[I].DATETIME_TO + TIMESPAN  Set termination time for all tasks
 END LOOP  End of loop
 COMMIT_WRITE(ROWS,VISITATIONS)  Overwrites the VISITATIONS table with the result of this scheduling
routine
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Reporting output
 After scheduling we code a reporting routine in the same fashion as before:
  We asume VIS (for short) is to contain the final output from VISITATIONS.
 ORDER VISITATIONS BY DATETIME_FROM  Orders all rows by starting time
 ORDER VISITATIONS BY CLIENT_ID  Performs a second ordering by client
 VIS := SELECT FROM VISITATIONS:  Loads several columns from the ordered Visitations table
 VIS.ID
 VIS.CLIENT_ID
 VIS.NAME
 VIS.DATETIME_FROM
 VIS.DATETIME_TO
 COMMIT WRITE(VIS,FILE(”.Output.csv”;#CSV))  Writes the result of this query in an archive in the comma-
separated values format.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
ACID Compliant DBMS
 In computer science, ACID (Atomicity, Consistency, Isolation, Durability) is a
set of properties that guarantee that database transactions are processed
reliably.
 In the context of databases, a single logical operation on the data is called a
transaction.
 This approach has many advantages and only slight disadvantages when
treating really huge databases (say Terabytes of data) in real time
environments. In those rare environments, a NoQSL approach might be
preferred.
 As we will see in the following reads, Microsoft´s SQL Server Express software
solution will ensure ACID compliancy.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
ACID Compliancy in MS SQL Server (I)
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
ACID Compliancy in MS SQL Server (II)
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
ACID Compliancy in MS SQL Server (III)
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Database design: a bird´s eye view
 At this point, we again depict the problem in a
paper sheet to gain further insight before continuing
the database creation and management issues.
 The database is thought of as part of a reservation
system that receives online reservation requests,
process them by scheduling acording to the scheme
and produces a visitation table. It also allows to
manage individually each of the visitators (just one
instance for our example), clients, reservations and
visitations.
 We expanded the basic functionality of the software
by adding the possibility of having more tan one
agent of a visitations, dubbed “visitator”.
 It will contain four tables: Visitators, Clients,
Reservations and Visitations.
 It will implement one “Reschedule” function and
three procedures: Edit_clients, Edit_visitators and
Edit_Reservations.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Database normalization
 Databse normalization is the process of organizing
the fields and tables of a relational database to
minimize redundancy and dependency.
 Normalization usually involves dividing large tables
into smaller (and less redundant) tables and
defininf relationships between them.
 The objective is to isolate data so that additions,
deletions and modifications of a field can be made
in just one table and then propagated through the
rest of the database using the defined
relationships.
 The Normal Forms (NF) of relational database
theory provide criteria for determining a table´s
degree of immunity against logical inconsistencies
and anomalies. The higher the normal form
applicable to a table, the less vulnerable it is.
 For OLAP (Online Analytical Processing)
applications, such as data mining tools, it might be
preferred to use a lower normal form because they
are primarily “read only” databases that tend to
extract accumulated historical data, whereas
transaction intensive applications will usually opt
for a higher normal form.
 For small problems like this one, usually 1NF, 2NF
or 3NF are the only ones being used.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Database map: Table definition
 The database will implement the following four tables: Visitators, Clients, Reservations and
Visitations
 The tables contain the fields specified below. An asterisk (*) is added after the primary key
identifier for each of the tables.
 VISITATORS: v_id (*), v_name
 CLIENTS: client_id (*), name
 RESERVATIONS: id (*), v_id, client_id, datetime_from, datetime_to
 VISITATIONS: V_id (*), v_id, client_id, datetime_from, datetime_to, Rescheduled
 NOTES:
 The field for the client name has been moved out from the reservations table because having the client_id,
this field is redundant. A table has been created to contain all of the clients´names associated to their
client_id.
 The field for for the client name has been moved out from the visitations table for the same reason. In case
we need to print a report containing the visitations as scheduled, a query will be able to access the Clients
table to retrieve the piece of data.
 The visitator´s name has been moved out of reservations for analogous reasons. A visitators table has been
created.
 The visitator´s identificator “v_id” has been added to the reservations and to the visitations table so to be
allow to choose among several of these.
 Rescheduled is a boolean field that has been added to keep track of rescheduling operations. Any visitation
that undergoes a change in any other field for reservations rescheduling purposes will be marked with a
TRUE value. FALSE otherwise.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Database map: Procedures and functions
 The database will implement three procedures and one function that will be called from any
of the former.
 The function “RESCHEDULE” will read the table of Reservations and any other needed and
will only write the table of Visitations. Its purpose will be to reschedule all rows according to
the scheme previously defined.
 There will exist four procedures:
 EDIT_CLIENTS: Reads and writes the Clients table. Writes the table of Reservations. Finally calls
the Reschedule function. It is used to modify any information concerning some particular client
instance, such as the name field, in all of the registers. It is also used to remove a client with
all of its reservations (and therefore its visitations).
 EDIT VISITATORS: Reads and writes the Visitators table. Writes the table of Reservations. Finally
calls the Reschedule functions. It is used to modify any information concerning some particular
visitator instance, such as the name field, in all of the registers. It is also used to remove a
visitator with all of its reservations (and therefore its visitations).
 EDIT RESERVATIONS: Reads and writes the Reservations table. Finally call the Reschedule
functions. It is used to edit any piece of data concerning a reservation, such as the visitator,
the client or the dates and times arranged. It is also used to delete a reservation.
 NOTES:
 Only the RESCHEDULE function can Access the Visitations list, being this considered the single
most valuable source of output reports from the program´s execution.
 It may occur that upon the deletion of any or all of the Reservations, some garbage data
remains stored at the Clients and Visitators tables. That´s why we need specific procedures to
edit those.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Database map, visually
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
 Blue boxes for tables, Green disks for procedures. Arrows for data/operations fluxes.
PERT
References
 "Microsoft SQL Server 2005 New Features" by Michael Oatley.
McGraw-Hill/Osborne 2005 (288 pages). ISBN:0072227761
 “SQL Server 2000: Stored Procedure Programming” by Dejan
Sunderic and Tom Woodhead. Osborne Database Professional’s
Library
 “Microsoft Excel 2007 VBA (Macros). Premier Training Limited
(London)
 “Macros Visual Basic para Excel” by José Pedro García Sabater.
ROGLE – Universitat Politècnica de València.
 “Microsoft SQL Server 2005 Express Edition for Dummies” by Robert
Schneider. Wiley Publishing, Inc.
 “Oracle PL/SQL by Example” by VV.AA. Pearson Education as
Prentice Hall Professional Technical Reference.
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
Any questions?
 alfonsodelafuenteruiz@yahoo.es
 http://creativecommons.org/licenses/by-nc-sa/3.0/legalcode
 Please excuse any errata.
 Thanks for your attention
Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)

Weitere ähnliche Inhalte

Andere mochten auch

Tietoturva sosiaalisessa mediassa
Tietoturva sosiaalisessa mediassaTietoturva sosiaalisessa mediassa
Tietoturva sosiaalisessa mediassaPete Nieminen
 
Civic User Testing Group (CUTGroup): Presentation at Code for America 2015 Su...
Civic User Testing Group (CUTGroup): Presentation at Code for America 2015 Su...Civic User Testing Group (CUTGroup): Presentation at Code for America 2015 Su...
Civic User Testing Group (CUTGroup): Presentation at Code for America 2015 Su...Smart Chicago Collaborative
 
Pilvet ja pilvipalvelut
Pilvet ja pilvipalvelutPilvet ja pilvipalvelut
Pilvet ja pilvipalvelutPete Nieminen
 
Experimental Modes of Civic Engagement in Civic Tech: Meeting people where th...
Experimental Modes of Civic Engagement in Civic Tech: Meeting people where th...Experimental Modes of Civic Engagement in Civic Tech: Meeting people where th...
Experimental Modes of Civic Engagement in Civic Tech: Meeting people where th...Smart Chicago Collaborative
 
Lesson Plan 1 - Making Inferences
Lesson Plan 1  -  Making InferencesLesson Plan 1  -  Making Inferences
Lesson Plan 1 - Making InferencesTess McNamara
 
Final Report for CUTGroup #28 - City of Chicago Open Data Portal Homepage
Final Report for CUTGroup #28 - City of Chicago Open Data Portal HomepageFinal Report for CUTGroup #28 - City of Chicago Open Data Portal Homepage
Final Report for CUTGroup #28 - City of Chicago Open Data Portal HomepageSmart Chicago Collaborative
 

Andere mochten auch (9)

Tietoturva sosiaalisessa mediassa
Tietoturva sosiaalisessa mediassaTietoturva sosiaalisessa mediassa
Tietoturva sosiaalisessa mediassa
 
Civic User Testing Group (CUTGroup): Presentation at Code for America 2015 Su...
Civic User Testing Group (CUTGroup): Presentation at Code for America 2015 Su...Civic User Testing Group (CUTGroup): Presentation at Code for America 2015 Su...
Civic User Testing Group (CUTGroup): Presentation at Code for America 2015 Su...
 
Pilvet ja pilvipalvelut
Pilvet ja pilvipalvelutPilvet ja pilvipalvelut
Pilvet ja pilvipalvelut
 
Experimental Modes of Civic Engagement in Civic Tech: Meeting people where th...
Experimental Modes of Civic Engagement in Civic Tech: Meeting people where th...Experimental Modes of Civic Engagement in Civic Tech: Meeting people where th...
Experimental Modes of Civic Engagement in Civic Tech: Meeting people where th...
 
Lesson Plan 1 - Making Inferences
Lesson Plan 1  -  Making InferencesLesson Plan 1  -  Making Inferences
Lesson Plan 1 - Making Inferences
 
Gestion hospitaliere
Gestion hospitaliereGestion hospitaliere
Gestion hospitaliere
 
Gestion hospitaliere
Gestion hospitaliereGestion hospitaliere
Gestion hospitaliere
 
Final Report for CUTGroup #28 - City of Chicago Open Data Portal Homepage
Final Report for CUTGroup #28 - City of Chicago Open Data Portal HomepageFinal Report for CUTGroup #28 - City of Chicago Open Data Portal Homepage
Final Report for CUTGroup #28 - City of Chicago Open Data Portal Homepage
 
Peds basicprinciplesmechanicalventilation
Peds basicprinciplesmechanicalventilationPeds basicprinciplesmechanicalventilation
Peds basicprinciplesmechanicalventilation
 

Ähnlich wie Visitation time scheduling

Intro to Big Data - Orlando Code Camp 2014
Intro to Big Data - Orlando Code Camp 2014Intro to Big Data - Orlando Code Camp 2014
Intro to Big Data - Orlando Code Camp 2014John Ternent
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream ProcessingGuido Schmutz
 
From Zero to Cloud and Back
From Zero to Cloud and BackFrom Zero to Cloud and Back
From Zero to Cloud and BackBATbern
 
Fast Secure and Anonymous Key Agreement Against Bad Randomness for CloudCompu...
Fast Secure and Anonymous Key Agreement Against Bad Randomness for CloudCompu...Fast Secure and Anonymous Key Agreement Against Bad Randomness for CloudCompu...
Fast Secure and Anonymous Key Agreement Against Bad Randomness for CloudCompu...IRJET Journal
 
Survey on Fast Secure and Anonymous Key Agreement against Bad Randomness for ...
Survey on Fast Secure and Anonymous Key Agreement against Bad Randomness for ...Survey on Fast Secure and Anonymous Key Agreement against Bad Randomness for ...
Survey on Fast Secure and Anonymous Key Agreement against Bad Randomness for ...IRJET Journal
 
Big Data and Fast Data – Big and Fast Combined, is it Possible?
Big Data and Fast Data – Big and Fast Combined, is it Possible?Big Data and Fast Data – Big and Fast Combined, is it Possible?
Big Data and Fast Data – Big and Fast Combined, is it Possible?Guido Schmutz
 
Azure Machine Learning Intro
Azure Machine Learning IntroAzure Machine Learning Intro
Azure Machine Learning IntroDamir Dobric
 
Big Data and Fast Data - big and fast combined, is it possible?
Big Data and Fast Data - big and fast combined, is it possible?Big Data and Fast Data - big and fast combined, is it possible?
Big Data and Fast Data - big and fast combined, is it possible?Guido Schmutz
 
Big data oracle_introduccion
Big data oracle_introduccionBig data oracle_introduccion
Big data oracle_introduccionFran Navarro
 
Sql relay Bristol Keynote Oct 13th 2015
Sql relay Bristol Keynote Oct 13th 2015Sql relay Bristol Keynote Oct 13th 2015
Sql relay Bristol Keynote Oct 13th 2015Jonathan Woodward
 
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham ALSecrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham ALMark Tabladillo
 
Future Proofing Your Office 365 & SharePoint Strategy
Future Proofing Your Office 365 & SharePoint StrategyFuture Proofing Your Office 365 & SharePoint Strategy
Future Proofing Your Office 365 & SharePoint StrategyRichard Harbridge
 
Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Guido Schmutz
 
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesBig Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesKrishna Sankar
 
D3 data driven development in practice - the AirPortal for Schiphol and Tra...
D3   data driven development in practice - the AirPortal for Schiphol and Tra...D3   data driven development in practice - the AirPortal for Schiphol and Tra...
D3 data driven development in practice - the AirPortal for Schiphol and Tra...112Motion
 
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...IRJET Journal
 

Ähnlich wie Visitation time scheduling (20)

Intro to Big Data - Orlando Code Camp 2014
Intro to Big Data - Orlando Code Camp 2014Intro to Big Data - Orlando Code Camp 2014
Intro to Big Data - Orlando Code Camp 2014
 
Introduction to Stream Processing
Introduction to Stream ProcessingIntroduction to Stream Processing
Introduction to Stream Processing
 
Data masking a developer's guide
Data masking a developer's guideData masking a developer's guide
Data masking a developer's guide
 
Mine craft:
Mine craft: Mine craft:
Mine craft:
 
From Zero to Cloud and Back
From Zero to Cloud and BackFrom Zero to Cloud and Back
From Zero to Cloud and Back
 
Fast Secure and Anonymous Key Agreement Against Bad Randomness for CloudCompu...
Fast Secure and Anonymous Key Agreement Against Bad Randomness for CloudCompu...Fast Secure and Anonymous Key Agreement Against Bad Randomness for CloudCompu...
Fast Secure and Anonymous Key Agreement Against Bad Randomness for CloudCompu...
 
12:12
12:1212:12
12:12
 
Survey on Fast Secure and Anonymous Key Agreement against Bad Randomness for ...
Survey on Fast Secure and Anonymous Key Agreement against Bad Randomness for ...Survey on Fast Secure and Anonymous Key Agreement against Bad Randomness for ...
Survey on Fast Secure and Anonymous Key Agreement against Bad Randomness for ...
 
Big Data and Fast Data – Big and Fast Combined, is it Possible?
Big Data and Fast Data – Big and Fast Combined, is it Possible?Big Data and Fast Data – Big and Fast Combined, is it Possible?
Big Data and Fast Data – Big and Fast Combined, is it Possible?
 
Azure Machine Learning Intro
Azure Machine Learning IntroAzure Machine Learning Intro
Azure Machine Learning Intro
 
Big Data and Fast Data - big and fast combined, is it possible?
Big Data and Fast Data - big and fast combined, is it possible?Big Data and Fast Data - big and fast combined, is it possible?
Big Data and Fast Data - big and fast combined, is it possible?
 
Big data oracle_introduccion
Big data oracle_introduccionBig data oracle_introduccion
Big data oracle_introduccion
 
Sql relay Bristol Keynote Oct 13th 2015
Sql relay Bristol Keynote Oct 13th 2015Sql relay Bristol Keynote Oct 13th 2015
Sql relay Bristol Keynote Oct 13th 2015
 
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham ALSecrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
Secrets of Enterprise Data Mining: SQL Saturday 328 Birmingham AL
 
Future Proofing Your Office 365 & SharePoint Strategy
Future Proofing Your Office 365 & SharePoint StrategyFuture Proofing Your Office 365 & SharePoint Strategy
Future Proofing Your Office 365 & SharePoint Strategy
 
Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)Processing Twitter Stream with Oracle Event Processing (OEP)
Processing Twitter Stream with Oracle Event Processing (OEP)
 
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & AntidotesBig Data Analytics - Best of the Worst : Anti-patterns & Antidotes
Big Data Analytics - Best of the Worst : Anti-patterns & Antidotes
 
Amazon cloud service
Amazon cloud serviceAmazon cloud service
Amazon cloud service
 
D3 data driven development in practice - the AirPortal for Schiphol and Tra...
D3   data driven development in practice - the AirPortal for Schiphol and Tra...D3   data driven development in practice - the AirPortal for Schiphol and Tra...
D3 data driven development in practice - the AirPortal for Schiphol and Tra...
 
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...
Quality of Groundwater in Lingala Mandal of YSR Kadapa District, Andhraprades...
 

Kürzlich hochgeladen

AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptxiammrhaywood
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSJoshuaGantuangco2
 
Presentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptxPresentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptxRosabel UA
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptxmary850239
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Celine George
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management SystemChristalin Nelson
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfVanessa Camilleri
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...Nguyen Thanh Tu Collection
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxVanesaIglesias10
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSMae Pangan
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfErwinPantujan2
 
The Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsThe Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsRommel Regala
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Mark Reed
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parentsnavabharathschool99
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17Celine George
 
Dust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEDust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEaurabinda banchhor
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmStan Meyer
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfJemuel Francisco
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxlancelewisportillo
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4JOYLYNSAMANIEGO
 

Kürzlich hochgeladen (20)

AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptxAUDIENCE THEORY -CULTIVATION THEORY -  GERBNER.pptx
AUDIENCE THEORY -CULTIVATION THEORY - GERBNER.pptx
 
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTSGRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
GRADE 4 - SUMMATIVE TEST QUARTER 4 ALL SUBJECTS
 
Presentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptxPresentation Activity 2. Unit 3 transv.pptx
Presentation Activity 2. Unit 3 transv.pptx
 
4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx4.16.24 Poverty and Precarity--Desmond.pptx
4.16.24 Poverty and Precarity--Desmond.pptx
 
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
Incoming and Outgoing Shipments in 3 STEPS Using Odoo 17
 
Transaction Management in Database Management System
Transaction Management in Database Management SystemTransaction Management in Database Management System
Transaction Management in Database Management System
 
ICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdfICS2208 Lecture6 Notes for SL spaces.pdf
ICS2208 Lecture6 Notes for SL spaces.pdf
 
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
HỌC TỐT TIẾNG ANH 11 THEO CHƯƠNG TRÌNH GLOBAL SUCCESS ĐÁP ÁN CHI TIẾT - CẢ NĂ...
 
ROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptxROLES IN A STAGE PRODUCTION in arts.pptx
ROLES IN A STAGE PRODUCTION in arts.pptx
 
Textual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHSTextual Evidence in Reading and Writing of SHS
Textual Evidence in Reading and Writing of SHS
 
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdfVirtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
Virtual-Orientation-on-the-Administration-of-NATG12-NATG6-and-ELLNA.pdf
 
The Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World PoliticsThe Contemporary World: The Globalization of World Politics
The Contemporary World: The Globalization of World Politics
 
Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)Influencing policy (training slides from Fast Track Impact)
Influencing policy (training slides from Fast Track Impact)
 
Choosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for ParentsChoosing the Right CBSE School A Comprehensive Guide for Parents
Choosing the Right CBSE School A Comprehensive Guide for Parents
 
How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17How to Add Barcode on PDF Report in Odoo 17
How to Add Barcode on PDF Report in Odoo 17
 
Dust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSEDust Of Snow By Robert Frost Class-X English CBSE
Dust Of Snow By Robert Frost Class-X English CBSE
 
Oppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and FilmOppenheimer Film Discussion for Philosophy and Film
Oppenheimer Film Discussion for Philosophy and Film
 
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdfGrade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
Grade 9 Quarter 4 Dll Grade 9 Quarter 4 DLL.pdf
 
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptxQ4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
Q4-PPT-Music9_Lesson-1-Romantic-Opera.pptx
 
Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4Daily Lesson Plan in Mathematics Quarter 4
Daily Lesson Plan in Mathematics Quarter 4
 

Visitation time scheduling

  • 1. Visitation time scheduling Alfonso de la Fuente Ruiz 2013 Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 2. Content index  Scenario  The O.R. Problem  Initial considerations  First approach: Microsoft Excel  Importing data from CSV into MS Excel  Exploring the dataset  Data order by client  Vouching for data validity  Alternatives and decision making  Coding software and choosing tools  Microsoft Excel Macros  Open Office Suite: Calc  Structured Query Language  Open Office Suite: Base  Visual Studio Express  Oracle and PL/SQL  Using Transact-SQL in Microsoft SQL Server 2k+  Cleaning the data  Pseudocode for data cleaning  Result after data cleaning  PERT and GANTT  Scheduling schemes  Scheduling scheme chosen  Coding the scheme  Reporting output  ACID Compliant DBMS  ACID Compliancy in MS SQL Server (I)  ACID Compliancy in MS SQL Server (II)  ACID Compliancy in MS SQL Server (III)  Database design: a bird´s eye view  Database normalization  Database map, visually  Database map: Table definition  Database map: Procedures and functions  References  Conclusion Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 3. Scenario:  The small test project that was asked to be prepared is described in a PDF file (Portable Document Format) and the data required is in a CSV file (Comma Separated Values).  One natural week was given to find a solution and to prepare a presentation that was to be shown remotely to the UK. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 4. The O.R. problem  The problem, from the Operational Research perspective, constitutes a very simple case of “visitation time scheduling” with multiple clients and a single server which can attend only one petition at a time.  Therefore, a number of solution schemes are readily available, such as First-Come First Served, Priority Queues, Gantt techniques and others.  The difficulty of the problem seems to root not in the complexity of the algorithm coding stage, but in the data formatting stage (both for input and output) and at the database design stage.  The precise software tools to be used were left unspecified, so a large number of alternatives are all posible choices. SQL Server and PostgreSQL were suggested.  In our approach, we firstly will use Microsoft Excel in order to study the data and to perform basic filtering, after which we will consider a number of solutions from the software market. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 5. Initial considerations  This problem constitutes a typical Computer Science Project for Business or Engineering students during their first years at the university.  The students will usually be asked to solve this kind of problem during one term, having a couple of months (up to a semester depending upon academic pressure considerations) to solve it and to prepare a written Project alone or in small teams, to be handed-in at the end of it.  The preparation of the Project case involves careful design considerations, ranging from plagiarism avoidance to speeding up marking processes and exception control.  This kind of knowledge can also come in handy for real business applications at the SAME (Small And Medium-sized Enterprise) level or larger.  In most scenarios, just a subset of the information contained in these pages will be documented and presented to students or staff personnel so to avoid informational saturation and to enhance operational understanding. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 6. First approach: Microsoft Excel  Since SQL Server was the first option suggested, and there exists a very popular software package from Microsoft in the market (MSSQLS), in our first approach, we load the CSV data file in Microsoft Excel (2013 Spanish version) to have a look from it.  In order to do so, we need to import the data from the file, using the “Data/Import/From textfile…” feature.  There we will select the “simple.csv” file and to follow the assistant.  In the assistant wizard window we select delimitated data file type, with headers, Windows (ANSI) file origins, “Comma” (,) as the separator character, and “General” data type for every column so that Excel autodetects it. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 7. Importing data from CSV into MS Excel  As a result, we obtain a set of columns where the headers can show the “autofilter” option which we often utilize to order alphabetically or numerically.  Here we ordered the data by the “datetime_from” field, so that we can observe the information and assume some hypothesis over the contents.  We can easily observe several types of plausible anomalies in the data which force us to take some decision-taking at design time. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 8. Exploring the dataset  At this point, we depict the problem on a paper sheet to gain further insight before moving on to the software tools.  There we get some data schemes and timetabling that will be commented upon further on.  Among other stuff, we observe that the total time for all visitations does not exceed the total time available for service, under any set of assumptions, which is a good sign, for it means that we will be able to deal with the service without overbooking. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 9. Data order by client  We now apply a second ordering to the data over the client_id field.  We name the rows as c#t#, where hashes represent client number and task number for that particular client.  Therefore we obtain the following set: {c1t1,c1t2,c1t3,c1t4 ; c2t1,c2t2,c2t3,c2t4 ; c3t1 ; c4t1,c4t2,c4t3} id client_id datetime_from datetime_to Name Rep? Inv? >24h? 1 1 2013-01-01 09:00 2013-01-01 10:00 gary doades 0 0 0,00 8 1 2013-01-01 09:01 2013-01-01 09:00 gary doades 0 1 0,00 3 1 2013-01-01 09:45 2013-01-01 10:45 gary doades 0 0 0,00 6 1 2013-01-01 12:00 2013-01-01 12:30 gary doades 0 0 0,00 4 2 2013-01-01 23:00 2013-01-02 06:00 richard ward 0 0 1,00 5 2 2013-01-02 04:00 2013-01-02 04:15 richard ward 0 0 0,00 10 2 2013-01-02 05:00 2013-01-02 06:00 richard ward 0 0 0,00 11 2 2013-02-30 01:00 2013-02-30 02:00 richard ward 0 0 #¡VALOR! 7 3 2013-01-01 01:00 2013-01-01 02:00 natasha lunt 0 0 0,00 2 4 2013-01-01 01:00 2013-01-01 01:01 olivia groom-smith 1 0 0,00 9 4 2013-01-01 01:00 2013-01-01 01:01 olivia groom-smith 0 0 0,00 12 4 2013-01-01 18:00 2013-01-02 19:00 olivia groom-smith 0 0 1,00 Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 10. Vouching for data validity  In order to detect anomalies, we ordered the data by “datetime_from” and then a few quick tests were implemented in boolean logic:  REPETITION “Rep?”: IF(AND(C2=C3;D2=D3);1;0) Briefly checks whether two visitation frames are repeated in consecutive rows. Instances #2, #9 for Olivia Groom-Smith are. Obviously not aplicable to the last row.  INVERSION “Inv?”: IF([@[datetime_from]]>=[@[datetime_to]];1;0) Checks whether the end time strictly happens after the beginning. Instance #8 for Gary Doades does not.  MORE THAN ONE DAY “>24h?”: =DAYS([@[datetime_to]];[@[datetime_from]]) Checks to see whether a visitation begins and ends in different days. Instances #12, #4 do, where #12 lasts for more than 24 hours and #4 does not (just 7 hours).  Instance #11 also returns an error code because the date format is not correct, as February does not have 30 days. id client_id datetime_from datetime_to Name Rep? Inv? >24h? 7 3 2013-01-01 01:00 2013-01-01 02:00 natasha lunt 0 0 0,00 2 4 2013-01-01 01:00 2013-01-01 01:01 olivia groom- smith 1 0 0,00 9 4 2013-01-01 01:00 2013-01-01 01:01 olivia groom- smith 0 0 0,00 1 1 2013-01-01 09:00 2013-01-01 10:00 gary doades 0 0 0,00 8 1 2013-01-01 09:01 2013-01-01 09:00 gary doades 0 1 0,00 3 1 2013-01-01 09:45 2013-01-01 10:45 gary doades 0 0 0,00 6 1 2013-01-01 12:00 2013-01-01 12:30 gary doades 0 0 0,00 12 4 2013-01-01 18:00 2013-01-02 19:00 olivia groom- smith 0 0 1,00 4 2 2013-01-01 23:00 2013-01-02 06:00 richard ward 0 0 1,00 5 2 2013-01-02 04:00 2013-01-02 04:15 richard ward 0 0 0,00 10 2 2013-01-02 05:00 2013-01-02 06:00 richard ward 0 0 0,00 11 2 2013-02-30 01:00 2013-02-30 02:00 richard ward 0 0 #¡VALOR! Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 11. Alternatives and decision making  The first observation that we made is that these data show some conflicts that require decision-taking:  There are 4 clients (customers) and 12 tasks a priori  Task c1t2 defines a visitation to end before it begins. This could only be understood as a reverse visitation (server visiting client) or as a quantum effect.  We assume that those two alternatives lie outside of the scope for the problem. Removed those, choice is to either exchange times or to remove the reservation row  Some tasks already show overlap within the order given a priori, thus rearrangement is required, such as c1t1 and c1t3  Task c2t1 occurs overnight, causing it to begin and end in different day dates.  Task c4t4 occurs in a different month than all other, being a possible outlier or mistaken data. Furthermore, the date is not correct, since February cannot have 30 days.  The course of action here could either be to remove the whole row or to correct the month to January.  Since no certainty exists that this table must contain data from a single month, the whole row will be treated as invalid.  Client #3 has only one visitation task defined for her, being the only one with a single visitation  Tasks c4t1 and t4t2 are repeated, so one of them could be deleted or either they could be arranged by their id number. Furthermore, they only last for one minute, being possibly outliers or mistakes.  Task c4t3 lasts for more than 24 hours, being a possible outlier of mistake. Thus, it also exhibits the outlook of c2t1 because it occurs overnight.  The output will be an array set of, at most, max_id (12) elements from which conflicting rows are to be deleted. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 12. Coding software and choosing tools  There is a large number of alternatives being readily available in the market that provide the software framework needed to deal with these kinds of problems.  Among them, we can name but just a few: Microsoft Excel Macros, MS SQL Server, MS Visual Studio Express, MS Access, MS Project, Open Office Base, MySQL, SAS (Statistical Software Analysis) GANTT module, Visual Basic, MicroGPSS, FORTRAN, Borland C++, Delphi, Java, PHP,…  From here on we show a brief selection of choice among those tools. Usually the decisión is taken out of convenience, with criteria such as availability (having the software package already installed and configured on the machine) but there exist multiple choices, all valid solutions.  Whenever posible, specialised freeware 4GT (Fourth Generation Techniques) will be used, being generally considered cheaper, most efficient, optimizing internal computations and of a higher abstraction level, thus greatly simplifying coding operations.  Finally we will deal with the database design issue according to analogous principles. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 13. Open Office Suite: Calc  In case no budget is allocated for software licensing, universities and other organizations often make usage of the OpenOffice suite for teaching and operational applications.  Open Office offers a range of solutions, such as the “Calc” spreadsheet program and the “Base” database management program.  Here we can observe how, upon importing the data into OpenOffice Calc in an analogous way as we did in Excel, the wrong “February 30th” data is immediately detected.  Some other tools (such as OpenProj) will automatically detect a mistake in the data and assign the next available date for the field (+2 days towards March the 2nd). Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 14. Microsoft Excel Macros  Another option is to record a macro from Microsoft Excel.  In order to do so, we need to activate the “Developer” tab.  Recording a macro is a straightforward process, but the source code syntax and aspects are quite complex in case we had to ammend anything in the code.  To keep the code as readable as possible, we can use some other mean.  The logical course of action seems to be to use SQL code in order to get to the required scheduling solution. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 15. Structured Query Language  Structured Query Language (SQL) code is the market for solved these kinds of problems.  Therefore, some SQL programming expertise is assumed in order to get a solution.
  • 16. Open Office Suite: Base  Open Office Base can be used to process the data and to query the table for the output requested, in the same way that the Microsoft Access software package would.  In OO Base, we can quickly create the table that we need, with the advantage that it is open source software and implements SQL.  To do so, we first need to specify the field names and types. Finally we would need to populate it with actual data from OO Calc or MS Excel. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 17. Visual Studio Express  One other Microsoft Tool that can be used is Visual Studio Express (demo available for free download).  Here we can observe how VSE also detects the invalidity of one of the dates (February the 30th).  Visual Studio Express can also be used to process the data and to query the table for the output requested.  It also implements SQL and is designed for seamless data Exchange with Microsoft SQL servers. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 18. Oracle and PL/SQL  Oracle is a very powerful tool that larger organizations, such as city councils or international corporations use. It has its own language extensión for database management: PL/SQL  PL/SQL stands for "Procedural Language Extensions to SQL." PL/SQL extends SQL by adding programming structures and subroutines available in any high-level language.  The syntax and capabilities are very similar to those in T-SQL and other derivatives of standard SQL.  Many Oracle applications are built using client-server architecture. The Oracle database resides on the server. The program that makes requests against this database resides on the client machine. This program can be written in C, Java, or PL/SQL.  Because PL/SQL is just like any other programming language, it has syntax and rules that determine how programming statements work together. It is important for you to realize that PL/SQL is not a stand-alone programming language. PL/SQL is a part of the Oracle RDBMS, and it can reside in two environments, the client and the server. As a result, it is very easy to move PL/SQL modules between server-side and client- side applications.  Oracle also supplies a reduced command-line SQL extension called SQL+. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 19. Using Transact-SQL in Microsoft SQL Server 2k+  Microsoft SQL server 2000 (and above) is one of the most popular software tools used to solve these kind of problems at the business level, wherever encountering high numbers of tables and instances.  MSSQLS uses a powerful extension of standard SQL originally developed by Sybase, called Transact-SQL. T-SQL code can be bundled into a variety of software applications: web pages, Visual Basic, Visual C# and so on.  New MS SQL Server versions such as 2005 indeed work with CSV files and are interoperable with all of Visual Studio, MSExcel and MSProject features and functionality.  MS SQL Server requires a moderate investment in licensing.  To the right you can see an example (cfr. bib.) where you can read how to use the ORDER and GROUP BY statements in T-SQL to aggregate data.  For our exercise it constitutes a very useful tool to design code that orders the preprocessed visitation list by starting date and returns results ordered by client, once a scheduling scheme has been agreed upon and implemented. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 20. Cleaning the data  As we have observed several irregularities within the input data, we need to clean those by deletion of all rows affected.  To do so, we can either use the built-in tools of the software package of our choice, or to write-up some code to do it for us.  Given that the amount of instances (rows) in our table is very small, we choose to clean it by hand (with the software packages built-in tools) with the target of speeding up the process.  If the amount of instances was higher (say dozens, hundreds or even millions of registers), we should necessarily code a clean-up routine for this task.  According to the validity analysis performed at a previous stage, and given the time available and scope, we choose to simplify as much as posible by completely removing any instances that show any of the following conflicts:  REPETITION: All reservations must be DISTINCT, so second and further identical reservations are deleted. Only the one with the lowest id is kept.  INVERSION: Reservations with null or negative time lapses are deleted.  MORE THAN 24 HOURS SERVICE TIME: Reservations that span over more tan one day are deleted only if the total service time is greater than 24 hours. Otherwise they are kept, assuming they occur over a night shift. We will also keep visitations lasting for just one minute, assuming they represent a quick status check.  WRONG DATA INPUT: Reservations with a wrong date or any other piece of data in any field are deleted. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 21. Pseudocode for data cleaning  Since no tool was specified within the problem´s requests, having a wide range of options including several variants and extensions of SQL, we will use pseudocode to show how to program the main scheduling routine.  When later a tool has been chosen, we may easily translate this pseudocode into the grammar of the language of choice, without any loss of generality.  We asume that a few simple subroutines are provided by the language for order, deletion and so on.  We asume ROWS (for short) is a table that is to contain the RESERVATIONS  ROWS := SELECT DISTINCT FROM RESERVATIONS Removes duplicates (but obviously for ‘rows.id’, the master key)  ORDER ROWS BY DATETIME_FROM Orders all rows by starting time  FOR ID IN ROWS LOOP: For every distinct row repeat:  IF DATE(ROWS[ID].DATETIME_FROM) < 0 All invalid dates should return a negative  THEN DELETE(ROWS[ID]) Cleans wrongly timestamped rows  IF ( DATE(ROWS[ID].DATETIME_FROM) >= DATE(ROWS[ID].DATETIME_TO) )  THEN DELETE(ROWS[ID]) Cleans rows with non positive visitation time spans  IF ( DAYS(ROWS[ID].DATETIME_TO - DATE(ROWS[ID].DATETIME_FROM ) >= 0  THEN DELETE(ROWS[ID]) Cleans rows with visitation lasting for one day or more.  END LOOP End of loop  COMMIT_WRITE(ROWS,RESERVATIONS) Replaces all initial rows with the result of this cleaning routine Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 22. Result after data cleaning  Subsets to be substracted:  Repetition candidate subsets: {c4t1,c4t2}. Choice subset: {c4t2}  Inversion: One negative time lapse {c1t2}  >24 hours: {c4t3}  Wrong input: date out of margins (February 30th) {c2t4}  Substraction set: {c1t2,c2t4,c4t2,c4t3}  We end up with 8 instances after cleaning: {c1t1,c1t2,c1t3,c1t4, c2t1,c2t2,c2t3,c2t4, c3t1, c4t1,c4t2,c4t3} – {c1t2,c2t4,c4t2,c4t3} = {c1t1,c1t3,c1t4, c2t1,c2t2,c2t3, c3t1, c4t1}  Or, according to the master key “Reservation ‘id’”: {1,8,3,6,4,5,10,11,7,2,9,12} - {8,9,11,12} = {7,2,1,3,6,4,5,10} Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 23. PERT and GANTT  Program Evaluation Review Techniques (PERT) are a set of tools for Project Management that are commonly use in scheduling environments.  The most widely known of these is the GANTT bar chart where we can define tasks to be executed in parallel, serialized or with interdependencies.  There are again a number of tools that can read an input, generate a Gantt chart and apply scheduling schemes to the data, such as Microsoft Project, GanttProject, OpenProj and several others. Or we can just use a general purpose RDBMS with SQL. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 24. Scheduling schemes  After we cleaned the data, there are several issues come to our mind that we should consider to deal with the scheduling of the visitations, of which we name but just a few among the most relevant:  We could want all of the visitations to be scheduled as soon as posible.  The first visitation occurs at 9:00 am, so we could schedule all of the reservations to be atended only during office hours.  We could also want to add breaks for meals, resting times, service maintenance or other managerial reasons. We asume none.  Some visitations occur overnight, so we can decide to schedule all visitations anytime during the day and over night  We could want to reschedule as few reservations as posible, or to have all visitations for the same client being served together, one right after another, so that each client came only once.  We could want to simplify:  To consider the earliest reservation starting time as the beginning and then queue all others right behind according: first, to their starting time, and second (if there were more tan one) by other criteria  Other possible criteria are: visitation duration, client id, alphabetical by name, or any other priority scheme. For the sake of simplicity we choose the plain vanilla reservation id (the table´s master key) Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 25. Scheduling scheme chosen  Since there exists a number of combinations for these and other criteria, that result in very different scheduling schemes. The choice is usually to be made among them according to the meta-knowledge that we have of the problem’s environment (being it a hospital, a supermarket, a computer´s CPU…). This was also the case at the data clean-up stage.  Since the problem was submitted decontextualised, we are somewhat free to choose here. Our scheduling scheme is defined as follows:  The earliest reservation with the lowest ‘id’ will be scheduled as the first one.  All others will follow without any time lapses, according to their starting time, and in case of conflict, to their reservation id. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 26. Coding the scheme  Alike before, we use pseudocode to show a simple scheduling routine:  We asume all ROWS have consecutive ID master keys after the COMMIT in the cleaning routine.  ROWS := SELECT ALL FROM RESERVATIONS Loads data from the Reservations table  ORDER ROWS BY DATETIME_FROM Orders all rows by starting time  ORDER ROWS BY ID Orders all rows by the master key  FOR I=ID FROM ROWS[FIRST] TO ROWS[LAST-1] LOOP: For every row but the last one, repeat with index ‘i’:  TIMESPAN := ROWS[I+1].DATETIME_FROM - ROWS[I+1].DATETIME_FROM Calculates duration for the next task  ROWS[I+1].DATETIME_FROM := ROWS[I].DATETIME_TO Set all tasks to start right after the previous one ends  ROWS[I+1].DATETIME_TO := ROWS[I].DATETIME_TO + TIMESPAN Set termination time for all tasks  END LOOP End of loop  COMMIT_WRITE(ROWS,VISITATIONS) Overwrites the VISITATIONS table with the result of this scheduling routine Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 27. Reporting output  After scheduling we code a reporting routine in the same fashion as before:  We asume VIS (for short) is to contain the final output from VISITATIONS.  ORDER VISITATIONS BY DATETIME_FROM Orders all rows by starting time  ORDER VISITATIONS BY CLIENT_ID Performs a second ordering by client  VIS := SELECT FROM VISITATIONS: Loads several columns from the ordered Visitations table  VIS.ID  VIS.CLIENT_ID  VIS.NAME  VIS.DATETIME_FROM  VIS.DATETIME_TO  COMMIT WRITE(VIS,FILE(”.Output.csv”;#CSV)) Writes the result of this query in an archive in the comma- separated values format. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 28. ACID Compliant DBMS  In computer science, ACID (Atomicity, Consistency, Isolation, Durability) is a set of properties that guarantee that database transactions are processed reliably.  In the context of databases, a single logical operation on the data is called a transaction.  This approach has many advantages and only slight disadvantages when treating really huge databases (say Terabytes of data) in real time environments. In those rare environments, a NoQSL approach might be preferred.  As we will see in the following reads, Microsoft´s SQL Server Express software solution will ensure ACID compliancy. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 29. ACID Compliancy in MS SQL Server (I) Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 30. ACID Compliancy in MS SQL Server (II) Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 31. ACID Compliancy in MS SQL Server (III) Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 32. Database design: a bird´s eye view  At this point, we again depict the problem in a paper sheet to gain further insight before continuing the database creation and management issues.  The database is thought of as part of a reservation system that receives online reservation requests, process them by scheduling acording to the scheme and produces a visitation table. It also allows to manage individually each of the visitators (just one instance for our example), clients, reservations and visitations.  We expanded the basic functionality of the software by adding the possibility of having more tan one agent of a visitations, dubbed “visitator”.  It will contain four tables: Visitators, Clients, Reservations and Visitations.  It will implement one “Reschedule” function and three procedures: Edit_clients, Edit_visitators and Edit_Reservations. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 33. Database normalization  Databse normalization is the process of organizing the fields and tables of a relational database to minimize redundancy and dependency.  Normalization usually involves dividing large tables into smaller (and less redundant) tables and defininf relationships between them.  The objective is to isolate data so that additions, deletions and modifications of a field can be made in just one table and then propagated through the rest of the database using the defined relationships.  The Normal Forms (NF) of relational database theory provide criteria for determining a table´s degree of immunity against logical inconsistencies and anomalies. The higher the normal form applicable to a table, the less vulnerable it is.  For OLAP (Online Analytical Processing) applications, such as data mining tools, it might be preferred to use a lower normal form because they are primarily “read only” databases that tend to extract accumulated historical data, whereas transaction intensive applications will usually opt for a higher normal form.  For small problems like this one, usually 1NF, 2NF or 3NF are the only ones being used. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 34. Database map: Table definition  The database will implement the following four tables: Visitators, Clients, Reservations and Visitations  The tables contain the fields specified below. An asterisk (*) is added after the primary key identifier for each of the tables.  VISITATORS: v_id (*), v_name  CLIENTS: client_id (*), name  RESERVATIONS: id (*), v_id, client_id, datetime_from, datetime_to  VISITATIONS: V_id (*), v_id, client_id, datetime_from, datetime_to, Rescheduled  NOTES:  The field for the client name has been moved out from the reservations table because having the client_id, this field is redundant. A table has been created to contain all of the clients´names associated to their client_id.  The field for for the client name has been moved out from the visitations table for the same reason. In case we need to print a report containing the visitations as scheduled, a query will be able to access the Clients table to retrieve the piece of data.  The visitator´s name has been moved out of reservations for analogous reasons. A visitators table has been created.  The visitator´s identificator “v_id” has been added to the reservations and to the visitations table so to be allow to choose among several of these.  Rescheduled is a boolean field that has been added to keep track of rescheduling operations. Any visitation that undergoes a change in any other field for reservations rescheduling purposes will be marked with a TRUE value. FALSE otherwise. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 35. Database map: Procedures and functions  The database will implement three procedures and one function that will be called from any of the former.  The function “RESCHEDULE” will read the table of Reservations and any other needed and will only write the table of Visitations. Its purpose will be to reschedule all rows according to the scheme previously defined.  There will exist four procedures:  EDIT_CLIENTS: Reads and writes the Clients table. Writes the table of Reservations. Finally calls the Reschedule function. It is used to modify any information concerning some particular client instance, such as the name field, in all of the registers. It is also used to remove a client with all of its reservations (and therefore its visitations).  EDIT VISITATORS: Reads and writes the Visitators table. Writes the table of Reservations. Finally calls the Reschedule functions. It is used to modify any information concerning some particular visitator instance, such as the name field, in all of the registers. It is also used to remove a visitator with all of its reservations (and therefore its visitations).  EDIT RESERVATIONS: Reads and writes the Reservations table. Finally call the Reschedule functions. It is used to edit any piece of data concerning a reservation, such as the visitator, the client or the dates and times arranged. It is also used to delete a reservation.  NOTES:  Only the RESCHEDULE function can Access the Visitations list, being this considered the single most valuable source of output reports from the program´s execution.  It may occur that upon the deletion of any or all of the Reservations, some garbage data remains stored at the Clients and Visitators tables. That´s why we need specific procedures to edit those. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 36. Database map, visually Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)  Blue boxes for tables, Green disks for procedures. Arrows for data/operations fluxes. PERT
  • 37. References  "Microsoft SQL Server 2005 New Features" by Michael Oatley. McGraw-Hill/Osborne 2005 (288 pages). ISBN:0072227761  “SQL Server 2000: Stored Procedure Programming” by Dejan Sunderic and Tom Woodhead. Osborne Database Professional’s Library  “Microsoft Excel 2007 VBA (Macros). Premier Training Limited (London)  “Macros Visual Basic para Excel” by José Pedro García Sabater. ROGLE – Universitat Politècnica de València.  “Microsoft SQL Server 2005 Express Edition for Dummies” by Robert Schneider. Wiley Publishing, Inc.  “Oracle PL/SQL by Example” by VV.AA. Pearson Education as Prentice Hall Professional Technical Reference. Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)
  • 38. Any questions?  alfonsodelafuenteruiz@yahoo.es  http://creativecommons.org/licenses/by-nc-sa/3.0/legalcode  Please excuse any errata.  Thanks for your attention Alfonso de la Fuente Ruiz – http://www.linkedin.com/in/alfonsofr/es - Licensed under Creative Commons BY-NC-SA (7/Sept/2013)