Mastering MySQL Database Architecture: Deep Dive into MySQL Shell and MySQL R...
Data Integration Lecture
1. The Web and Data Integration Databases WWW Integration
2. First comes knowledge (stories) Once upon a time, in the primitive and barbarian days before books or computers, the amount of information shepherded by a group of people could be collected in the wisdom and the stories of its older members. In this world, storytellers, magicians, and grandparents were considered great and honored storehouses for all that was known.
3. Soon humanity invented the technology of writing. And though great scholars like Aristotle warned that the invention of the alphabet would lead to the total demise of the creativity and sensibility of humanity, data began to be stored in voluminous data repositories, called books. As we know, eventually books propagated with great speed and soon, whole communities of books migrated to the first real "databases", libraries. Specifically, libraries introduced "standards" by which data could be stored and retrieved. Then Came Libraries….(they lasted about 2000 years)
4. And for the next couple thousand years libraries grew, and grew, and grew along with associated storage/retrieval technologies such as the filing cabinet, colored tabs, and three ring binders. All this until one day about half a century ago, some really bright folks including Alan Turing, working for the British government were asked to invent an advanced tool for breaking German cryptographic "Enigma" codes. That day the world changed again. That day the computer was born. The computer was an intensely revolutionary technology of course, but as with any technology, people took it and applied it to old problems instead of using it to its revolutionary potential. Almost instantly, the computer was applied to the age-old problem of information storage and retrieval. After all, by World War Two, information was already accumulating at rates beyond the space available in publicly supported libraries. And besides, it seemed somehow cheap and tawdry to store the entire archives of "The Three Stooges" in the Library of Congress. Information was seeping out of every crack and pore of modern day society. Then Came Computers
5. Thus, the first attempts at information storage and retrieval followed traditional lines and metaphors. The first systems were based on discrete files in a virtual library. In this file-oriented system, a bunch of files would be stored on a computer and could be accessed by a computer operator. Files of archived data were called "tables" because they looked like tables used in traditional file keeping. Rows in the table were called "records" and columns were called "fields". Consider the following example: The Flat File (I remember it well) 365-777-9876 [email_address] Lim Li Hsien 987-765-4321 [email_address] Sol Selena 213-456-0987 [email_address] Tachibana Eric Phone Email Last Name First Name
6. The Flat File Was Seriously Bad So We “Indexed” It Essentially, in order to find a record, someone would have to read through the entire file and hope it was not the last record. With a hundred thousands records, you can imagine the dilemma. What was needed, computer scientists thought (using existing metaphors again) was a card catalog, a means to achieve random access processing, that is the ability to efficiently access a single record without searching the entire file to find it. The result was the indexed file-oriented system in which a single index file stored "key" words and pointers to records that were stored elsewhere. This made retrieval much more efficient. It worked just like a card catalog in a library. To find data, one needed only search for keys rather than reading entire records. However, even with the benefits of indexing, the file-oriented system still suffered from problems including: * Data Redundancy - the same data might be stored in different places * Poor Data Control - redundant data might be slightly different such as in the case when Ms. Jones changes her name to Mrs. Johnson and the change is only reflected in some of the files containing her data * Inability to Easily Manipulate Data - it was a tedious and error prone activity to modify files by hand * Cryptic Work Flows - accessing the data could take excessive programming effort and was too difficult for real-users (as opposed to programmers).
7. The Indexed File Wasn’t Much Better Consider how troublesome the following data file would be to maintain. What was needed was a truly unique way to deal with the age-old problem, a way that reflected the medium of the computer rather than the tools and metaphors it was replacing. A Human Cultures 88 West 1st St. Ms. Tonya Lippert A Governments 88 West 1st St. Ms. Tonya Lippert A Psychology 102 100 Capitol Ln. Mrs. Tonya Ducovney A Psychology 101 88 West 1st St. Ms. Tonya Lippert A Chinese 123 Kensigton Mr. Eric Tachibana A English 101 123 Kensigton Mr. Eric Tachibana B Data Structures 123 Kensigton Mr. Eric Tachibana C+ Chemistry 102 123 Kensigton Mr. Eric Tachibana Grade Course Address Name
8. Enter the database. Simply put, a database is a computerized record keeping system. More completely, it is a system involving data, the hardware that physically stores that data, the software that utilizes the hardware's file system in order to 1) store the data and 2) provide a standardized method for retrieving or changing the data A database might be as complex and demanding as an account tracking system used by a bank to manage the constantly changing accounts of thousands of bank customers, or it could be as simple as a collection of electronic business cards on your laptop. Today companies like Oracle and Sybase make a living on these. Access is the Microsoft product in this line.
9. What is the Internet (World Wide Web)? 1957 (A very good year by the way) USSR launches Sputnik, first artificial earth satellite. In response, US forms the Advanced Research Projects Agency ( ARPA ), the following year, within the Department of Defense (DoD) to establish US lead in science and technology applicable to the military. Oh by the way, NASA was formed for the same reason. 1969 The same year NASA delivers by landing a man on the moon, ARPA delivers and sends packets across a network at UCLA. Throughout the 70’s things really heat up with BITNET (Because it’s Time Network) appearing in 1981 the same year the IBM PC was introduced. In 1984 Domain Name Services (DNS) are introduced. By 1990 ARPA is gone and the Internet as it is now know exceeds 100,000 nodes.
10. 1991 Gopher released by Paul Lindner and Mark P. McCahill from the Univ of Minnesota World-Wide Web (WWW) released by CERN ; Tim Berners-Lee developer. What do we mean WWW? The WWW project is based on the principle of universal readership: "if information is available, then any (authorized) person should be able to access it from anywhere in the world.” This is done by using client/server logic and hypertext .
11. How Do the Web and the Database Work Together More importantly for our focus, databases have quickly become integral to the design, development, and services offered by web sites. Consider a site like Amazon.com that must be able to allow users to quickly jump through a vast virtual warehouse of books, compact disks and much more. Finally, after almost 30 years in the making the two technologies: databases and hypertext have collided on the Internet.
12. Client Server URL: http://www.oneonta.edu/faculty/greenbjb/myfile.html www.oneonta.edu on the disk is a file called myfile.html in the greenbjb folder in the faculty folder is sent back to the client. client asks for file
13. Any file and any server can be accessed. Soak that up for a minute. This is powerful stuff. Was the original objective achieved? ( universal readership)
14. But my friends this is a flat file - this “html”. Good yes, but we can do better. Take the following URL: http://www.oneonta.edu/navigation/directory.asp
15. Client Server URL: http://www.oneonta.edu/navigation/directory.asp www.oneonta.edu on the disk is a file called directory.asp in the navigation folder and it is EXECUTED by the server. client asks for file
16. So What? What does it mean to be able to run a program on the server? Query or update a database Access current information Change the response each time the page is visited Look smart, even if we aren’t We can have what we call Dynamic Web Pages
17. In addition to this intelligence on the server, we can also download little snips of code to the client which is smart enough to run them. These can be done in a number of ways…. Java applets or even Shockwave. Flash and other programs help people produce this “things”. This also can produce dynamic web pages.
18. Client Server URL: http://employees.oneonta.edu/greenbjb/Flash101/animations/GettingStarted.swf www.oneonta.edu on the disk is a file called GettingStarted.swf in the greenbjb folder in the Flash101 folder in the animations folder and it is sent by the server to the client. client asks for file when client receives the file it notes it as a .swf and knows to run it.
19. New Adobe technology with PDF has expanded this Web forms in PDF, Image over Text, and DB integration
20. Flash, an Adobe animation tool also has interactivity and DB integration tools.