3. Turing Test
a test of a machine's ability to exhibit intelligent
behavior equivalent to, or indistinguishable from, that of
a human
4. Deductive Reasoning
the process of reasoning from one or more
statements (premises) to reach a logically certain
conclusion
Inductive Reasoning
reasoning in which the premises are viewed as
supplying strong evidence for the truth of the
conclusion
5. IBM Watson
a question answering computer system capable of
answering questions posed in natural language
6. Watson
advanced natural language processing, information retrieval, knowledge
representation, automated reasoning, and machine learning technologies to the
field of open domain question answering
Software
Watson uses IBM's DeepQA software and the Apache UIMA (Unstructured Information
Management Architecture) framework. The system was written in various languages,
including Java, C++, and Prolog, and runs on the SUSE Linux Enterprise Server 11 operating
system using Apache Hadoop framework to provide distributed computing.
Hardware
Watson is composed of a cluster of ninety IBM Power 750 servers, each of which uses a 3.5
GHz POWER7 eight core processor, with four threads per core. In total, the system has 2,880
POWER7 processor threads and has 16 terabytes of RAM.
Data
The sources of information for Watson include encyclopedias, dictionaries, thesauri,
newswire articles, and literary works. Watson also used databases, taxonomies, and
ontologies. Specifically, DBPedia, WordNet, and Yago were used. The IBM team provided
Watson with millions of documents, including dictionaries, encyclopedias, and other
reference material that it could use to build its knowledge. Although Watson was not
connected to the Internet during the game, it contained 200 million pages of structured and
unstructured content consuming four terabytes of disk storage, including the full text of
Wikipedia.
12. Machine Learning
the study and construction of algorithms that can
learn from and make predictions on data
A core objective of a learner is to generalize
from its experience.
the ability of a learning machine to perform
accurately on new, unseen examples/tasks after
having experienced a learning data set.
13.
14. Approaches of Machine Learning
• Decision tree learning
• Artificial neural networks
• Support vector machines
• Bayesian networks
• Clustering
• Genetic algorithms
• …
15. Applied Machine Learning Process
1. Define the Problem
Step 1: What is the problem?
Step 2: Why does the problem need to be solved?
Step 3: How would I solve the problem?
2. Prepare Data
Step 1: Data Selection
Step 2: Data Preprocessing
Step 3: Data Transformation
3. Spot Check Algorithms
4. Improve Results
5. Present Results
16. Deep Learning
a branch of machine learning based on a set of
algorithms that attempt to model high-level abstractions
in data by using multiple processing layers, with
complex structures or otherwise, composed of multiple
non-linear transformations
Deep learning has been characterized as a buzzword, or
a rebranding of neural networks.
17. Neural Network
to estimate or approximate functions that can
depend on a large number of inputs and are
generally unknown
Objective
Solution
29. Libraries
• Caffe
– A deep learning framework specializing in image recognition.
• CNTK
– open source deep-learning Computational Network Toolkit by Microsoft Research.
• ConvNetJS
– A Javascript library for training deep learning models. It contains online demos.
• Deeplearning4j
– An open-source deep-learning library written for Java with LSTMs and convolutional networks. It provides parallelization with
CPUs and GPUs.
• Gensim
– A toolkit for natural language processing implemented in the Python programming language.
• Keras
– deep learning framework capable of running on top of either TensorFlow or Theano.
• NVIDIA cuDNN
– A GPU-accelerated library of primitives for deep neural networks.
• OpenNN
– An open source C++ library which implements deep neural networks and provides parallelization with CPUs.
• TensorFlow
– Google's open source machine learning library in C++ and Python with APIs for both. It provides parallelization with CPUs and
GPUs.
• Theano
– An open source machine learning library for Python.
• Torch
– An open source software library for machine learning based on the Lua programming language.
• Apache SINGA
– A General Distributed Deep Learning Platform.
30. References
1. , , “ ”, , 22 1 , 2015.1.
2. , “ 1 ”, http://www.slideshare.net/DonghunLee20/1-59501887
3. Wikipedia, http://en.wikipedia.org
4. Jason Brownlee, “Process for working through Machine Learning Problems”,
http://machinelearningmastery.com/process-for-working-through-machine-learning-
problems/
5. DANIEL SHIFFMAN, “THE NATURE OF CODE”, http://natureofcode.com/book/