2. ABOUT THE COMPANY -
INTERNSHALA
Internshala is an internship and online training platform, based
in Gurgaon, India.[1][2] Founded by Sarvesh Agrawal, an IIT
Madras alumnus, in 2011, the website helps students
find internships with organisations in India..
3. WHAT IS MACHINE LEARNING?
Machine learning (ML) is a type of artificial intelligence (AI)
that allows software applications to become more accurate
at predicting outcomes without being explicitly programmed
to do so. Machine learning algorithms use historical data as
input to predict new output values.
5. EXPLORATORY DATA ANALYSIS
Exploratory Data Analysis refers to the critical process of performing initial
investigations on data so as to discover patterns , to spot anomalies , to test
hypothesis and to check assumptions with the help of summary statistics and
graphical representations. exploratory data analysis is an approach of analyzing data
sets to summarize their main characteristics, often using statistical graphics and
other data visualization methods.
The four types of EDA are univariate non-graphical, multivariate non- graphical,
univariate graphical, and multivariate graphical.
6. SUPERVISED MACHINE LEARNING
Supervised learning uses a training set to teach models
to yield the desired output. This training dataset
includes inputs and correct outputs, which allow the
model to learn over time. The algorithm measures its
accuracy through the loss function, adjusting until the
error has been sufficiently minimized.
7. TYPES OF SUPERVISED MACHINE LEARNING
CLASSIFICATION : when the output variable is a category, such
as “red” or “blue” or “disease” and “no disease”. A classification
model attempts to draw some conclusion from observed values.
REGRESSION : when the output variable is a real or continuous
value, such as “salary” or “weight”. Many different models can be
used, the simplest is the linear regression. It tries to fit data with the
best hyper-plane which goes through the points.
8. LOGISTIC REGRESSION : Logistic regression is a calculation used to predict a binary
outcome: either something happens, or does not. This can be exhibited as Yes/No, Pass/Fail, Alive/Dead,
etc. Independent variables are analyzed to determine the binary outcome with the results falling into one of two
categories. The independent variables can be categorical or numeric, but the dependent variable is always
categorical.
K-NEAREST NEIGHBORS : K-nearest neighbors (k-NN) is a pattern recognition algorithm
that uses training datasets to find the k closest relatives in future examples. When k-NN is used in classification,
you calculate to place data within the category of its nearest neighbor. If k = 1, then it would be placed in the class
nearest 1. K is classified by a plurality poll of its neighbors.
NAIVE BAYES : Naive Bayes calculates the possibility of whether a data point belongs
within a certain category or does not. In text analysis ,it can be used to categorize words or phrases
as belonging to a preset “tag” (classification) or not.
9. DECISION TREE : A decision tree is a supervised learning algorithm that is perfect for
classification problems, as it’s able to order classes on a precise level. It works like a flow chart,
separating data points into two similar categories at a time from the “tree trunk” to “branches,” to
“leaves,” where the categories become more finitely similar. This creates categories within categories,
allowing for organic classification with limited human supervision.
SUPPORT VECTOR MACHINES : A support vector machine (SVM) Uses algorithms
to train and classify data within degrees of polarity, taking it to a degree beyond X/Y prediction. The
goal of the SVM algorithm is to create the best line or decision boundary that can segregate n-
dimensional space into classes so that we can easily put the new data point in the correct category in
the future. This best decision boundary is called a hyperplane.
10. UNSUPERVISED MACHINE LEARNING
Unsupervised learning is helpful for finding useful insights from the data.
Unsupervised learning is much similar as a human learns to think by their
own experiences, which makes it closer to the real AI.
Unsupervised learning works on unlabeled and uncategorized data which
make unsupervised learning more important.
In real-world, we do not always have input data with the corresponding
output so to solve such cases, we need unsupervised learning.
11. TYPES OF UNSUPERVISED LEARNING
CLUSTERING : Clustering is a method of grouping the objects into clusters such that objects
with most similarities remains into a group and has less or no similarities with the objects of
another group. Cluster analysis finds the commonalities between the data objects and categorizes
them as per the presence and absence of those commonalities.
ASSOCIATION : An association rule is an unsupervised learning method which is used for
finding the relationships between variables in the large database. It determines the set of items that
occurs together in the dataset. Association rule makes marketing strategy more effective. Such as
people who buy X item (suppose a bread) are also tend to purchase Y (Butter/Jam) item. A typical
example of Association rule is Market Basket Analysis.
12. K-MEANS : K-Means Clustering ,which groups the unlabeled dataset into different clusters. Here K
defines the number of pre-defined clusters that need to be created in the process, as if K=2, there will be two
clusters, and for K=3, there will be three clusters, and so on. It allows us to cluster the data into different groups
and a convenient way to discover the categories of groups in the unlabeled dataset on its own without the need
for any training. It is a centroid-based algorithm, where each cluster is associated with a centroid. The main aim of
this algorithm is to minimize the sum of distances between the data point and their corresponding clusters.
NEURAL NETWORKS : A neural network is a method in artificial intelligence that teaches
computers to process data in a way that is inspired by the human brain. It is a type of machine learning process,
called deep learning, that uses interconnected nodes or neurons in a layered structure that resembles the human
brain. It creates an adaptive system that computers use to learn from their mistakes and improve continuously.
Thus, artificial neural networks attempt to solve complicated problems, like summarizing documents or
recognizing faces, with greater accuracy.
13. REINFORCEMENT MACHINE LEARNING
Reinforcement Learning is a feedback-based Machine learning technique
in which an agent learns to behave in an environment by performing the
actions and seeing the results of actions. For each good action, the agent
gets positive feedback, and for each bad action, the agent gets negative
feedback or penalty.
In Reinforcement Learning, the agent learns automatically using
feedbacks without any labeled data, unlike supervised learning .
Since there is no labeled data, so the agent is bound to learn by its
experience only.
RL solves a specific type of problem where decision making is sequential,
and the goal is long-term, such as game-playing, robotics, etc.
14. LEARNING OUTCOMES
LEARNT WHAT IS MACHINE LEARNING .
LEARNT TYPES OF MACHINE LEARNING.
LEARNT HOW TO ANALIZE AND CLEAN THE DATA.
LEARNT HOW TO PREDICT THE FUTURE OUTCOMES OF THE
GIVEN DATA .
LEARNT HOW MACHINE LEARNING AFFECTS OUR PRESENT
WORLD .
15. CONCLUSION
Machine learning approaches applied in systematic reviews of complex research
fields such as quality improvement may assist in the title and abstract inclusion
screening process. Machine learning approaches are of particular interest
considering steadily increasing search outputs and accessibility of the existing
evidence is a particular challenge of the research field quality improvement.
Increased reviewer agreement appeared to be associated with improved predictive
performance.