MACHINE
LEARNING–WHAT, WHY & WHERE
Before we go into the details of Machine Learning, it becomes
very important for us to understand its need and the approaches used previously
to address these needs, but, failed and now, how they are fulfilled by Machine
Learning.
Algorithm is a process or set of rules to be followed in
calculations or problem- solving operations on an input and produces desired
output. Any problem, easy or complex can be solved with the help of algorithm.
But, there are scenarios where input data is large, complex or unstructured
that it becomes very difficult to write algorithms for them. Some common examples
are to filter the spam emails, to find the information from images and videos,
to predict customer behaviour, to diagnose the diseases and the list goes on.
This is where the Machine Learning comes to the rescue. It takes the past experiences
or datasets as input, extract useful information from it and then apply a set of
complex rules and different models so that a usable output can be achieved.
For example, in case of stock Market, using the historical
datasets, we can try to predict if particular share is going to be profitable or
loss making. It’s easier to diagnose a disease which otherwise would have been
very difficult using trial and error. Some other examples where Machine
Language is used are:
- Image Recognition
- Face Recognition
- Anamoly Detection
- Medical Diagnosis
- Search Engines
- Handwriting Recognition
- Financial Services
- Crypto-currency
In short, Machine Learning helps us in solving many complex
real world problems by taking past experiences as input and provides good
future prediction.
Life-Cycle
of Machine Learning
The process of Machine
learning can discussed with the below figure.
1. Problem
definition: First
step is always to define the problem and then determine if Machine Learning can
be used to solve it or not.
2.Data
Preparation: Once
problem is defined, data is collected from the source systems and transformed
into consumable files. This is the most important part of the Machine Learning
as it helps to convert the raw data collected from different sources into
useful information.
Steps
for Data Preparation:
The raw
data might contain the data which is not useful and hindering the ultimate
performance. More the accurate data, more the prediction will be accurate.
- Firstly this raw data should be formatted in some format like csv files, database with headers, so that using the python libraries like ‘Pandas’, it can be stored as data frame and it became easy to operate on different rows and columns. The Pandas, sklearn libraries in the Python is very useful in preprocessing of the data.
- Now, we can perform the different operations on these data-frames which hinder the performance like the unwanted records need to remove from the datasets, replacing and removing of missing data, encoding of categorical data, feature extraction (using only specific data from the dataset).
- Then, the above processed data is divided between testing and training dataset generally in ratio of 40:60 or 30: 70.
3. Data
Modelling: Different
algorithms that can help in resolving the problem are identified and applied on
the training dataset to prepare a predictive model with the highest accuracy
possible. Once the model is ready, it is applied on testing dataset to evaluate
the performance.
Data
modelling usually is an iterative process. There are very high chances that
model may not produce expected results in the first run. Hence, if the outcome
of testing dataset are not as desired, model is reevaluated. Sometimes
performance is fixed by tuning the model parameters and in some cases model is
rebuild. This process continues until we desired outcome is achieved.
TYPES OF MACHINE LEARNING
There are three types of
Machine Learning. Let’s discuss in brief about them.
Supervised
Learning
The type of learning in which
prediction is made with the given attributes or features of the dataset. The
goal of supervised algorithm is to analyse the training data-set and generate a
function that maps the input values with the desired output values. This
training process with the given input will continue until the model or
algorithm start demonstrating good performance.
Supervised Learning can be
divided into two types on the basis of desired output variable.
1. Classification: To
predict and label the output as category. For example -
- Knowledge Extraction: To divide the customers in a bank into ‘high risk’ and ‘low risk’ categories for the purposes of identifying the loan or credit card eligibility.
- Message
filtering: To filter out the malicious emails and flag them
as ‘spam’.
- Medical Diagnosis: Diagnose the patients with the help of their symptoms.
- Recognising a person on the basis of his handwriting, face or speech.
- Weather Prediction
2. Regression: Regression Learning have continuous, numerical
value as their output.
- To find out the number of mangoes in a basket of different fruits.
- To predict the approximate salary of employee in an organisation based on the qualification and number of years of experience of the employees.
- To predict the age of the person on the basis of its height and weight.
The different types of Supervised algorithm used in Machine
Learning are:
- Linear Regression
- Polynomial Regression
- Logistic Regression
- Decision Tree
- Random Forest
- Naïve Bayes
- Support Vector Machines
- Kernel SVM
- K-NN
UnSupervised Learning
In this type of learning, we
only have input data available without any corresponding output variable to
achieve. There is very little idea of the result or no prior prediction in
unsupervised learning.
The categories of the data are
unknown in unsupervised learning. The aim of this learning is to identify the
regularities, structures or patterns in the data and divide the data among similar
groups in order to learn more about it.
The unsupervised learning can
be achieved through ‘clustering’. The goal of the clustering is to find out
similar patterns or clusters in a given input. For example, to find the
customer behaviour on the different products launched in the market. It helps to
analyse the market value of the product and help to maintain customer relationship
by providing good service.
Unsupervised learning also
helps in face recognition from the image based on the pixels, colour, size etc.
Thanks all for reading this blog!!
You can find more information on these topic from the below reference which help me to take knowledge before writing this blog.
No comments:
Post a Comment