Fundamentals of Machine Learning
Unveiling the Magic of Machine Learning: A Beginner's Guide
Hare Krishna ☀ In this blog I'll be introducing machine learning concepts that every machine learning prodigy should know it.This blog is completely beginner-friendly and also consists of application of machine learning and real-life examples, and many more just stick with me
Let's get started !!!
Introduction to Machine Learning
What is machine learning?
[Machine Learning is the] field of study that gives computers the ability to learn
without being explicitly programmed.
—Arthur Samuel, 1959
In the traditional method, you write a program and provide data then the computer generates the output for you this is what we have learned in school but thing gets interesting when you provide the computer a data and output then machine learn from these inputs and generates a program that solve problems
Why machine learning is required?
For example, In your email, there are many spam emails and you want to filter it out then you start manually doing it. Moving spam emails you start to notice that these spam emails repetitively mention the words "BIG", "HUGE", "LARGE", and "MEGA" You find it tedious to filter out these emails then machine learning comes into the picture Now, you provide machine a data of these email and output that these are fake or spam machine will learn and starts to filter this emails with program
Thus machine learning reduces manpower and swiftly executes tasks its main functions are :
Problems for which existing solutions require a lot of hand-tuning
Complex problems for which there is no good solution at all using a traditional
approach
Fluctuating environments
Getting insights about complex problems and large amounts of data.
Types of Machine Learning
There are 4 types of Machine Learning :
Supervised Learning
It's like teaching a computer using examples. You show it a bunch of labeled pictures of cats and dogs, and it learns to tell them apart.
This is the most commonly used type of machine learning. It's used in things like spam email filters, recommendation systems (like Netflix or Amazon suggesting movies or products), and image recognition systems.
Unsupervised Learning
This is like when the computer figures out patterns on its own. Imagine you give it a bunch of mixed-up colored beads, and it groups them based on their colors.
While not as common as supervised learning, it's used in clustering similar items together, like grouping customers with similar buying habits for marketing.
Semi-supervised Learning
It's a bit like having a few answers in a big test. Instead of all the answers, you're given some, and you have to figure out the rest by using hints and patterns you've learned.
Reinforcement Learning
Imagine teaching a computer to play a video game. It learns by getting rewards when it does something right and punishments when it does something wrong, just like you learn by getting points in a game. It's like training a dog to do tricks, but with a computer and a game instead of a real dog.
This can also be quite complex, as it involves an agent learning from trial and error while interacting with an environment. It's often used in advanced applications like training robots or self-driving cars.
Intel case study
Intel, a major chip manufacturer, employs machine learning uniquely. Many computers use Intel chips for core processing. Intel's Nervana chips, designed for data center servers, leverage machine learning for significant data processing tasks.
Key Machine Learning Concepts
Data, Features, and Labels
Data refers to information or facts that are collected, stored, and processed for various purposes.
The main challenge you can face in ML is 'Bad Algorithm' or 'Bad Data' and the best is you can improve a Bad Algorithm with GOOD data data >>>> algorithm
Obviously, if your training data is full of errors, outliers, and noise (e.g., due to poor-quality measurements), it will make it harder for the system to detect the underlying patterns, so your system is less likely to perform well. It is often well worth the effort to spend time cleaning up your training data. The truth is, most data scientists spend a significant part of their time doing just that.
For example: The customer didn't fill up the column of age which leads to missing data. Now there are options to ignore these missing values or fill these data with median or average values according to class
As the saying goes: garbage in, garbage out. Your system will only be capable of learning if the training data contains enough relevant features and not too many irrelevant ones. A critical part of the success of a Machine Learning project is coming up with a good set of features to train on. This process is called feature engineering
Models and Algorithms
An algorithm is like a set of instructions that we give to a computer to help it learn or perform a specific task. step-by-step instructions on how to learn and generate programs from data just like while cooking a dish from indigent we refer to a cooking recipe here in ML we write these recipes for the machine to digest data and learn from it.
A model in machine learning is like a smart assistant that learns from data. Previously you gave instructions to follow now the machine learned that this is a cat and no it is not a cat this learning of the machine is a Model
So, in simple terms, an algorithm is the recipe, and the model is the computer's learning from the recipe to do tasks like recognizing cats, predicting the weather, or playing games.
Data Preparation
Collecting Data
I recently tweeted about a famous open data repositories
There are also 4 ways to gather data:
CSV(comma separated values)
JSON/SQL
Fetch from API
Web scraping
In the real world companies will provide you datasets so don't have to worry about data but many ML projects practice on CSV files How to fetch data from the above methods can be a separate blog if you want a blog Please comment !!
Cleaning Data
Data cleaning is the process of removing or correcting inaccurate, corrupt, or improperly formatted data and removing duplication within a dataset.
There are many ways to clean data using pandas I will show some of them then you can try them on !!
Step 1: Detecting missing value using pandas and seaborn
import pandas as pd
import seaborn as sns
pd.isnull() #find out all missing values
sns.heatmap() #visulize all missing values helps in large data
#step 2:
pd.interpolate() #ittakes average of upper and lower vlue and returns
pd.dropna() #directly drop coloum of missing values
Evaluating Machine Learning Models
How to Measure Model Performance
There are various techniques to measure model accuracy but I'll be discussing checking model performance using a machine learning library (i.e sklearn)
from sklearn . metrics import (
accuracy _ score,
precision score,
recall score,
f1 score,
roc_auc_score
)
accuracy = accuracy_score(y_test, y_pred)
precision =precision_score(y_test, y_pred)
recall= recall_score(y_test, y_pred)
f1 = fl_score(y_test, y_pred)
roc_auc= roc_auc_score(y_test, y_pred)
print(f'Accuracy: {accuracy}' )
print(f'Precision: {precision}' )
print(f'Recall: {recall}' )
print(f'Fl-score: {fl}')
print(f'ROC AUC: {roc_auc}')
Output:
Introduction to Deep Learning
Deep learning is a subset of machine learning that focuses on artificial neural networks, which are inspired by the human brain. It involves training complex models, called deep neural networks, to recognize patterns and make predictions from large amounts of data. These networks have multiple layers (hence "deep"), each learning different features from the data.
Application of DL
One prominent real-life application of deep learning is in Autonomous Driving. Deep neural networks are used in self-driving cars to analyze data from sensors, cameras, and lidar to make real-time decisions, such as detecting pedestrians, recognizing road signs, and steering the vehicle safely. This technology aims to improve road safety and reduce accidents.
Interesting Fact
Next Steps and Resources
In this blog, you will not find hardcore machine learning algorithms although I have explained foundational knowledge that will act as the basis of your machine learning career
Here are some Resources that I found useful:
Youtube
Machine Learning Specialization by Andrew Ng
Books
Language - R
The Hundred-page Machine Learning Book ~ Andriy Burkov
An Introduction to Statistical Learning ~ Daniela Witten, Gareth M. James, Trevor Hastie, Robert Tibshirani
Language - Python
Hands-On Machine Learning with Scikit-Learn, Keras, and TensorFlow ~ Aurélien Géron
Roadmap
Conclusion
In conclusion, we've embarked on a journey to understand the essence of machine learning, delving into its significance, types, key concepts, and the crucial role of data preparation. We've explored the art of measuring machine learning model performance, and we've even ventured into the fascinating world of deep learning and its real-world applications.
To all the budding machine learning enthusiasts and students out there, remember that machine learning is vast, ever-evolving, and full of exciting opportunities, challenges are the stepping stones to success. Embrace the journey, stay curious, and remember - persistence pays off. Your future in this dynamic field is bright, keep learning and keep pushing your boundaries! 🚀🤖
Feel free to connect: