Decision Trees, Random Forests, Bagging & XGBoost: R Studio
Decision Trees and Ensembling techinques in R studio. Bagging, Random Forest, GBM, AdaBoost & XGBoost in R programming
Note: 4.7/5 (152 notes) 54,822 students
Instructor(s): Start-Tech Academy
Last update: 2022-01-15
What you’ll learn
- Solid understanding of decision trees, bagging, Random Forest and Boosting techniques in R studio
- Understand the business scenarios where decision tree models are applicable
- Tune decision tree model’s hyperparameters and evaluate its performance.
- Use decision trees to make predictions
- Use R programming language to manipulate data and make statistical computations.
- Implementation of Gradient Boosting, AdaBoost and XGBoost in R programming language
- Students will need to install R Studio software but we have a separate lecture to help you install the same
You’re looking for a complete Decision tree course that teaches you everything you need to create a Decision tree/ Random Forest/ XGBoost model in R, right?
You’ve found the right Decision Trees and tree based advanced techniques course!
After completing this course you will be able to:
Identify the business problem which can be solved using Decision tree/ Random Forest/ XGBoost of Machine Learning.
Have a clear understanding of Advanced Decision tree based algorithms such as Random Forest, Bagging, AdaBoost and XGBoost
Create a tree based (Decision tree, Random Forest, Bagging, AdaBoost and XGBoost) model in R and analyze its result.
Confidently practice, discuss and understand Machine Learning concepts
How this course will help you?
A Verifiable Certificate of Completion is presented to all students who undertake this Machine learning advanced course.
If you are a business manager or an executive, or a student who wants to learn and apply machine learning in Real world problems of business, this course will give you a solid base for that by teaching you some of the advanced technique of machine learning, which are Decision tree, Random Forest, Bagging, AdaBoost and XGBoost.
Why should you choose this course?
This course covers all the steps that one should take while solving a business problem through Decision tree.
Most courses only focus on teaching how to run the analysis but we believe that what happens before and after running analysis is even more important i.e. before running analysis it is very important that you have the right data and do some pre-processing on it. And after running analysis, you should be able to judge how good your model is and interpret the results to actually be able to help your business.
What makes us qualified to teach you?
The course is taught by Abhishek and Pukhraj. As managers in Global Analytics Consulting firm, we have helped businesses solve their business problem using machine learning techniques and we have used our experience to include the practical aspects of data analysis in this course
We are also the creators of some of the most popular online courses – with over 150,000 enrollments and thousands of 5-star reviews like these ones:
This is very good, i love the fact the all explanation given can be understood by a layman – Joshua
Thank you Author for this wonderful course. You are the best and this course is worth any price. – Daisy
Teaching our students is our job and we are committed to it. If you have any questions about the course content, practice sheet or anything related to any topic, you can always post a question in the course or send us a direct message.
Download Practice files, take Quizzes, and complete Assignments
With each lecture, there are class notes attached for you to follow along. You can also take quizzes to check your understanding of concepts. Each section contains a practice assignment for you to practically implement your learning.
What is covered in this course?
This course teaches you all the steps of creating a decision tree based model, which are some of the most popular Machine Learning model, to solve business problems.
Below are the course contents of this course :
Section 1 – Introduction to Machine Learning
In this section we will learn – What does Machine Learning mean. What are the meanings or different terms associated with machine learning? You will see some examples so that you understand what machine learning actually is. It also contains steps involved in building a machine learning model, not just linear models, any machine learning model.
Section 2 – R basic
This section will help you set up the R and R studio on your system and it’ll teach you how to perform some basic operations in R.
Section 3 – Pre-processing and Simple Decision trees
In this section you will learn what actions you need to take to prepare it for the analysis, these steps are very important for creating a meaningful.
In this section, we will start with the basic theory of decision tree then we cover data pre-processing topics like missing value imputation, variable transformation and Test-Train split. In the end we will create and plot a simple Regression decision tree.
Section 4 – Simple Classification Tree
This section we will expand our knowledge of regression Decision tree to classification trees, we will also learn how to create a classification tree in Python
Section 5, 6 and 7 – Ensemble technique
In this section we will start our discussion about advanced ensemble techniques for Decision trees. Ensembles techniques are used to improve the stability and accuracy of machine learning algorithms. In this course we will discuss Random Forest, Bagging, Gradient Boosting, AdaBoost and XGBoost.
By the end of this course, your confidence in creating a Decision tree model in R will soar. You’ll have a thorough understanding of how to use Decision tree modelling to create predictive models and solve business problems.
Go ahead and click the enroll button, and I’ll see you in lesson 1!
Below is a list of popular FAQs of students who want to start their Machine learning journey-
What is Machine Learning?
Machine Learning is a field of computer science which gives the computer the ability to learn without being explicitly programmed. It is a branch of artificial intelligence based on the idea that systems can learn from data, identify patterns and make decisions with minimal human intervention.
What are the steps I should follow to be able to build a Machine Learning model?
You can divide your learning process into 3 parts:
Statistics and Probability – Implementing Machine learning techniques require basic knowledge of Statistics and probability concepts. Second section of the course covers this part.
Understanding of Machine learning – Fourth section helps you understand the terms and concepts associated with Machine learning and gives you the steps to be followed to build a machine learning model
Programming Experience – A significant part of machine learning is programming. Python and R clearly stand out to be the leaders in the recent days. Third section will help you set up the Python environment and teach you some basic operations. In later sections there is a video on how to implement each concept taught in theory lecture in Python
Understanding of models – Fifth and sixth section cover Classification models and with each theory lecture comes a corresponding practical lecture where we actually run each query with you.
Why use R for Machine Learning?
Understanding R is one of the valuable skills needed for a career in Machine Learning. Below are some reasons why you should learn Machine learning in R
1. It’s a popular language for Machine Learning at top tech firms. Almost all of them hire data scientists who use R. Facebook, for example, uses R to do behavioral analysis with user post data. Google uses R to assess ad effectiveness and make economic forecasts. And by the way, it’s not just tech firms: R is in use at analysis and consulting firms, banks and other financial institutions, academic institutions and research labs, and pretty much everywhere else data needs analyzing and visualizing.
2. Learning the data science basics is arguably easier in R. R has a big advantage: it was designed specifically with data manipulation and analysis in mind.
3. Amazing packages that make your life easier. Because R was designed with statistical analysis in mind, it has a fantastic ecosystem of packages and other resources that are great for data science.
4. Robust, growing community of data scientists and statisticians. As the field of data science has exploded, R has exploded with it, becoming one of the fastest-growing languages in the world (as measured by StackOverflow). That means it’s easy to find answers to questions and community guidance as you work your way through projects in R.
5. Put another tool in your toolkit. No one language is going to be the right tool for every job. Adding R to your repertoire will make some projects easier – and of course, it’ll also make you a more flexible and marketable employee when you’re looking for jobs in data science.
What is the difference between Data Mining, Machine Learning, and Deep Learning?
Put simply, machine learning and data mining use the same algorithms and techniques as data mining, except the kinds of predictions vary. While data mining discovers previously unknown patterns and knowledge, machine learning reproduces known patterns and knowledge—and further automatically applies that information to data, decision-making, and actions.
Deep learning, on the other hand, uses advanced computing power and special types of neural networks and applies them to large amounts of data to learn, understand, and identify complicated patterns. Automatic language translation and medical diagnoses are examples of deep learning.
Who this course is for
- People pursuing a career in data science
- Working Professionals beginning their Data journey
- Statisticians needing more practical experience
- Anyone curious to master Decision Tree technique from Beginner to Advanced in short span of time
- Welcome to the Course!
- Course Resources
- Setting up R Studio and R Crash Course
- Installing R and R studio
- This is a milestone!
- Basics of R and R studio
- Packages in R
- Inputting data part 1: Inbuilt datasets of R
- Inputting data part 2: Manual data entry
- Inputting data part 3: Importing from CSV or Text files
- Creating Barplots in R
- Creating Histograms in R
- Machine Learning Basics
- Introduction, Key concepts and Examples
- Steps in building an ML model
- Simple Decision trees
- Basics of Decision Trees
- Understanding a Regression Tree
- The stopping criteria for controlling tree growth
- The Data set for the Course
- Importing the Data set into R
- Splitting Data into Test and Train Set in R
- More about test-train split
- Building a Regression Tree in R
- Pruning a tree
- Pruning a Tree in R
- Simple Classification Tree
- Classification Trees
- The Data set for Classification problem
- Building a classification Tree in R
- Advantages and Disadvantages of Decision Trees
- Ensemble technique 1 – Bagging
- Bagging in R
- Ensemble technique 2 – Random Forest
- Random Forest technique
- Random Forest in R
- Ensemble technique 3 – Boosting
- Boosting techniques
- Gradient Boosting in R
- AdaBoosting in R
- XGBoosting in R
- Add-on 1: Preprocessing and Preparing Data before making any model
- Gathering Business Knowledge
- Data Exploration
- The Data and the Data Dictionary
- Importing the dataset into R
- Univariate Analysis and EDD
- EDD in R
- Outlier Treatment
- Outlier Treatment in R
- Missing Value imputation
- Missing Value imputation in R
- Seasonality in Data
- Bi-variate Analysis and Variable Transformation
- Variable transformation in R
- Non Usable Variables
- Dummy variable creation: Handling qualitative data
- Dummy variable creation in R
- Correlation Matrix and cause-effect relationship
- Correlation Matrix in R
- Bonus Section
- The final milestone!
- Bonus Lecture
Google Project Management [Coursera with Google]
Time remaining or 402 enrolls left
|Don’t miss any coupons by joining our Telegram group|