Top 5 Data Science Courses at Udemy for Beginners

“Data Science is the sexiest job of the 21st century” is something that you heard many people said nowadays. A data scientist is a job title for an employee who excels at analyzing data, particularly large amounts of data (also called Big Data), to help a business gain a competitive edge. Normally, data scientist possesses a combination of analytic, machine learning, data mining and statistical skills as well as experience with algorithms and coding. Below are some recommendation of courses which you can take to kick start your career as data scientist. The suggested online courses are recommended for beginners and have received good ratings and also great feedbacks from many of their participants. It is offered by Udemy, the world’s online learning marketplace, where 6 million+ students are taking courses in everything from programming to yoga to photography–and much more. The courses is taught by an expert instructor, and is available on-demand, so students can learn at their own pace, on their own time, and on any device.

1. Want to be a Data Scientist?

Learn what Data Science is all about, if you are a good fit for that domain and how to become one.

Over 18 lectures and 3 hours of content.
Appreciate what Data Science is all about.
Understand how a typical data science project gets done.
Decide if this job is a right fit for you.
Learn the skills required to be a data scientist and how to acquire them.

2. Applied Data Science with Python

Learn how to execute an end-to-end data science project and deliver business results.

Over 43 lectures and 8.5 hours of content!
Appreciate what Data Science really is
Understand the Data Science Life Cycle
Learn to use Python for executing Data Science Projects
Master the application of Analytics and Machine Learning techniques

3. Applied Data Science with R

Learn how to execute an end-to-end data science project and deliver business result.

Over 54 lectures and 11 hours of content!
Appreciate what Data Science really is
Understand the Data Science Life Cycle
Learn to use R for executing Data Science Projects
Master the application of Analytics and Machine Learning techniques
Gain insight into how Data Science works through end-to-end use cases.

4. Learn Hadoop, MapReduce and BigData from Scratch

A Complete Guide to Learn and Master the Popular Big Data Technology.

Over 74 lectures and 15.5 hours of content!
Become literate in Big Data terminology and Hadoop.
Understand the Distributed File Systems architecture and any implementation such as Hadoop Distributed File System or Google File System
Use the HDFS shell
Use the Cloudera, Hortonworks and Apache Bigtop virtual machines for Hadoop code development and testing
Configure, execute and monitor a Hadoop Job

5. Become a Certified Hadoop Developer | Training | Tutorial

Learn Hadoop from an expert, get certified & bag one of the highest paying IT jobs in current times.

Over 49 lectures and 5 hours of content!
Code of all the programs discussed.
Builds strong MapReduce and Hadoop fundamentals.
200+ highly relevant questions to prepare for certification exams like Cloudera and Hortonworks.
Massive Q&A repository.

Continue Reading

Top 10 Data Mining Algorithm Resources 2013

Below are Top 10 search results based on keyword “Algorithms” using our customized data mining search engine powered by Google CSE:

The Two Most Important Algorithms in Predictive Modeling Today …

Wouldn’t it be great if there were just use two algorithms which could handle most of your predictive modeling needs? It turns out that actually this is the case.
Why More Data and Simple Algorithms Beat Complex Analytics … analytics-models/

Aug 7, 2013 … More data and simple algorithms are better than complex analytics models because having more data allows the “data to speak for itself,” …
On Algorithm Wars and Predictive Apps – Datanami…/on_algorithm_wars_and_predictive_apps.html

May 15, 2013 … Predictive apps are going to be one the next major disruptions in technology as enterprises begin to take advantage of the combined power of …
rxDTree(): a new type of tree algorithm for big data…/rxdtree-a-new-type-of-tree-algorithm.html
Jul 11, 2013 … by Joseph Rickert The rxDTree() function included in the RevoScaleR package distributed with Revolution R Enterprise is an an example of a …
Data Mining with an Ant Colony Optimization Algorithm – SCI2S
File Format: PDF/Adobe Acrobat
Abstract – This work proposes an algorithm for data mining called Ant-Miner (Ant … with CN2, a well-known data mining algorithm for classification, in six public …
Big Graph Mining: Algorithms and Discoveries…/V14-02-04-Kang.pdf

File Format: PDF/Adobe Acrobat
primitive that Pegasus uses for its algorithms to analyze structures of large … tributed algorithm, and it handles all the details of data dis- tribution, replication …
StreamKM++: A Clustering Algorithm for Data Streams∗…/alx10_016_ackermannm.pdf
File Format: PDF/Adobe Acrobat
We develop a new k-means clustering algorithm for data streams, which we … the problem on the sample using the k-means++ algorithm. [1]. To compute the …
A Good Business Objective Beats a Good Algorithm « Predictive ……/a-good-business-objective-beats-a- good-algorithm/
Nov 5, 2013 … Predictive Modeling competitions, once the arena for a few data mining conferences, has now become big business. Kaggle ( is …
A novel evolutionary data mining algorithm with applications to …

File Format: PDF/Adobe Acrobat
Many algorithms have been developed to mine large data sets for classification models … propose a new algorithm, called data mining by evolutionary learning …
Random Forests Algorithm – Data Science Central…/6448529%3ABlogPost%3A106993

Sep 24, 2013 … One of the most popular methods or frameworks used by data scientists at the Rose Data Science Professional Practice Group is Random …

For More Information about Data Minining click here

Continue Reading

Sentiment Analysis/Opinion Mining Resources

According to Wikipedia, Sentiment analysis or opinion mining refers to the application of natural language processing, computational linguistics, and text analytics to identify and extract subjective information in source materials. The rise of social media such as blogs (Blogger, WordPress, TypePad) and social networks (Facebook, LinkedIn, Ning) has fueled interest in sentiment analysis. With the proliferation of reviews, ratings, recommendations and other forms of online expression, online opinion has turned into a kind of virtual currency for businesses looking to market their products, identify new opportunities and manage their reputations. As businesses look to automate the process of filtering out the noise, understanding the conversations, identifying the relevant content and actioning it appropriately, many are now looking to the field of sentiment analysis.

For example, if you are considering buying an Apple iPad, you can check latest Twitter sentiment (positive, negative or neutral) towards the product using online Twitter Sentiment site as below:

iPad Twitter Sentiment

Gadgets powered by Google
Other sites that offer such similar applications are:
1. TweetFeel – Tweetfeel monitors positive and negative feelings in twitter conversations about stuff like movies, musicians, TV shows and popular brands.
2. Twendz – Twitter-mining Web application uses the power of Twitter Search, highlighting conversation themes and sentiment of the tweets that talk about topics you are interested in.
3. PeopleBrowsr – PeopleBrowsr engages the collective intelligence in real-time conversation, identify top influencers, search cross network and see multiple accounts across social streams.
4. groubal – it tracks the customer dissatisfaction of hundreds of brands through social media. A customer satisfaction index built using a Bayesian classifier. Claims 90% accuracy.
5. Brandcrown – it ranks popular brands.
6. OpinionCrawl – online sentiment analysis for current events, companies, products, and people.
7. Tweet Sentiment – attempts to find a correlation between Twitter sentiment and stock prices.

For More Information about Data Minining click here

Continue Reading

Statistica Data Mining Resources

STATISTICA (Data Miner/Text Miner) has been gaining popularity among data miner/analyst around the globe these days. A report from the Rexer’s Annual Data Miner Survey in 2010 stated that STATISTICA Data Miner along with IBM SPSS Modeler and R received the strongest satisfaction ratings as a data mining tool in both 2010 and 2009; moreover, it was rated as the primary data mining tool chosen most often (18%). STATISTICA is a statistics and analytics software package developed by StatSoft. The software includes an array of data analysis, data management, data visualization, and data mining procedures; as well as a variety of predictive modeling, clustering, classification, and exploratory techniques. Hence, I have compiled some great resources about Statistica for your reference:

StatSoft – official Statistica creator site; more about the product, download and support.
Statistica in Wikipedia – provides history, overview and references of Statistica software package.
Statistica Video Tutorials – compilation of videos on how to getting started with Statistica; including links to Youtube Statistica channel.
UCLA Notes & Movies – provides hands-on experience using Statistica for doing statistics, graphics and data management. Recommended for starters/students.
Statistica Group – have a question? then join discussion group in Yahoo!Group.
Statistica Review – not sure to use Statistica, then read personal review of open source and commercial software packages for Data Mining; including Orange, R, RapidMiner, Statistica and WEKA.
Statistica in Amazon – shortcut to books related to Statistica in
Statsoft Statistica Books – additional statistical resources beyond the STATISTICA manual that accompanies the software?
Statistica Forum – ask a question or share your thoughts about Statistica.
Statistica Facebook – are you Statistica fan? then join their Facebook!

R Data Mining Resources

Data Mining using R has been gaining popularity among data miner/data analyst around the globe these days. A report from the Rexer’s Annual Data Miner Survey in 2010 stated that R has become the data mining tool used by more data miners (43%). According to Wikipedia, R is a programming language and software environment for statistical computing and graphics. R provides a wide variety of statistical and graphical techniques, including linear and nonlinear modeling, classical statistical tests, time-series analysis, classification, clustering, and others. Thus, I have compiled top resources about R data mining for your reference:

R Project for Statistical Computing – the official R open source project website. Here you can get the latest release of R source code, manuals and recent bugs.
R Books Website – list of latest books that are related to R and may be useful to the R user community. You may also like to read Data Mining with R book (data mining bestseller at
R in Wikipedia – here you can read basic info and example for R programming, including list of GUI for R and some references.
Rattle: A GUI for Data Mining using R – a simple and logical graphical user interface based on Gnome that can be used by itself to deliver data mining projects. Rattle runs under GNU/Linux, Macintosh OS/X, and MS/Windows.
R reference card for data mining – a collection of R packages and functions for data mining.
R Bloggers – a central hub of news and tutorials contributed by (185) R bloggers.
R Video Tutorials – a series of R for Statistical Programming screencasts that show you how to use R for for text mining. (some of the video links are missing)
Reasons to learn R? – YouTube video describing why students should learn the R programming language.
Programming R – online R programming resources from beginner to advanced resources.
R Programming Wikibook – a place where anyone can share his/her tricks and knowledge on R.

Continue Reading