Top 10 Data Mining Algorithm Resources 2013

Below are Top 10 search results based on keyword “Algorithms” using our customized data mining search engine powered by Google CSE:

The Two Most Important Algorithms in Predictive Modeling Today …
strataconf.com/strata2012/public/schedule/detail/22658

@Kaggle
Wouldn’t it be great if there were just use two algorithms which could handle most of your predictive modeling needs? It turns out that actually this is the case.
Why More Data and Simple Algorithms Beat Complex Analytics …
data-informed.com/why-more-data-and-simple-algorithms-beat-complex- analytics-models/

Aug 7, 2013 … More data and simple algorithms are better than complex analytics models because having more data allows the “data to speak for itself,” …
On Algorithm Wars and Predictive Apps – Datanami
www.datanami.com/…/on_algorithm_wars_and_predictive_apps.html

May 15, 2013 … Predictive apps are going to be one the next major disruptions in technology as enterprises begin to take advantage of the combined power of …
rxDTree(): a new type of tree algorithm for big data
blog.revolutionanalytics.com/…/rxdtree-a-new-type-of-tree-algorithm.html
Jul 11, 2013 … by Joseph Rickert The rxDTree() function included in the RevoScaleR package distributed with Revolution R Enterprise is an an example of a …
Data Mining with an Ant Colony Optimization Algorithm – SCI2S
sci2s.ugr.es/keel/pdf/algorithm/articulo/Ant-IEEE-TEC.pdf
File Format: PDF/Adobe Acrobat
Abstract – This work proposes an algorithm for data mining called Ant-Miner (Ant … with CN2, a well-known data mining algorithm for classification, in six public …
Big Graph Mining: Algorithms and Discoveries
www.kdd.org/sites/default/files/issues/14-2…/V14-02-04-Kang.pdf

File Format: PDF/Adobe Acrobat
primitive that Pegasus uses for its algorithms to analyze structures of large … tributed algorithm, and it handles all the details of data dis- tribution, replication …
StreamKM++: A Clustering Algorithm for Data Streams∗
https://www.siam.org/proceedings/…/alx10_016_ackermannm.pdf
File Format: PDF/Adobe Acrobat
We develop a new k-means clustering algorithm for data streams, which we … the problem on the sample using the k-means++ algorithm. [1]. To compute the …
A Good Business Objective Beats a Good Algorithm « Predictive …
www.predictiveanalyticsworld.com/…/a-good-business-objective-beats-a- good-algorithm/
Nov 5, 2013 … Predictive Modeling competitions, once the arena for a few data mining conferences, has now become big business. Kaggle (kaggle.com) is …
A novel evolutionary data mining algorithm with applications to …
sci2s.ugr.es/keel/pdf/algorithm/articulo/DMEL.pdf

File Format: PDF/Adobe Acrobat
Many algorithms have been developed to mine large data sets for classification models … propose a new algorithm, called data mining by evolutionary learning …
Random Forests Algorithm – Data Science Central
www.datasciencecentral.com/xn/…/6448529%3ABlogPost%3A106993

Sep 24, 2013 … One of the most popular methods or frameworks used by data scientists at the Rose Data Science Professional Practice Group is Random …

For More Information about Data Minining click here

Continue Reading

Data Miner Survey Summary Report 2013

Here are some highlights from the 2013 Data Miner Survey:
SURVEY & PARTICIPANTS: 68-item survey conducted online in 2013. Participants: 1,259 analytic professionals from 75 countries. This is the 6th Data Miner Survey.
FOCUS ON CRM: In the past few years, there has been an increase among data miners in the already substantial area of customer-focused analytics. Respondents are looking for a better understanding of customers and seeking to improve the customer experience. This can be seen in their goals, analyses, big data endeavors, and in the focus of their text mining.
BIG DATA: Many in the field are talking about the phenomena of Big Data. There are clearly some areas in which the volume and sources of data have grown. However it is unclear how much Big Data has impacted the typical data miner. While data miners believe that the size of their datasets have increased over the past year, data from previous surveys indicate that the size of datasets have been fairly consistent over time.
THE ASCENDANCE OF R: The proportion of data miners using R is rapidly growing, and since 2010, R has been the most-used data mining tool. While R is frequently used along with other tools, an increasing number of data miners also select R as their primary tool.
CHALLENGES IN THE USE OF ANALYTICS: Data miners continue to report challenges at each level of the analytic process. Companies often are not using analytics to their fullest and have continuing issues in the areas of deployment and performance measurement.
ENGAGEMENT & JOB SATISFACTION: The Data Miners in our survey are highly engaged with the analytic community: consuming and producing content, entering competitions and searching for education and growth within their jobs. All of these activities lead to high job satisfaction, which has been increasing over time.
ANALYTIC SOFTWARE: Data miners are a diverse group who are looking for different things from their data mining tools. Ease-of-use and cost are two distinguishing dimensions. Software packages vary in their strengths and features. STATISTICA, KNIME, SAS JMP and IBM SPSS Modeler all receive high satisfaction ratings.
OTHER FINDINGS include the labels analytic professionals use to describe themselves (Data Scientist is #1), the algorithms being used (regression, decision trees, and cluster analysis continue to be the triad of core algorithms), and computing environments (cloud computing is increasing).

Continue Reading

Data Scientist Resources 2013

Big Data is still a hot talk for business intelligence and data mining people in 2013. In my previous post, i have researched about big data search trends in 2012. FYI, data scientist is a practitioner of data science. Below are top 10 resources to kick start becoming a data scientist by exploring big data online resources:

Data Science Central – the industry’s online resource for big data practitioners, including information about the latest in technology, tools and trends.
How to be a Data Scientist – article in Smart Data Collective describing set of skills you should have if you want to do data science.
Free Big Data Education – article in Big Data Republic that listed free online courses (MOOC) which you can take toward obtaining the requisite background for becoming a data scientist.
Data Science Tutorials – list of tutorials by Kaggle to perform data analysis using data scientist’s toolkit.
Data Science News in Social Media – compilation of latest news about data science.
Data Science Wikibooks – open book with a very basic introduction to data science.
CODATA – the International Council for Science (ICSU), which works to improve the quality, reliability, management and accessibility of data. Also resource for Data Science Journal.
GigaOM Big Data – latest big data tech stories.
5 Big Data Predictions for 2013 – some of the key big data themes to dominate 2013.
Top 5 Data Science Bloggers – article that listed top 5 data science blogs.

Continue Reading

Data Miner Survey 2011

Some highlights from the Rexer Analytics’ 5th Annual Data Miner Survey (2011):
SURVEY & PARTICIPANTS: 52-item survey of data miners, conducted on-line in 2011. Participants: 1,319 data miners from over 60 countries.

FIELDS & GOALS: Data miners work in a diverse set of fields. CRM/Marketing has been the #1 field for the past five years. Fittingly, 
“improving the understanding of customers”, “retaining customers” and other 
CRM goals continue to be the goals identified by the most data miners.

ALGORITHMS: Decision trees, regression, and cluster analysis continue to form a triad of core algorithms for most data miners. However, a wide variety of algorithms are being used. A third of data miners currently use text mining and another third plan to do so in the future.

TOOLS: R continued its rise this year and is now being used by close to half of all data miners (47%). R users report preferring it for being free, open source, and having a wide variety of algorithms. Many people also cited R’s flexibility and the strength of the user community. STATISTICA is selected as the primary data mining tool by the most respondents (17%). Data miners report using an average of 4 software tools. STATISTICA, KNIME, Rapid Miner and Salford Systems received the strongest satisfaction ratings in 2011.

ANALYTIC CAPABILITY AND SUCCESS MEASUREMENT: Only 12% of corporate respondents rate their company as having very high analytic sophistication. However, companies with better analytic capabilities are outperforming their peers. Respondents report analyzing analytic success via Return on Investment (ROI) and analyzing the predictive validity or accuracy of their models. Challenges to measuring success include client or user cooperation and data availability/quality.

Continue Reading