Its all about Data

Sunday, December 15, 2013

Map/Reduce Programming Paradigm Example

MapReduce is the heart of Hadoop. It is the programming paradigm that allows for massive scalability across hundreds or thousands of server...

Friday, November 8, 2013

Steps to run Presto on linux m/c

Facebook announced on 6th November 2013 that it is committing its Presto low-latency, SQL-compliant query system for Hadoop to open source...

Tuesday, October 29, 2013

Steps to run K-Means Clustering on Apache Mahout

For k-means example on Mahout, here i am downloading reuters dataset from : http://kdd.ics.uci.edu/databases/reuters21578/reuters21578.t...

Wednesday, October 9, 2013

Ensemble Learning

Ensemble learning is to build a prediction model by combining the strengths of a collection of simpler base models Bagging, Boosting and...

Monday, July 22, 2013

Regression Model Analysis

Regression model analysis is done by RMSE(Root Mean Square Error) value. An RMSE of zero, meaning that the estimator predicts observ...

Tuesday, June 11, 2013

Data Preparation and Predictive Modeling Mistakes in M/C Learning and How to Avoid These Mistakes?

1) Mistake (Including ID Fields as Predictors) : Because most IDs look like continuous integers (and older IDs are typically smaller), it...

Monday, June 10, 2013

Underfitting/Overfitting Problem in M/C learning

Underfitting : If our algorithm works badly with points in our data set, then the algorithm underfitting the data set. It can be check eas...

View web version

About Me

Unknown

View my complete profile

Powered by Blogger.