## A comprehensive guide to connect R to Amazon Redshift

Amazon Redshift is one of the hottest databases for Data Warehousing right now, it's one of the most cost-effective solutions available, and allows for integration with »

tl;dr: Convert numerical variables into categorical, as it is shown in the next image. ⏳ Reading time ~ 6 min. Let's start! The package funModeling (from version »

Well after some time, and +300 commits, this is the biggest release of the Data Science Live Book! (open source), after the first publication more than »

Hi there! I decided to almost re-write the model validation section since it didn't reflect real case scenarios. Hopefully in the two new chapters you will »

This update contains a new chapter -scoring- which is related to model performance and model deployment, used when predicting a binary outcome. Link to the scoring »

Hi! Well finally there is the first release of this project: A open source book which will hopefully contain some useful resources for those who want »

POST UPDATE 09/24/2016 Good news! funModeling documentation evolved into an open source book! Please follow the link below Jump to the book... This release »

Introduction Inspired by this Netflix post, I decided to write a post based on this topic using R. There are several nice packages to achieve this »

Introduction Big Data help us to analyze unstructred data (aka "text" ), with many techniques, in this post it is presented one: Cosine Similarity. There are also »

Introduction This is an excellent resource to understand 2 types of data frame format: Long and Wide. Just take a look at figure 1 inside the »

Introduction "I want to develop a model that automatically learns over time", a really challenging objective. We'll develop in this post a procedure that loads data, »

Introduction In clustering you let data to be grouped according to their similarity. A cluster model is a group of segments -clusters- containing cases (such as »