Data Science Live Book (open source)

Hi! Well finally there is the first release of this project: An open source book which will hopefully contain some useful resources for those who want »

Time Series Analysis Using Max/Min... and some Neuroscience.

Introduction Time series have maximum and minimum points as general patterns. Sometimes the noise present on it causes problems to spot general behavior. In this post, »

How to bulk upload your data from R into Redshift

Amazon's columnar database, Redshift is a great companion for a lot of Data Science tasks, it allows for fast processing of very big datasets, with a »

Package funModeling: data cleaning, importance variable analysis and model performance

Hi there :) This new package -install.packages("funModeling")- tries to cover with simple concepts common tasks in data science. Written like a short tutorial, its »

Anomaly Detection in R

Introduction Inspired by this Netflix post, I decided to write a post based on this topic using R. There are several nice packages to achieve this »

Text Mining Analysis: some theory and practice in R

Introduction Big Data help us to analyze unstructred data (aka "text" ), with many techniques, in this post it is presented one: Cosine Similarity. There are also »

Adding Authentication to Shiny Open Source Edition

Shiny Server is a great solution for BI/analytics reporting. It leverages the power of the R language to create interactive reports/dashboards. May be you »

Recommendation Systems in R

These systems are used in cross-selling industries, and they measure correlated items as well as their user rate. This last point wasn't included the apriori algorithm »

{Long Vs. Wide} Data Frames

Introduction This is an excellent resource to understand 2 types of data frame format: Long and Wide. Just take a look at figure 1 inside the »

Introduction to automatic machine learning

Automatic Machine Learning Introduction Introduction "I want to develop a model that automatically learns over time", a really challenging objective. We'll develop in this post a »

Data Science - Short lesson on cluster analysis

Introduction In clustering you let data to be grouped according to their similarity. A cluster model is a group of segments -clusters- containing cases (such as »

EU Life Quality Geo Report

Living longer, living better? It's equally important to measure the longer living as well as its quality. Analyzing data from eurostat which containts the following two »

Dynamic analysis on outliers

Treating outliers Introduction Outliers are the extreme values that a variable has, depending on the model or requirement, it could be necessary to treat them, either »

Forecasting the Argentinian "Blue Dollar"

If you have visited recently my website, Bluelytics, you will notice there is a new section named "Predicción", which is a forecast of the value of »

Scraping data from the central bank of Argentina

Today i'm going to show you an example of data scraping with the BCRA, which is the central bank of Argentina. On this website, we have »