Trying out Tableau 10: #makeovermonday from August

A bit late, but I finally managed to check out the latest version of Tableau 10. I’ve decided to use the dataset of the deaths by Malaria around the world published within the context of the #makeovermonday week 34.

My approach is pretty similar to the one by Andy Kriebel, with a few subtle differences:

  • Instead of a line chart, I’ve use a barchart to represent the evolution of deaths with time. The reason is that we only have data from a few years, therefore we have discrete data (we could have say it is “continuos data” if we woudl have had data from much more years or data at the level of month). However, due to the limited space, I don’t think that the linechart is a bad solution
  • I’ve added a “Continent filter”, allowing to focus only on data from one continent. I’ve used a dataset that identifies the continent to which each contry belong to. The matching is not perfect, and there are some countries assigned to the “null” continent. I’ll try to fix this
  • The map can be used to filter the information from the barchart, and the same happens the other way around: highlighting a bar, shows the number of deaths per country in the map. Both filters highlight the row/column that they refer to in the heatmap

P.S.: the layout weould have looked nicer with a wider visualization, but I’ve limited the width to 800 which is the maximum size of my text space (I have to change the template…)

My Life in Weeks

mylifeinweeks

This interactive visualization follows the visualization suggested by the great blog Wait But Why in their post Your Life in Weeks.

Each row represents a year in my life, out of the expected 85 years I’m meant to life (fingers crossed!). Rows are divided in the 52 weeks that each year has. Colors are assigned according to the different ‘seasons’ of my life. From my early years until now. Some seasons have a link attached to it, so clicking on any week of that period will show a reference on what I was doing or where I was doing it.

Read More

Presenting UNiCS at the European Data Forum, Eindhoven

At the end of June I had the pleasure to represent SIRIS Academic in the European Data Forum. In this conference where data-related business is the main topic, I briefly presented UNiCS, the project developed during the first half of 2016 with the awesome team of SIRIS Academic, where we combined open datasets related to the sector of Higher Education and Research and developed analytical tools to explore them. The project was funded by ODINE program.

This is the short video of the presentation

Visualizing FP7 projects

Our latest project in SIRIS Academic has been a tool to explore the number of projects awarded to a given institution between 2007 and 2013 under the European Union seventh framework programme for research and technological development (FP7).

The project is still in beta, but you can check it out here.

fp7_explorer

The dataset has been downloaded from the European Union Open Data Portal. Given the lack of standardization of the data, part of the SIRIS team spend time with Open Refine finding similar names and identifying similar institution and country names.

The result of the project is a set of coordinated barcharts that allow to compare the number of projects got by an institution per year, type of call and activity area, and also allows to compare data across institution. The barchart have been organized following the structure of the different

A big drawback of the project has been the lack of information regarding the number of researchers per institution, preventing proper comparison among institutions due to the big difference in size. Nevertheless, the project easily allows to search among more than 30 thousand institutions and 174 countries. A follow-up project will help to discover which are the fair comparisons, and therefore, discover the best institutions to be compare with the tool. Meanwhile, do not hesitate to test the FP7 Explorer using the comboboxes to search for name of institutions.

Exploring ways for visualizing search results

Since I started working with the people from SIRIS Academic, the topic of using visualizations with semantic modelling technologies has become a real interest for me. Specially since I’m sharing office with the great Alessandro Mosca.

As a first and simple effort towards this direction, we (with the special effort of Xavi Giménez) have developed a D3 version of the Elastic List presented some years ago by Moritz Stefaner. The idea is to provide a bunch of filters on the dimensions of a dataset in the form of collapsable boxes sized according to the number of entries they repreesnt. While this visualization will be incorporated in a search platform to visualize results from a query, you can check out our live demo using a dataset from Nobel Prizes

Screen Shot 2015-03-21 at 18.30.59

My working Pulse

UPDATE: Apparently people from some countries can not participate in Tableau’s Quantified Self Viz Contest, so my submission has been rejected

For a long time I’ve been fascinated by the Quantified Self movement. As someone passionate about data, I can’t think of any prettier data than the kind related to myself. That’s why, since the fall 2013 I began to collect data about myself using different kinds of software and gadgets that track my life seamlessly.

Now, the time to start digging into this data has arrived, especially since I realized yesterday about Tableau’s Quantified Self Viz Contest. Though I’m planning to build more complex, advanced and “cross-topic” visualizations, I decided to give a second try to Tableau after my series’ visualization. The dataset that I’m exploring in this post is related to the number of keystrokes that I perform on my laptop, my main working tool. The data has been collected with this open source keylogger which provides a file with a timestamp and number of keyboard strokes performed by minute.

Read More

Analyzing the IMDB ratings of my favorite series

Since the apparently successful ending of Breaking Bad, I decided to give it a chance. So far I’m at the third episode of the second season and still don’t feel as attached to the series as everyone told me I should, so inspired by this post I decided to have a look at the IMDB ratings to see if the quality of the episodes actually increases with time (as I was told).

To begin with, I implemented a small R script that converts the page of IMDB ratings of a series into an R data frame. At that point I decided that, rather to only analyze Breaking Bad, I would also have a look at some of my favorite series: Lost, How I Met Your Mother, Homeland, Big Bang Theory and Dexter.

First of all I had a look at the distribution of the ratings:

ratings_distribution

Read More

First Data Expedition in Barcelona


IMG_2690

Last weekend I was part of the organization committee of the very first Data Expedition made in Barcelona also formed by Karma Peiro, Concha Català, Eduard Martin Borregon and Diego Pascual.

The event, organized within the context of the Open Knowledge Foundation in Spain was dedicated to the study of the Department of Health of the catalan government.

The main goal of the Data Expeditions promoted by the School of Data of the  OKFN is to create interdisciplinary groups formed by journalists and engineers to create stories out of data. 

This first experiment has proved the benefits of merging disciplines that were completely unrelated a few years ago such as journalism and computer science (data analysis, or whoever you want to call it) leading to very interesting results such as the timeline created by a group that studied the relation between a foundation and a private company in the sector of biomedics.