Visualizing FP7 projects

Our latest project in SIRIS Academic has been a tool to explore the number of projects awarded to a given institution between 2007 and 2013 under the European Union seventh framework programme for research and technological development (FP7).

The project is still in beta, but you can check it out here.


The dataset has been downloaded from the European Union Open Data Portal. Given the lack of standardization of the data, part of the SIRIS team spend time with Open Refine finding similar names and identifying similar institution and country names.

The result of the project is a set of coordinated barcharts that allow to compare the number of projects got by an institution per year, type of call and activity area, and also allows to compare data across institution. The barchart have been organized following the structure of the different

A big drawback of the project has been the lack of information regarding the number of researchers per institution, preventing proper comparison among institutions due to the big difference in size. Nevertheless, the project easily allows to search among more than 30 thousand institutions and 174 countries. A follow-up project will help to discover which are the fair comparisons, and therefore, discover the best institutions to be compare with the tool. Meanwhile, do not hesitate to test the FP7 Explorer using the comboboxes to search for name of institutions.

Exploring ways for visualizing search results

Since I started working with the people from SIRIS Academic, the topic of using visualizations with semantic modelling technologies has become a real interest for me. Specially since I’m sharing office with the great Alessandro Mosca.

As a first and simple effort towards this direction, we (with the special effort of Xavi Giménez) have developed a D3 version of the Elastic List presented some years ago by Moritz Stefaner. The idea is to provide a bunch of filters on the dimensions of a dataset in the form of collapsable boxes sized according to the number of entries they repreesnt. While this visualization will be incorporated in a search platform to visualize results from a query, you can check out our live demo using a dataset from Nobel Prizes

Screen Shot 2015-03-21 at 18.30.59

My working Pulse

UPDATE: Apparently people from some countries can not participate in Tableau’s Quantified Self Viz Contest, so my submission has been rejected

For a long time I’ve been fascinated by the Quantified Self movement. As someone passionate about data, I can’t think of any prettier data than the kind related to myself. That’s why, since the fall 2013 I began to collect data about myself using different kinds of software and gadgets that track my life seamlessly.

Now, the time to start digging into this data has arrived, especially since I realized yesterday about Tableau’s Quantified Self Viz Contest. Though I’m planning to build more complex, advanced and “cross-topic” visualizations, I decided to give a second try to Tableau after my series’ visualization. The dataset that I’m exploring in this post is related to the number of keystrokes that I perform on my laptop, my main working tool. The data has been collected with this open source keylogger which provides a file with a timestamp and number of keyboard strokes performed by minute.

Read More

Analyzing the IMDB ratings of my favorite series

Since the apparently successful ending of Breaking Bad, I decided to give it a chance. So far I’m at the third episode of the second season and still don’t feel as attached to the series as everyone told me I should, so inspired by this post I decided to have a look at the IMDB ratings to see if the quality of the episodes actually increases with time (as I was told).

To begin with, I implemented a small R script that converts the page of IMDB ratings of a series into an R data frame. At that point I decided that, rather to only analyze Breaking Bad, I would also have a look at some of my favorite series: Lost, How I Met Your Mother, Homeland, Big Bang Theory and Dexter.

First of all I had a look at the distribution of the ratings:


Read More

First Data Expedition in Barcelona


Last weekend I was part of the organization committee of the very first Data Expedition made in Barcelona also formed by Karma Peiro, Concha Català, Eduard Martin Borregon and Diego Pascual.

The event, organized within the context of the Open Knowledge Foundation in Spain was dedicated to the study of the Department of Health of the catalan government.

The main goal of the Data Expeditions promoted by the School of Data of the  OKFN is to create interdisciplinary groups formed by journalists and engineers to create stories out of data. 

This first experiment has proved the benefits of merging disciplines that were completely unrelated a few years ago such as journalism and computer science (data analysis, or whoever you want to call it) leading to very interesting results such as the timeline created by a group that studied the relation between a foundation and a private company in the sector of biomedics.

Data Visualization and Story Telling by Edward Segel

I’ve recently discovered this talk by Edward Segel at the Jan 2012 KDMC digital storytelling workshop. Edward dissected his paper (written in collaboration with Jeffrey Heer) entitled “Narrative Visualization: Telling Stories with Data“.

The paper, and the talk, goes through the design of narrative visualizations, identifying relevant techniques for telling stories with graphics.

The slides of the talk can also be found in SlideShare

The graph of my website

Visualize the link structure of your website using free tools

There is a big value in understanding the hyperlink structure of our website as I pointed out in my PhD thesis. At the time I was doing it, getting the hyperlink structure of a website and visualize it was a tedious task. However, nowadays there are a bunch of free tools that facilitate this task.

In this post I will show how to visualize the link structure of a website using three tools:

Read More

The Process of Information Visualization by Dürsteler & Engelhardt

The process of Information Visualization

(This post is a fragment of Chapter 2 of my PhD thesis entitled “Visual Exploration of Web Spaces”)

The Process of Information Visualization by Dürsteler & Engelhardt

The understanding of the basis of data transformation into insight, known as the the process of InfoVis, is crucial for developing effective strategies that help users to reach their informative goals. Several conceptual approximations to such a process have been presented. However, all of them converge in the definition of three main steps, specifically named by Dürsteler and Engelhardt:

Read More