Creating Fictional Data Products and Their Implications

Creating Fictional Data Products and Their Implications

When conceptualizing a service or product based on data, I first transform visions into a tangible visualization or prototype that anyone in a multi-disciplinary team can feel and understand. Additionally, I generally create Design Fictions that explore possible appropriations of the envisioned data product along its life. Taken together, prototypes and fictions present tangible concepts that help anticipate opportunities and challenges for engineering and user experience before a project gets even founded. These concepts give a clearer direction on what you are planning to build. They are a powerful material to explain the new data product to others and they act as a North Star for a whole team has a shared vision on what they might to want build.

Taken together, prototypes and fictions present tangible concepts that help anticipate opportunities and challenges for engineering and user experience before a project gets even founded.

This is the approach I aimed to communicate last week in a 5-days workshop at HEAD design school in Geneva to an heterogeneous group of students coming from graphic design, engineering, business or art backgrounds.

The syllabus of the 5-days workshop
The syllabus of the 5-days workshop

Part 1: Sketching with Data

Sketching with DataThe first part of the workshop was dedicating to become familiar with the theories and practices related to data science, data visualization, and information design. Along with Julian Jamarillo from Bestiario, we introduced different ways of extracting insights from data and convey a message effectively from the simple result of a collaborative filtering algorithm to the proper use of a map or a chart. The main objective for the students was to acquire a hands-on experience visualizing data and transform them into small stories.

The day of a bike sharing systemFor instance, through the manipulation of a real dataset participants apprehended its multiple dimensions: spatial, temporal, quantitative, qualitative, their objectivity, subjectivity, granularity, etc. It only took a full day of sketching with data with Quadrigram, for participants to start write and tell small stories about crime in San Francisco or mobility in Barcelona. Embedded as a data-driven web page, we motivated students to provide a critical eye on the current hype about big data: What are the limitations? Do they tell a story but not THE story? Consequently, we discussed the notions of trust, quality and integrity of the sources, the ownership of personal data, and the subjectivity in many design decisions to convey a message.

Through the manipulation of a real dataset participants apprehended its multiple dimensions: spatial, temporal, quantitative, qualitative, their objectivity, subjectivity, granularity, etc.

Part 2: Creating implications

Creating implicationsIn the second part of the workshop we projected into the future the datasets and their stories. We started to imagine a future service, product, solution that link data to fashion, entertainment, the environment, social relations, etc. Using an approach called Design Fiction, we encouraged participants to build elements of a possible data product without being too precious or detailed about them. The aim was to spark conversations about the near future of data, check the sanity of visions and uncover hidden perspectives.

A Design Fiction approach to bring a technology to the world starts by anticipating how people could co-evolve with it. Instead of designing for Time 0 (T) when people start using a data product or service, I believe it is important to consider the evolution of the user experience with its frictions, rituals, and behaviors at T+ 1 minute, T+ 1 hour, T+1 day, T+1 week, T+1 month, etc. until the actual end of life of the product (e.g. what happens to my data when I retire my Fitbit into my box of old devises).

The evolution of the user experience
The evolution of the user experience with its frictions, rituals, and behaviors at T+ 1 minute, T+ 1 hour, T+1 day, T+1 week, T+1 month, etc. until the actual end of life of the product. Inspired by Matt Jones’ Jumping to the End talk .

Hence, in our workshop, similar to Amazon’s Working Backward process of service design, we asked students to write first a press release that describes in a simple way what a potential data product does and why it exists. The format of the press release is practical because it is not escapist. It forces to use precise words to describe a thing and its ecosystem (e.g. who built it, who uses it, what does it complement, what is it built with?).

Writing a fictional press release forces to use precise words to describe a thing and its ecosystem. Quite naturally it leads to listing Frequently Asked Questions with the banal yet key elements that define what the data product is good for.

With the press release in hands, the next exercise consisted in “cross-pitching” their concepts for 2 minutes to each other. Quite naturally, from the questions that came up during the exchange some participants started to list the Frequently Asked Questions (FAQ). The FAQ includes the banal yet key elements that define what the data product is good for. That exercise forced participants to consider the different situations and frictions users could have along the life of a product.

As the concepts clarified, we sketched storyboards of use cases and mocked up interfaces that described in more details the user experience with the product. Finally, each embryonic concepts of data product became alive with the production of a piece of design fiction.

Creating implicationsIn Design Fiction, we use cheap and quick content production material (e.g. video, data visualization, print, interface mockups, …) to make things (e.g. diegetic prototypes) as if they were real. For instance, one student project took the form of the user manual of a smart jacket that shows how a customer should use it, what personal data are exploited, how the information is revealed.

This type of exploration serves to design-develop prototypes and shape in order to discard them, make them better, or reconsider what we may take for granted today. It served at considering the data product and its implications. The Design Fictions act as a totem for discussion and evaluation of changes that could bend visions and trajectories. They are some sort of “boundary objects” that allow heterogeneous groups of participants to understand with a common language the exploitation of data and their instantiation into a product or service.

Some of the created and discussed implications include the Fashion Skin jacket that explore through a user manual the affordance of smart clothes and how people might interact with contextual information. The press release says:

The Fashion’Skin, with its unique sensing and adaptive fabric, is a revolution in the fashion and the smart clothing landscapes. It is always accorded to the people’s feelings, the weather, or the situation, without compromise. The fabric can change its color, its texture and its form.

Others looked at the data intake rituals of the near future and the hegemony of mean-well technologies with Noledge a data patch that transfer knowledge on languages directly into your brain. Here is its unboxing video.

Almost all groups looked at the virtues and pitfall of feedback loops. For instance Real Tennis Evo for the Wiii that models data generated with Wilson-Sony rackets into simulations of one-self. The game cover advertises that “you can improve your skills by playing against your real self at home”.

Real Tennis Evo
The next generation mixed reality tennis experience: Real Tennis Evo.

Data visualizations help extract insights, and prototypes force to consider the practical uses of those insights. Design fictions put prototypes and visualization in the context of the everyday life.

Take aways

Data visualizations, prototypes and design fiction are ‘tools’ to experiment with data and project concepts into potential futures. They help uncover the unknown unknowns, the hidden opportunities and unexpected challenges.

Data visualizations help extract insights, and prototypes force to consider the practical uses of those insights. Design fictions put prototypes and visualization in the context of the everyday life. They help form a concept and evaluate its implications. The approach works well for abstract concepts because it forces you to work backward and explore the artifacts or the byproducts linked to your vision (e.g. a user manual, an advertisement, a press release, a negative customer review …). Eventually the approach encourages considering the ecosystem affected by the presence of a data product: What do people do with it over time? Where are the technical, social, legal boundaries?

Thanks to Daniel Sciboz and Nicolas Nova for the invitation, Julian Jamarillo and Bestiario to share their practice and Quadrigram and the students of HEAD and HEG for their creativity, energy and capacity to leave their comfort zone in design, engineering, business and art.

Unveiling Quadrigram

So for the last 8 months I have been working almost exclusively with my friends at the information visualization consulting company Bestiario on new tools to visualize information. Last year, based on our joint experience, we detected two increasing demands within innovative institutions. First the wish to think with liberty with data, outside of coding, scripting, wizard-based or blackbox solutions. Then, we perceived the necessity to diffuse the power of information visualization within organizations to reach the hands of people with knowledge and ideas of what data mean.

Our efforts have now culminated into Quadrigram, a Visual Programming Environment to gather, shape and share living data. By living data we mean data that are constantly changing and accumulating. They can come from social network, sensor feeds, human activity, surveys, or any kind of operation that produce digital information.

For Bestiario and its long track record in ‘haute couture’ interactive visualizations, Quadrigram offers ‘prêt-à-porter’ solutions for organizations, consultants, analysts, designers and programmers working routinely with these types of data. As with other services, data visualization plays a central role in the making sense and sharing of complex data.

I got the chance to work on multiple conceptual, engineering and strategic aspects of Quadrigram. In this post I summarize four most main areas I had the pleasure to shape in collaboration with Bestiario:

1) Redefining work with data

For us at Near Future Laboratory it made sense in helping Bestiario with our experience in prototyping solutions that become feedback loops where our clients can actually figure something out. Indeed, more and more results of our investigations became interfaces or objects with a means of input and control rather than only static reports. The design of Quadrigram lays on this very idea of ‘feedback loop’ and provides a WYSIWYG (What you see is what you get) interface. It is designed for iterative exploration and explanation. Each iterations or “sketches” is an opportunity to find new questions and provide answers with data. Data mutate, take different structure in order to unveil their multiple perspectives. We like to think that Quadrigram offers this unique ability to manipulate data as a living material that can be shaped in real time or as Mike Kuniavsky nicely describes in Smart Things: Ubiquitous Computing User Experience Design: “Information is an agile material that needs a medium”. And this not only concerns ‘data scientists’ but rather everybody with knowledge and ideas in a work that involves data.

With the diffusion of access to data (e.g. the open data movement), our investigation with data has become utterly multi-disciplinary. Nowadays, our projects embark different stakeholders with fast prototyped tools that promote the processing, recompilation, interpretation, and reinterpretation of insights. For instance, our experience shows that the multiple perspectives extracted from the use of exploratory data visualizations is crucial to quickly answer some basic questions and provoke many better ones. Moreover, the ability to quickly sketch an interactive system or dashboard is a way to develop a common language amongst varied and different stakeholders. It allows them to focus on tangible opportunities of product or service that are hidden within their data. I like to call this practice ‘Sketching with Data‘, others such as Matt Biddulph talks about “Prototyping with data” (see also Prototyping location apps with real data). Regardless of the verb used, we suggest a novel approach to work data in which analysis and visualizations are not the unique results, but rather the supporting elements of a co-creation process to extract value from data. In Quadrigram, the tools to sketch and prototype took the form of a Visual Programming Environment.

The teaser video summarize the vision behind Quadrigram

2) Reducing the barriers of data manipulation

Visual Programming Environments have flourished in the domain of information technologies, starting with LabVIEW in the 80s and then spreading to the emerging fields mixing data with creativity such as architecture, motion graphic and music. In these domains, they have demonstrated virtues in reducing the barrier of entry for non-experts (check the VL/HCC community for more on the topic). In the Visual Programming Environment we developed, users manipulate in an interactive way pre-programmed modules represented as graphical elements. When connected, these modules form a ‘data flow’ (also called dataflow programming) that provide a constant visual awareness the result of the program (“What You See Is What You Get”) ideal for quick “trial and error” explorations. This way the tool allows for the evaluation of multiple pathways towards the correct solution or desired result. It inspires solution-finding for non-technical professional by exposing the full flow of data.

The take a tour video presents the Visual Programming Environment that offers a transparent way of setting up a solution, that contrast with wizard-based environments and their “black boxes”.

3) Creating a coherent language

A major challenge when grouping tools to work with data within a common Visual Programming Environments has been to define basic building blocks of a language. Starting from scratch, we used an exploratory phase that led to the release of an experimental environment called Impure and its large sets (500) of diverse modules. This free solution generated a decent community of valorous 5000 users. We used Impure as testbed for our ideas and perform the necessary user studies to come up with a coherent basic language. We particularly focused on specific action verbs (what people can do see Verbs and design and verbs) that enclose the most common operations on data: sort, search, insert, merge, count, compare, replace, remove, filter, create, get, cluster, encode, decode, convert, accumulate, split, resize, set, execute, load, save. These actions are performed on Data Structures (e.g. create List, sort Table, replace String, cluster Network, compare Date, resize Rectangle, load Image, …) within specific domains (e.g. Math, Geography, Statistics, …). The language is complemented with a growing list of Visualizers categorized according to their objectives to reveal aspects about the data (e.g. compare, contextualize, relate, …). Through this structure (actions – structure – domain) user can find the appropriate module within a very dense and diverse toolset.

This exploratory analysis video shows how a unique language provides similar perspectives in the same dataset.

4) Steering the development of an environment that takes advantage of an ecosystem of great tools

Bestiario’s CEO José Aguirre always like to present Quadrigram as a sponge capable of absorbing information from many diverse sources: social networks, data bases, Internet of Things, social media tools, business analytics tools, etc. stressing that “In the wild we know that it is not the strongest who survive but rather those who best cooperate”. We brought that vision to reality with an environment based on severs ‘in the cloud’ that integrates with other sophisticated tools. Like many other platforms, Quadrigram connects to various types of data sources (databases, APIs, files, …) to load data within a workspace. But we also wanted users with detailed needs to take advantage R scripting to perform advanced statistical method or Gephi to layout large networks. The main challenge was to find and implement a protocol to communicate Quadgrigram data structure back and forth with these great tools. In other words, we wanted users to perform analysis in R as part of their data flow. Similar to the architecture of distributed systems and the used of JSON nowadays, the solution was to pass around serialized Quadrigram objects. That offers a pretty unique mechanism to store and share results of data manipulations, what we call “memories”. For instance the content of a Table stored in Quadrigram server is available publically to other tools via a URL (e.g. store an analysis of my CPU activity)

Why do I blog this: It has been a unique opportunity to help shape a software product and bring it to market. When we created Lift Lab and now Near Future Laboratory we knew if was the kind of experience we wanted to live. This post is an attempt to keep track of the work performed to make Quadrigram a tool that we hope will open new practices around the manipulation and visualization of data. Thanks to the team at Bestiario for their talent and stimulating discussions. I will continue contributing to the project with constant technical, strategic and conceptual guidance. I have also jumped in the advisory board in company of Bernando Hernandez and Jaume Oliu.

At O’Reilly Strata Conference

Last week I participated to the O’Reilly Strata Conference with a 40-minutes talk in the session on ‘visualization & interfaces’. My contribution suggested the necessity to quickly answer and produce questions at different stages of the innovation process with data. I extended the material presented at Smart City World Congress by adding some narrative on the practice of sketching by major world changers and focussing on Quadrigram as an example of tools that embraces this practice with data. The abstract went as follow:

Sketching with data

Since the early days of the data deluge, the Near Future Laboratory has been helping many actors of the ‘smart city’ in transforming the accumulation of network data (e.g. cellular network activity, aggregated credit card transactions, real-time traffic information, user-generated content) into products or services. Due to their innovative and transversal incline, our projects generally involve a wide variety of professionals from physicist and engineers to lawyers, decision makers and strategists.

Our innovation methods embark these different stakeholders with fast prototyped tools that promote the processing, recompilation, interpretation, and reinterpretation of insights. For instance, our experience shows that the multiple perspectives extracted from the use of exploratory data visualizations is crucial to quickly answer some basic questions and provoke many better ones. Moreover, the ability to quickly sketch an interactive system or dashboard is a way to develop a common language amongst varied and different stakeholders. It allows them to focus on tangible opportunities of product or service that are hidden within their data. In this form of rapid visual business intelligence, an analysis and its visualization are not the results, but rather the supporting elements of a co-creation process to extract value from data.

We will exemplify our methods with tools that help engage a wide spectrum of professionals to the innovation path in data science. These tools are based on a flexible data platform and visual programming environment that permit to go beyond the limited design possibilities industry standards. Additionally they reduce the prototyping time necessary to sketch interactive visualizations that allow the different stakeholder of an organization to take an active part in the design of services or products.

Slides + notes (including links to videos)

Sketching with data (PDF 15.7MB) presented at the O’Reilly Strata Conference in Santa Clara, CA on 29.02.2012.

Sketching with Data at O'Reilly Strata Conference

Introducing Elephant Path

Elephant Path home

It is rewarding to see some our areas of investigation at Lift Lab burgeoning in relation with our clients and partners. For instance, we now have a good set of tools and reasonably well-documented processes that help qualify and profile territories from their network activity (e.g. GSM, WiFi, Bluetooth, mobility infrastructures, social networks). One specificity of our approach is to produce visualizations that characterize the data at hand very early in the analysis process. It tremendously helps bring the different actors of a project on the same page by opening a dialogue and their interpretations of what they see is often great material for early insights to focus the investigation (see Exploration and engage in the discussion in the Data City essay).

This ability to sketch with data is particularly fruitful when dealing with multidisciplinarity. Indeed, data visualization brings together over a same language very diverse practices and methodologies (e.g. in our projects on network data, we deal with a bestiary of physicists, network engineers, marketing directors, salesmen, architects, geographers, social scientists, innovation specialists, …). Over the last months, we have been very fortunate to partner with our friends at Bestiario who share a common vision : data visualization is part of an innovation process not its outcome. They applied this perspective in their latest product Impure, an engine with an intuitive visual programming language. Impure has particularly revolutionized our ability to quickly communicate the early results our investigation. In a few weeks we have been able to swiftly create interfaces in collaboration with designers that did not have prior programming skill. One outcome of the use of our set of tools is Elephant Path, a concept by Lift Lab, designed and implemented by the young designer Olivier Plante in Impure :

Elephant Path, a social navigation interface based on he thousand of pieces of information inhabitants and visitors share publicly on the web
Our idea of Elephant Path germinated years ago with the emergence of new ways of reading and discovering a territory through its digital activities (see my PhD thesis). It collided with our long interest in the principles of social navigation (see rss4you developed by Nicolas and Robi in the early days of content syndication) that leverage traces of activities with the goal to facilitate locating and evaluating information. In the physical world, a classic example of social navigation is a trail (called elephant path, desire line, social trail or desire path) developed by erosion caused by people making their own shortcuts (a phenomenon we like to observe).

Taking that concept into the informational layers of our cities and regions, we sketched in Impure the possibility to reveal unofficial routes and beaten tracks through the thousand of pieces of information inhabitants and visitors share publicly on the web. Technically, we deployed our own algorithms to extract travel sequences using collections of user-generated content from Wikipedia, Flickr and Geonames. For each region, Elephant Path lists Wikipedia entries and selects some of the monuments, parks, and other popular sites with a story. It consolidates the the Wikipedia entries with geographical coordinates via the Geonames API. Then, it uses the Flickr API to collect the information photographers share at these locations. Finally is applies our own network data analysis algorithms to filter the data, produce travel sequences and measure photogenic levels.

We have done it for both Paris and Barcelona. For each city, Elephant Path provides measures on the main trails, on the photogenic attractions and the months of activity. For instance the information reveals that:

Elephant Path
Paris seems to be a “summer” destination according to their monthly photographic activity. If you are in Paris during that period, the parks (Bois de Boulogne, Parc Monceau and Jardin du Luxembourg) might not be you visiting priorities. Indeed, these sites seem to be more photogenic in Spring and Fall. But if you are at Jardin du Luxembourg, there is some chances that you were in the St-Germain des Prés neighborhood (e.g. Café de Flore) previously and that your stroll there might very well bring you to Centre Pompidou that links the nearby Panthéon with the trendy Marais neighborhood. Barcelona seems to be more of “fall” destination according to the monthly photographic activity. Discover it yourself. [More screenshots]

An interface designed for you to copy and adapt it
But Elephant Path doesn’t end with data visualization, maps and graphs that can be embedded into web pages. It is meant to be open and be appropriated in unexpected ways. The Impure platform offers numerous data access, information processing and visualization capabilities. You can copy the code and data of Elephant Path and improve it in your workspace. Content of the work content is under the terms of a Creative Commons License. Do not hesitate in ripping and adapting it!

Data City: A Text for Visual Complexity, the Book

Early last year Manuel Lima kindly invited me to contribute to his book at Princeton Architectural Press on the topic of Network Visualization. The book VisualComplexity: Mapping Patterns of Information is not available for pre-order. However my essay did not make through the last editor’s pass. My role was provide an overview of the topic “data city”, its future implications and the role of visualization in this context. I tried to give a high-level reflection on the field evaluating its present and future while keeping the text accessible with tangible examples. It was written in January 2010, it is unedited, but you still might find some relevant elements:

City and information

A city has, by default, always been about information and its diffusion. Historically, fixed settlements permitted the development of newspapers and the possibility for the exchange of information. It will continue to do so in the near future given the volume of data modern cities generate and the emerging selection of algorithms and visualizations available to us to extract information.

The digitization of information

Indeed, we are noticing a digitization of the contemporary cities with technologies embedded into its streets and buildings and carried by people and vehicles. This evolution has appended an informational membrane over the urban fabrics that afford citizens new flexibility in conducting their daily activities. Simultaneously, this membrane reports on previously invisible dynamics of a city; providing new means to the multiple actors of the urban life to reshape the spaces, the policies, the flows, the services and the many different aspects that constitute a city. For instance, the aggregated view of mobile phone traffic reveals the “pulse” of a city, detecting anomalies such as traffic congestions in real-time. Similarly, the deployment of radio frequency identification (RFID) tags connects inanimate objects into an Internet of Things. Ben Cerveny, strategic and conceptual advisor to the design studio Stamen, coined this evolutions as ”things informalize” using the following terms: “the city itself is becoming part of the Internet with a world of data moved piece by piece and collided against a open source toolchain and methodology”.

Tools and platforms to reveal the data city

The data collisions described by Ben Cerveny produce multiple layers of urban information accessible to the actors of the city for their appropriation. Mixed with the emergence of accessible cartography (e.g. Open Sreet Maps), descriptive languages (e.g. KML), data visualization platforms (e.g. GeoCommons), and data processing techniques (e.g. Geocoding), today’s representation of cities do not only depict the cityscape, they reveal conditions in the city that were previously hidden in spreadsheets and databases. As the datasets become more complex and their model of representation richer, graphically representing the city has become less a matter of convention and more a matter of invention. Indeed, traditional cartography with primitive line drawing and static images now co-exist with flexible solutions that separate row data from the map, and promote exploration with multiple-scale interactivity and reactive environments. This evolution was particularly striking with the popularity of “mash-ups”, linking information to space and mapping newly accessible urban data on top of interactive imageries.


The popularity of “mash-ups” have determined larger initiatives (e.g. “open data” and “web of data”) to free urban data from their silos and promote the public appropriation. Practically, city and government data have been also moving onto the Web making accessible the locations of infrastructures, crime reports or pollution readings. In consequence, “data scientists”, developers and designers create palettes of city data-based visualizations and application, transforming data and their visualization into a public good. In parallel, other platforms such as Pachube have contributed to the bottom-up generation and upload of city data with visualization platforms such as GeoCommons or IBM’s Many Eyes to communicate and share views. This participation offers the opportunity to change cities urban strategies, with potential innovations creating news way to look at the process of citymaking.

Other ways to share the dynamics of the city have emerged in a less obvious but nevertheless indicative unfolding. For instance, of the past years, Idealista a Spanish online real estate ad platform had been accumulating massive amount of information on the cities housing market. It is only recently they have started to offer, almost in real-time, their analysis of the evolution of the real estate market back to the public, with and API for developers to appropriate the results. This strategy offers a city the kind of insights that previously only tedious administrative survey procedures were producing.

Similarly, as the information is not always well-formatted for the analysis and visualization inquires, some initiatives had to develop “web scrapping” techniques to extract valuable data from the public web sites of local institutions and services providers. For instance, for the Oakland Crimespotting, the developers at Stamen Design parsed the web site of the Oakland City Police to produce an effective interactive visualization of crime data showing residents where crime is occurring and what types of crimes are being reported.

The roles of visualization

Exploration and engage in the discussion
The work of Stamen proved that this type of “interventionist mapping” go beyond the expository. Indeed the use of interactive visualization allow exploration and question-making; broadening the urban policy conversation. In fact, aesthetics plays a fundamental role in engaging the discussion. It is not without a reason may visualization of urban informational layers are exhibited in Museums. Indeed, the application of aesthetics to data does not only try to make citizens aware of what is happening around them, but also figuring out the most elegant ways of making the unseen felt and gather feedback. As a researcher at MIT Senseable City lab, I experienced the fundamental utility of “beautiful” visualization as part of investigation process, to attract attention of cities stakeholders, stimulate the dialogue and stretch the imagination. For instance, very early on in the Tracing the Visitor’s Eye project, we produced visualizations to acquire first-hand feedback from journalists and inhabitants of the Province of Florence. They naturally contextualize our work to the local politics, wondering whether the our results could help move the David statue to more appropriate tourist areas or whether they could better understand the impact of the implementation of low-cost airline in a near-by airport. In contrast, they also helped highlighting the “imperfect mirror to reality” we were projecting, rightfully arguing that the models and data supporting the visualization reveal only a partial perspective on visitors dynamics.

Decision making – integration into existing practices
The critics of mash-ups and raw data visualization is the necessary first step to produce knowledge. It leads to investigation, further linking the data to improve the ways professionals and authorities understand and manage cities. Indeed, architects, transportation engineers, urban planners, policy makers, community groups rely on new types of representations as communication instrument as much as means to analyze urban dynamics. In fact, the application of visualizations that combine the emerging time-space data has proven vital; particularly because language through which designer, planners and decision makers communicate plans is mainly visual.

Responsive environments
Outside the realm of professionals, the flexibility of new data processing and visualization techniques facilitate their communication to the public through multiple mediums, from projection on building facades to the transformation of physical space. Indeed, the cityscape offers plenty of interfaces to display the state of city-scale services such as energy consumption (e.g. green smoke) or road traffic. When communicated in real-time, the information creates a responsive environment capturing city dynamics, supporting the decision-making and adapting to the changing needs of the public. MIT Senseable City Lab’s seminal project WikiCity exemplifies the implementation of this feedback loop mechanism. This urban demo proposed a visualization platform for the citizen of Rome to view on large screens the city’s dynamics in real-time (e.g. presence of crowd, location of buses, awareness of events). This platform enabled people, participating to the Notte Bianca event, to become prime actors themselves, appropriating dynamically the city and the event. Besides the importunity of this type of responsive environment to improve the experience of a city, it raises challenges to design the mechanisms by which these services are provisioned and understand for which activity that citizens utilize them for?


The modern city is built not just upon physical infrastructure, but also upon patterns and flows of information that are growing and evolving. We are only at beginning of the development of the tools and visualizations that allow us to see these complex patterns of information over huge spans of time and space, or in any local context in real-time.
Yet, this data city face major challenges. Particularly, the collection of data and their communication involves the collaboration of multiple actors in different languages at the crossroad of urbanism, information architecture, geography and human sciences. Indeed, it is evident that the understanding of a city goes beyond logging machine states and events. Therefore, the data scientist fascination of the massive amount of data cities produces in “real-time”, should not discard the other points of view necessary to understand the city, its environment and its people. In other words, data alone does not explain and their visualizations do not stand alone.

Why do I blog this: Thanks to Manuel for the invitation. Even though the text did not pass the final cut, it was a very healthy and fun exercise to try to write about my work and domains of investigation in accessible terms.

Sketching with Data

These past weeks I had to chance to work with the alpha version of Impure, a new visual programming environment developed by my good friends at Bestiario. Impure offers a full visual language to retrieve, manipulate, process and visualize information:

Impure allows the acquisition of information from different sources, ranging from user-specific data to popular online feeds, such as from social media, real-time financial information, news or search queries. This data can then be combined in meaningful ways using built-in interactive visualizations for exploration and analysis.

Based on an event-based development structure, the software consists of 5 different modules.
1. Data Structures, which hold data coming from a data source (e.g., Number, String, List, etc.).
2. Operators, which have 1 or more receptors that enable the system to perform a specific operation (e.g., addition or subtraction).
3. Controls, which act as dynamic filters (e.g., interval selectors).
4. Visualizators, which receive data structures from operators or controls and visualize it. They usually return emitters on selected visual objects that can be used as input into another module.
5. APIs that allow real-time communication with various data sources such as Google, Twitter, Facebook, Flickr, Delicious, Ebay, etc.

The prime objective of Impure aims at bridging the gap between ‘non-programmers’ and data visualization by linking information to programmatic operators, controls and visualization methods through an easy visual and modular interface. Yet, I must admit that Impure has a lot to offer to programmers and data specialists; particularly those who necessitate to “sketch with data” as part of their practice. In other words, the type of professionals who process and visualize datasets as part of their investigation process rather than uniquely generating results.

loading data

My experience in leading investigations that aim in extracting value from network data, exploratory data visualization is crucial to quickly recognize patterns and understand complex events. But as my projects involve diverse sets of professionals (e.g. ) being able to quickly sketch an interactive dashboard is a guarantee to have a common language that helps the different actors in asking better questions, getting better feedback from them and properly focusing the investigation (some call it visual thinking or to some extend predictive analytics). Ultimately, the use of plateforms such as Impure offer that opportunity to collect insights that give a project the upper hand in decision making. Moreover, the flexibility of a visual programming environment permit to go beyond the limited possibilities for design of GIS and statistical softwares while reducing the fast-prototyping time necessary to program specific interactive visualizations (e.g. see the animations of traffic density and flows in Zaragoza, Spain, based on real-time information) produced in a few hours with Impure).

Practically, in addition to quickly share a first exploratory analysis, environments such as Impure can simplify the practice of ethno-mining particularly to co-create data with participants of the field research (see Numbers Have Qualities Too: Experiences with Ethno-Mining).

Sketching with data
Sketching a solution for the Louvre Museum with Impure (see the complete “Sketching with data” Flickr set)

Similar approach to “sketching with data”, Stamen that has for long been leading investigative data visualization projects generally we divide their process into three distinct phases—explore, build and refine. Based on their experience, they outlined some of their common assumptions about data visualization and recommendations for how to do this kind of work; one particularly relevant to the exploration (i.e. sketching) phase:

(19) Start and End With Questions

“Traditional statistical charts can be a good first step to generate questions, especially for getting an idea about the scope of a data set. Good questions to start with include “how many things do we have”, “what do we know about each thing”, “how do the things change over time”, “how many of each category of thing do we have”, “how many things are unique” and “how many things is each thing connected to”. I don’t believe that any visualization can answer all of these questions. The best visualization will answer some questions and prompt many more.”

Apparently, they are engaged in the a similar path as Bestiario, using a Knight News Challenge grant to build a series of tools to map and visualize data that is truly Internet-native and useful. Flexible and “internet-native” environments that make easier to work with information are also emerging in the data storage end of “sketching”, for instance with the Barcelona-based FluidDB.

Zaragoza car traffic behavior @Bcn Design week
At Barcelona Design Week, sharing an animation of the traffic flows in Zaragoza sketched with Impure using real-time data feed from BitCarrier.