Over the past 3 months Bocoup has been working closely with the
Guardian Interactive team
on the Miso Project, a set of open source libraries designed to expedite and
simplify the creation of data-driven interactive content. We are excited to announce
the release of the first of these libraries
called Dataset.
You can see the code here on github.

Traditionally, data-driven interactive applications require a series of steps that all follow
a similar workflow. For example, remote data sources need to be fetched, the data is then
parsed to match the client-side representation of the model and then perhaps transformed
and queried to obtain the actual information required by the rendering layer. These steps
can be individually accomplished through custom code or a collection of existing libraries
and frameworks. When writing Dataset, we wanted to simplify this part of the workflow by
creating a single library that managed the entire process.

Dataset comes with a growing list of examples
that showcase not only its ease of use, but also how easy it is to integrate it into
existing libraries. For example, here’s a quick bar chart showcasing Dataset’s own
github repo commit history using jQuery Sparklines:

Some of Dataset’s facilities echo common patterns that we already see in MVC frameworks:
easy tie-in to API endpoints and creating common client-side models that then make the data
accessible. The Dataset structure is itself a similar abstraction to Backbone collections
or Ember arrays for handling sets of data. While many frameworks provide these facilities
for handling sets of data, our focus is on providing a more efficient implementation with
more extensive APIs for manipulating the incoming data. By making data a first-class citizen
in an MVC application, Dataset functions as a management layer that can be used as a step
before, during or after an MVC framework like Backbone.js. It is our goal to grow the library
in a way that facilitates interoperability with those frameworks and so we are looking
forward to hearing how you might use it in your workflow.

Available Features

Dataset has a variety of features that try to cover the common set of functionality required by client-side data-driven applications:

  • A series of importers
    are responsible for fetching data from
    local and remote sources, like google spreadsheets.
  • A variety of parsers are waiting to transform the incoming data
    from formats such as CSV to our standard and fast-to-traverse format.
  • A series of computational functions are available
    to easily obtain metrics about one or more columns in the data, such as min/max and groupBy.
  • A simple and powerful filtering API that lets you create sub-selections
    of the data that match a particular set of conditions.
  • An event system that allows subscription to specific data changes such as new rows being
    added or existing rows being updated.

While Dataset was written to facilitate browser-based data management, thanks to Tim Branyen‘s
efforts it is now also available as a node.js module.

Why We Built It

At Bocoup, we are committed to moving the Open Web forward through developing new Open Web technologies for
industries in transition to the web. We want to ensure that as this happens, the best software tools for doing so are open.

Inspired by the needs of journalism, we set out to work with the Guardian, a leader in Open Journalism,
to develop Open Web data journalism tools. As we began building Dataset, we realised that its facilities
are valuable for other software paradigms that we work on at Bocoup, and so we are excited to release
a tool that that focuses on client-side data management.

We as web developers have a lot to learn about narrative and storytelling in our data focused web applications.
In light of this, it is especially compelling for us to be focusing on an initiative
like the Miso Project that bridges this gap.