Jacquard 04: Extending The Data Model

Since our previous installment got us users and authentication, we’re ready to start adding support for various services. One of the features of Jacquard – one of the main points of Jacquard, actually – is interacting with multiple services. To make that possible, we’ve included the accounts attribute in our User class:

# this attribute is going to contain info about all the services
# this user has configured -- i.e., there will be one for Twitter,
# one for Facebook, etc.
has accounts => (
isa => 'KiokuDB::Set',
is => 'ro',
lazy => 1 ,
default => sub { set() },

Before we start actually writing the code for the things that are going to be inside those accounts attributes, it’s worth stepping back and thinking about what the data we want to store looks like, and how we can best structure our classes to help us manipulate that data.

Ideally, what we want – and again, this is a reflection of the underlying raison d’etre of Jacquard – is to be able to interact with all the different services we support in the same way without having to worry how each individual service handles fetching new posts, or writing a new post out. In other words, to be able to do something like this:

# warning: pseudocode
foreach my $account ( $self->accounts->members ) {

# or even
my $new_post = get_new_post_from_user();
foreach my $account ( $self->accounts->members ) {
$account->post( $new_post )

Since we’re using Moose, we’re going to have a number of Jacquard::Schema::Account::FOO classes, one for each type of service we’re going to interact with. Each of those classes will consume an API-defining role (called something like Jacquard::Schema::Role::AccountAPI) that will define the required methods for the common interface we want all services to have – that is, making sure they support a get_new_posts method, and a post method, and so forth.

We also need to think about how to structure the other data associated with services. That data will fall into two broad categories: account information (and similar data, like authentication tokens, etc.) and posts on that service. Depending on the exact service, the details of what a ‘post’ is will differ – a tweet is not quite a FaceBook post, and both are very distinct from a blog post entry in an Atom feed – but they share enough commonalities that we can again take a similar approach: a number of Jacquard::Schema::Post::FOO classes, each consuming a Jacquard::Schema::Role::PostAPI role that ensures that they’re implementing a common API.

The benefit of this approach is that it becomes relatively trivial to add support for new services – particularly if there’s already a module on CPAN to handle interacting with that service. As we’ll see in a number of upcoming posts, taking an existing library and wrapping it in a layer of Moose class to implement a particular API is very little work.

I often find it helpful, when I’ve gotten to this point in the process of designing something and I think have a workable solution, to work out how a small set of data would map across instances of these classes. Let’s consider how a single Jacquard user, with a Twitter account configured within Jacquard, would look once a few tweets were stored:

# first, the user
my $user = Jacquard::Schema::User->new( name => 'Bob' );

# then we add the account
$user->add_account( 'Twitter' , %account_details );

# and then we get the most recent posts
$user->get_account( 'Twitter' )->get_new_posts();

And for this design, this is the point where I realize I haven’t really thought at all about how the individual Post objects will be associated with the Account object. The simplest way to do this would be for the Account classes to have a posts attribute, much like the accounts attribute in the User class:

has posts => (
isa => 'KiokuDB::Set',
is => 'ro',
lazy => 1 ,
default => sub { set() },

Assuming that is how we go, after that get_new_posts() method above, we’d end up with the accounts attribute of $user containing a KiokuDB::Set object with a single member – an instance of Jacquard::Schema::Account::Twitter. Inside that object, there would be another KiokuDB::Set object with a bunch of members, each one an instance of Jacquard::Schema::Post::Twitter corresponding to an individual tweet from this user’s timeline (and having attributes like ‘author’, ‘content’, ‘datetime’, ‘id’, and so on).

That all seems to make a reasonable amount of sense – but there’s a looming problem. (If you’ve already spotted it, give yourself a pat on the back.) What happens when we add a second User, also with a Twitter account configured, and that user ends up following some of the same people on Twitter as our first user? We’ll end up with Post objects that are essentially duplicates of each other – but one copy will be inside the Twitter Account object of the first user, and the second will be inside the Twitter Account object of the second. In the long run, that’s going to end up being a big waste of storage space and/or RAM.

How do we solve this problem? We could just maintain a uniform set of Post objects, and include the same object into different Account objects as needed. That would solve the issue with the duplication of information – but it introduces another issue, in that we no longer have a place to store per-user metadata about individual Post objects (e.g., read/unread status).

Instead, we’ll solve this problem the old fashioned way: we’ll introduce another layer of abstraction! Instead of the posts attribute of Account objects containing Jacquard::Schema::Post objects directly, we’ll have a generic Jacquard::Schema::UserPost object that maps between a User account and a particular Post. In return for making our data model slightly more complicated, this approach gives us the best of both worlds, in that we have a place for per-User metadata to live, but the original Post data is only present in our system once, regardless of how many of our users have it present in their Account objects.

This solution also doesn’t impact any of our previous design decisions: the Jacquard::Schema::Role::PostAPI can be consumed by the Jacquard::Schema::UserPost class too, via method delegation to the Jacquard::Schema::Post::FOO object it refers to. (More about that later.)

So, now that we’ve been though a couple of rounds of thinking about the data structures, we can write some code, yes? Well, no, not exactly, not quite yet. Instead we’re going to explore what the API for using this code might look like, and we’re going to do that by writing some test code, or at least outlining some test cases.

First, we’re going to need to be able to add Account objects to User objects, and Post objects to Account objects. The code for that will probably look something like:

## method to associate an Account object with a User
# $model is an instance of Jacquard::Model::KiokuDB,
# while $user is a Jacquard::Schema::User,
# and $account->does('Jacquard::Schema::Role::AccountAPI')
$model->add_account_to_user( $account , $user );

## method to add a UserPost object to an Account -- will also save the
## associated Post object if needed
# $model is an instance of Jacquard::Model::KiokuDB,
# while $account->does('Jacquard::Schema::Role::AccountAPI')
# and $post->does('Jacquard::Schema::Role::PostAPI')
$model->add_post_to_account( $userpost , $account );

(Aside: I spent more time than I am willing to disclose trying to decide whether the order of arguments in those methods should be ‘more generic object, more specific object’, or if they should instead be the same as in the method name – which is what I finally went with, reasoning that will serve as a mnemonic.)

We’ll also need to be able to remove Account objects from User objects, and modify Post and UserPost objects and save those changes:

## method to remove an Account object from a user
# $model is an instance of Jacquard::Model::KiokuDB,
# while $account->does('Jacquard::Schema::Role::AccountAPI')
# doesn't need a user object because $account has an 'owner' attribute
$model->remove_account_from_user( $account );

## method to store a modified Post or UserPost object, e.g. after
## changing some of the metadata, or if the underlying post object is
## updated
# $model is an instance of Jacquard::Model::KiokuDB,
# while $post->does('Jacquard::Schema::Role::PostAPI')
$model->update_post( $post );

Those all look reasonable, at least at first glance. It’s worth noting that none of the methods have an explicit return value – they will just return true on success (and will likely eventually throw some sort of exception object on failure). This approach makes the most sense to me, as it is isolates the model to just marshaling objects to storage. Any modification or manipulation of the objects will happen elsewhere.

At this point, we’ve fleshed out the data model more, we have the beginnings of a plan for implementing the code that will let Jacquard interact with other services, and we have the outline for how we’re going to store and update that data in KiokuDB once we have it. That seems like a good place to stop. Next time, we’ll actually implement the first Account and Post classes!

As always, the code for Jacquard is available on Github. Patches are welcome; share and enjoy.