Jacquard 04: Extending The Data Model
Since our previous installment got us
users and authentication, we’re ready to start
adding support for various services. One of the features of Jacquard
– one of the main points of Jacquard, actually – is interacting
with multiple services. To make that possible, we’ve included the
accounts
attribute in our User class:
# this attribute is going to contain info about all the services
# this user has configured -- i.e., there will be one for Twitter,
# one for Facebook, etc.
has accounts => (
isa => 'KiokuDB::Set',
is => 'ro',
lazy => 1 ,
default => sub { set() },
);
Before we start actually writing the code for the things that are
going to be inside those accounts
attributes, it’s worth stepping
back and thinking about what the data we want to store looks like,
and how we can best structure our classes to help us manipulate that
data.
Ideally, what we want – and again, this is a reflection of the underlying raison d’etre of Jacquard – is to be able to interact with all the different services we support in the same way without having to worry how each individual service handles fetching new posts, or writing a new post out. In other words, to be able to do something like this:
# warning: pseudocode
foreach my $account ( $self->accounts->members ) {
$account->get_new_posts;
}
# or even
my $new_post = get_new_post_from_user();
foreach my $account ( $self->accounts->members ) {
$account->post( $new_post )
}
Since we’re using Moose, we’re going to have a number of
Jacquard::Schema::Account::FOO
classes, one for each type of service
we’re going to interact with. Each of those classes will consume an
API-defining role (called something like
Jacquard::Schema::Role::AccountAPI
) that will define the required
methods for the common interface we want all services to have – that
is, making sure they support a get_new_posts
method, and a post
method, and so forth.
We also need to think about how to structure the other data associated
with services. That data will fall into two broad categories: account
information (and similar data, like authentication tokens, etc.) and
posts on that service. Depending on the exact service, the details of
what a ‘post’ is will differ – a tweet is not quite a FaceBook post,
and both are very distinct from a blog post entry in an Atom feed –
but they share enough commonalities that we can again take a similar
approach: a number of Jacquard::Schema::Post::FOO
classes, each
consuming a Jacquard::Schema::Role::PostAPI
role that ensures that
they’re implementing a common API.
The benefit of this approach is that it becomes relatively trivial to add support for new services – particularly if there’s already a module on CPAN to handle interacting with that service. As we’ll see in a number of upcoming posts, taking an existing library and wrapping it in a layer of Moose class to implement a particular API is very little work.
I often find it helpful, when I’ve gotten to this point in the process of designing something and I think have a workable solution, to work out how a small set of data would map across instances of these classes. Let’s consider how a single Jacquard user, with a Twitter account configured within Jacquard, would look once a few tweets were stored:
# first, the user
my $user = Jacquard::Schema::User->new( name => 'Bob' );
# then we add the account
$user->add_account( 'Twitter' , %account_details );
# and then we get the most recent posts
$user->get_account( 'Twitter' )->get_new_posts();
And for this design, this is the point where I realize I haven’t
really thought at all about how the individual Post
objects will be
associated with the Account
object. The simplest way to do this
would be for the Account
classes to have a posts
attribute, much
like the accounts
attribute in the User
class:
has posts => (
isa => 'KiokuDB::Set',
is => 'ro',
lazy => 1 ,
default => sub { set() },
);
Assuming that is how we go, after that get_new_posts()
method above,
we’d end up with the accounts
attribute of $user
containing a
KiokuDB::Set
object with a single member – an instance of
Jacquard::Schema::Account::Twitter
. Inside that object, there would
be another KiokuDB::Set
object with a bunch of members, each one an
instance of Jacquard::Schema::Post::Twitter
corresponding to an
individual tweet from this user’s timeline (and having attributes like
‘author’, ‘content’, ‘datetime’, ‘id’, and so on).
That all seems to make a reasonable amount of sense – but there’s a
looming problem. (If you’ve already spotted it, give yourself a pat on
the back.) What happens when we add a second User
, also with a
Twitter account configured, and that user ends up following some of
the same people on Twitter as our first user? We’ll end up with Post
objects that are essentially duplicates of each other – but one copy
will be inside the Twitter Account
object of the first user, and the
second will be inside the Twitter Account
object of the second. In
the long run, that’s going to end up being a big waste of storage
space and/or RAM.
How do we solve this problem? We could just maintain a uniform set
of Post
objects, and include the same object into different
Account
objects as needed. That would solve the issue with the
duplication of information – but it introduces another issue, in that
we no longer have a place to store per-user metadata about individual
Post
objects (e.g., read/unread status).
Instead, we’ll solve this problem the old fashioned way: we’ll
introduce another layer of abstraction! Instead of the posts
attribute of Account
objects containing Jacquard::Schema::Post
objects directly, we’ll have a generic Jacquard::Schema::UserPost
object that maps between a User
account and a particular
Post
. In return for making our data model slightly more complicated,
this approach gives us the best of both worlds, in that we have a place
for per-User
metadata to live, but the original Post
data is only
present in our system once, regardless of how many of our users have
it present in their Account
objects.
This solution also doesn’t impact any of our previous design
decisions: the Jacquard::Schema::Role::PostAPI
can be consumed by
the Jacquard::Schema::UserPost
class too, via method delegation to
the Jacquard::Schema::Post::FOO
object it refers to. (More about
that later.)
So, now that we’ve been though a couple of rounds of thinking about the data structures, we can write some code, yes? Well, no, not exactly, not quite yet. Instead we’re going to explore what the API for using this code might look like, and we’re going to do that by writing some test code, or at least outlining some test cases.
First, we’re going to need to be able to add Account
objects to
User
objects, and Post
objects to Account
objects. The code for
that will probably look something like:
## method to associate an Account object with a User
# $model is an instance of Jacquard::Model::KiokuDB,
# while $user is a Jacquard::Schema::User,
# and $account->does('Jacquard::Schema::Role::AccountAPI')
$model->add_account_to_user( $account , $user );
## method to add a UserPost object to an Account -- will also save the
## associated Post object if needed
# $model is an instance of Jacquard::Model::KiokuDB,
# while $account->does('Jacquard::Schema::Role::AccountAPI')
# and $post->does('Jacquard::Schema::Role::PostAPI')
$model->add_post_to_account( $userpost , $account );
(Aside: I spent more time than I am willing to disclose trying to decide whether the order of arguments in those methods should be ‘more generic object, more specific object’, or if they should instead be the same as in the method name – which is what I finally went with, reasoning that will serve as a mnemonic.)
We’ll also need to be able to remove Account
objects from User
objects, and modify Post
and UserPost
objects and save those
changes:
## method to remove an Account object from a user
# $model is an instance of Jacquard::Model::KiokuDB,
# while $account->does('Jacquard::Schema::Role::AccountAPI')
# doesn't need a user object because $account has an 'owner' attribute
$model->remove_account_from_user( $account );
## method to store a modified Post or UserPost object, e.g. after
## changing some of the metadata, or if the underlying post object is
## updated
# $model is an instance of Jacquard::Model::KiokuDB,
# while $post->does('Jacquard::Schema::Role::PostAPI')
$model->update_post( $post );
Those all look reasonable, at least at first glance. It’s worth noting that none of the methods have an explicit return value – they will just return true on success (and will likely eventually throw some sort of exception object on failure). This approach makes the most sense to me, as it is isolates the model to just marshaling objects to storage. Any modification or manipulation of the objects will happen elsewhere.
At this point, we’ve fleshed out the data model more, we have the
beginnings of a plan for implementing the code that will let Jacquard
interact with other services, and we have the outline for how we’re
going to store and update that data in KiokuDB once we have it. That
seems like a good place to stop. Next time, we’ll actually implement
the first Account
and Post
classes!
As always, the code for Jacquard is available on Github. Patches are welcome; share and enjoy.