Message-based API, Part 2

Kasey Speakman - Sep 28 '17 - - Dev Community

In the previous post, I described a message-based API where communication with the API is done through the posting of messages. This is an alternative to REST, although my implementation shared many characteristics with REST. The reason an alternative may be needed is simply to shift thinking. REST describes a lot of interesting benefits, but in practice it frequently leads developers down the path of mapping URLs one-to-one with database entities. This tight coupling between client and API internals makes for a brittle system which is resistant to change. However, thinking in terms of messages allows clients to think in terms of goals and intentions of API interaction without having to know what's behind the wall.

In this post, I'm going to push message-oriented thinking even further. All the way down to how changes inside an API are represented. First, let's go over the common way to "change stuff" inside an API, like saving data to a database.

    // validate, decide, setup data for saving
    var data = ... ;
    ...
    Sql.Write("INSERT ...", data);
Enter fullscreen mode Exit fullscreen mode

Or maybe if you are using an ORM (I wish you wouldn't).

    // validate, decide, setup data for saving
    var data = ... ;
    ...
    ormContext.Add(data);
Enter fullscreen mode Exit fullscreen mode

Code like this mixes concerns. Even using something like a Repository pattern here, your business logic is both making decisions and performing the side effects of those decisions. It then becomes really easy to add even more side effects here. Such as: now users want to be emailed whenever this thing happens. So, dev adds a line to send an email.

    // validate, decide, setup data for saving
    var data = ... ;
    ...
    Sql.Write("INSERT ...", data);
    emailer.Send(data);
Enter fullscreen mode Exit fullscreen mode

The illusion here is that the email functionality is encapsulated somewhere else, and I'm just calling it here. But the reality is if that email function throws an exception, your business logic throws it too. If that's not acceptable, now you are in a place where your business logic is peppered with try/catch to handle problems performing side effects. We've also created more work for ourselves in tests. We have to mock SQL and emailer objects just to be able to test decision logic.

How could messages help?

What if we could just return the decision that was made (as a message) and let some other component be responsible for side effects that should result? Maybe then the business logic could look like this.

    // validate, decide
    ...
    // return message
    return new OrderPlaced { ... };
Enter fullscreen mode Exit fullscreen mode

How hard is this code to test versus the previous example? How hard is this code to understand? I believe it is very easy on both counts. Nothing to mock. Nothing external to worry about.

In order to distinguish these messages from the ones that are used to communicate with the API, they are often referred to by different names. The API communication messages are often called Commands (confusingly not the same as the GoF pattern), and the messages representing decisions and other happenings inside the API are called Events.

Once the code makes a decision and then returns an event representing that decision, then the API passes the event to any interested component (usually called event handlers). One interested party might convert that event into SQL statements. Another might convert it into an email.

I hear you saying "Wait, wait, wait. Isn't this the same as Event Sourcing?" No, actually. Using Events to represent changes in your API does not require you to save and load events as your source of truth. You can quite happily get along with a relational database as the source of truth, and simply translate the API events into SQL statements to update that source of truth. I have APIs which do this. Using internal events allows you to separate decision code from effect code regardless of event sourcing.

Event sourcing is not required to do this. But it does take this capability to the next level. Because with event sourcing your events can be seen by services operating on other computers, not just internal to the API. However, I use this pattern without event sourcing in brownfield projects where migrating existing data into events would be too costly.

This also provides a hook-in point to respond to events of interest at the business level instead of the data level. Without events, when data gets written to a relational table it has lost all business semantics -- it's now just data. Sure, I can add an update timestamp to see when it changed. Or I might even record the previous state so I can diff between current and previous version. That tells me what parts of the data changed, but it still probably doesn't tell me what happened from a business perspective. I'm left to guess at that.

With events though, I can think in terms of what this event means to the business. And how my component needs to respond to that event (if at all). It's also pretty easy to add cases in a way that doesn't affect the business logic which generated the event.

public class EmailHandler() {

    private Email GenerateNewOrderEmail(OrderPlaced e) { ... }
    private void SendEmail(Email email) { ... }

    public void Handle(IApiEvent apiEvent) {
        switch (apiEvent) {
            case OrderPlaced e:
                SendEmail(GenerateNewOrderEmail(e));
            default:
                return; // do nothing

            // maybe later OrderShipped case is added
        }
    }
}
Enter fullscreen mode Exit fullscreen mode

For events to be of any use, they should be modeled in business terms, not data terms. For instance, OrderUpdated is data without business meaning. It's just saying some data in the order was updated. I can't tell whether I care about the event without digging into its data. On the other hand OrderCanceled or OrderPaidInFull are perhaps well-modeled business events because they reflect the semantics of the ordering process.

The real world

So the real world brings up further refinements that my simplistic examples above do not cover. For one thing, we need to be able to return multiple events.

    // validate, decide, return decision
    ...
    return new [] {
        new ItemTransferredToFulfillment { ... },
        new ItemSoldOut { ... }
    };

Enter fullscreen mode Exit fullscreen mode

And often there are batching use cases where we need to run multiple individual commands in an all-or-nothing manner. So instead of business logic directly returning events, it just adds them to a "pending" list. You might recognize this as the Unit of Work pattern.

    // validate, decide, return decision
    ...
    context.AddPending(
        new ItemTransferredToFulfillment { ... },
        new ItemSoldOut { ... }
    );
Enter fullscreen mode Exit fullscreen mode

In order to return multiple message types as though they are the same type, a Marker Interface usually does the job. In functional languages, a union type might be used instead.

// marker interface, no properties or methods
public interface IApiEvent { }

// elsewhere...

public class ItemSoldOut : IApiEvent
{ ... }

public class ItemTransferredToFulfillment : IApiEvent
{ ... }
Enter fullscreen mode Exit fullscreen mode

Depending on what pattern matching facilities your language has, it might be annoying to handle events when given the marker interface. It's not so bad in C# 7 (or F#).

public void Handle(IApiEvent apiEvent) {
    switch (apiEvent) {
        // only care about this case and no others
        case ItemSoldOut e:
            ...

        default:
            return;
    }
}
Enter fullscreen mode Exit fullscreen mode

I'll often want to batch multiple updates together in a transaction so either all changes are made to the system or none of them are. So I generate "patches" first, and then run all patches in a transaction.

// part of a class that handles Order table changes
public IEnumerable<Patch> GetPatches(IApiEvent apiEvent) {
    switch (apiEvent) {
        case OrderPlaced e:
            yield return new Patch(
                // query
                "INSERT ... VALUES (@OrderDate, ...)",
                // key/value tuple, list as many as you want
                ("OrderDate", e.OrderDate),
                ...
            );

            // each event can generate multiple "patches"

        ...
    }
}
Enter fullscreen mode Exit fullscreen mode

Depending on the system, I might batch them all into one large statement (one round-trip for all updates) or start the transaction in code and run them individually.

Identity

If you use auto-incrementing IDs, then you might have a bit of chicken and egg problem. Because your event handlers probably need the auto-generated ID, but you don't know what it will be when you create the event. Handling this requires a slightly different arrangement, where persistence isn't "just another event handler". You'll likely have to specially run the persistence handler first, get the auto-generated ID back, then update the event to include it. (This is called Event Enrichment in some circles.)

In general, I don't like to use auto IDs are a primary identifier. Because auto IDs are a side effect (increase counter) on top of a side effect (insert record). Architecturally, that makes them hard to deal with. Instead, I will use a UUID for primary identity. Then if the business requires a more friendly identifier, I will use an auto ID or user-entered string or whatever as a secondary ID. But this secondary identifier will be purely for human-friendly searching.

Command handlers

Also I haven't really discussed command handlers. This is the place where I tend to prepare all the data needed to run business logic. That way, the business logic can be purely deterministic and ridiculously easy to test. And command handler integrations (loading from DB, calling external API, etc) are exercised with integration testing.

Here is an example handler for the PlaceOrder command, where IO is interleaved with business logic (OrderFactory and order calls), but the business logic is still deterministic.

public void Handle(ApiContext context, PlaceOrder command) {
    // is this order even valid?
    var order = OrderFactory.Create(command);
    if (order.IsValid) {
        // load inventory status from DB for requested items
        var inventoryStatusList = ... ;
        // maybe it throws for errors, like OutOfStockException?
        order.CheckInventory(inventoryStatusList);
        // get decisions made about this order
        context.AddPending(order.GetEvents());
    }
}
Enter fullscreen mode Exit fullscreen mode

Implementations vary

All of the above is just a sketch of what message-based (on the inside) APIs can look like. In practice, some of the implemented pieces will depend on the problems you are solving. The thing I have found most delightful about this kind of infrastructure is that, like all good code, it is easy to change as you learn new information.

The C# code above is off-the-cuff, not guaranteed to compile. I actually write this kind of API in F# with slightly different idioms, but you can see above that it is equally expressible in OO languages. I feel like the above is far from a complete explanation, but hopefully it is a good start. I may amend this post as I think of things.

. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . .
Terabox Video Player