Datasports on Software Development

Articles and updates from Datasports about the craft of software

StreamBase and Productivity

with 3 comments

1. Introduction

The StreamBase development platform is marketed on a number of selling points, the two main benefits being high performance and rapid application development (two benefits that are generally presumed to be in opposition with one another). In this article I will share my experiences positive and negative regarding the productivity of working with the StreamBase platform.

The discussion touches on EventFlow coding vs. the Java and .Net platforms, the value of the library of adapters, and build/test/deploy realities.

2. EventFlow Code vs. Other Platforms

EventFlow diagrams as found in StreamBase marketing materials are very compelling. The code looks so elegant and accessible that one has to wonder if it’s too good to be true… What’s the catch? The short story in my experience is that while these examples show StreamBase in an idealized form, they do convey a fair representation of what it’s like to design and implement systems in EventFlow code.

Of course, we never get something for nothing, and while I don’t think that there’s a “catch”, I do believe that there is a very important caveat:

NOTE: The StreamBase platform and EventFlow coding model provide tremendous gains in the ease and speed of developing high-performance systems provided the problem maps well to StreamBase’s event processing model.

Systems that map well to StreamBase’s event processing model are real-time systems which will consume traffic in the form of well-defined messages or events from disparate systems, and which respond to requests or certain conditions by sending out events or messages to other systems.

Just about all real-time financial services systems are an excellent fit for StreamBase (especially anything related to electronic trading), and it is in this domain where StreamBase has found its most receptive clients to date. However, there are many other problem domains that fit this general high-level description, such as sensing and data acquisition, healthcare informatics, biometric monitoring, electronic messaging, and innumerable others. It remains to be seen if customers from these other problem domains will adopt StreamBase and push the platform to grow in new directions.

Systems that are not a good fit for StreamBase include anything that does complex recursive batch processing on large static datasets, intensive graphics rendering, processing of data whose format is not known at design time and must be inferred from the data itself, and systems where performance is unimportant.

In this section, we dig a bit deeper and compare against the 2 most prominent development platforms for server-side application development. Java (using Eclipse as the IDE), and the .Net framework. In every case, the analysis assumes that we’re considering building a system that is a good fit for StreamBase.

2.1 General

The biggest part of a developer’s day-to-day duties involve implementing business logic as code. Here, there is no comparison between EventFlow and text-based languages. When writing EventFlow code, the usual boilerplate and plumbing required to create a real application are all handled elegantly by the platform so that developers’ efforts are concentrated 100% on business logic. And the provided set of operators are thoughtfully composed and presented such that it should be possible for a skilled StreamBase developer to implement that logic in a clean and clear way in very short order.

This has been touched on in previous articles, but in addition to the significant productivity gains when developing new logic, the ability to easily refactor existing logic into new reusable modules and to change the application topology and threading models is a huge win compared to traditional programming. Developers who have been faced with taking selected parts of functionality from multiple classes and re-packaging them into something that can run in multiple concurrent threads knows what a nightmare this can be, and the first time they do the analogous task in StreamBase they will feel waves of relief crashing over them.

For these and other reasons, I am confident that a developer who is comfortable with StreamBase will be about 3-5 times more productive at creating and maintaining EventFlow code than with traditional programming platforms. The development effort is all focused on logic and data structures instead of on APIs and language details, and where problems are found they are generally more easily rectified than with text-based programming.

2.1. Java/Eclipse

Let me say off the bat that I am not a fan of Java by any means. I feel that it is to some degree a victim of its early success and position as a trailblazer. Other modern languages/frameworks that came after benefited from lessons learned watching what worked well in Java and what was problematic, and as a result I think that Java and its associated development tools are less refined and harder to use and tune than .Net. With all that said, Java is the defacto standard for enterprise business logic servers, which tend to run on various flavors of Unix (including Linux) and it has proven its worth many times over.

StreamBase Studio is implemented as a set of plugins for the Eclipse IDE, so experienced Eclipse/Java programmers should feel very much at home from the start. People new to Eclipse will have a non-trivial learning curve to deal with in terms of basically everything outside the EventFlow canvas, but of course that’s normal with any new IDE. Compared with straight Java development, StreamBase’s use of Eclipse and plugins is smooth and seamless: everyday tasks like adding, renaming, moving files and folders; locating and opening resources; managing workspaces and projects are effectively the same as for Java projects.

I would light to highlight two specific productivity wins for StreamBase when compared to straight Java, one major, and one minor but well-known to many experienced Java developers.

2.1.1. Library and Infrastructure Stabilization

The strength and the weakness of Java is the wide array of libraries, extensions, commercial and open-source 3rd-party tools. Whatever you need to do in Java, there are at least 5 ways of doing it: 1 that’s ideal, 3 that can be made to work despite some frustrations and setbacks, and 1 that will make you hate life. A significant part of Java expertise is knowing which tool to grab for which job, and this extends to just about every element of system design, implementation, and deployment: XML parsing, data abstraction layers, network communication, threading models, application hosting, messaging middleware, etc.

With StreamBase, the majority of these decisions have been made for you, and I have never felt like my ability to deliver a quality product in a timely fashion was hurt by the choices made by StreamBase.

2.1.2. Date & Time Calculations

Date & Time calculations are an area where even the most ardent of Java champions tend to waver in their commitment. If you haven’t done it yourself it’s hard to imagine just how difficult simple Date & Time operations can be in Java. Seemingly simple operations like taking the date portion of one timestamp and combining it with the time portion of another can end up burning through hours of frustrating fights with some or all of the Calendar, java.util.Date, java.util.Time, java.sql.Date, java.sql.Time, SimpleDateFormatter, and other classes. When dealing with external systems using different Date & Time formats, the complexity rises significantly.

Have a look at the StreamBase Expression Languge docs for Timestamp functions. It’s simple, clear, well thought-out, and easy to use for western calendars across all timezones. My only beef with it is that there is not a clear differentiation between Interval Timestamps and Absolute Timestamps. I don’t understand why these couldn’t be 2 distinct types.

NOTE: I chose to include this despite the fact that it’s a relatively minor issue because to me it shows the care that has gone into the platform. In my experience, that care extends into almost every element of the coding experience.

2.2. .Net

Despite a lot of overlap, .Net has a different set of strengths and weaknesses than Java. Java’s strengths include cross-platform support and tremendous flexibility, while .Net’s strengths include greater clarity of platform choices and a more streamlined, linear progression in language features.

When compared to StreamBase, .Net’s weaknesses are all of the weaknesses inherent in any traditional programming language compared with the StreamBase EventFlow programming experience. .Net does a better job of mitigating those weaknesses than many other platforms, but they are still there.

2.3. Good old C++

OK, let’s be clear about something: in terms of rapid application development and developer productivity, StreamBase slaughters C++. It doesn’t belong in this discussion except for me to say that I think that C++ gets overlooked a bit too much for systems where performance is paramount. It really belongs in a discussion of performance, not productivity.

For very narrow use-cases where microseconds of execution time are more important than months of development time you could argue that you can get an end result with careful use of C++ that you just can’t get with the other platforms considered here, which complicates the productivity discussion.

3. Adapter Library

EventFlow is kind of sexy. It presents developers with a new way of thinking about and interacting with their code, which is appealing to some and threatening to others. I don’t care how much code you’ve written, when you’re trying to think through a design or explain it to someone else you draw a picture, and those pictures tend to look something like EventFlow code. It’s a natural fit for the ways that our brains process information visually. But I’m going to let you in on a little secret:

Implementing something like a FIX Gateway; or RMDS, ITCH, or STAMP feed handlers in EventFlow would be so difficult as to be effectively impossible. If StreamBase developers had to do that in order to integrate with external systems, nobody would like StreamBase.

The extensive library of Adapters is the Luke Duke to EventFlow’s Bo. Less flashy, but an important ally on which you know you can depend.

The list of external systems already covered by StreamBase’s Adapter library is long and it continues to grow. The complexity of the Adapters depends largely on the complexity of the interaction with the external system. The RMDS and QuickFIX adapters have much more complicated configuration and interactions with your EventFlow code than something like the the CSV File Reader or Log Output Adapter.

If you are evaluating StreamBase as a development platform, do a careful analysis (and perhaps some shrewd bargaining with the sales folks) to ensure that you have good Adapter coverage for your short-term and medium-term integration needs. Adapters are a critical part of the solution that lets you concentrate on business logic instead of plumbing.

4. Build, Test, Deploy

The forgoing sections of this document have included no small amount of praise for the StreamBase platform and the productivity gains when developing business logic in EventFlow. However, there is more to life as a developer or technology manager than seeing elegant and performant code delivered in short times by small teams. In particular, there is the matter of continuous integration practices: frequent and automated build, test, and deployment processes. It is in this domain where StreamBase has the most room to grow.

The care invested in the coding experience and productivity there is evident as you create applications in StreamBase. The same care and effort is conspicuously absent when it comes to these other parts of the development lifecycle.

This section will elucidate the pros and cons of StreamBase in these other areas.

4.1 Project Configuration and Dependencies

StreamBase Studio provides mechanisms inherited from Eclipse to create references between projects in a workspace, and it provides a nice UI for specifying Module Search Path and Resource Search Path which are used to resolve module references and locate resources (such as data files) at design time and when launching within the Studio environment. It works fine for what it is, and experienced Java developers will probably be comfortable with it from the start.

Something else that will be familiar to Java developers is the fact that these Eclipse-based project settings are essentially private to Eclipse and external tools such as ant don’t know anything about them. In the Java world there are ant and maven plugins for Eclipse, and Eclipse plugins for ant, and ant plugins for maven, and who knows what other permutations. In my view, this is a nightmare which is compounded rather than mitigated by StreamBase, which introduces new dependency information (the Module and Resource search paths). Information about project references, dependent libraries, module and resource directories ends up being replicated in:

  1. Eclipse / StreamBase Studio project settings
  2. .sbconf file(s)
  3. build scripts
  4. sbunit invocation scripts
  5. sbbundle invocation scripts
That StreamBase is built on top of Java is mostly abstracted away when developing and interactively executing EventFlow code within StreamBase Studio. For almost any other task involved in taking that code and getting it running in an environment other than Studio, the developer is going to need significant knowledge of established practices and tools for Java development.

Not every developer on your StreamBase team needs to be well-versed in Java. Figure on about 5-10% of the total effort benefiting from that comfort with Java tools in order to meet these artifact creation and management needs.

4.1 Automation

In general, operations performed manually in StreamBase Studio are easily configured and executed. This includes running, building bundles and archives, executing SBUnit tests, etc. As with items like the Eclipse/Studio project configuration, these easy steps do not extend outside of the Studio environment. This is exactly analogous to the Java side of Eclipse, where the UI provides an easy mechanism to create a jar file manually (File | Export…), but no way to create a build configuration that will generate the jar in a repeatable manner, or create an install package that includes dependencies.

This divide between manual and automated steps that was laid down by Eclipse extends into StreamBase Studio. Automating any of these operations which are very easy to do manually involves writing non-trivial scripts in tools such as ant or shell scripts.

In my experience, these sorts of tasks are marginally more difficult for StreamBase projects than for straight Java projects, because in the pure Java world the tools and knowledge for implementing the most common tasks are readily available, while for StreamBase it’s not an exact fit.

This is an area where the StreamBase IDE and platform falls very far behind the analogous .Net tools. Operations such as building multiple target configurations in one operation, managing cross-project dependencies, and automating build, test, and deploy tasks are all readily done from the command line, and there is no duplication of project config in different places for different purposes.

5. Productivity Wishlist

In the years that I have been using StreamBase, I have seen steady and welcome improvements in the usability and productivity. But, being a greedy jerk when it comes to my time, I do have a wishlist for improvements I’d like to see. I don’t doubt that at least some of these are on the way, time will tell:

  1. Syntax-aware StreamBase search of fields, expressions, and properties (like the powerful Java Search feature in Eclipse).
  2. Eclipse/Studio project builders for all StreamBase operations (e.g. making .sbbundle and .sbar files, executing SBUnit tests) which can be added to the project’s build config and executed as part of the build.
  3. Command-line automation for all operations, aware of the Studio project settings so that this information does not need to be duplicated.
  4. Side-by-side display of applications, Operator properties, and Schemas: I’ve not been able to find a way to just look at two applications side-by-side, or to look at the Property and Schema tabs of multiple Operators at the same time. As often as not, I end up doing a screen capture of one and pasting it into MS Paint for reference as I work on the other, which seems silly.
  5. Display names on ports: There’s no limit to the number of input and output ports on Adapters or Operators, which can become confusing when connecting up arcs in your EventFlow logic. Module References present the name of the associated stream when you hover over a port, which is very nice, but Adapters and Operators only show names like “Output Port #1”, which is not helpful.
  6. Constants and Schemas defined by Adapters and Operators: Custom Operators and Adapters should be able to define and export named Schemas and constants which correspond to their inputs, outputs, and special values. This would make the EventFlow integration easier and less error-prone.
  7. Incremental/Cached typechecking: When working with files in Studio, there is some spurious typechecking going on, which can become onerous when dealing with large and complicated applications. If you open an application, change layout, save, close, and re-open, the application will typecheck at least twice (on both opens) and sometimes 3-4 times (on the layout change and/or save). Typecheck results should be cached and only redone when application logic or configuration changes.

6. Conclusion

When used for the correct purpose, the StreamBase platform provides a remarkable combination of high developer productivity and high-performance real-time solutions. Developers can deliver business logic implementations significantly faster using StreamBase than with traditional development platforms. The concentration on that development productivity is apparent in the carefully thought-out tools provided.

With time, it is my expectation that the parts of the platform that relate to other stages of the development lifecycle will be brought up to the high standards of polish and ease of use that has already gone into the Studio development process.

If you have found this article interesting or helpful, please take some time to post your comments, questions, or suggestions on this blog.

Thanks,
Phil Martin
Datasports Inc.

Advertisements

Written by datasports

Oct 27, 2011 at 12:02 AM

Posted in Real World

3 Responses

Subscribe to comments with RSS.

  1. Good summary of SB productivity.

    Thought experiment:
    – Let’s say we have a badass coder. Let’s call him Hans. Hans is in the top 5% of developers and turns out clean, hot code faster than anyone else in his firm. He also donates heavily to elephant shelters and the ACLU, but that’s beside the point.

    – Let’s say we have another, more average coder. Let’s call him Tubby. Tubby is in the top 40% of developers and turns out code that works well in a reasonable amount of time, but he’s by no means an expert.

    – Would both developers benefit from using SB over C++/Java/.Net? The case for Tubby is fairly self-evident, but I’m not sure about Hans.

    – Would a project co-developed by Hans and Tubby be better in SB or other platforms?

    – If you were managing a team with both developers, would you force them both to use SB, or would you let Hans do his thing so long as he integrated tightly with what Tubby does in SB (presuming that Hans didn’t prefer SB)?

    Discuss.

    Data's Ports

    Oct 28, 2011 at 9:33 AM

    • Hi, thanks very much for the comment and questions. I think that we’ve got a mix of technical, project management, and philosophical line management issues here, which makes for an interesting discussion.

      First off, your continued work on elephant shelter awareness is to be lauded.

      Next, I’ll say that every firm has its Hanses and its Tubbies, and I think that how much each benefits from moving to SB depends on a lot of factors other than technical skill.

      I was something of a Hans in straight C, then in C++, then in Delphi (ugh, don’t ask), and in .Net, and on every one of those platforms, my concern was to craft high-quality code where I get to concentrate as much of my attention as possible on solving the problems that were specific to my product. I get upset anytime I have to copy & paste, write boilerplate code, or tell a computer something that it should or does already know (which is reflected in my criticisms of the duplication of project config items in SB). But I’ve known plenty of Hanses (is that the right plural?) who want to write their own string class for every new project. When C++ came along, some great C coders responded with enthusiasm, and others hated it. Similarly with the switch from C++ to Java and/or .Net.

      And the same goes for the merely good, the so-so, and the bad coders. I think that a willingness and ability to switch gears to a new platform and a new way of thinking is independent of skill.

      So, I think that willingness is a key component. If Tubby embraces SB and uses it correctly, then he’ll be more productive than Hans if Hans is made to use SB and actively subverts it.

      If Hans is so fast because he’s a generally excellent developer, then I believe that his skill and speed will translate to SB and he’ll be worth 3-5 Hanses. But if he’s so fast because his brain is wired in such a way that he’s especially well tuned to one platform or language, then that speed and skill won’t translate.

      If I was managing a team where I believed that SB was the way to go and my Hans didn’t want to do SB development, then I’d have 2 choices:

      1. Assign Hans work within the project that lets him use his language-specific skills (e.g. if we need custom Adapters, Operators, unit tests, etc. in Java, or a sophisticated Java, web, or .Net UI) at least part of the time. In that way, Hans can become something of a force multiplier by letting my Tubbies stay away from the lower-level code.
      2. If the project doesn’t need enough of those non-SB deliverables to keep Hans happily busy, then it’s time for me to look for another Hans.

      I don’t mean the latter point to sound harsh. I went the consulting route so that I could concentrate on doing SB work only, so I respect people’s language/platform preferences. But I think it’s a mistake to let personalities or individual preferences shape technical deliverables too much.

      Splitting a project into SB and non-SB components because some subsystem is better done outside SB due to the nature of the problem is one thing, splitting it like that because you’ve got one trusted developer who insists on using his or her own preferred tools is a symptom of a significant “key man” risk.

      Thanks again for reading and especially for posting.

      datasports

      Oct 28, 2011 at 10:17 AM

  2. […] It is important is to make sure that you are defining a project that’s actually well-suited to StreamBase, otherwise you’ll end up with skewed results. I’ve written a piece that discusses what is and what isn’t a good fit, along with some other stuff about productivity in StreamBase here: StreamBase and Productivity […]


Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s

%d bloggers like this: