Skip navigation

When listening to a recent episode of the Food Fight Show on the internals of Chef Client code, Dan DeLeo was discussing the creating Chef Clients and, specifically, what expectations should you have when calling Chef methods. For tracing execution order when invoking a method the suggest Dan provided was lots of git grep and “look through the code”. On the surface, this is a reasonable reply given the request. If you want to know what a program’s doing, the ultimate source of truth is the code. But there’s a trap laying there that has significant operational and supportability consequences.

Reading through the code is a fantastic way to figure out how something is being done. It’s pretty much the only way to get up to speed enough to be able to make reasonable suggestions for improving the implementation of whatever service the code/program provides. It should not, however, be necessary for someone writing a client to read the server code. If I want to write a web server, I don’t need to know the implementation specific details of how Chrome, Opera, Firefox, Safari, IE, et. al work. I need to make sure I speak the same common languages, HTTP and HTTPS, which everyone’s agreed to use. I’m quite certain Daniel Bernstein doesn’t care about which TCP stack you’re using when querying djbdns. He cares that his software’s giving back the right answers when queried using the defined, common methods.

If you’re writing a server, your APIs (aka Application Programming Interfaces) need to be documented. If you’re writing a server say “The code is the documentation”, a) that’s lazy and b) the documentation is only valid until the code changes. If I’m writing client code for Chef, should I subscribe to the git repository for change updates? How do I separate functional changes from refactoring or formatting changes? Should I also subscribe to all dependent modules (in case they change as well as recently happened for the excon gem causing chef client runs to abort and, thereby, prevent node convergence for nodes that had been recently updated)? How far down the rabbit hole does this go?

If I want to write a client, I believe it is a reasonable expectation to simply use the documented APIs. I also believe server developers should be responsible to have said documentation along with non-trivial examples of their use. I’m not saying you have to go to the level of DataDog’s API documentation but if you don’t you’re not doing it well. I think DataDog’s documentation looks so good because so many other people document their services so poorly.

There is a time and place for procedurally generated documentation based on comments, method definitions, etc. Javadoc and it’s brethren have their place. But if you’re providing a service and expect people to ultimately pay you for it (which Opscode does for their hosted and private offerings) expecting folks to ‘look through the code’ to determine functionality will work in the short term, but will cause heartache down the road when server implementations change and your client code breaks because of it. . . and hopefully your client code will break in a noisy, but not service impacting fashion. Service impacting breakages would not be good, but those silent failures that are around for weeks, months or longer that provide you a false sense of security are equally as dangerous. “Oh, yeah, we’ve got a process that checks for that. No need to worry. Let’s move on to the next thing.”

So, in summary, server or service developer: document your APIs (and include good examples, not simply “Here’s how I do ‘Hello World'”). And, if you’re a client developer, demand good documentation of the APIs you’re using. If you’re feeling generous, contribute documentation back to the community and make sure the developers know what your expectations are when using their product. Increase communication and do so via good documentation!

But that’s just my opinion. I could be wrong.