Seeing through the clouds: serverless observability across cloud platforms

Image for post
Image for post

Once I was finished building my initial prototype to dig into observability, I couldn’t sleep. My mind jumped all over the place thinking of everywhere I could make use of this newly gained knowledge. I eventually fell alseep thinking about all the great things I could do, and woke up less than two hours later. My application needed to do more to confirm whether or not this would be helpful beyond a really simple app. My next project was to tackle deploying another component of my application in a different cloud platform provider.

So I had a great starting point, an app that allows me to get the weather for any city in the world via the Open Weather Map API, but what about weather on other planets? I mean since I’m already building a multi-cloud service, why not leverage some planet-scale DB along the way and create an interplanetary service? I know, this is taking geeky to another level but hey, why not learn a little about astronomy along the way 😄.

So I decided to initially build the second component of my distributed system, the planetary-api, by signing up for an Azure free trial. From there it took a few minutes to install the tooling necessary. To get started I needed the Azure CLI as well as the plugin for the serverless framework.

brew install azure-cli
npm install --save serverless-azure-functions
serverless create --template azure-cloud-function

Seemed easy enough…

sls deploy
Serverless: Building Azure Events Hooks
Serverless: Parsing Azure Functions Bindings.json...
Serverless: Building binding for function: hello event: httpTrigger
Serverless: Building binding for function: hello event: http
Serverless: Packaging service...
Serverless: Excluding development dependencies...
Serverless: Logging in to Azure
Serverless: Open a browser to https://aka.ms/devicelogin and provide the following code (which is copied to your clipboard!) to complete the login process: XXXXXX

Sadly, I run into my first roadblock here, my browser prompted me to insert the code provided, but after entering it, it would either lead me to an Azure error page. When it did eventually succeed, the command line would eventually display the following error message.

Error --------------------------------------------------ENOENT: no such file or directory, open '~/.azure/azloginServicePrincipal.json'

After some insightful Google searches, it turns out I needed to login first using the cli.

az login

With the login sorted, I thought I would finally be able to get things going, but my deployment failed yet again with another dandy message Long running operation failed with error: "At least one resource deployment operation failed. Please list deployment operations for details. Please see https://aka.ms/arm-debug for usage details". I tried using serverless with SLS_DEBUG=* enabled with no additional information. I started digging through the Azure portal, yes the console here is called a portal. After much digging through the user interface, I finally found a somewhat cryptic error message that I had to Google “The requested app service plan cannot be created in the current resource group because it is hosting Linux apps”. I deleted the resource group, re-deployed my serverless function and behold, my next helloworld app was up and running.

sls invoke -f planetary-api
Serverless: Logging in to Azure
Serverless: Invoking function "planetary-api"
Go Serverless v1.x! Your function executed successfully!

It’s now time to instrument my function. I created a separate dataset in my honeycomb interface and added some instrumentation to my code. After re-deploying the function, invoking it was failing again. Unlike the code I wrote in my previous post, the errors that were coming back from my invocation did not help me get to the bottom of what was going on. I ended up having to use the Azure App service UI to get more meaningful messages.

Image for post
Image for post
Could definitely have used those logs

Turns out I was missing a node module which I installed and you guessed it, re-deployed. Ta daaaaaaaa! My application can now talk to serverless function in another cloud.

sls invoke -f weatherary -d '{"planet":"mars"}'
{
"statusCode": 200,
"headers": {
"Content-Type": "application/json",
"X-MyCompany-Func-Reply": "hello-handler"
},
"body": "{\"city\":\"\",\"weather\":\"fine\"}"
}

First and foremost, I can’t emphasize how much time I spent troubleshooting my NodeJS code simply because I didn’t have a compiler handy. I spent hours digging into typos and missing interfaces because I couldn’t rely on my trusty compiler to tell me when things weren’t where they needed to be. So I guess lesson one for me, I strongly prefer compiled languages, my brain and coding habits just allow me to be a lot more productive in them. Either that or I haven’t learned to embrace an IDE that offers me better support.

Another thing that became clear is the importance of naming things well. This becomes especially important as tracing starts expanding to multiple services, having the ability to identify applications or code quickly is heavily reliant on the work put in up front to name things well. For example, if you look at my initial trace across two applications, it’s not immediately clear what is happening and where.

Image for post
Image for post

I invested a small amount of time to add some more context into my tracing via two additional fields: platform & application. I utilized the propagation library honeycomb offers for both golang and nodejs to offer continuity in my traces. The result is much clearer as I can now determine by just looking at the trace where the functions are running and in which application.

Image for post
Image for post

I spent a fair amount of time getting frustrated by the tooling to deploy my application to Azure. I’m sure this is partially due to my unfamiliarity with the platform, but having the ability to quickly re-deploy things without having to push buttons is important to me. My workflow to deploy the initial app to Amazon Web Services via the serverless framework felt a hundred times more efficient than having to click my way through the Visual Studio Code interface. I’d love to get some feedback from folks that have used the Azure tools for longer to get better at using them, because I kept telling myself “There just has to be a better way” each time I clicked the deploy button and it prompted me to confirm I wanted to deploy.

Image for post
Image for post
There just has to be a better way…

NOTE: According to the serverless folks that responded to a tweet of mine, it looks like better things are coming for Azure integration with serveless

It’s really cool to experience first-hand how simple it was to setup tracing across different components in multiple cloud providers. Although I haven’t quite started leveraging the real power of honeycomb just yet, I can already see the benefits of using cloud-agnostic observability tooling as it gives me more control over my data. It was an interesting learning experience to realize how foreign the user interface and terminology in Azure felt compared to the world of AWS I was already familiar with. Eventually it started making sense, but it’s a good reminder of what it was like to get started in cloud providers years ago.

Ultimately, I decided to change the architecture of my application a bit, and deploy the planetary-api into AWS as a Golang binary. The main reason being that I wanted to insert what should have been a simple sleep in my code to simulate the time it would take for requests* to reach far away planets. Yes, I know I could have done it via setTimeout or through Python, but this was supposed to be fun 😃. The next step in this project will involve Azure and Google Cloud Platform to deploy the weather-station components. I’ll also be adding at least one database component, and am planning on tackling some failure scenarios put honeycomb to the test, stay tuned. As before the code can be found in GitHub.

* the requests I’m sending will be traveling at ludicrous speeds of course because who the hell has time to wait between four to twenty-four minutes for a response from Mars. Seriously, who’s fixing this speed of light limitation?

Written by

Passionate about the environment and making the world a better place

Get the Medium app

A button that says 'Download on the App Store', and if clicked it will lead you to the iOS App store
A button that says 'Get it on, Google Play', and if clicked it will lead you to the Google Play store