AgileData.io DataOps

There are a number of mindsets and patterns that we expect all members of the AgileData.io team to embrace. This document provides a light overview of these.

It is a form of mantra that highlights the things we think are more important than some other things.

This is the page we send new team members to first, so they can get a feel of how we work.

Our definition of DataOps

DataOps is the term we use to describe the blending of DevOps agile and lean patterns to automate the way we build and maintain the AgileData.io product.

We focus on the art of simplicity. To achieve this everything we do should be automated, decoupled and replaceable.

Single version of code

  • We store all our code in a code repository so that everyone on the team can see, review and work on the code whenever they require.

  • We store our code in Google Cloud Repository .

  • You can use what ever IDE you prefer to create your code with simplicity.

  • You should pull code whenever you want to make a change, and commit it when you have made that change.

Connect and collaborate when required

  • We all work remotely. We are expected to be connected via Slack and available via Google Meet whenever you are working.

  • If you don’t need a reply now, post it in slack.

  • If you need a reply now, contact the person directly via the phone or google meet.

  • We don’t use emails to have conversations with each other.

  • You don’t have to respnd to slack posts immediately, but its good behaviour to add a reaction to a post when you have read it. Feel free to find your own favourite unique reaction icon (but you dont have to)

  • We expect everybody will need to collaborate with other team members to get stuff done, so we do it as and when we require. If you haven’t collaborated with somebody each day something is probably wrong and you probably want to look at fixing it.

  • We don’t do “daily standups” as we are small enough to be able to leverage person to person connections and regular team checkpoints via slack. As we grow our team we may need to do daily checkpoints via Google meet.

Plan when required

  • As we finish our current task we check in via slack to collaborate and agree what the next priority is.

  • We run a visual Product Backlog in AgileData.io (hey product backlogs are just data right ;-). This provides a current, 1-3 m0nth and +3 month of what we think we will work on next.

  • You can add features you think we need to the feature channel in slack at any time. These get added to the product backlog when it is refined.

  • We physically get together somewhere in the world every 3 months to work together for a week or two, and to reconnect, learn new jokes and collaborate on updating the product backlog with what we think is important.

Serverless infrastructure as code

  • We use the Google Cloud Platform (GCP) as our underlying infrastructure. We have a strong preference for GCP serverless components over anything else.

  • Regardless of what type of GCP service we are using we provision our infrastructure using code, not manual effort.

  • We treat our infrastructure as “cattle not pets”, if a part of our infrastructure becomes unhealthy instead of repeatably trying to fix it to restore its health we destroy the unhealthy one and provision a new healthy one. This is one of the many reasons we provision our infrastructure using code.

We value simplicity, then scale, then performance, then cost

  • We value creating things that are simple for our users to use.

  • We value things that are simple for us to create and maintain.

  • Once we have created something with simplicity, we value spending time making sure it can scale as we do.

  • Once we have ensured it can scale, we value spending time improving its performance.

  • Once it performs well, we value spending time reducing our costs.

Data is architected using strict patterns

  • Data and the data infrastructure is architected with a number of fit for purpose data patterns, which manage the way data is collected, transformed and stored.

  • The AgileData.io product meets the needs of many different types of customers, industries and use cases. To do this we have created some “magic” that provides us the flexibility to meet these needs, while retaining the ability to leverage reusable patterns.

We loosely leverage the following patterns as part of our “magic”

  • Persistant Staging Area (PSA) or Data Lake;

  • Event modelling;

  • Data Vault;

  • Gherkin;

  • Business Rules Execution;

  • Pub/Sub or Message Queues;

  • BDD orientated testing;

  • Micro Services;

  • Virtualised or persisted semantic layer;

  • Open Source code.

Continuously build, test, release and deploy

  • We continuously build, test and deploy our infrastructure, code, data and content.

Log everything, its all critical data insights

  • We log every interaction, both manual and automated, to provide empirical data on the state, performance and usage of the AgileData.io product.

Check everything is working the way it should

  • We continuously run tests on our infrastructure, code and data to identify issues immediately.