#218 — May 29, 2019

Read on the Web

Web Operations Weekly

..soon to become Statuscode Weekly

Building Facebook's Service Encryption Infrastructure — The tale of how Facebook (who run thousands of microservices serving, they claim, ‘billions of requests per second’) migrated from using Kerberos to TLS as their backend authentication protocol.

Facebook Code

Why Tim Bray Thinks AWS SQS (Simple Queue Service) is Nearly Perfect — A love letter of sorts to SQS which is simple, scalable, and just gets the job done. This post provoked an extensive Hacker News discussion.

Tim Bray

SRE Best Practices for Incident Management — Understand the origins of modern incident management best practices, how they align with the emerging discipline of Site Reliability Engineering (SRE), and how incidents can be proactively prevented with thoughtful failure injection.

Gremlin sponsor

endoflife.date: Quickly Check 'End of Life' Dates for Tools and Technologies — So far it covers PHP, Ruby, Node.js, Drupal, Django, Debian, Windows, and 12 other systems. Pull requests are encouraged to extend it further.


Will Only Enterprise Chrome Installs Have Full Ad-Blocking? An Update on Manifest v3 — A rather deep and technical thread but essentially Chrome is deprecating the blocking capabilities of the webRequest API in its new standards Chrome extensions will have to adhere to. This will likely have impacts on how things like ad blockers work. The Register has a more accessible writeup.

Simeon Vincent

Quick bytes:

💻 Jobs

Senior Site Reliability Engineer - Invoca (Santa Barbara, CA or Remote) — Join our team of Operations Engineers deploying code to our production SaaS platform & public cloud infrastructure multiple times per day.


Find a WebOps Job on Vettery — Vettery specializes in tech roles and is completely free for job seekers.


📖 Tutorials & Stories

About MetricsDB: A Time Series Database for Storing Metrics at Scale at Twitter — Twitter’s time series ingestion service is handling 83 million metrics a second and to scale into the future, they had to seek a new approach. MetricsDB, which went live in 2017, gives an overall cost reduction of 10x and latency by 5x compared to traditional key value stores.

Satish Kotha and Ilho Ye (Twitter)

Atlassian's Journey Scaling Low Latency, Multi-Region Services on AWS — Atlassian went ‘all in’ on AWS in 2016 and has faced (and solved!) a variety of challenges in scaling stateless, high-availability cloud services on it. Here’s some of what they faced.


📕 20 Patterns to Watch for in Engineering Teams

GitPrime sponsor

Broken by Default: Why You Should Avoid Most Dockerfile Examples — A quick look at how even a basic Dockerfile can be broken and what to look for.

Itamar Turner-Trauring

Deploying Active-Active Postgres on Kubernetes — A step-by-step guide on how to deploy an active-active Postgres cluster on Kubernetes using Symmetric-DS (an open source database replication tool).

Dave Cramer

Why We’re Switching to gRPC“Although building gRPC APIs requires a bit more work upfront, we found that having clear API specifications and good support for streaming more than makes up for that.”

Levin Fritz

SOC 2 Compliance using Git: A Developer's Guide — A practical list of Git best practices to help you get SOC 2 quick wins and improve developer productivity.

Datree sponsor

GraphQL Predictions for 2019 and Beyond

Robert Matyszewski

Right Sizing Your Instances Is Nonsense — Many cost optimization companies will talk about right-sizing instances or VMs as if it were trivial. AWS cost optimization guru Corey Quinn disagrees.

Corey Quinn

🛠 Code & Tools

Postgres 12 Beta 1 Released — The draft release notes go into detail on what’s new but new support for a pluggable table storage is sure to open up opportunities.

PostgreSQL Global Development Group

Announcing Terraform 0.12


Sheetson: Quickly Turn Any Google Sheet Into a CRUD API

Ralph Ngo

Cloudflare Unveils Workers KV, A Highly Distributed Database — Cloudflare has built a distributed, eventually consistent key-value store aimed at users of its Workers serverless platform.

Ashcon Partovi (Cloudflare)

AWS Auto Remediate: Functions to Remedy Common Security Issues via AWS Config