Talks

We would like to give a huge thank you to our speakers, the talks they deliver are the reason we have this conference.

Planting Seed

Jamie Hewland
Site Reliability Engineer
Praekelt Foundation

Follow @JayH5

Open source tools from companies such as Mesosphere, CoreOS and HashiCorp have made it possible for small teams to build scalable, highly automated infrastructure on which to deploy applications. Software container systems such as Docker have drastically simplified the way these applications are packaged and executed.

Over the last 2 years, we've worked hard to update our infrastructure from a complex set of manually-coordinated processes to an automated system that deploys Docker containers to a cluster. We made this move to reduce costs, scale further and make our software more portable.

We've used this new infrastructure to host hundreds of websites for Facebook's Free Basics platform. Our new health software stack, Seed, has pushed the requirements of the infrastructure even further. We're working towards replicating this stack to deployments in countries across the world.

The talk will cover some of the basic building blocks of cluster orchestration systems such as consensus algorithms, distributed key/value stores and resource allocation. We'll cover some of the challenges in networking containers at scale and touch on some solutions for persistent storage.

Finally, we'll look at how these problems and the available solutions have shaped our infrastructure, the design trade-offs we made, and what we're looking forward to in the future.

Jamie is a software developer at Praekelt Foundation. He believes that good user experience is important throughout the full software stack — from the front end down to the lowest-level system code. He studied Computer Science at UCT and did research on evolving robotic teams.

The video for this talk is available on YouTube

Design patterns for product launch success

Michael Needham
Senior Manager of Solutions Architecture - RMEA
Amazon Web Services

Follow @smneedham

With more 'Unicorns' in the past 18 months than entire history of online 'Unicorns' the macroeconomic timing and techincal conditions for product disruption are high. This talk will focus on 5 design patterns for rapid launch of new products and cost efficient scale when success takes hold.

The more technical aspects of this talk will consider the serverless architecture paradigm (based on AWS Lambda) for low cost unit economics in your products and extremely high scale across global regions. It will touch on frameworks such as Jaws Open Source framework (now called Serverless) as a mechanism of dealing with the complexities of such deployments, a typical challenge in this space. The final design pattern focuses on scaling your products to new channels and will consider the IOT theme with a small example of a Lambda integration in to an Amazon Alexa for a simple beer ordering service :)

In this role Michael is supporting internal teams and directly working with customers to successfully architect and run their applications at scale in the Amazon Web Services cloud.

Michael has 19 years’ IT experience working in Africa, Asia and Europe. In various stages of his career he played a key role in shaping the platforms and technology that power some of the top online brands of Southern Africa, including a leading ISP, a breaking news portal, mobile-first startups in Kenya and Nigeria and more recently online retail. Over a 10 year period, Michael also worked as technical due diligence lead on Nasper’s mergers and acquisitions team, supporting over $300 Mil of investments in emerging market eCommerce startups such as Flipkart (India), SimilarWeb (Israel), Konga (Nigeria), Souq (Middle East), PayProp (South Africa) and many more.

The video for this talk is available on YouTube

Rapid, Reproducible Builds - Treating our build servers like production

Dewald Viljoen
Senior Software Developer
ThoughtWorks

Follow @DewaldV

I'd like to talk to everyone about someone we often neglect. That someone keeps our code flowing into production, runs all those automated tests, does things that we might not even remember to deploy to all our environments.

That someone is our build servers.

They tirelessly work to get our changes to our end users and we do neglect them. We don't treat our build servers like production servers, we hack them, we play with them, we mess up the environments. We may have several agents with different runtimes and gems and patch levels and who knows what.

We can do better.

In this talk I'd like to take you through something I've been working on over the last couple of months: Containerising your build. The tool I'll be using for this is Docker, but it's achievable with any VM or container technology. I'll take you through going from a working build on local to building local inside a container and then to moving that build to a build server, explaining what how and why as I go.

Dewald is a software developer and ThoughtWorker from Johannesburg. He believes that software has no value if it cannot get to the consumer and that close collaboration between operations and development and automating all the things is crucial to succeeding at this goal.

The video for this talk is available on YouTube

Scaling Like Twitter With Apache Mesos

Sunil Shah
Technical Lead
Mesosphere

Follow @ssk2

When Twitter was faced with the task of re-architecting their backend away from a monolithic Ruby on Rails application to a series of microservices, they turned to Apache Mesos. Mesos is an open source cluster manager that was developed at UC Berkeley and powers the backend of heavily trafficked services like Twitter, Siri and others. Mesos has been proven to scale to tens of thousands of nodes.

Using Mesos with container software like Docker and other open source tools, you can easily deploy automated infrastructure that scales with both your users and developers. In this presentation, we will introduce and motivate the Apache Mesos project and demonstrate common workflows for building and deploying applications to a cluster of machines.

Sunil Shah is a technical lead at Mesosphere, working on tools and services around the Apache Mesos project to make the lives of developers easier. Before joining Mesosphere, Sunil worked at music recommendations service Last.fm and completed a Master's program at UC Berkeley in EECS, working on real-time processing of images collected from drones. When he's not flying drones around, Sunil likes to cycle, camp, hike, ski and play a large drum.

The video for this talk is available on YouTube

Scaling Like Twitter With Apache Mesos

Philip Norman
Software Entwickler
Mesosphere

Follow @philipnrmn

When Twitter was faced with the task of re-architecting their backend away from a monolithic Ruby on Rails application to a series of microservices, they turned to Apache Mesos. Mesos is an open source cluster manager that was developed at UC Berkeley and powers the backend of heavily trafficked services like Twitter, Siri and others. Mesos has been proven to scale to tens of thousands of nodes.

Using Mesos with container software like Docker and other open source tools, you can easily deploy automated infrastructure that scales with both your users and developers. In this presentation, we will introduce and motivate the Apache Mesos project and demonstrate common workflows for building and deploying applications

Originally from London, Philip Norman now works out of Hamburg developing scalable container tooling for Mesosphere. Previously, Philip worked freelance in the web space on projects ranging from experimental social networks to FinTech solutions. Philip likes to relax by cycling, bouldering, and exploring the weirder corners of North German cuisine.

The video for this talk is available on YouTube

Scaling Support by Educating Customers

Job Thomas
Training and Education Manager
Automattic

Growing a business does not just imply selling more products and getting more customers; one of the biggest challenges is supporting that growing customer-base in handling their acquired products. With growth comes an increase in potential errors, bugs, and struggles. So how does a growing tech company scale support?

At WooThemes we get about 10,000 ticket submissions per month, which stands in big contrast with the almost 500,000 visitors of our public support resources. This talk will focus on finding a good balance between providing a stable public resource and answering urgent ticket requests, and the mistakes and wins we've encountered along the way.

Job Thomas is a Belgian who followed his South African wife to Cape Town in 2013. He has worked as training manager at several NGOs and as academic teaching staff at a few colleges before joining WooThemes as Training and Education Manager focusing on proactive support strategy.

The video for this talk is available on YouTube

Continuous deployment to millions of users 40 times a day

Michael Gorven
Production Engineer
Facebook/Instagram

Follow @mgorven

Instagram deploys code 40 times a day to its fleet of thousands of webservers and userbase of 400M monthly active users automatically when engineers land changes. This talk describes the iterative approach we took to building this system, the problems we faced along the way, the solutions we implemented, and the key principles which enable this to work.

Michael Gorven is a Production Engineer at Facebook, where he works on Instagram. He fixes things when they break, improves the reliability of the system, helps engineer it to scale, and reverts diffs. Previously he was an early employee at South African startup Nimbula. Michael grew up in Durban and holds a BSc in Electrical and Computer Engineering from the University of Cape Town. He currently lives in California with his wife and 2 year old son.

Retrospective: Scaling Infrastructure at Etsy

Bethany Macri
Software Engineer
Etsy

Follow @bmacri

Etsy is an online marketplace that connects people across the globe to buy and sell handmade and vintage goods. Founded in 2005 in Brooklyn, Etsy has grown rapidly. As a result, Etsy has need to scale its infrastructure to serve its growing user-base as well as its developers. In this talk, Bethany Macri, Core Platform Engineer at Etsy, will discuss three software projects that demonstrate how Etsy changed its infrastructure to scale and maintain high availability:

  1. a four-year data migration project
  2. writing a new API that allowed clients to fetch data concurrently, and
  3. scaling datasets infrastructure.
This talk will propose a model for scaling as well as discussing possible ways Etsy will continue to scale as it grows.

Bethany Macri is a Software Engineer on Etsy's Core Platform team. Bethany studied Literature in college, then taught herself how to code and attended the Recurse Center before working for Etsy. She is passionate about infrastructure and web operations, Mexican food and theater.

The video for this talk is available on YouTube

FUD at scale

Len Weincier
Founder
CloudAfrica

Follow @lenw

Fear, Uncertainty and Doubt can seriously limit your ability to scale. Would you ever build your own hardware? Would you build an operating system? Lets take a look at the "other" barriers to scaling.

Len has built systems ranging from embedded devices to enterprise scale big data systems. Coming from C and C++ via Java and JEE through Ruby and now into Go and Node with bits of Clojure, he has been around the programming block. He's trained many developers, biz analysts and project managers over time and spoken at many conferences.

The video for this talk is available on YouTube

Containers will not fix your broken culture (and other hard truths)

Bridget Kromhout
Principal Technologist
Pivotal

Follow @bridgetkromhout

Containers will not fix your broken culture. Microservices won’t prevent your two-pizza teams from needing to have conversations with one another over that pizza. No amount of industrial-strength job scheduling makes your organization immune to Conway’s Law.

Does this mean that devops has failed? Not in the slightest. It means that while the unscrupulous might try to sell us devops, we can’t buy it. We have to live it; change is a choice we make every day, through our actions of listening empathetically and acting compassionately.

Making thoughtful decisions about tools and architecture can help. Containers prove to be a useful boundary object, and deconstructing systems to human-scale allows us to comprehend their complexity. We succeed when we share responsibility and have agency, when we move past learned helplessness to active listening. But there is no flowchart, no checklist, no shopping list of ticky boxes that will make everything better. “Anyone who says differently is selling something”, as The Princess Bride teaches us.

Part rant, part devops therapy, this talk will explain in the nerdiest of terms why CAP theorem applies to human interactions too, how oral tradition is like never writing state to disk, and what we can do to avoid sadness as a service.

Bridget Kromhout is a Principal Technologist for Cloud Foundry at Pivotal. Her CS degree emphasis was in theory, but she now deals with the concrete (if ‘cloud’ can be considered tangible). After years in site reliability operations (most recently at DramaFever), she traded in oncall for more travel. A frequent speaker at tech conferences, she helps organize the AWS and devops meetups at home in Minneapolis, serves on the program committee for Velocity, and acts as a global core organizer for devopsdays. She podcasts at Arrested DevOps, occasionally blogs at bridgetkromhout.com, and is active in a Twitterverse near you.

The video for this talk is available on YouTube

The Strangler Pattern

Quinton Parker
Solutions Architect
Spree.co.za

Follow @quintonparker

If a large, complex, legacy (and monolith?) software system could be thought of as a boeing and the users its passengers. This talk is about a powerful (but not new) strategy to replace the jet engines (performance) whilst simultaneously enabling more passengers to embark mid-flight (scalability) without anybody getting killed in the process (developers included).

This talk will showcase real-world and practical concepts, pragmatic thinking, microservice adoption, silly mistakes, and unusual software pattern mashups employed by Spree engineering teams on our journey to a preferable ecommerce platform

Quinton is a Solutions Architect for Spree.co.za but he's not the hand-waiving box-drawing kind. He's the roll-up sleeves, write code and ensure big ideas sensibly translates into practical and real-world implementation kind of architect. As a young under-priviledged developer he would spend many years working on large-ish php/mysql applications. Today he sips the Node.js koolaid, Elasticsearches all the things, and is a devout varnishcache fanboy

The video for this talk is available on YouTube

Load Balancing is Impossible

Tyler McMullen
CTO
Fastly

Follow @tyler

Load balancing is something most of us assume is a solved problem. But the idea that load balancing is "solved" could not be further from the truth. If you use multiple load balancers, the problem is even worse. Most of us use "random" or "round-robin" techniques, which have certain advantages but are highly inefficient. Others use more complex algorithms like "least-conns," which can be more efficient but have horrific edge cases. "Consistent hashing" is a very useful technique, but only applies to certain problems. There are several factors that exist both in theory and practice that make efficient load balancing an exceptionally hard problem.

For instance:

  • Poisson request arrival times
  • Exponentially distributed response latency
  • Oscillations when sharing data between multiple load balancers
Luckily, there are techniques and algorithms that have been developed that can make life better. I’ll walk through some of the ways that we can do better than “random,” “round-robin,” and naive “least-conns,” even with distributed load balancers.

Tyler McMullen is CTO at Fastly, where he’s responsible for the system architecture and leads the company’s technology vision. As part of the founding team, Tyler built the first versions of Fastly’s Instant Purging system, API, and Real-time Analytics. Before Fastly, Tyler worked on text analysis and recommendations at Scribd. A self-described technology curmudgeon, he has experience in everything from web design to kernel development, and loathes all of it. Especially distributed systems.

The video for this talk is available on YouTube

The dirty secrets of building large, highly available, scalable HTTP APIs

Damian Schenkelman
Engineer
Auth0

Follow @dschenkelman

When you first start building an API for a new product you mostly focus on getting an MVP ready, with the goal shipping as soon as possible so you can get feedback from customers. If you are lucky enough, your product will be successful and you will have to start worrying about things like authentication, authorization, documentation, validation, rate limiting, geo-redundancy, and no downtime deployments. In this talk I will go over some real life examples of our experience evolving our APIs at Auth0 and some of the tools we use for that.

Damian is an engineer at Auth0 working on making the core scalable and performant. He loves learning about distributed systems, software performance and contributing to OSS.

The video for this talk is available on YouTube

Get in touch

Email

[email protected] for general queries

[email protected] for payment related questions

Social Media

Follow us on Twitter @scaleconf

Find us on Facebook

Community

This site is hosted on Github, please feel free to look around and contribute.

Code of conduct

© ScaleConf 2022