Stuff The Internet Says On Scalability For July 15th, 2016

Hey, it's HighScalability time:


That little smudge on Jupiter is North America (size comparison). If you like this sort of Stuff then please support me on Patreon.

  • <2%: percent of total U.S. electricity consumption used by data centers; $4.99: hourly wage of Amazon Turkers; 8,072: cores in Cassandra cluster; .5: new reward for slaving away in the Bitcoin mines; 11: source code for the original Apollo guidance computer; 10 inverse femtobarns: number of collisions recorded by the Large Hadron Collider; 34 bps: using MEMO to send molecular messages through the air; 200 MB: record for storage in DNA; 10,000+: 3D printed parts are used in a Rolls-Royce Phantom; $43.6bn: IaaS revenue to triple by 2020; 

  • Quotable Quotes:
    • @PokemonGoApp: To ensure all Trainers can experience #PokemonGo, we continue to add new resources to accommodate everyone. Thank you for your patience.
    • @balajis: Pokemon Go is a classic overnight success, 10 years in the making. Ingress database, Google Maps, the Pokemon brand…
    • @avantgame: The math of Pokemon Go is pretty amazing. 21 million players in ONE week, playing 43 minutes on average a day.
    • @icecrime: Does Pokemon Go have generics?
    • @HarvardBiz: When companies start scaling, they often start seeing the future as a threat
    • Jakob Engblom: for the best performance, you want to break the design apart across cut-points with the lowest level of communication across the cut.
    • @peterpur: once again, it becomes obvious that complexity feeds itself, while simplicity needs conscious effort & hard work.
    • @jamesurquhart: Mine is already a microservice because it runs on a microcomputer. Right? Right?
    • Facebook: In our experience, every time we add a new tool, we are surprised that we managed without it.
    • @petecordell: Telling a programmer there's already a library to do X is like telling a songwriter there's already a song about love
    • @linclark: Code that my mom wrote 50 years ago just went up on GitHub
    • @danielbryantuk: "Our monolithic application was so monolithic that we gave it a name - jimmy..." Haha, awesome! @ZalandoTech at #microservices summit
    • Uri Hasson~ even across different languages, our brains show similar activity, or become “aligned,” when we hear the same idea or story.
    • @etherealmind: Bidirectional forwarding detection is most significant advance in Autonomous Routing in the last 20 years.
    • @aphyr: In particular, I'd like to note that @VoltDB has opted to preserve strong serializabilty as the default behavior, despite a latency cost.
    • @swardley: the system is based on a cycle of theft, settlers steal from pioneers forcing them to move on ...
    • Ian Adams: That the actual encoding at the CPU [for erasure coded storage] is generally not the bottleneck, but instead that the network tends to be, especially when you have really “wide” codes, e.g. 17/20 causing tons of traffic across many storage nodes for every request. 
    • Ayende: There is about 10% difference between fsync and fdatasync when using the HDD, but there is barely any difference as far as the SSD is concerned. This is because the SSD can do random updates (such as updating both the data and the metadata) much faster, since it doesn’t need to move the spindle.
    • @cpurdy: As long as flash capacities have an order-of-magnitude advantage over RAM, flash is allowed to be slower ;-)
    • @huntchr: Before you all go nuts re #serverless, #mechanicalsympathy remains important. You still need to understand what is going on under the hood.
    • Gallant: These results demonstrate that dynamic brain activity measured under naturalistic conditions can be decoded using current fMRI technology.
    • @sheeshee: trying to convince somebody to archive a really old CGI script roughly 1994 for archeological purposes.. old code is important for learning.
    • Kreps & Kleppmann: we advocate a style of application development in which each data storage and processing component focuses on “doing one thing well”. Heterogeneous systems can be built by composing such specialised tools through the simple, general-purpose interface of a log. 

  • This is how you know you are Facebook. Instead of testing your new mobile software on one device you have a datacenter, with a lab, with around 60 custom made rack bristling with 2000 mobile phones, all so you can test all the different combinations and permutations. The mobile device lab at the Prineville data center.

  • The challenge was made. @adrianco: Let me know when you run a 1000 node Cassandra cluster on Kubernetes :-). The challenge was met. Thousand Instances of Cassandra using Kubernetes Pet Set: We deployed 1,009 minion nodes to Google Compute Engine (GCE), spread across 4 zones, running a custom version of the Kubernetes 1.3 beta. We ran this demo on beta code since the demo was being set up before the 1.3 release date. For the minion nodes, GCE virtual machine n1-standard-8 machine size was chosen, which is vm with 8 virtual CPUs and 30GB of memory. It would allow for a single instance of Cassandra to run on one node, which is recommended for disk I/O. 

  • Lures from Pokemon Go have turned out to be amazingly effective. Pokemon Go Is Driving Insane Amounts of Sales at Small Local Businesses. Here's How It Works. Building that kind of native business model driver deep into the game mechanics is the real trick of the game. Also, How the gurus behind Google Earth created 'Pokémon Go'.

  • A new building block of compression. Save 22% of your bandwidth with Dropbox's new Lepton tool and file format for losslessly compressing JPEGs by an average of 22%. It: compresses JPEG files at a rate of 5 megabytes per second and decodes them back to the original bits at 15 megabytes per second, securely, deterministically, and in under 24 megabytes of memory.

  • We are using less energy than expected in datacenters. How often does that happen? And if you only have 30 minutes to get all your networking news then plugin and take a Network Break with Packet Pushers. An interesting bit this week is from United States Data Center Energy Usage Report. Not too many years ago it was projected computer electricity usage would outstrip supply and grid capacity. That hasn't been the case: "data center electricity consumption increased by about 4% from 2010-2014, a large shift from the 24% percent increase estimated from 2005-2010 and the nearly 90% increase estimated from 2000-2005..... In 2014, data centers in the U.S. consumed an estimated 70 billion kWh...additional energy efficiency strategies and technologies that could significantly reduce data center electricity use below the approximately 73 billion kWh demand projected in 2020." We can thank companies like Facebook and Google that have taken energy saving seriously and chip makers like Intel for finally making energy efficiency a goal. 

  • Videos from the Use Case track at @DockerCon 2016 are available

  • MapD shares how they able to visualize large numbers of records at the grain-level in near real-time. Optimizing Server-Side Rendering for Billion Plus Row Datasets. Surprisingly they render on the backend. The frontend has too many limitations, namely WebGL is missing a lot of functionality and network bandwidth is a bottleneck. So they adopted a hybrid approach "that delivered a lightweight and efficient frontend, and an optimized and load-balanced backend." It's an intricate dance of bespoke tactics that make the impossible, invisible. And there are of course GPUs: the parallel power of GPUs present opportunities to create truly distinctive visual experiences with datasets that were once considered to be too large to render in real-time.

  • SSDs are getting cheaper at 37.5 cents per GB, but hard drives, at 2.5 cents per GB, are still 15x less expensive. Samsung introduces 4TB SSD for a cool $1,500.

  • Maybe we shouldn't brag about the size of our supercomputers anymore? No need for supercomputers: Russian scientists suggest a PC to solve complex problems tens of times faster than with massive supercomputers

  • Really good experience report. From idea to reality: containers in production at GoCardless. Why move to a container-based infrastructure?: have a uniform way to deploy our applications; produce artifacts that can reproducibly be shipped to multiple environments; do as much work up-front as possible - detecting failure during artifact build is better than detecting it during deployment. As a small concern they didn't care about scheduling on systems like Mesos. The result: faster, more reliable deployments; more smaller internal services deployed; more frequent Ruby upgrades.

  • Some mixed experiences with HTTP/2. Real–world HTTP/2: 400gb of images per day: For a typical image rich, latency–bound page using a high–speed, low–latency connection, visual completion was achieved 5% faster on average; For an extremely image–heavy, bandwidth–bound page using the same connection, visual completion was achieved 5–10% slower on average; On a high–latency, low–speed connection we saw significant delays for page to reach visual completion.

  • If scientists can find the Youtube videos people had watched by pattern matching on MRI scans, then how inaccurate can MRIs really be? Brain decoding. If you want to see the next frontier of search this is incredible stuff. The Gallant Lab at UC BerkeleyNatural speech reveals the semantic maps that tile human cerebral cortexArticle in Science Direct about the work of Jack GallantReconstructing Visual Experiences from Brain Activity Evoked by Natural Movies: We recorded BOLD signals in occipitotemporal visual cortex of human subjects who watched natural movies and fit the model separately to individual voxels. Visualization of the fit models reveals how early visual areas represent the information in movies. 

  • Ayende has a good series of posts going: The Guts n’ Glory of Database Internals. Topics include Getting durable, faster; Writing to a data file; The enemy of they database is; The communication protocol. 

  • If you have an continuous stream of data ring buffers are your friend. Wikipedia has an wonderful visualization of how they work

  • Jepsen tests VoltDB 6.3.  You'll want to read because Aphyr goes into good explanations of concepts that hard to understand in practice like strong serialiazabilty and linearizability. You'll also want to read because testing is like a deep dive into a product so you get to read in detail about different architectural choices and their implications. The test result: "VoltDB 6.3 allows stale reads, dirty reads, and lost updates due to network partitions and fault recovery." VoltDB 6.4 included fixes for all these problems. Oh, and VoltDB called this torture upon themselves, the funded the test. Which doesn't surprise me at all, John and the other folks I dealt with at VoltDB were good people. VoltDB also has their own take on the test: Issues Found By Jepsen in VoltDB 6.3, where they discuss the bugs, why they happened, and possible fixes. 

  • Excellent tutorial. FASTER POSTGRESQL SEARCHES WITH TRIGRAMS: A trigram is a sequence of three consecutive characters in a string. For example, the trigrams of Rails are Rai, ail, and ils...Trigram indexes are simple and can help you speed up string matches considerably. 

  • Murat reviews the paper Realtime Data Processing at Facebook. There's a good overview of various systems like Scribe, Puma, Swift, Stylus, Laser, Scuba, and Hive. Some key lessons: no single language that fits all use cases; The ease or hassle of deploying and maintaining the application is equally important; Once an app is deployed, we need to monitor it: Is it using the right amount of parallelism; Streaming versus batch processing is not an either/or decision...Using a mix of streaming and batch processing can speed up long pipelines by hours.

  • Mobile Monetization: A Study of Popular Games: The most popular monetization strategy uses both in-app purchases and advertising together; About half of all mobile games are using a virtual currency; 
  • eBay using flamegraphs to debug running nodejs servers in production from their laptops. Mastering the Fire. They were able to easily find a tricky problem: a fresh deployment, it started using 80% of CPU instead of the expected 20–30%. Below you can see the problem, it was loading templates over and over again with every request. The fix was simply to cache the templates at the first load.

  • In the titles you don't read everyday department: Endangered ferrets are being saved by drones that drop vaccine-laced M&Ms. Maybe this approach would work for frogs that are dying from fungus? As are bats. As are oak trees. Pine trees are dying from nematodes. Bees are dying from pesticides, so we just have to stop doing that.

  • A good AWS Summit London Recap. Some highpoints: #1) Moving to cloud (properly) means adopting a DevOps culture; #2) Software companies (that’s all of you, by the way) are migrating wholesale to the cloud; #3) It’s time to go beyond infrastructure; #4) Your services may need to evolve: from SOA to Microservices; 

  • Wow, when optimizing your database the query planner may not be a reliable narrator. Can Adding an Index Make SQL Server 2016…Worse?: Estimated number of rows = 1 Actual number of rows = 165,367

  • What is the best way to pack cone photoreceptors in a retina? It's not obvious at all, but evolution figured it out. A Bird’s-Eye View of Nature’s Hidden Order: Hyperuniformity is clearly a state to which diverse systems converge, but the explanation for its universality is a work in progress. “I see hyperuniformity as basically a hallmark of deeper optimization processes of some sort,” Cohn said. But what these processes are “might vary a lot between different problems.”

  • How much must a business spend in order to generate one recurring gross profit dollar over the course of a year?  A New Way To Calculate A SaaS Company's Efficiency: Cost_per_Recurring_Gross_Profit_Dollar = (Cost_of_Acquisition /Annual_Revenue_Retention + Cost_to_Serve) / Gross_Profit. 

  • NATS performed 3x faster than HTTP, beating HTTP in both in speed and throughput.  NATS protocol: "wire protocol is a simple, text-based publish/subscribe style protocol. Clients connect to and communicate with gnatsd (the NATS server) through a regular TCP/IP socket using a small set of protocol operations that are terminated by newline."

  • Jakob Engblom with a good book review of Prototypical: The Emergence of FPGA-Based Porototyping for SoC Design. Sounds like a lot of good history and interesting details.

  • atmanos/atmanos: AtmanOS allows you to compile ordinary Go programs into standalone unikernels that run under the Xen hypervisor. AtmanOS is implemented as a series of patches and additional files for Go's runtime and standard library.

  • amznlabs/amazon-dsstne: an open source software library for training and deploying deep neural networks using GPUs. Amazon engineers built DSSTNE to solve deep learning problems at Amazon's scale. DSSTNE is built for production deployment of real-world deep learning applications, emphasizing speed and scale over experimental flexibility.

  • Kafka, Samza and the Unix Philosophy of Distributed Data: In this paper we explain the reasoning behind the design of Kafka and Samza, which allow complex applications to be built by composing a small number of simple primitives – replicated logs and stream operators. We draw parallels between the design of Kafka and Samza, batch processing pipelines, database architecture, and the design philosophy of Unix