Posts

How to debug a crashing docker container

If want to run your docker process with some tweaks because it's crashing in your docker container and causing the container itself to stop (without giving you a way to inspect the files on the image), here's the magic command to start it with just bash. (I found this after quite a bit of hunting on the internet, the magic flag is --entrypoint and don't forget the -s at the end) Here's a sample command: docker run -it --entrypoint /bin/bash  $IMAGE -s Sourced from: https://vsupalov.com/debug-docker-container/

How to be an Effective Engineer?

Image
Read this book by Edmond Lau:  Highly recommended. His experiences parallel my own. The book is written engagingly, quick to read and succinct in its delivery. 

Understanding Divye

I typically share the contents of this document the first time I meet someone I'm going to be working with for a long time. Style of communication: Direct and clear. I don't do well reading between the lines. If you would like me to know something, please say something. What is important to me: High quality code and a highly productive, functional team. I don't believe in posturing. I believe in commitments and delivery. Delivering more and talking less is a good way to show impact. What kind of leader I'm trying to become: One who can grow people. I'm driven by Mission, Vision, Values. What I value in an organization: Transparency, Trust and Integrity. If you have a problem, speak up. Don't ever go behind someone's back. Bring problems up together so that I can see both sides represented fairly. Be direct. What I seek in people I work with: Ability to execute and potential to grow. I will endeavor to make you a better engine...

AI Expo 2019: Tim Jurka (LinkedIn, Director Feed AI) - Part 4 of 4

I recently attended the AI Expo 2019 at the Santa Clara Convention Center. Notes are from my understanding of the talk. Any errors are mine and mine alone. LinkedIn: A look behind the AI that powers the LI feed Tim Jurka (Dir. Feed AI) The talk was focused on the objectives of LinkedIn's Feed. The talk was focused to a high level (exec) audience. While I was familiar with the space, the objective function formulation and presentation was interesting: The recommendation problem for LinkedIn is maximizing Like/Comment/Share CTR + downstream network activation (virals) + encouraging new creators. Problem Formulation: P(click) + P(viral) * (alpha_downstream + alpha_creator * e ^ (- decay * E[num_response_to_creator]) alpha_downstream accounts for downstream effects; alpha_creator penalizes popular creators to induce diversity. General approaches (Toolbox): Multi Objective Optimization (ads vs organic content). Logistic Regression: Features, Embedding...

AI Expo 2019: Emilio Billi (CTO, A3Cube) - Part 3 of 4

Why and How the computational power influences the rate of progress in the technology Emilio Billi CTO A3Cube Inc Background: ML, Big Data & Analytics, AI, HPC. This was a big data infra focused talk. The speaker had a background in systems infra with past DoD experience. Not the most engaging delivery, but really nice takeaways: Moving 128 bytes on a CPU using 100Gbit ETH: CPU waits 8900ns for nothing (~7.1M compute ops lost); Moving the same 128 bytes using optimized RDMA intra-cluster costs 1200ns CPU time (~0.96M compute ops lost) You get 6M ops extra per second for ML. That's a great acceleration for ML workloads. Basic contention: ETH, TCP, slow storage is legacy technology. The clusters of the future will look like the supercomputer systems of today: 1. Low latency converged parallel file systems (think S3 for the cluster). 2. Built in Distributed Resource scheduler (think Kubernetes for the cluster). 3. Cooperative RAM over networ...

AI Expo 2019 - Prakhar Mehrotra (Walmart Sr. Director of ML) - Part 2 of 4

I attended the AI Expo 2019 at the Santa Clara Convention Center where Prakhar gave a talk. Notes are my summarization of the talk. Any errors are mine and mine alone.  Walmart - Prakhar Mehrotra (Sr. Director of ML, previously at Uber) Walmart has huge scale: 0.5Trillion+ revenue, 3000+ stores with massive physical footprints, a massive global supply chain, Jet.com, Walmart.com, Shoes.com, Flipkart.com and it keeps growing. The talk was focused on Walmart's application of ML, the contrasts of Uber-style surge pricing vs Walmart's fixed in-store pricing ("everyday low pricing"). A focus point was causality over correlation: understanding Walmart's customer and its supply chain (the Why?). Their primary domain was solving for shelf placement of inventory. Other interesting problems were inventory management, bridging the online and offline worlds (if we ship from warehouse, it's going to cost you X but if you pick up at this store where it...

AI Expo 2019 Notes: Ameen Kazerouni (Zappos) - Part 1 of 4

I recently attended the AI Expo 2019 at the Santa Clara Convention Center where there were talks on various ML platforms. I'm leading the ML Training Infra platform at Pinterest. These notes are from those talks and are summarized from the speaker's presentations. Any and all errors are mine alone. Hope you find the below useful. Speaker: Ameen Kazerouni (Zappos) The scope of the talk was Zappos' ML Platform ecosystem, the problems they faced after solving the basic 5: Problem specification, dataset design, model selection, training and validation. A condensed list of their issues is: 1. Data management: data lifetime (how long?), security footprint (who should have access?), governance issues (who did have access?), data scrubbing and anonymization (avoiding privacy issues under GDPR). 2. Team: There are very few unicorns that can do PhD statistics, ML math and write distributed systems. They hire for domain competence and the ability to communicate t...