Skip to main content


Recent posts

Hidden Technical Debt in Machine Learning Systems

The software engineering industry precedes other industries in it's tooling, and with the understanding of the importance of dealing with technical debt.  For example software developers use versioning, markdown for fast documentation, text to diagram tools, continuous deployments, these among many other tools are things you don't and sometimes cannot see in other industries.   One place where the software industry is lagging a lot behind is the ability to create software in a well organized form, instead it's using processes which try to make sense of coding on the go like scrum.  As cal Newport said just imagine a car company where someone runs with a part, puts it in the car, then another email comes they and decide to change the color to blue, people moving around, decisions in mail with regards how many cars to take out this week, then something blasts at one part of the car manufacturing area, so they open the graphs they see that indeed they overutilized their robots

Statistical Significance in Hypothesis Testing

You started taking vitamin-D a couple of weeks ago and you notice it takes you less time to fall asleep at night, is it a result of the vitamin D or is it something else, is there maybe something else that causes you to fall asleep more easily? So you decide to do an experiment. You take 50 coworker volunteers, and you split them to two groups, one 125coworker group will get vitamin D the same one you took, while the other group will take a placebo. You notice that the guys who took the real vitamin did get shorter amount of time to fall asleep.  Was it the vitamin D as the cause? It could be, and it could also be the case that not, maybe they share a project they work on and it's going well, so they fall asleep better while the other ones are having hard time and it's not related therefore to any vitamin D they take or not. This is where hypothesis testing and significance come into play. Hypothesis testing is almost what we did here with the experiment, but we want to also kn

Boeing B Tree's

In July 1970 Rudolf Bayer and Mcreight from Boeing Scientific research laboratories published the original B-Trees paper in the mathematical and information sciences report.  They never said what BTrees stand for, and it could well be Boeing Trees though Balanced Tree's is also a good reminder of what they are. When studying BTrees before getting to the actual algorithm, wouldn't it be nice if we understand the exact motivation of the people who actually published the paper, the original people who thought of this idea. They explain it in their paper, first just remember we are talking about 1970, this was before most of us were born, computers were slow, really slow. In the paper they said that they are working on the organization and maintenance of index for dynamic random access file.  Let's dissect it, we know what index is, this is a thing that allows us to locate fast items in our data without scanning the whole data right.  So they want to understand and suggest bett

Bellman Ford Graph Algorithm

The Shortest path algorithms so you go to google maps and you want to find the shortest path from one city to another.  Two algorithms can help you, they both calculate the shortest distance from a source node into all other nodes, one node can handle negative weights with cycles and another cannot, Dijkstra cannot and bellman ford can. One is Dijkstra if you run the Dijkstra algorithm on this map its input would be a single source node and its output would be the path to all other vertices.  However, there is a caveat if Elon mask comes and with some magic creates a black hole loop which makes one of the edges negative weight then the Dijkstra algorithm would fail to give you the answer. This is where bellman Ford algorithm comes into place, it's like the Dijkstra algorithm only it knows to handle well negative weight in edges. Dijkstra has an issue handling negative weights and cycles Bellman's ford algorithm target is to find the shortest path from a single node in a graph t

Containers - Quick Low Level Guide

Containers Kernel, namespace, cgroups Kernel space and user space Before we actually get to explain containers let's define what is a kernel.  Because you know there is no such thing in reality as a kernel it's only how we name things, and different people name things differently. cgroups, namespaces, UFS We are going to discuss containers, cgroups, namespace, UFS, hypervisor, user space, kernetl space and more.   When we say "kernel" we mean this.  We have the hardware, this is not the kernel, now above the hardware we have a few layers of software, imagine now two boxes. User mode is all the application you run while the kernel is the lower level is all the virtual memory management scheduling, connection to hardware devices, network drivers, it's basically the abstraction on top of the hardware + the basic services which allow this. One box is closer to the hardware and contains a few layers, the second box sits on top of the kernel box and contains