How open source is changing technology
Some history of the movement that's gotten us to where we are today.
Welcome to Day to Data!
Did a friend send you this newsletter? Be sure to subscribe here to get a weekly post covering the technology behind products we use every day and interesting applications of data science for folks with no tech background.
“We have no moat”.
These words, shared in an internal document supposedly from a researcher at Google, portray the lack of defense that a large company like Google, or their competitor OpenAI, could have against large language models and platforms being developed by the open source community. Is this conclusion true? How does open source effect generative artificial intelligence? How did we get to where we are today?
We’re going to tackle this in two parts. This week, we’re talking about the history of open source software and the community of developers who built it. Next week, we’re going to dive into the impact open source projects have had on evolving technologies of today.
A background on open source.
In 1969, IBM announced that their software would be sold separately from the hardware their company provided. Customers went from buying a computer with all the software pre-installed to having to buy what software their device needed. This "unbundling” kick started the software market. In the years following, popular software like Tex was released with the corresponding source code. The years to follow would be influential for FOSS — free, open source software.
For folks who may not understand what software + source code means: if someone wanted to go out and replicate the Facebook app, they couldn’t literally copy it as there is not public source code (the actual, executable code) that exposes how the application works from frontend to backend. To think about something more machine learning based, let’s use the TikTok algorithm as an example. TikTok users know that TikTok is really good at understanding preferences, but if you wanted to replicate their algorithm, you couldn’t because you don’t have the source code. When we use the phrase open source, we roughly mean that the source code is available online and may even be open for feedback and improvements from contributors anywhere.
In 1985, the Free Software Foundation was founded by Richard Stallman to support a new initiative called the GNU Project, which hoped to build a system like Unix (an early operating system developed by the Bell Labs) entirely comprised of free software, and other initiatives that hoped to bring FOSS to systems everywhere.
It wasn’t until 1991 that Linus Torvalds announced Linux, an operating system kernel that he later released in 1994 which helped enable the distribution of open-source software. Linux (perhaps named as such by combining Linus and Unix!) is a free, open-source operating system that was started when Linus was only 21 years old. Linux became a revolutionary part of the open-source movement, with the Linux Foundation being established in 2000 and helping democratize not only software, but the systems they’re built upon.
So what’s this all mean?
The movement for free, open-source software was led by developers, academics, and foundations that saw the need for the free access and democratization of software. The internet boom encouraged their efforts. The FOSS movement made strides through prioritizing and optimizing for:
Transparency — programmers developing software where the source code is shared to the public promoted a level of transparency around how a software/algorithm is working behind the scenes
Security — public source code enables security professionals an opportunity to find bugs and explore vulnerabilities
Speed of development — the release of open-source projects inspired spin-offs and improvements quicker than ever before
Influential open source projects
Open source projects have truly shifted the way much of our internet has been built. Several programming languages, operating systems, frameworks and algorithms have been released through open source versioning. To name a few: Go, Angular, TensorFlow, Kubernetes, Matplotlib, and MongoDB.
These projects have changed the way developers build projects and the way they collaborate. Several companies have built their product and models on top of these open-source platforms, relying on them for everything from compiling their code to storing their data.
Real life examples? Kubernetes, Microsoft’s open-source platform to help deploy cloud apps at scale, is used by almost 20% of developers per Stack Overflow’s Developer Survey in 2023. Companies like Pinterest, Airbnb, and even Pokemon Go are built on top of Kubernetes.
I just watched Glitch: The Rise & Fall of HQ Trivia and in the opening scenes, founder Colin Kroll mentions the app is built upon EC2 and open source softwares. Companies everywhere rely on open source projects and I’d expect they’ll continue to do so for years to come.
For a list of open source projects Google has launched or contributed to, check our their projects here. Google’s most famous open-source projects include Android, Kubernetes, TensorFlow, and Firebase.
The open source movement today
So why does Google have “no moat” against the open source projects of the world? Just as we mentioned, open source products promote transparency, safe software, and innovative development. And in the evolving age of large language models and generative artificial intelligence, these pillars are more important than ever. More to come…
Next week, we’re going to talk about where the open source movement is today and the impact they’re having on innovation today. We’ve got a lot more to cover, so stay tuned!
No glitches here!