Editor’s note: This post is the third of a series of three capturing the result of recent interview/discussions I had with Robert Fink of Palantir. The conversation was wide ranging, hitting on topics of design, development environments and a bit on the philosophy of enterprise tech. Several common themes emerged in those topic areas, including ways that Palantir has been leveraging open approaches to data architectures, system design and even developer environments. The first post focused on open data architectures. The second post continues the discussion of open approaches including open source software. This third post gets into the meat of modern, open development environments .– bg
BG: The third broad topic I want to ask you about is open-source for development environments. How does this relate to open-source software and open data architectures?
RF: Once again these are in theory separate concerns, but related by mindset. Hypothetically, you could imagine a development environment based on closed, proprietary tools that produces an open data system, but in reality I have never seen that done. The most powerful and flexible development environments today are based on collections of open-source tools that have been designed from the ground up to produce highly functional, well documented, sharable code. Modern languages like Typescript are entirely open-source, including the compiler, the build system, and the IDE. Contrast that to the early days of C++, where the best compilers and IDEs were closed-source and very, very expensive.
Palantir’s developers are addicted to a large amount of open-source developer tooling. For example, all of our Java developers build their projects with Gradle, and all of our frontend developers code in Typescript. We have made a number of our internal productivity tools available as open-source. For Gradle, examples include plugins for coding standards, container-based integration testing, or for building Docker images. For the frontend community, we maintain the standard linter for Typescript.
BG: On one hand, Palantir is a commercial software company, on the other hand, you give away some of your code as open-source. How do you balance these concerns, how do you decide which part of your software you publish?
RF: In the case of developer-tooling the answer is straightforward: I feel morally obliged to maintain our tools in the open-source domain because it incurs no overhead for us and because we would gain no competitive advantage from keeping it closed-source. A second key advantage of open-source tooling: it makes it much easier for our customers to interact programmatically with our open data platform.
BG: Can you give me an example of that?
BG: I noticed from your GitHub page that you remain personally involved in the open-source community. Can you tell us some of your favorite Palantir projects?
RF: On the top of my list is probably AtlasDB. AtlasDB is a distributed transactional key-value store built on top of Cassandra. Inspired by research papers on MVCC databases, we began internal development of AtlasDB around 2012 and decided to contribute the codebase to the open-source community in 2014. Today, AtlasDB is the database backend for all of our products. Specifically for Palantir Gotham, AtlasDB has enabled our customers to migrate from Oracle to open-source databases.
As a second example, we partnered with Pepperdata, Red Hat, Bloomberg, and Google to design and implement a Kubernetes backend for Spark. After a number of iterations, the main components developed as part of this collaboration were merged into mainline Spark. Both Spark and Kubernetes are industry-changing software ecosystems and I am really happy to see that their “marriage” is now available to the wider Spark community. Palantir continues to be an active participant in the vision and implementation of the project.
BG: Can you elaborate a little on legal concerns with open-source software, and maybe also on security concerns?
RF: Both are non-trivial subjects, in particular when organizations or individuals “go rogue” and blindly download and deploy open-source components. A big advantage of commercially supported open-source software is that the providers take care of the issues of indemnification and compliance which organizations would have otherwise have to solve themselves.
Open-source components, including code libraries, graphics and fonts, and entire applications are subject to a license grant. Different license grants impose different constraints on how the licensed component and the resulting software can be used. For example, the MIT license allows for permissive use of the covered components, including modifications and redistribution under commercial terms, whereas the GPL license is far more restrictive in how one can package and distribute a combination of the open-source component and proprietary code. We monitor our external dependencies to ensure that our use of open-source components is compatible with their licenses.
The same dependency monitoring is useful in the security context. Like any software, open-source software can be susceptible to malicious and unintentional bugs and security problems. Fortunately, the developer community and security experts around the world continually scan source code for vulnerabilities and maintain lists of known “bad” libraries. We use our dependency monitoring tools to identify and remove such code from our systems. A small amount of common sense also helps in this context: we try to avoid dependencies on code produced by small or obscure development teams, as well as code that is no longer actively maintained.
BG: Thanks, Robert!
RF: My pleasure!
This concludes our series based on our interviews of Robert Fink. Find all three of the series at:
- An interview with Robert Fink, Architect of Foundry, Palantir’s open data platform Part One: Open Data Architectures
- An interview with Robert Fink, Architect of Foundry, Palantir’s open data platform Part Two: Open Source and Open Approaches
- An interview with Robert Fink, Architect of Foundry, Palantir’s open data platform Part Three: Open Development Environments
Latest posts by Bob Gourley
- Senzing: AI powered entity resolution to find who is who and what is what in your data - December 12, 2018
- CTO Summit at NASDAQ Marketsite in Times Square 18 Dec 2018 - December 7, 2018
- Join (in person or online) The People Centered Internet 10 December 2018 To Help Shape The Future of Humanity - December 7, 2018