Editor’s note: This post is the third of a series of three capturing the result of recent interview/discussions I had with Robert Fink of Palantir. The conversation was wide ranging, hitting on topics of design, development environments and a bit on the philosophy of enterprise tech. Several common themes emerged in those topic areas, including ways that Palantir has been leveraging open approaches to data architectures, system design and even developer environments. The first post focused on open data architectures. The second post continues the discussion of open approaches including open source software. This third post gets into the meat of modern, open development environments .– bg
BG: The third broad topic I want to ask you about is open-source for development environments. How does this relate to open-source software and open data architectures?
RF: Once again these are in theory separate concerns, but related by mindset. Hypothetically, you could imagine a development environment based on closed, proprietary tools that produces an open data system, but in reality I have never seen that done. The most powerful and flexible development environments today are based on collections of open-source tools that have been designed from the ground up to produce highly functional, well documented, sharable code. Modern languages like Typescript are entirely open-source, including the compiler, the build system, and the IDE. Contrast that to the early days of C++, where the best compilers and IDEs were closed-source and very, very expensive.
Palantir’s developers are addicted to a large amount of open-source developer tooling. For example, all of our Java developers build their projects with Gradle, and all of our frontend developers code in Typescript. We have made a number of our internal productivity tools available as open-source. For Gradle, examples include plugins for coding standards, container-based integration testing, or for building Docker images. For the frontend community, we maintain the standard linter for Typescript.
BG: On one hand, Palantir is a commercial software company, on the other hand, you give away some of your code as open-source. How do you balance these concerns, how do you decide which part of your software you publish?
RF: In the case of developer-tooling the answer is straightforward: I feel morally obliged to maintain our tools in the open-source domain because it incurs no overhead for us and because we would gain no competitive advantage from keeping it closed-source. A second key advantage of open-source tooling: it makes it much easier for our customers to interact programmatically with our open data platform.
BG: Can you give me an example of that?
RF: Since our data platform is focused on serving customer needs for data integration and analysis, we support standard formats and APIs with built-in interoperability, for example SQL, Hadoop, Parquet, as well as downstream analysis applications like Tableau or PowerBI. But we also provide ways for customers to extend the platform with bespoke functionality in the form of plugins and applications. We try to make this easy by providing building blocks for application development. In many cases we have turned these building blocks into open-source projects themselves so anyone can contribute to and use these capabilities. For example, Blueprint and Plottable are Javascript libraries that both we and our customers employ to develop rich Web-based frontend applications with consistent look and feel. We are in the process of open-sourcing our RPC framework, Conjure. Conjure is built on industry standards like HTTP and JSON and makes is very easy to interact with our platform APIs from a variety of programming languages.
BG: I noticed from your GitHub page that you remain personally involved in the open-source community. Can you tell us some of your favorite Palantir projects?
RF: On the top of my list is probably AtlasDB. AtlasDB is a distributed transactional key-value store built on top of Cassandra. Inspired by research papers on MVCC databases, we began internal development of AtlasDB around 2012 and decided to contribute the codebase to the open-source community in 2014. Today, AtlasDB is the database backend for all of our products. Specifically for Palantir Gotham, AtlasDB has enabled our customers to migrate from Oracle to open-source databases.
As a second example, we partnered with Pepperdata, Red Hat, Bloomberg, and Google to design and implement a Kubernetes backend for Spark. After a number of iterations, the main components developed as part of this collaboration were merged into mainline Spark. Both Spark and Kubernetes are industry-changing software ecosystems and I am really happy to see that their “marriage” is now available to the wider Spark community. Palantir continues to be an active participant in the vision and implementation of the project.
BG: Can you elaborate a little on legal concerns with open-source software, and maybe also on security concerns?
RF: Both are non-trivial subjects, in particular when organizations or individuals “go rogue” and blindly download and deploy open-source components. A big advantage of commercially supported open-source software is that the providers take care of the issues of indemnification and compliance which organizations would have otherwise have to solve themselves.
Open-source components, including code libraries, graphics and fonts, and entire applications are subject to a license grant. Different license grants impose different constraints on how the licensed component and the resulting software can be used. For example, the MIT license allows for permissive use of the covered components, including modifications and redistribution under commercial terms, whereas the GPL license is far more restrictive in how one can package and distribute a combination of the open-source component and proprietary code. We monitor our external dependencies to ensure that our use of open-source components is compatible with their licenses.
The same dependency monitoring is useful in the security context. Like any software, open-source software can be susceptible to malicious and unintentional bugs and security problems. Fortunately, the developer community and security experts around the world continually scan source code for vulnerabilities and maintain lists of known “bad” libraries. We use our dependency monitoring tools to identify and remove such code from our systems. A small amount of common sense also helps in this context: we try to avoid dependencies on code produced by small or obscure development teams, as well as code that is no longer actively maintained.
BG: Thanks, Robert!
RF: My pleasure!
This concludes our series based on our interviews of Robert Fink. Find all three of the series at:
- An interview with Robert Fink, Architect of Foundry, Palantir’s open data platform Part One: Open Data Architectures
- An interview with Robert Fink, Architect of Foundry, Palantir’s open data platform Part Two: Open Source and Open Approaches
- An interview with Robert Fink, Architect of Foundry, Palantir’s open data platform Part Three: Open Development Environments
Track the most disruptive technologies by diving into our categorized index:
Artificial Intelligence Companies – A fast overview of Artificial Intelligence companies we believe are poised to cause the most positive disruption in the enterprise.
Big Data Companies – Reference to the greatest, most disruptive Big Data companies in the tech ecosystem.
Business Intelligence Companies – We assess these to be the Business Intelligence Companies most impactful for delivering real decision advantage.
Cybersecurity Companies – We apply our deep expertise in cybersecurity to assessing the best across multiple categories including:
- CASB
- Cyber Threat Intelligence
- Deception
- Encryption
- Endpoint Detection and Response
- Governance, Training, Education, Process
- IAM
- Managed Services, Outsourced Security
- Microsegmentation and Container Security
- Network Traffic and Analysis
- SDP
- Security Scanning And Testing
Cloud Computing Companies – We include both platform and software as a service providers, capturing only the most innovative and disruptive.
Collaborative Tool Companies – These are the firms that help humans connect to humans to create, manage and lead.
Infrastructure Companies – Critical enterprise foundations for business agility.
IoT Companies – Internet of Things and Industrial Internet of Things are here. How do you manage them?
Mobile Companies – Help manage, configure, secure and optimize these very powerful capabilities.
Robotics Companies – Including innovations in Robotic Process Automation, Drones, and industrial robotics.
Services Companies – We only track a few, the ones we really know well.
Tech Titans – These are the big players. We track the tech titans closely since their capabilities change continuously.
VC, PE and Finance Companies – Keeping an eye on the investors can give indications of coming developments.
You can also use our topical pages to get up to speed quickly on the current status of the major megatrends. See our pages on Cloud Computing, Artificial Intelligence, Mobility, Big Data, Robotics, Internet of Things, Cybersecurity and Blockchain and Cryptocurrencies.
We also provide special pages focused on high interest topics, including Science Fiction, Entertainment, Cyber War, Tech Careers, Training and Education and Tech Tips.