Editor’s note: This post is the first of a series of three capturing the result of recent interview/discussions I had with Robert Fink of Palantir. The conversation was wide ranging, hitting on topics of design, development environments and a bit on the philosophy of enterprise tech. Several common themes emerged in those topic areas, including ways that Palantir has been leveraging open approaches to data architectures, system design and even developer environments. This first post focuses on open data architectures.– bg
Bob Gourley: Robert I look forward to your thoughts on three related topics, open data architectures, open-source software, and open development approaches. But first could you let us know a bit about your background?
Robert Fink: I am a computer scientist at Palantir Technologies in Palo Alto, California. I have a background in Physics, and a PhD in Computer Science from Oxford University specializing in database theory. Here at Palantir I am the architect of our open data platform that we call Foundry. Today I spend most of my time on the design and architecture of Foundry. I also oversee our work with the open-source software community… as you know, software development is a team sport these days. I’m also interested in developer education and in development best practices across the industry.
BG: I’m intrigued that you call Foundry an “open data platform”. Is it a database? Or an application? Or an enterprise service bus? Or an ETL service? Or some combination of those?
RF: You managed to mention everything that Foundry is not! Or at least not in the context of the traditional interpretations of these concepts. Foundry is the culmination of 15 years of experience with data integration pipelines and data analysis workflows across our government and commercial customers. We had initially built Foundry as a productivity tool for our own engineers, but soon realized that Foundry allows enterprises to reimagine how they interact with data. As implied by your question, legacy approaches to data involve awkward kludges of components all playing some siloed role in making data more useful to the enterprise. We designed Foundry to work openly with all data sources and repositories in ways that smoothly integrate data, create and capture new knowledge, and make such knowledge available to all users of the platform, to the entire enterprise. Doing this requires an open platform that makes it easy for enterprises to move data in and out. But more than that, it requires an open platform that lets any user source, fuse and transform data using incredibly powerful tools.
BG: So, if an enterprise has developed other tools or uses other commercial solutions, will Foundry work with those?
RF: Yes, of course. Over the last 15 years we have always had APIs that allowed data to be exported to other tools, but heard consistently from our enterprise clients that faster and more streamlined or native interoperability is required. Our current system was designed from the ground up with this open data architecture, and we migrated Gotham APIs to modern, interoperable standards. This gives customers the ability to use any tool now, and, as the world changes and new innovations become available, our solution enables enterprises to more rapidly innovate and adopt new technologies as they become available. Designing as an open system gives that benefit.
BG: So what would you say the greatest value add of Foundry is?
RF: In my experience, the biggest benefit to IT organizations lies in the data management and governance capabilities. This includes end-to-end data provenance, full audit coverage, and the most powerful access control system I am aware of. Fortunately, we were able to lean on our experience with Gotham and the government space when designing these capabilities. Foundry users discover, access, and derive value from all of their data. We give them the ability to collaborate over data and business logic, algorithms, code, formulas, spreadsheets, etc. The key features that enable collaboration are data and code versioning, together with sandboxing: every user can manipulate data and code in their own, isolated view and only merge the changes back once they’re ready. The backend also enables users to easily create new data out of their work product. Every data or code artifact created in Foundry becomes a new data source that other users can build on.
So, to more directly answer your question, Foundry’s greatest value add is to provide our customers with an open data platform that integrates with their enterprise environment. Marc Fontaine, the Digital Transformation Officer at Airbus, eloquently called this idea “re-creating digital continuity” in a recent interview with BCG.
BG: Allow me to come back to my initial question. You called Foundry an “open platform”, could you elaborate a little?
RF: The notions of open platforms and open architectures originate in hardware design and describe systems in which different components can be added, replaced, or upgraded independently. Buyers like this idea, because it reduces vendor lock-in and increases flexibility and negotiating power. This translates more or less directly to software platforms: they are considered open if their inter-component APIs follow open standards, are well documented, and can be accessed by any party through readily available tools and libraries. This is in contrast to closed APIs whose internals are undocumented and often intentionally cryptic. Moreover, use of closed APIs is typically governed by license agreements that prohibit third-party tools, or even outright ban any external use of its data. At Palantir, we lean on open standards like JSON and HTTP for APIs, and open-source technologies for data storage and transformations.
In the early days of computing, the majority of commercial platforms were closed (because, hey, who doesn’t like a good monopoly?) and this led to the siloed compute and data infrastructures that most IT organizations on this planet are still trying to unwind today. Fortunately, the commoditization of hardware and software together with the continued success of open data formats and open-source ecosystems changed the market dynamic over the past 10 years or so. Today, no software vendor can afford to ignore open formats and open-source software.
BG: Does openness matter because customers are interested in plugging in their tools, or is it actually because there’s a deeper concern about turning things “off”?
RF: Both matter. Connectivity to other systems is obviously the bread and butter for a data integration platform. Similarly, almost all of our customers develop bespoke applications and plugins against our open APIs. Before committing to our deep technology and commercial partnership, Airbus conducted an anti-lock-in experiment to “turn off” our platform. Over the course of a week or so, they extracted data and code from Palantir into other systems. This was only possible because all data in Foundry is stored in standard, open formats and accessible through open, documented APIs using off-the-shelf open-source tooling.
Our next post in this series will continue the discussion of open approaches, including open source software. The full series includes:
- An interview with Robert Fink, Architect of Foundry, Palantir’s open data platform Part One: Open Data Architectures
- An interview with Robert Fink, Architect of Foundry, Palantir’s open data platform Part Two: Open Source and Open Approaches
- An interview with Robert Fink, Architect of Foundry, Palantir’s open data platform Part Three: Open Development Environments
Latest posts by Bob Gourley
- Announcing OODA: A Company Addressing Next Generation Security Challenges - January 9, 2019
- An Interview with Greg DeArment, Head of Infrastructure at Palantir - December 17, 2018
- Senzing: AI powered entity resolution to find who is who and what is what in your data - December 12, 2018