An interview with Robert Fink, Architect of Foundry, Palantir’s open data platform Part One: Open Data Architectures


Editor’s note: This post is the first of a series of three capturing the result of recent interview/discussions I had with Robert Fink of Palantir. The conversation was wide ranging, hitting on topics of design, development environments and a bit on the philosophy of enterprise tech. Several common themes emerged in those topic areas, including ways that Palantir has been leveraging open approaches to data architectures, system design and even developer environments. This first post focuses on open data architectures.– bg

Bob Gourley: Robert I look forward to your thoughts on three related topics, open data architectures, open-source software, and open development approaches. But first could you let us know a bit about your background?

Robert Fink: I am a computer scientist at Palantir Technologies in Palo Alto, California. I have a background in Physics, and a PhD in Computer Science from Oxford University specializing in database theory. Here at Palantir I am the architect of our open data platform that we call Foundry. Today I spend most of my time on the design and architecture of Foundry. I also oversee our work with the open-source software community… as you know, software development is a team sport these days. I’m also interested in developer education and in development best practices across the industry.

BG: I’m intrigued that you call Foundry an “open data platform”. Is it a database? Or an application? Or an enterprise service bus? Or an ETL service? Or some combination of those?

RF: You managed to mention everything that Foundry is not! Or at least not in the context of the traditional interpretations of these concepts. Foundry is the culmination of 15 years of experience with data integration pipelines and data analysis workflows across our government and commercial customers. We had initially built Foundry as a productivity tool for our own engineers, but soon realized that Foundry allows enterprises to reimagine how they interact with data. As implied by your question, legacy approaches to data involve awkward kludges of components all playing some siloed role in making data more useful to the enterprise. We designed Foundry to work openly with all data sources and repositories in ways that smoothly integrate data, create and capture new knowledge, and make such knowledge available to all users of the platform, to the entire enterprise. Doing this requires an open platform that makes it easy for enterprises to move data in and out. But more than that, it requires an open platform that lets any user source, fuse and transform data using incredibly powerful tools.

BG: So, if an enterprise has developed other tools or uses other commercial solutions, will Foundry work with those?

RF: Yes, of course. Over the last 15 years we have always had APIs that allowed data to be exported to other tools, but heard consistently from our enterprise clients that faster and more streamlined or native interoperability is required. Our current system was designed from the ground up with this open data architecture, and we migrated Gotham APIs to modern, interoperable standards. This gives customers the ability to use any tool now, and, as the world changes and new innovations become available, our solution enables enterprises to more rapidly innovate and adopt new technologies as they become available. Designing as an open system gives that benefit.

BG: So what would you say the greatest value add of Foundry is?

RF: In my experience, the biggest benefit to IT organizations lies in the data management and governance capabilities. This includes end-to-end data provenance, full audit coverage, and the most powerful access control system I am aware of. Fortunately, we were able to lean on our experience with Gotham and the government space when designing these capabilities. Foundry users discover, access, and derive value from all of their data. We give them the ability to collaborate over data and business logic, algorithms, code, formulas, spreadsheets, etc. The key features that enable collaboration are data and code versioning, together with sandboxing: every user can manipulate data and code in their own, isolated view and only merge the changes back once they’re ready. The backend also enables users to easily create new data out of their work product. Every data or code artifact created in Foundry becomes a new data source that other users can build on.

So, to more directly answer your question, Foundry’s greatest value add is to provide our customers with an open data platform that integrates with their enterprise environment. Marc Fontaine, the Digital Transformation Officer at Airbus, eloquently called this idea “re-creating digital continuity” in a recent interview with BCG.

BG: Allow me to come back to my initial question. You called Foundry an “open platform”, could you elaborate a little?

RF: The notions of open platforms and open architectures originate in hardware design and describe systems in which different components can be added, replaced, or upgraded independently. Buyers like this idea, because it reduces vendor lock-in and increases flexibility and negotiating power. This translates more or less directly to software platforms: they are considered open if their inter-component APIs follow open standards, are well documented, and can be accessed by any party through readily available tools and libraries. This is in contrast to closed APIs whose internals are undocumented and often intentionally cryptic. Moreover, use of closed APIs is typically governed by license agreements that prohibit third-party tools, or even outright ban any external use of its data. At Palantir, we lean on open standards like JSON and HTTP for APIs, and open-source technologies for data storage and transformations.

In the early days of computing, the majority of commercial platforms were closed (because, hey, who doesn’t like a good monopoly?) and this led to the siloed compute and data infrastructures that most IT organizations on this planet are still trying to unwind today. Fortunately, the commoditization of hardware and software together with the continued success of open data formats and open-source ecosystems changed the market dynamic over the past 10 years or so. Today, no software vendor can afford to ignore open formats and open-source software.

BG: Does openness matter because customers are interested in plugging in their tools, or is it actually because there’s a deeper concern about turning things “off”?

RF: Both matter. Connectivity to other systems is obviously the bread and butter for a data integration platform. Similarly, almost all of our customers develop bespoke applications and plugins against our open APIs. Before committing to our deep technology and commercial partnership, Airbus conducted an anti-lock-in experiment to “turn off” our platform. Over the course of a week or so, they extracted data and code from Palantir into other systems. This was only possible because all data in Foundry is stored in standard, open formats and accessible through open, documented APIs using off-the-shelf open-source tooling.

Our next post in this series will continue the discussion of open approaches, including open source software. The full series includes:

Track the most disruptive technologies by diving into our categorized index:

Artificial Intelligence Companies – A fast overview of Artificial Intelligence companies we believe are poised to cause the most positive disruption in the enterprise.

Big Data Companies – Reference to the greatest, most disruptive Big Data companies in the tech ecosystem.

Business Intelligence Companies – We assess these to be the Business Intelligence Companies most impactful for delivering real decision advantage.

Cybersecurity Companies – We apply our deep expertise in cybersecurity to assessing the best across multiple categories including:

Cloud Computing Companies – We include both platform and software as a service providers, capturing only the most innovative and disruptive.

Collaborative Tool Companies – These are the firms that help humans connect to humans to create, manage and lead.

Infrastructure Companies – Critical enterprise foundations for business agility.

IoT Companies – Internet of Things and Industrial Internet of Things are here. How do you manage them?

Mobile Companies – Help manage, configure, secure and optimize these very powerful capabilities.

Robotics Companies – Including innovations in Robotic Process Automation, Drones, and industrial robotics.

Services Companies – We only track a few, the ones we really know well.

Tech Titans – These are the big players. We track the tech titans closely since their capabilities change continuously.

VC, PE and Finance Companies – Keeping an eye on the investors can give indications of coming developments.

You can also use our topical pages to get up to speed quickly on the current status of the major megatrends. See our pages on Cloud ComputingArtificial IntelligenceMobilityBig DataRoboticsInternet of ThingsCybersecurity and Blockchain and Cryptocurrencies.

We also provide special pages focused on high interest topics, including Science FictionEntertainmentCyber WarTech CareersTraining and Education and Tech Tips.