Using Functional Programming to Accelerate Translational Research at Pfizer (CUFP 2017)

Track

CUFP 2017

Time Zone

The program is currently displayed in (GMT+01:00) Belfast.

Use conference time zone: (GMT+01:00) BelfastSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

By setting a time band, the program will dim events that are outside this time window. This is useful for (virtual) conferences with a continuous program (with repeated sessions).
The time band will also limit the events that are included in the personal iCalendar subscription service.

Display full programSpecify a time band

Save

When

Sat 9 Sep 2017 17:15 - 17:40 at L2 - CUFP Talks 6

Abstract

Biopharmaceutical research is an incredibly resource-intensive endeavor. The task of pharmaceutical R&D organizations is to perform translational research, specifically, to integrate data and knowledge of human biology in order to create new treatments for disease.

This requires iterating from an initial, broad category of biological hypotheses towards an increasingly narrower set of high-confidence biological mechanisms and associated treatments. These iterations require combining many sources of information to design and interpret experiments while making decisions as to which biological mechanisms are most likely to be efficacious. Progressing a treatment through the development pipeline spans a multi-year development cycle and culminates in clinical studies that determine the success or failure of a proposed treatment.

One key challenge for translational research is that it draws upon disparate sources of information, including genomics, experiments from animal models, public data repositories, high throughput experiments, and high dimensional measurements of biomarkers. Data sources are in many locations and in a variety of formats and adhoc schemas, making mere acquisition and integration of this data often a rate-limiting step of research. Contextualizing heterogeneous data into persisted datasets requires a large amount of upfront work. Analyses are often one-off, resulting in having to consider two suboptimal choices – delaying time-sensitive analyses due to over-engineering or failing to aggregate persisted data resources that offer potential institutional value in the longer term.

We used functional programming technologies to accelerate the core computational workflows involved in the many data integration needs of translational research. Our goal was to build a platform that automates many steps of the process for scientists who are domain experts but have little training in data management tools. The key to success was the idea of declarative programming at all stages of the workflow. The first stage is data acquisition, where we dynamically infer a declarative specification of the shape of the incoming data (schemas) that is streamed in, with little to no annotations from the scientist. These inferred schemas are then useful for scientists to make sense of the data and build effective queries. We further automate access to this data by dynamically generating REST endpoints that can be queried conveniently from any of the many programming languages in use by scientists today (R, Python, and Ruby). The content of data sources is versioned and the metadata curated via programmatic ontological annotations. Scientists are then able to reason about a priori unknown relationships between disparate data sources. Finally, we built a browser-based interface for identifying and curating datasets.

While biopharma is an unusual setting for this, we used functional programming technologies at all levels of the stack, from Haskell for efficiently streaming data into our platform, to GHCJS to rapidly iterate on a web interface that our users find effective, as well as reusing functional data query DSL’s in R (dplyr) and other languages. We developed technologies inspired by functional programming itself, such as type inference, immutability, and logical relationship inference from the data we acquire. Overall we found functional programming methodologies were a good fit for the data analysis needs and rapid iteration requirements of translational research.

Time Zone

The program is currently displayed in (GMT+01:00) Belfast.

Use conference time zone: (GMT+01:00) BelfastSelect other time zone

The GMT offsets shown reflect the offsets at the moment of the conference.

Time Band

Display full programSpecify a time band

Save

Session Program

Sat 9 Sep
Displayed time zone: Belfast change

16:50 - 17:40	CUFP Talks 6CUFP at L2

16:50 25m Talk		Building the largest payment sandbox on a tiny machine CUFP A: Máté Marjai TestingPays
17:15 25m Talk		Using Functional Programming to Accelerate Translational Research at Pfizer CUFP Austin Huang Pfizer

Using Functional Programming to Accelerate Translational Research at Pfizer

Sat 9 Sep
Displayed time zone: Belfast change

Austin Huang

Pfizer

Tracks

Co-hosted Conferences

Workshops

Co-hosted Symposia

Using Functional Programming to Accelerate Translational Research at Pfizer

Program Display Configuration

Program Display Configuration

Sat 9 SepDisplayed time zone: Belfast change

Austin Huang

Pfizer

Sat 9 Sep
Displayed time zone: Belfast change