Episode 287

Alan Rubin on MaveDB

April 17th, 2026

39 mins 51 secs

Your Host
Special Guest

About this Episode

Guest

Alan Rubin

Panelist

Richard Littauer

Show Notes

On this episode of Sustain, Richard Littauer sits down with computational biologist Alan Rubin to explore how open source software supports scientific research, clinical genetics, and cancer-related data infrastructure. Their conversation centers on MaveDB, a project that began as a way to organize hard-to-find variant data from research papers and has since evolved into a valuable resource for both scientists and clinicians. Along the way, they discuss infrastructure funding, research software sustainability, and why open source communities and academic researchers have a lot to learn from each other. Press download now to hear more!

[00:01:24] Alan explains his role leading a research group focused on genomics, cancer medicine, and improving patient care through genetics.

[00:02:46] We learn more about what MaveDB does.

[00:06:52] Alan details why a database was needed.

[00:08:26] Alan shares how the project grew out of collaboration, PyCon AU inspiration, Django, and Python tooling that let a small team build a practical research database.

[00:11:54] There’s a discussion on the infrastructure funding problem and Alan explains a major theme is how hard it is to fund scientific infrastructure, since most grants favor new discoveries rather than maintaining shared tools and databases.

[00:17:55] The project took a major turn when clinical geneticists began using the data to interpret patient variants, pushing the team to rethink the interface and user needs.

[00:21:13] Alan describes the new clinical-facing interface, Mave for Medicine (MaveMD), designed to help doctors evaluate specific variants for diagnosis and treatment decisions.

[00:22:02] Alan talks about managing the project through a distributed team, shared responsibilities, and a role that now centers more on direction, priorities, and community than day-to-day coding.

[00:23:36] They discuss why research software rarely attracts hobbyist contributors, even when the mission is compelling, and how scientific projects often function more like small product teams.

[00:27:44] Alan makes the case that scientists often learn more about improving their software craft at events like PyCon than at discipline-specific conferences.

[00:30:38] Alan highlights how academic software depends heavily on mature, well-documented open source tools and encourages more connection between technical communities and scientific work.

[00:34:15] Find out where you can learn more about MaveDB and Alan’s work.

Quotes

[00:10:04] “We quite literally followed the Django Girls tutorial, but instead of a building a blog, we built a database for research scientists.”

[00:12:35] “Infrastructure is something everybody wants to have it exist and nobody wants to pay for.”

[00:26:08] “I have never been successful in engaging the broader open source community, despite having tried many times to contribute to this or any other scientific project.”

[00:31:01] “I think people who work in OSS should be excited about the kind of stuff that their work is enabling, even if they don’t really hear about it.”

Spotlight

  • [00:35:44] Richard’s spotlight is the book, News of the Dead.
  • [00:36:22] Alan’s spotlight is The Global Alliance for Genomics & Health (GA4GH) and all the good work they’re doing.

Links

Sponsor

CURIOSS

Credits

Support Sustain