Introduction
Many customers want to pull data out of OpenSpecimen into an external data warehouse. You can achieve this in two ways:
Using APIs
Using Query
This page explains the option to use the OpenSpecimen Query interface to achieve this option. This option is easy to achieve.
Importantly, this avoids outside developers learning OpenSpecimen APIs and writing custom programs. Writing programs need a lot of programming efforts, can be a maintenance overhead in the long run, and can be error-prone.
Overview
Notes/References
Refer to Reporting to learn how to create queries.
Refer to Query Results view on including columns and renaming column header names.
Refer to Scheduling Queries on how to schedule queries.
You can create different queries per participant, visit, specimens, etc. to control the data output.
Refer to https://openspecimen.atlassian.net/l/c/xi6pnzz0 on how to run the scheduled jobs and download the data output.
How to query for records that have been added/updated since the last run?
You may not want all data in every run. To pull only the data that has changed since the last query run, you can use the fields “Update Time” field within Participant, Visit, and Specimen objects in the query UI.
To retrieve specimens modified in the last 60 minutes, use the temporal filter given below:
minutes_between(current_date(), Specimen.updateTime) < 60
To retrieve visits modified in the last 24 hours:
minutes_between(current_date(), SpecimenCollectionGroup.updateTime) < 1440
To retrieve participants modified in the last one week (7 days):
minutes_between(current_date(), Participant.updateTime) < 10080
To retrieve participants modified in the last one month:
months_between(current_date(), Participant.updateTime) < 1