Introduction

OpenSpecimen v3.2 supports a new feature to automatically monitor a folder to consume bulk import files on a continuous basis. This feature is very useful to integrate OpenSpecimen with other systems like REDCap, OpenClinica etc.

With this feature, the external sources do not need to learn OS REST APIs to integrate with OS. They just need to generate OS BO compatible CSV files and store it in a pre-defined folder. OS continuously monitors this folder and consumes the BO file when it finds a new file. The CSV files are processed in ascending order of timestamp specified in the file name. This helps maintain the order of the files imported.

Steps

Create a "scheduled-bulk-import" directory in-app data (os-data) directory if not present. The location of the os-data directory can be retrieved from $Tomcat_Home/conf/openspecimen.properties file. Check for app.data_dir property.
Copy the bulk import CSV files in 'scheduled-bulk-import' directory. You can do it via scripts in case of integrations.

The file name should be in a format as mentioned in the table below

Data	Format	Description
Standard entities (e.g. participant, specimen, etc)	<object_type>_<operation>_<timestamp>_[<csv_type>].csv	object_type: Entity name specified in bulk import schema file (see list below) operation: Operation to perform, valid values are "create" or "update". timestamp: In yyyyMMddHHmmssSSS". csv_type (optional): Specify "m" in case of "Order" and "Shipment" Examples cp_create_20160511162033124.csv distributionOrder_create_20160511162033124.csv
Custom fields	<entity>_<operation>_<timestamp>_cpId_<cpId>.csv	entity: cpr, visit, specimen. operation - Operation to be performed, "create" or "update" timestamp - In "yyyyMMddHHmmssSSS" format cpId - Specifying Identifier of collection protocol. <cpId> - Identifier of collection protocol. You can get this identifier via DB or from the browser URL on the collection protocol overview page. Example: Specimen custom field level update file specimen_update_20200925115950000_cpId_4292.csv
Custom forms	extensions_<attached_level><form_name><operation>_<timestamp>.csv	extensions: Static word to identify a custom form. attached_level: Level at which form is attached. Participant SpecimenCollectionGroup (i.e Visit) Specimen SpecimenEvent form_name - System generated 'Form Name' of the custom form. operation - Operation to be performed, "create" or "update" timestamp - In "yyyyMMddHHmmssSSS" format Example: extensions_Participant_familyHistoryAnnotation_create_20160511162246252.csv

Once the files are processed (uploaded into OpenSpecimen), they are moved to a folder named - 'processed-bulk-import'
If there are any issues with the file name format, then the file is not uploaded and moved to a folder name 'unprocessed-bulk-import'
The file name needs to be updated and the file must be moved to the 'scheduled-bulk-import' from where it can be taken up for upload.
To view the bulk import jobs, use the URL <https://IP address/openspecimen/#/bulk-import-jobs>.
Following is the list of object types for OpenSpecimen entities

Note

Update CP based custom field values of participants, visits, and specimens are supported from v.6.3

Entity	Object Type
Institutes	institute
Site	site
User	user
User Roles	userRoles
Container	storageContainer
Distribution Order	distributionOrder
Shipment	shipment
Participant registration	cpr
Participant registrations for multiple CP	cprMultiple
Participant consents	consent
Visits	visit
Specimen	specimen
Aliquots	specimenAliquot
Derivatives	specimenDerivative
Master Specimen	masterSpecimen

Note: The date format in the files should be in the format <mm-dd-yyyy>for US and for others <dd-mm-yyyy>.

Example 1: Template for Collection Protocol

Example 2: Template for importing files in Participant-level custom field:

Prerequisites:

Participant update csv: Name of csv should follow cpr_update_yyyyMMddHHmmssSSS_cpId_<cpId>.csv format. eg. cpr_update_20220715185015000_cpId_18.csv
Create a folder called 'files', and insert the required files (pdf, jpeg, png, etc.) into the folder.
Compress both folder ('files') and csv into a zip. Give the zip the same name as the csv, i.e., cpr_update_20220715185015000_cpId_18.zip

Example zip file containing csv and folder 'files' with pdfs:

Browser not supported