Bulk Operations Enhancements

Introduction

OpenSpecimen contains the following improvements to the Bulk Operation feature:

  1. Upload Custom Forms/Dynamic Extensions (DE) data via BO 
  2. Remove dependency on system identifiers (like participant id, specimen id, etc)
  3. BO rerun
  4. Better handling missing columns
  5. Default value 

Custom Forms BO Support

User will be able to insert data using BO for Custom forms. When a new custom form is created or existing one is updated, the BO template is automatically generated. Users can use this for uploading data in bulk.

Remove dependency on system identifiers

In older versions, when migrating data for Participant, SCG, and Specimens, one needed to perform certain manual operations in between uploading the files. I.e., upload participants, copy the participant system identifier into SCG CSV, upload SCG CSV, copy SCG system identifiers into Specimen CSV. This can long and tedious, as well as error prone due to manual intervention.

OpenSpecimen fixes this issue by removing the dependency on system identifiers. So the work flow would be:

  1. Participant CSV should contain PPID
  2. SCG CSV should contain SCG Label, PPID & CPID 
  3. Specimen CSV should contain SCG Label
  4. SCG upload with MRN + Site name & CP title

With this, you can load all the CSVs one after the other without any manual intervention

Someone who has done legacy data migration in caTissue will appreciate this enhancement better since they have gone through the pain! (smile)

Note: there are some cases where you still need system ids and being resolved as we move along in OpenSpecimen. Please feel free to report places where you find them.

BO rerun

When uploading data via BO, chances of failure is common due to the data errors in the uploaded file. The errors might be due to incorrect data specified for enumerated fields or format. After user runs BO upload, system generates a report and displays in the UI dashboard. In previous version, when the such error occurs, user had to correct the data and recreate the data file with the failed records only and upload again. With new version of BO, user can correct the data within the BO report generated by the system and upload it back. User does not have to remove the success records or remove the additional status column. System will ignore the records and process only the previously failed records.

Setting default value in template

In OpenSpecimen, user can specify default values in the BO XML template at attribute level. User can skip such columns from CSV when loading data. Below are some use cases and examples how this can be used:

  1. For the fields that are not used in particular biobank but are mandatory in caTissue, specify default value like Not Specified - Examples:
    1. SpecimenCollectionGroup.ClinicalStatus - Not Specified
    2. Specimen.TissueSite - Not Specified
  2. For the fields that are mandatory and value will be consistent for all protocols, set the value in XML - Examples:
    1. SpecimenCollectionGroup.CollectionSite - ABC Hospital (If specimens collected from same collection site)
    2. Specimen.Type - Whole Blood (If only whole blood collected)

In above cases, when loading Specimen Collection Group or Specimens, user can skip the attributes from CSV where default value is specified. If value is specified in CSV, the default value is overwritten.

Better handling for missing columns

If the BO XML template contains a certain set of columns, then it was mandatory to have those columns in the CSV file too. However, in case of optional columns or in case of "Edit", it is not necessary to have all the columns in the CSV file. In OpenSpecimen, BO ignores the missing columns unless any of the missing column was mandatory. Even the mandatory columns can be skipped if default value is specified in the template.

Delete existing Bulk Operation templates

Use below SQLs to delete existing BO templates:

NOTE: Please take backup of the database before running these commands.

Delete all templates:

DELETE FROM catissue_bulk_operation;

Delete specific template:

DELETE FROM catissue_bulk_operation WHERE operation LIKE 'OperationNameHere';

Load standard Bulk Operation templates

You can load standard BO templates by running below ant target from OpenSpecimen installable directory.

ant load_bulk_operator_templates

Ant target to load Bulk Operation templates

Run below ant target from OpenSpecimen installable directory by changing respective parameters depending upon your template.

ant add_bulk_operation_template -DoperationName="Operation Name Here" -DdropdownName="Drop-down Name Here" -DcsvFile="CSV File Path" -DxmlFile="XML File Path"

Note: Path must be separated by Unix style path separator "/".