Release Notes
1.14 - 10/15/2024 Enabling BJ-Local#
General Updates#
- Refactored to remove the warnings on pipeline run
- Updated the containers to use ubuntu 24.04 as the base
Secondary Pipelines#
- BJ-DNA-QC v2.0.3: See general updates.
- BJ-WES v1.3.6: Support for GIAB HG002-T2T. See general updates.
- BJ-WGS v1.8.3: Support for GIAB HG002-T2T. See general updates.
- BJ-Expression v1.8.3: See general updates.
- BJ-SomaticVariantCalling v1.0.2: Addition of Count_Reads module. Removed “reads“ meta data from input.csv. See general updates.
1.13.1 - 8/6/2024#
Secondary Pipelines#
- BJ-Expression v1.8.2: Increased resource allocation for larger input reads.
Bug Fixes#
- Zero reads error causing pipelines to fail
1.13 - 7/11/2024#
General Updates#
- Addition of software versions table to MultiQC.
- Addition of user-specific parameter to MultiQC.
Secondary Pipelines#
- BJ-SomaticVariantCalling v1.0.1: Initial release enabling somatic variant calling with TNscope variant caller and a pseudobulk module. See general updates.
- BJ-DNA-QC v2.0.1: Updated Ginkgo parameter default to diploid. Cleaned Qualimap results output. See general updates.
- BJ-WGS v2.0.5: Added more GIAB cell references for the ADO and VCFeval modules and bioskryb129 custom DNAscope model. See general updates.
- BJ-WES v1.3.5: Added more GIAB cell references for the ADO and VCFeval modules and bioskryb129 custom DNAscope model. See general updates.
- BJ-Expression v1.8.1: See general updates.
Tertiary Pipelines#
- BJ-CNV v1.2.3: Updated Ginkgo parameter default to diploid. See general updates.
- BJ-VariantAnnotation v2.0.2 and BJ-RNAVariantCalling v1.1.5: See general updates.
Bug Fixes#
- Fixed arm architecture compute queues for job clogs. Monitoring will continue.
1.12 - 5/14/2024#
General Updates#
- Enhanced file upload capabilities.
- Updated documentation with detailed input file information.
- Reorganized and updated testing files for better performance.
Secondary Pipelines#
- BJ-WGS v2.0.4; BJ-DNA-QC v2.0.0: See general updates.
- BJ-WES v1.3.4: Addition of Near Distance Parameter for picard HsMetrics module. Replaced the Sentieon HsMetrics algorithm with Picard HsMetrics. See general updates.
- BJ-Expression v1.8.0: Upgraded to MultiQC v1.21. Streamlined the general statistics table by removing "_sampled" displays. Removed the failing Qualimap module, though its stats table remains available. Fixed bug in the HT-Seq module. See general updates.
Tertiary Pipelines#
- VariantAnnotation v2.0.1: Eliminated the failing conversion process from gatk_variants.txt to HDF5 for large files. Removed unnecessary columns from the gatk_variant_totable.txt file. See general updates.
1.11 - 4/25/2024#
General Updates#
- MultiQC v1.2.1: Upgraded MultiQC to version 1.21 across all pipelines. This enhancement ensures a smoother user experience, especially when handling more than 200 biosamples in a single report.
- VCFeval GIAB Ground Truth: Updated to HG001 GRCh38 v4.2.1 for improved accuracy.
Secondary Pipelines#
- BJ-WGS v2.0.3; BJ-Expression v1.7.8; BJ-WES v1.3.3; BJ-DNA-QC v1.9.8: Updated to leverage Graviton cores for enhanced computational node utilization, alongside standard instances. There have been no changes to tool versions or algorithmic steps.
BugFixes#
- Fixed output from BJ-Expression to function seamlessly with BJ-RNAVariantCalling.
1.10 - 1/24/2024#
Secondary Pipelines#
- BJ-WGSv2.0.0, BJ-WESv1.3.0, BJ-DNA-QCv1.9.7
- Metric Standardization: The total reads reported in multiqc have been modified (BJ-1320) and column names aligned across each pipeline. This metric is now grabbed from the hsmetrics table.
- Update Sentieon Version: Sentieon was updated to its latest version (202308-0) for all pipelines.
- BJ-WGS v2.0.0
- Update Sentieon Model: The WGSpipeline has been updated to use the latest Sentieon WGS DNAScope model.
- 'Total Reads' match: This value in the Project Level set of reads and MultiQC table match.
- BJ-WES v1.3.0
- Update Sentieon Model: The WES pipeline has been updated to use the Sentieon WES DNAScope model. Prior versions of the WES pipeline used the WGS model.
- BJ-DNA-QC v1.9.7
- Repaired the rounding of average ploidy from the MultiQC output (BJ-1326). Now number best matches Ginkgo avg ploidy labeled on individual jpgs. They do their own rounding prior to applying on the image and in some cases those don’t match. The most accurate value is on the MultiQC table.
Tertiary Pipelines#
- Variant Annotation v1.2.1
- Updated version which enables running on large projects.
Visualizations#
- Dash 5.2 deployed for all BaseJumper environments. Smoother interactions, faster speeds and updates on all Applications to use this engine. Exception is circos plotting app.
- Data fetching, particularly for updated metadata tables, provided in the Project Biosamples table, is now accessible to visualization apps.
- RNA Seq App
- Substantial updates to performance, can accept projects larger than 200 biosamples
- Fixed error where point selection in PCA submodule did not update subsequent graphics
- Repaired bug where users supplied “NA” as an actual group category label but was treated as 'Non Applicable” (N/A, null)
- Biosample selection in the left tab, next to the number of reads / biosample have been removed. Users can now either: 1)move minimum reads bar on the graphic or 2)Shift+select biosamples in the metadata table to remove. Metadata table also enables full sample name to be visible.
- Repaired multiple frame bugs in CSS layers of application. *Removed the “Update Count Matrix” button as changes to the application automatically updates the counting table.
- Corrected bug where some projects dont have the same number of biosamples in the pipeline and in the RNA-Seq App
- Variant Filter App
- Filtering by COSMIC ID has been removed
- Circos
- Currently disabled (just this version) to provide enhanced compatibility with updated Plotly engine
BugFixes#
- Fixed error that prevented certain projects to not be able to transfer between workspaces
- In some cases, user-provided tokens for BSSH could create a continuous 'spin' while building the cache of biosamples. This has been resovled to properly find biosamples
- Matching ploidy numbers from the multi-QC table and CNV output visualizations.
HotFixes#
- Fixed project transfer failure related to resource labels.
1.9 - 12/19/2023#
General Updates#
- Display
- Reduce the clutter of multiple pipeline versions visible at once by creating drop button: This update simplifies the pipeline launching process by consolidating multiple versions behind a dropdown button. This aims to reduce on-screen clutter, allowing users to conveniently select older versions while defaulting to the latest for a more organized workflow.
- Other
- Description to Pipelines: Users can now add descriptions to pipeline launches, aiding in tracking differences between launches.
- Private/Proprietary pipelines: This update enables users to launch pipelines exclusive to their workspace/organization, accessible only to authorized users.
Pipelines#
- ALL Secondary and Tertiary Pipelines:
- BioSample Tagging: This will enable the pipeline dashboard to display how many samples are associated with a pipeline invocation. This will enable us to record compute usage for each pipeline.
- Automated Testing for Secondary pipelines: Introduces automated testing for secondary pipelines, streamlining the testing process.
- BJ-WGS v1.9.5:
- Create summarized visualization/metric for ADO: Generates easily interpretable visualizations and adds a percentage breakdown of ADO for each sample to the all metrics table.
- BJ-WES v1.2.7:
- Add Sentieon gene_summary file to output of WES: Includes the Sentieon gene_summary file in the output of WES.
- Create summarized visualization/metric for ADO: This update provides a summarized visualization for ADO and appends the percentage ADO breakdown for each sample to the all metrics table.
- BJ-DNA-QC v1.9.6 ; BJ-Expression v1.7.7 ; BJ-CNV v1.2.2 ; BJ-RNAVariantCalling v1.1.3 ; BJ-SV v1.2.2:
- General version updates across all pipelines.
Hot Fixes#
- Fix Sample Name Table Display: Resolves the display issue for longer sample names in the sample name table.
- Project Export Failure: Addresses a bug causing project export failures.
1.8 - 11/21/2023#
General Updates#
- Biosample Metadata:
- Bug fix related to data type import and sort filter capabilities BJ-1169: This bug fix ensures a smoother and more efficient handling of biosample metadata, enhancing the accuracy of data organization.
- Enabled bulk sample delete with a delete button CR-822 : This enhancement streamlines the management of biosamples, providing users with a convenient and time-saving option for sample deletion.
- Enabled selection of multiple samples with shift key CR-825 : Users can now select multiple biosamples simultaneously by utilizing the shift key, enhancing the overall user experience, simplifying the process of managing biosample selections for various operations.
- Enabling multiple sizing filters for biosample (meta-data) tables. i.e bytes, megabytes, etc BJ-1322 : This feature offers greater flexibility in customizing the display of biosample information, catering to diverse user preferences and data types.
- Other
- Deletion notification bug CR-797 where large deletion of biosamples result in error email notification: This fix ensures that the notification system accurately reflects the status of biosample deletions, preventing unnecessary alerts.
- Generation of automated unit testing report BJ-1275: This enhancement ensures that every aspect of the codebase is thoroughly tested, providing developers with detailed reports on BaseJumpers performance and reliability.
- Enabling conversion to strapi from makedocs for BJ documention BJ-1267: This update not only streamlines the documentation process but also enhances the overall user experience by making information more organized, searchable, and visually appealing.
Pipelines#
- ALL Secondary and Tertiary Pipelines:
- MultiQC fix for consistency in table orders CR-693 : Users can now rely on consistent table orders, streamlining the interpretation of complex data and facilitating more accurate and reproducible analyses.
- Pipeline cost tagging CR-770 : This update allows developers to tag and track costs associated with different pipeline runs, providing valuable insights into resource utilization. Improved cost visibility empowers users to make informed decisions, and developer optimized resource allocation and ultimately reducing operational overhead.
- BJ-WGS v1.9.4:
- Version 1.9.4 of BJ-WGS introduces several enhancements, including the removal of empty columns in the MultiQC report (CR-880). This refinement streamlines data presentation, making it more concise and visually appealing. Additionally, a bug fix related to the parsing of low reads (BJ-1324) ensures the accuracy of data interpretation, contributing to a more reliable and precise whole-genome sequencing experience.
- BJ-WES v1.2.6:
- In version 1.2.6 of BJ-WES, the removal of empty columns in the MultiQC report (CR-880) enhances data clarity. Furthermore, an update ensures that WES is properly cited in output files (BJ-1172), addressing important considerations for data reproducibility and compliance with best practices in genomic research.
- BJ-DNA-QC v1.9.5 ; BJ-Expression v1.7.5 ; BJ-CNV v1.2.1 ; BJ-RNAVariantCalling v1.1.2 ; BJ-SV v1.2.1:
- BJ-DNA-QC v1.9.5, BJ-Expression v1.7.5, BJ-CNV v1.2.1, BJ-RNAVariantCalling v1.1.2, BJ-SV v1.2.1 brings version updates, specifically focusing on cost tagging.
Hot Fixes#
- Related to CR-624, Projects cloning in when filtering by size: This hot fix eliminates potential frustrations associated with project management, ensuring that projects behave as expected and simplifying the user experience.
- Fix CR-783 bug related to sample renaming in project creation: This hot fix ensures that the sample naming process is error-free, preventing potential data misinterpretation and maintaining data integrity from project inception.
Current Limitations#
- Tertiary pipelines currently do not support GRCm38 (CR-873). While we acknowledge this limitation, our dedicated team is actively working to overcome this constraint.
1.7 Bug Release - 11/7/2023#
- Biosample Metadata: Users can also add biosample metadata columns directly in the biosample table, updating the project accordingly. Additional updates include user enabled:
- Export Biosample Meta Data table
- Import Meta-data
- Search/Query biosample meta-data tables (CR-787)
- Sort/Filter new meta-data columns (CR-851 /CR-803 )
- Project Creation (CR-626): Fix relating to biosample search/filter results saving
- Updating BJ Version (CR-833): Version at bottom of page now updated no longer displaying “Head”
- Runner update for address missing samples and transfer failure for larger projects (CR-742 , CR-703 , CR-622 )
- All Secondary pipelines
- Enabling low input reads into multiQC output. Resolving sample dropout issue CR-764. This update enables samples with low reads counts that would otherwise be exclude to be including in the multiQC report with a flag warning of low reads.
- BJ WES v1.2.5
- VCFEval Results in MultiQC. Related to issue CR-855 in which these results were missing.
- Correct Pipeline Citation In MultiQC Report (CR-590). Report for WES pipeline cited WGS. This was corrected to cite the WES pipeline.
- BJ Expression v1.7.4
- Removal of Qualimap Percent Stats table. Issue BJ-1258: These metrics are redundant and contained in the Overall Stats Table. This goes along with efforts to clean up the MultiQC page and creating a clear understanding of what we share with customers.
1.6 - 10/10/2023#
- Biosample Metadata:
- Project Level: Users can now add biosample metadata columns at the project level when creating a new project.
- Biosample Table: Users can also add biosample metadata columns directly in the biosample table, updating the project accordingly. Users can also edit biosample metadata directly within the table. Editable fields are highlighted with a cyan rectangle. After making changes, the "Save changes" button becomes enabled for quick updates.
- Dynamic Column Selection:
- Introducing a dropdown feature for dynamic column selection in the biosamples table. Users can easily toggle between selected columns.
- Biosample Table Lazy Loading:
- Enhanced performance with the ability to work efficiently with larger tables, even when dealing with thousands of biosamples. Data is loaded in manageable chunks.
- Export and Import:
- Export to CSV: Users can export the biosamples table to a CSV file. Only the selected columns will be included in the export.
- Import Data: Import CSV or TSV files into the table with dynamic column name checks against existing columns. If a column from the import file doesn't match an existing one, a user-friendly guess is made using edit distance. The user is prompted to specify the correct column and can add it as a new column. Additionally, users need to specify the data type for the new column, similar to the process for adding columns at the project or biosample level.
- UI Enhancements:
- Implementing small UI changes and cleanup for a more polished user experience.
- Project Creation Lot ID (CR-699): *When creating a project, users have the flexibility to either select options from the dropdown or enter their own lot ID. The provided lot ID must adhere to the BioSkryb lot ID format.
- Sample Renaming on Project Creation (CR-747):
- Improving the logic for sample renaming, particularly when importing data from S3. This update ensures better handling of various renaming scenarios during project creation.
- Project Deletion Email:
- Users will now receive an email when projects or biosamples are deleted from their workspace, to ensure that users stay informed regarding any changes to their workspace.
- BJ-WGS v1.9.3 Pipeline
- Fix snpeff and vcfeval to use vcf that is generated from gvcf by excluding GT=0/0:
- Resolves an issue where snpeff (for annotating genetic variants) and vcfeval (from the RTG suite, used for evaluating the accuracy of VCF files against a reference) were using gVCFs which included "0/0" genotypes resulting in resource issue.
- Fix connection timeout issue that happens sporadically by bumping aws client max connection to 20:
- Increases the AWS client connection limit, addressing occasional connection timeouts and ensuring smoother operation, especially when handling large datasets or performing intensive tasks in cloud environments.
- Update MultiQC to v1.16
- Upgrades MultiQC, a bioinformatics tool that aggregates results from various tools into one consolidated report. This upgrade addresses some of the bugs that were seen in the previous version of the report eg. duplicate descriptions.
- Update SnpEff to 5.1:
- Upgrades SnpEff, a tool for annotating genetic variants in VCF files. Version 5.1 addresses some of the bug fixes in the tool.
- Update resources for certain processes to use resources for the entire instance and dynamically increase resources upon failed tasks:
- Adjust resource management, allowing more efficient use of computational power. By dynamically reallocating resources upon task failure, the system ensures tasks have the best chance of successful completion, reducing the chances of job failures due to resource constraints.
- Fix snpeff and vcfeval to use vcf that is generated from gvcf by excluding GT=0/0:
- BJ-WES v1.2.4 Pipeline
- Fix snpeff and vcfeval to use vcf that is generated from gvcf by excluding GT=0/0:
- Resolves an issue where snpeff (for annotating genetic variants) and vcfeval (from the RTG suite, used for evaluating the accuracy of VCF files against a reference) were using gVCFs which included "0/0" genotypes resulting in resource issue.
- Update to include chrM variant calling:
- Enhances the pipeline's capability by introducing variant calling for the mitochondrial chromosome (chrM), allowing for more comprehensive genetic analyses that include mitochondrial DNA, which can be crucial for certain studies.
- Update MultiQC, SnpEff, and resources:
- As described in the BJ-WGS updates, these enhance the functionality and efficiency of the pipeline in similar ways.
- Introduce ADO toggle for WES launch and output in MultiQC:
- This enhancement adds an "ADO toggle" option when launching the Whole Exome Sequencing (WES) pipeline.
- The inclusion of the ADO (Allelic Drop Out) data in the MultiQC report offers users a more comprehensive analysis, allowing them to easily assess and manage allelic dropout instances, a phenomenon where one allele may not be detected in a heterozygous sample.
- Fix snpeff and vcfeval to use vcf that is generated from gvcf by excluding GT=0/0:
- BJ-DNA-QC v1.9.3
- Fix connection timeout issue:
- Same adjustment as in the WGS pipeline, optimizing the DNA Quality Control (QC) pipeline's cloud operations.
- Update MultiQC to v1.16:
- Similar upgrade as above, streamlining the DNA-QC reporting process.
- Add module to calculate MAPD Score:
- Incorporates a new module for the MAPD (Median of the Absolute values of all Pairwise Differences) score, a metric used in DNA-QC. MAPD provides a measure of uniformity in the library amplification.
- Add biosamples missing from input in MultiQC report:
- Ensures inclusion of all samples, even those initially dropped due to low read counts. This modification provides a more exhaustive view of the dataset, allowing users to see the entirety of their samples, irrespective of read count constraints.
- Update fastp to allow users to provide adapter sequences to trim:
- Refines fastp, a tool for preprocessing sequence reads. By allowing users to define adapter sequences, they can customize trimming operations for their specific experimental setups, leading to more accurate read preparations.
- Update dynamically increase resources upon failed tasks:
- Automates resource adjustments in the DNA-QC pipeline, ensuring tasks get the necessary computational power for successful completion. BJ-Expression v1.7.3 Pipeline
- Issue with Pipeline Failures (CR-691):
- A critical bug, directly linked to CR-691, was identified, causing pipeline failures when subsampling was disabled. To address this, the “create_master” stats logic was updated, ensuring it adeptly manages scenarios where the subsampling option is skipped. Additionally, the "master_stats" script was modified to handle outputs, even when they are not subsampled, maintaining pipeline integrity and stability.
- Issue with Utilities Files Consolidation (CR-708):
- A notable issue was encountered related to CR-708, originating from the consolidation of utilities files for RNASeq to use the same repository as DNA-QC and WGS. The fix, therefore, involved resolving discrepancies from detached versions of the utilities file, exclusive to RNA, in versions 1.6.1 and 1.6.2. These files included FASTP for filtering and allowed for the input of adapter sequences for trimming. Ensuring the consolidated FastP (which doesn't accommodate adapters but detects them) can be utilized effectively, the adjustments also enable versions 1.6.1 and 1.6.2 to accept updated input.csv files.
- Issue with Biosample Name Cleanup:
- An issue was addressed concerning the cleanup of biosample names. A comprehensive fix was implemented to rectify biosample name inconsistencies and safeguard against further related issues, ensuring accurate and consistent data handling across the platform.
- Removal and Update of BJ-Expression Versions:
- The versions 1.7.1, 1.6.1, and 1.6.2 of BJ-Expression have been removed. Furthermore, the expression subsampling default has been updated to 200K reads to align it harmoniously with the standards set in version 1.6.1.
- Additions and Adjustments to BJ-Expression 1.7.3:
- Several updates have been introduced in this version, including the addition of a gene_body reference for the mouse genome GRCm39, which facilitates the generation of gene_body coverage plots for mouse data.
- A dynamic range column has also been introduced to the Overall stats, providing an enhanced layer of data interpretation for users.
- Lastly, a typographical error in Overall stats has been corrected, amending “Prop mithochondrion column” to “Prop mitochondrion”, ensuring clarity and accuracy in data representation (CR-737).
- Resolution of Missing Tertiary Pipelines Issue (CR-717):
- Issue CR-717, which removed tertiary pipelines in Expression pipeline version 1.72, has been fixed, restoring these pipelines.
- Fix connection timeout issue:
- BaseJumper Docs
- This update includes comprehensive details on pipeline steps and workflows, covering everything from counting algorithms and normalization to output and deliverables. Additionally, we've made it easier for you to access essential information by including instructions on how to download counts tables and other expression documents
- Visualization apps
- minor performance enhancements
- updated dependencies
1.5 - 9/13/2023#
- Updated Project Creation interface for cleaner user experience.
- Added functionality to rename incorrectly named files during the Project Creation process.
- Kraken2 has been enabled once again for performing quality checks on contamination.
- Updated labels for capture methods in WES pipeline.
- Added significant digits to proportion intron/exon/etc in MultiQC output.
- Added documentation to the User Manual about the demo datasets.
- Updated the MultiQC view to show CNV metrics (MAD, general ploidy, and number of segments).
- Updated the MultiQC Overall stats table for RNA to include essential quality control metrics.
- Update to the number of significant digits in MultiQC tab for RNA.
- Performance enhancements for workspaces with many projects.
- Added functionality to run BJ-DNA-QC and BJ-WGS on mouse samples.
- Performance enhancements for WES to enable analysis on large sample sets.
- Added parameter to VCFeval feature to allow users to modify minimum depth cutoffs.
- Updates to VCFeval feature to report metrics like accuracy, specificity, true negatives.
- Updates BJ-WGS for user enabled ADO Benchmarking.
- Update to BJ-WGS MultiQC to report ADO Benchmarking.
- Update to BJ-WGS to enable user selection of variant caller.
- Update to BJ-WGS variant caller output of ChrM variants.
- Minor bug fixes.
1.4#
- Move projects between workspaces to enable users to organize projects across multiple workspaces.
- Added T's & C's to BaseJumper menu to enable users to review after accepting.
1.3#
- The ability to delete individual biosamples to enable users to have more control over managing their data.
- Performance improvements to enable project creation with up to 1000 samples.
- URL formatting to enable users to more easily share URLs with other users in the same workspace.
- Terms and conditions dialog box for new users and existing users who have not yet accepted them.
- Low bandwidth warning dialog box to warn users when low bandwidth might be impacting the performance of BaseJumper (most important for visualization apps).
- Formatting improvements to the account creation process to provide a more consistent user experience.
- Bug fixes
- Pipelines queued after project creation were not being submitted.
- Memory errors when exporting large files.
1.2#
- A new visualization tool for multi-omic analytics (Circos plots) to show structural variants, tandem repeats, single nucleotide variants, insertions, and deletions on a single plot.
- A new visualization tool for droplet-based scRNA-seq analytics (UMAP plots) to show cell clustering patterns.
- An updated CNV visualization tool to improve user experience to select samples or subsets of samples more easily.
- Additional reference files to enable analysis of mouse samples.
- Performance improvements to enable primary and secondary analysis of large samples sets with up to 400 samples.
- The ability to cancel and resume jobs.
- The ability to rename projects.
- The ability to add samples to existing projects.
- A redesigned back-end to improve stability, maintainability, and enhanced tech support.
- Multiple interface updates to provide a better user experience.
- Temporary removal of the fusion tertiary pipeline.
- Export feature bug fixes.
- Pipeline Enhancements for BJ-DNA-QC (v1.8.0), BJ-WGS (1.7.0), and BJ-WES (1.1.0)
- Performance improvements
- Minor bug fixes that do not impact outputs
- Pipeline Enhancements for BJ-Expression (v1.7.0)
- Support for the GRCm38 and GRcm39 mouse reference
- Minor bug fixes.
1.1#
Release Date: Nov 18, 2022
- Minor bug fixes.
- Speed enhancements for BJ-WGS and BJ-Expression pipelines
1.0#
Release Date: Oct 25, 2022
- Initial release