Inconsistency with biobank table (pheno & gcc project)
problem:
The bbmri.xml describes the table as follows :
<entity name="Biobank" extends="Panel">
<description>Describes one biobank cohort</description>
<field name="Acronym" nillable="true" description="Acronym for this
biobank" />
<field name="Institutes" nillable="true" type="mref"
xref_entity="Institute" mref_name="BiobankInstitute"
description="Institute that are custodian of this biobank" />
<field name="Publications" type="mref" xref_entity="Publication"
description="Publications for this biobank" />
<field name="Category" type="xref" xref_entity="OntologyTerm" />
<field name="Type" type="enum" enum_options="[core,support,other]" />
<field name="Contacts" type="mref" xref_entity="Contact"
mref_name="BiobankContact" description="Contact for this biobank" />
<field name="Topics" type="mref" xref_entity="OntologyTerm" />
<field name="Materials" type="mref" xref_entity="OntologyTerm" />
<field name="Size" nillable="true" />
<field name="LastUpdate" type="date" auto="true" />
<field name="Description" type="text" nillable="true" />
</entity>
Similarly the BBMRIupdateDatabase tries to create these fields. But
the created table in DB is missing the fields:
Institutes (mref), Publications(mref), Contacts (mref), Topics
(mref), Materials (mref).
Is this how was supposed to work? Because it causes a problem in mysql
input from the previous scheme (in pheno).
Approach 1)Solution
- create new table biobank from pheno project as follows:
mysql> create table biobank_new select Acronym, Category, Type, Size, LastUpdate , Description, id from biobank ;
(only fields that exist in biobank in gcc project)
- Dump and insert new table in gcc project.
- $ mysqldump5 -u root -p bbmri_old biobank_new > biobank_new.mysql
- Edit biobank_new.mysql and replace biobank_new (table) with biobank
-
- mysql5 -u root -p bbmri < biobank_new.mysql
Approach 2)select query retrieving records in molgenis form
We need data inside INNER/LEFT JOIN tables
select count(*) as num_rows FROM Biobank INNER JOIN
Panel ON (Biobank.id = Panel.id) INNER JOIN
ObservationTarget ON (Biobank.id = ObservationTarget.id) INNER JOIN
ObservationElement ON (Biobank.id = ObservationElement.id) LEFT JOIN
Investigation AS xref_Investigation ON xref_Investigation.id = ObservationElement .Investigation LEFT JOIN
OntologyTerm AS xref_ontologyReference ON xref_ontologyReference.id = ObservationElement.ontologyReference LEFT JOIN
OntologyTerm AS xref_Species ON xref_Species.id = Panel.Species LEFT JOIN
OntologyTerm AS xref_PanelType ON xref_PanelType.id = Panel.PanelType LEFT JOIN
OntologyTerm AS xref_Category ON xref_Category.id = Biobank.Category WHERE ObservationElement.__Type = 'Biobank'
Approach 3) Fix the xls data file according to the consistencies between these entities.
xls colums :
Name Type Acronym category_term topics_term institutes_name contacts_lastName size materials_term GWA data n = GWA platform GWA comments description
try to import as it is :
Need to removes characters like , ( ) ; + . Comma and parenthesis was replaced by or ; by some spaces and plus by plus. OntologyTerm? term was replaced by name .
After uploading : error : Column 'PanelType' cannot be null
two new columns added :
__Type | PanelType |
OntologyTerm | PanelType? |
still error with Column 'PanelType' cannot be null
`After these changes import worked :
Added new tabs: ObservationElement (column name), investigation (column name), New columns at ObservedValue : investigation_name feature_name target_name, and new columns at ObservedFeature : investigation_name feature_name target_name
name name name, Biobank also needed new columns : paneltype_name category_name investigation_name
less that 50 year less that 50 year name .
The xls could not be imported with characters like comma (,) , parenthesis ( ()) , + ,