== Feedback 2011_01_16 == * Meerdere QTL's in 1 plotje stoppen in de QTLFinder plugin, ipv alles los van elkaar. Dit is essentieel voor het vergelijken van de resultaten in de verschillende datasets. * Afmaken en uploaden van de eerste dataset, iig zover mogelijk. Dus cureren en nalopen van de annotaties. BLASTen van de Wageningen probes komt dan later wel. * WormQTL op de nieuwe server draaiend krijgen zodat we vanaf dan in de final productieomgeving opereren. * Integratie van jBrowse visualisatie onderzoeken en als mogelijk uitwerken == Feedback QTL finder == 1) De find tab zou goed het openings scherm kunnen zijn, dan kunnen de gebruikers gelijk iets typen DONE 2) Dat er meerder probes zijn is nu niet handig, als daar ook een default voor kan komen, want het is nu niet duidelijk wat het betekend. DONE 3) Als je op een common gen naam zoekt dan krijg je ook hits als “target of gen xxx “ omdat dat in de beschrijving staat dit is niet handig aan gezien je dan wel een hele lange lijst met probes krijgt en je niet weet welke nu van het gen is dat je zoekt. DONE 4) Het QTL plaatje in nu veel te klein, dat zou op zijn minst leesbaar moeten zijn, dat je kan clicken om het te vergroten is wel goed. LASTIG - plots zijn nu 'generiek' 5) Een overview functie zou ook handig zijn dus meerder QTL profiles op een figuur. Dat creëert meer waarde. LASTIG - plots zijn nu 'generiek' 6) De QTL plaatjes zouden niet op de markers geplot moeten worden maar de bp, dan kunnenw e alle experimenten en verschillende populaties goed vergelijken. LASTIG - plots zijn nu 'generiek' 7) De positie van het gen zo in het plaatje aangegeven kunnen worden op de x-as dan kan men gelijk zien of er een cis of trans qtl is. DONE 8) De bp van de hoogste piek (per chromosoom) zou ook handig zijn. == Dataset tagging == We want a way to apply multiple 'tags' to datasets (matrices), e.g. "genotypes", "eQTL results", "Rockman_et_al_2010", "LOD_scores", "raw_data", etc. A suitable field would be the OntologyReference, but this is only an XREF. It is inherited from !InvestigationElement through !ProtocolApplication to Data. If this would be an MREF, it would work. But this a quite a big change because it affects all !InvestigationElements. An alternative is to add a new field to just Data. == Stories Yang == || '''Story''' || '''How to demo''' || || 1. as a researcher I want to see all QTLs from a specific experiment || (1) user can choose the experiment name, e.g. GxT study, Gld-1 RNAi expression. (2) user can choose mapping method e.g. single marker mapping or full model mapping (3) user can view QTL dot plots: X axis: QTL position; Y axis: gene position; each dot represent QTL || || 2. as a researcher I want to see QTLs of some selected genes from a specific experiment || (1) user can paste a list of gene names into text field. (2) user can choose the experiment name, e.g. GxT study, Gld-1 RNAi expression. (3) user can choose mapping method e.g. single marker mapping or full model mapping (4) user can view QTL plots: X axis: marker position; Y axis: QTL; || || 3. as a researcher I want to see QTLs of some selected genes from the selected experiments data sets || (1) user can paste a list of gene names into text field. (2) user can choose the experiment names, e.g. GxT study, Gld-1 RNAi expression. (3) user can choose mapping method e.g. single marker mapping or full model mapping (4) user can view QTL plots: X axis: marker position; Y axis: QTL; || || 4. as a researcher I want to run "Regenotyper" to detect potential wrongly labeled samples in a selected experiments data set || (1) user can choose the experiment names, e.g. GxT study, Gld-1 RNAi expression. (2) user can view WLS score plots: X axis: samples; Y axis: WLS score || || 5. as a researcher I want to input genes from one pathway and show correlations in a graph || (1) user can paste a list of gene names into text field. (2) user can choose a correlation matrix. (3) user can view genes in circular graph with each edge showing the correlation strenght by thickness || || 6. as a researcher I want to input genes from one pathway and show QTL profiles in a graph || (1) user can paste a list of gene names into text field. (2) user can choose a specific experiemnt data set. (3) user can view genes in circular graph with QTL profiles next to gene || || 7. as a researcher I want to input genes from one pathway and compare correaltions between two conditions in a graph || (1) user can paste a list of gene names into text field. (2) user can choose two or more experiemnt data sets. (3) user can view genes from each conditions in circular graph with correlation strength by thickness (4) user can view genes between conditions in circular graph with correlation strength changes by thickness || || 8. as a researcher I want to input genes and visualize differential expression in a graph || (1) user can paste a list of gene names into text field. (2) user can choose a experiemnt data set with multiple conditions. (3) user can choose method to compute DE genes (4) view genes in ditrected circular graph with gradien color showing DE significance/fold changes || || 9. Similar story as 5-9, with input being "key word" from a biological process or gene fucntion etc. || same as story 5-9 || || 10. as a researcher I want to input genes, infer causal realtionship, and visualize it in a graph || (1) user can paste a list of gene names into text field. (2) user can choose a experiemnt data set. (3) user can view genes in ditrected circular graph with inferred causal probability displayed on the arrows || == Datamodel enhancements for Panacea == Genes [List] '''-> xQTL entity: Gene'''[[BR]] Contains all the identified genes in a certain Wormbase release. It is a list with the main identifier the Wormbase ID (WBgene00000001). Furthermore it will contain the sequence name, common name, splice variants, start position, end position and (in a later version) can include information like intron, exon and promotor position. || Needed field || Example value || xQTL field || || Wormbase ID || WBgene00000912 || name || || Common name || daf-16 || Use: 'Alternative identifiers', MREF to 'AlternateId' table (NEW) || || Sequence name || R13H8.1 || Use: 'Alternative identifiers', MREF to 'AlternateId' table (NEW) || || Splice variant || R13H8.1a || Use: XREF in 'Transcript' table pointing to 'Gene' (NEW) || || Chromosome || I || chromosome_name || || Start Position || 10750332 || bpstart || || End Position || 10776689 || bpend || Platforms [List] '''-> xQTL entity: !OntologyTerm'''[[BR]] List with a micro-array platform identifier and a description about the micro-array platform || Needed field || Example value || xQTL field || || PlatformID || WUR_Agilent || name || || Info || 4*44K array || description || Platform info [List] '''-> xQTL entity: Probe'''[[BR]] This is a list per micro-array platform that contains all the info about the spots on the micro array. Spot ID, Gene ID, splice variant, probe sequence, probe position on the genome, etc. || Needed field || Example value || xQTL field || || Spot ID || WUR_AGI_00001 || name || || Wormbase ID || WBgene00000912 || Use: 'ReportsFor' XREF pointing to a 'Gene' (NEW) || || Splice variant || R13H8.1a || Use: the existing 'Transcript' to 'Gene' XREF || || Probe sequence || || seq || || Probe Chromosome || I || chromosome_name || || Probe position || || bpstart || || Unambiguous || || mismatch || Samples [List] '''-> xQTL entity: Sample'''[[BR]] List of samples. Sample ID, genotype, experiment. Sample properties == genotype matrix OR phenotype matrix || Needed field || Example value || xQTL field || || Sample ID || Sample000001 || name || || Genotype || WN001 || Use: a 'Data' matrix of Sample x DerivedTrait instead. For example: Measurement(name=Genotype), ObservedValue(value=WN001, feature=Genotype, target=Sample000001))|| || Experiment || TEMP16_EXP || Use: a 'Data' matrix, this is the name of the matrix. || || Remarks || || description || Experiments [List] '''-> xQTL entity: OntologyTerm'''[[BR]] List of experiments, Experiment ID, what was measured, conditions, etc… Experiments = ProtocolApplications (== Data) Repeat per sample! || Needed field || Example value || xQTL field || || Experiment ID || TEMP16_EXP || Use: Name of a 'Data' matrix (= 'ProtocolApplication') || || Measured || Gene expression || Use: Tags of a 'Data' matrix (= 'ProtocolApplication') || || Platform || WSU || Use: matrix data, phenotype of each sample in the set || || Growing temperature || 16 || Use: matrix data, phenotype of each sample in the set || || Stage || L4 || Use: matrix data, phenotype of each sample in the set || || Hours || || Use: matrix data, phenotype of each sample in the set || || Treatment || No || Use: matrix data, phenotype of each sample in the set || || Remarks || || Use: Description of a 'Data' matrix (= 'ProtocolApplication') || Projects [List] '''-> xQTL entity: OntologyTerm'''[[BR]] List to link experiments together, because multiple experiments under different conditions can belong to one project. Might also be a way to give access to the different groups of data. Suggested solution for future: * remove InvestigationElement (should become ownership groups) * make mref Investigation->ProtocolApplication to bundle data sets into one project/investigation (=organization of research) * use ProtocolApplication.previousStep to show how the data flow was from raw data to results * use row level security groups to secure ObservationElement and/or ProtocolApplication/Data (= organization of ownership) Current solution: * make 'experimentId' and 'projectid' just phenotypes or tags || Needed field || Example value || xQTL field || || Project ID || GROW_TEMP || Use: Tags of a 'Data' matrix (= 'ProtocolApplication') || || Experiment ID || TEMP16_EXP || Use: Tags of a 'Data' matrix (= 'ProtocolApplication') || || Publications || Li etal 2006 ||Use: 'Publication' table || || Remarks || || Use: Description of the 'tag' used (OntologyTerm) for Project/Experiment || Datasets [List] '''-> xQTL entity: Data'''[[BR]] List of datasets. Contains information like, belongs to experiment, normalization, mapping models etc. || Needed field || Example value || xQTL field || || Dataset ID || DATA_TEMP16 || Use: Name field || || Type || Gene expression || Use: Tags field || || Unit || Log 2 intensities || Use: Tags field || || Processed || Normalized || Use: Tags field || || Experiment ID || TEMP16_EXP || Use: Tags field || || Remarks || || Use: Description field || Polymorphisms [List] '''-> xQTL entity: SNP (NEW)'''[[BR]] List of polymorphisms, probably only SNPs first, although up to 500 deleted genes are known between CB and N2. The position and base pair change, probable AA change and gene features like intron, exon, promotor etc. can be included. More-over it should indicate which parental strains/wild isolates contain which basepair at the SNP position. Sequences of a dozen wild-isolates will become available in 2012. || Needed field || Example value || xQTL field || || SNP ID || SNP_a000001 || Use: Name field || || Chromosome || I || Use: Chromosome_name field || || Status || Confirmed || Use: Status field || || Position || 2312245 || bpstart || || Reference || A || Use: MREF of 'SNP' to 'Polymorphism' (NEW) || || A || N2 || Use: MREF of 'SNP' to 'Polymorphism' (NEW) || || T || || Use: MREF of 'SNP' to 'Polymorphism' (NEW) || || C || CB4856 || Use: MREF of 'SNP' to 'Polymorphism' (NEW) || || G || || Use: MREF of 'SNP' to 'Polymorphism' (NEW) || || Remarks || || Use: Description field || Markers [List] '''-> xQTL entity: Marker'''[[BR]] List of marker IDs. Should contain the position and type, etc. || Needed field || Example value || xQTL field || || Marker ID || Marker_00001 || name || || Other Name || pkI067 || symbol || || Type || PCR || ontologyreference_name || || Chromosome || I || chromosome_name || || Position || 2312245 || bpstart || || SNP ID || SNP_a000001 || Use: 'ReportsFor' field (NEW) || || Remarks || || description || Mapping populations [List] '''-> xQTL entity: Panel(type='mapping population')'''[[BR]] List of available mapping populations and their description. Many more mapping population will become available in the near future. Some are already developed and genotyped. || Needed field || Example value || xQTL field || || Population ID || Pop_001 || name || || Parental strains || N2, CB4856 || strain_name || || Type || RIL || ontologyreference_name || || Number of Genotypes || 200 || in Data || || Publication || || ??? || || Remarks || || description || Strains [List] '''-> xQTL entity: Panel(type=strain])'''[[BR]] List of wild-type strains and their description. Isolation site, etc || Needed field || Example value || xQTL field || || Strain_ID || N2 || name || || Isolation site || Bristol || phenotype of the panel || || Description || || description || || Sequenced || || phenotype of the panel || == Report builder feedback 18 nov 2011 == Yang: * Use but does not help you to 'get to know' the data in the first place. * Wanted: * Quantative data: boxplot of measurements for each sample/individual, to inspect and detect/remove outliers -> Use e.g. boxplot(log(metaboliteexpression, exp(2.71828))) -> Or boxplot(log(t(metaboliteexpression), exp(2.71828))) for the (transposed) 'other view' * Qualitative data: heatmap like view of e.g. genotypes -> Use e.g. image(matrix(as.numeric(as.factor(genotypes)),nrow(genotypes),ncol(genotypes))) -> Or heatmap(matrix(as.numeric(as.factor(genotypes)),nrow(genotypes),ncol(genotypes)),scale="none",Colv=NA, Rowv=NA) Basten: * Enter gene -> get all eQTL information! Simple as can be. * Cis/trans information * Find genes in a region and get their QTLS (sounds like gBrowse) * Ofcourse everything in document 'concept map' Matthias: * Need a bit more than just 'gene -> QTL' info, so like the Advanced button in the simple screen * Select a matrix with QTL results, then grab a row or column (e.g. select a trait) * Filter within that list using e.g. a threshold to get a smaller list of e.g. significant markers * Using this trait and marker, build a nice table with a report * Even more advanced: cross-datamatrix filtering. Select an individual and get from all matrices values which adhere to a certain filter. * Relation between promotor-gene-transcript-protein-enzyme-pathways should be used preferably to create a biological valuable report * gBrowse as an exploration tool would be super to have, and is reusable for many organisms == Panacea xQTL review Wed 9 nov 2011 == * Data curation and annotation of the Panacea Database (joint effort) * Host project analyses: * Phase 1: Just add the R scripts as files for basic provenance * Phase 2: Being able to rerun important scripts such as sample mislabeling and QTL x Env * Phase 3: Make it easy to add and run any R script that was used * Data matrix: * Download or visualize with CytoScape * Need new view with more organization/hierarchy: group by experiment or annotation * Need to couple this hierarchy with the ability to quickly run scripts for visualization or statistics on a piece of the data * Need 'supersearch' which produces reports on concepts (ie. everything related to a marker or individual) * This should include a special 'QTL finder' tool to quickly create reports on findings using a matrix and/or traits as inputs * Focus on pathways: * Want to query pathway information for an organism from 1..N sources (GO, WormBase, KEGG.. ?) * Want to couple existing gene annotations to these pathways, semi-automatically (?) * Want to use this information to run analysis/visualizations on these batches of genes * Want to create pathway plots (biology!) with the gathered information (e.g. QTL profiles) * Want to create CytoScape graphs of e.g. correlation of the genes == EURATRANS xQTL review Wed 9 nov 2011 == * Integrate with genome browser (gBrowse, formats WGL and GFF) * Need to save any filter as a URL * Need to clarify how a tool like gBrowse can specify a 'range' in filter URL syntax and link it * Also need the 'range' to work on matrices with locus data (e.g. markers) via URL * Need to add scripts as visualizers of data: e.g. let them appear in a list of possible plot types in matrix * Search box bugs: no '_name' searches possible * Use real column names instead of labels eventually, this is still very confusing * Need for advanced matrix views: * On similar rows & columns, create 'subheaders' with multivalue display * Alternatively, matrix merge operation to concatenate them * Want to have a report on a value that appears in multiple matrices * And/or reports with plots for e.g. a marker or probe with all related information from all sources == xQTL LL user workshop Fri 28 Oct 2011 general comments == * Need more download formats: STATA, SAS, CSV (not TSV) * SNP data should somehow include metadata such as: imputed (yes/no), genome build, reliability, dosage, etc. * There should be a way to merge (part of a / a filtered) phenotype set with a genotype set, and retrieve the result * The application can be slow, need more speed * There are too many tabs, GUI is unclear * Need an hourglass (or something) to indicate the application is busy (blackout screen with progress bar when busy, GeneNetwork style?) * Making selections is too complex * Column paging is confusing * Filtering works 'half' (Joris please explain) * Data viewer should be able to create selections using your own file * Id's should be unrecognizable * Search filter should work on all measurements by default * Filters are lost when a user pages, this is annoying * Like other waiting times, applying a filter should cause the app to refresh with a 'wait please' status * Merging pheno- and geno sets is a special case of using a shopping cart (batches) to save selections for re-use elsewhere, this should be implemented. So you make a selection of participants based on phenotypic criteria, and then you view SNPs of interest for only those selected, and vice versa.