xQTLUserReview – Trac

Context Navigation

Version 9 (modified by jvelde, 14 years ago) (diff)
--

Datamodel enhancements for Panacea

Genes [List] -> xQTL entity: Gene
Contains all the identified genes in a certain Wormbase release. It is a list with the main identifier the Wormbase ID (WBgene00000001). Furthermore it will contain the sequence name, common name, splice variants, start position, end position and (in a later version) can include information like intron, exon and promotor position.

Needed field	Example value	xQTL field
Wormbase ID	WBgene00000912	name
Common name	daf-16	description
Sequence name	R13H8.1	alias
Splice variant	R13H8.1a	symbol
Chromosome	I	chromosome_name
Start Position	10750332	bpstart
End Position	10776689	bpend

Platforms [List] -> xQTL entity: OntologyTerm
List with a micro-array platform identifier and a description about the micro-array platform

Needed field	Example value	xQTL field
PlatformID	WUR_Agilent	name
Info	4*44K array	description

Platform info [List] -> xQTL entity: Probe
This is a list per micro-array platform that contains all the info about the spots on the micro array. Spot ID, Gene ID, splice variant, probe sequence, probe position on the genome, etc.

Needed field	Example value	xQTL field
Spot ID	WUR_AGI_00001	name
Wormbase ID	WBgene00000912	description
Splice variant	R13H8.1a	symbol
Probe sequence	###	seq
Probe Chromosome	I	chromosome_name
Probe position	###	bpstart
Unambiguous	###	mismatch

Samples [List] -> xQTL entity: Sample
List of samples. Sample ID, genotype, experiment

Needed field	Example value	xQTL field
Sample ID	Sample000001	name
Genotype	WN001	???
Experiment	TEMP16_EXP	???
Remarks	###	description

Experiments [List] -> xQTL entity: ???
List of experiments, Experiment ID, what was measured, conditions, etc…

Needed field	Example value	xQTL field
Experiment ID	TEMP16_EXP	name
Measured	Gene expression	???
Platform	WSU	???
Growing temperature	16	???
Stage	L4	???
Hours	###	???
Treatment	No	???
Remarks	###	description

Projects [List] -> xQTL entity: ???
List to link experiments together, because multiple experiments under different conditions can belong to one project. Might also be a way to give access to the different groups of data.

Needed field	Example value	xQTL field
Project ID	GROW_TEMP	name
Experiment ID	TEMP16_EXP	???
Publications	Li etal 2006	???
Remarks	###	description

Datasets [List] -> xQTL entity: Data
List of datasets. Contains information like, belongs to experiment, normalization, mapping models etc.

Needed field	Example value	xQTL field
Dataset ID	DATA_TEMP16	name
Type	Gene expression	ontologyreference_name
Unit	Log 2 intensities	???
Processed	Normalized	???
Experiment ID	TEMP16_EXP	???
Remarks	###	description

Polymorphisms [List] -> xQTL entity: ???
List of polymorphisms, probably only SNPs first, although up to 500 deleted genes are known between CB and N2. The position and base pair change, probable AA change and gene features like intron, exon, promotor etc. can be included. More-over it should indicate which parental strains/wild isolates contain which basepair at the SNP position. Sequences of a dozen wild-isolates will become available in 2012.

Needed field	Example value	xQTL field
SNP ID	SNP_a000001	name
Chromosome	I	chromosome_name
Status	Confirmed	???
Position	2312245	???
Reference	A	???
A	N2	???
T		???
C	CB4856	???
G		???
Remarks	###	description

Markers [List] -> xQTL entity: Marker
List of marker IDs. Should contain the position and type, etc.

Needed field	Example value	xQTL field
Marker ID	Marker_00001	name
Other Name	pkI067	symbol
Type	PCR	ontologyreference_name
Chromosome	I	chromosome_name
Position	2312245	bpstart
SNP ID	SNP_a000001	???
Remarks	###	description

Mapping populations [List] -> xQTL entity: ???
List of available mapping populations and their description. Many more mapping population will become available in the near future. Some are already developed and genotyped.

Needed field	Example value	xQTL field
Population ID	Pop_001	name
Parental strains	N2, CB4856	strain_name
Type	RIL	ontologyreference_name
Number of Genotypes	200	???
Publication	###	???
Remarks	###	description

Strains [List] -> xQTL entity: Strain
List of wild-type strains and their description. Isolation site, etc

Needed field	Example value	xQTL field
Strain_ID	N2	name
Isolation site	Bristol	???
Description	###	description
Sequenced	###	???

Report builder feedback 18 nov 2011

Yang:

Use but does not help you to 'get to know' the data in the first place.
Wanted:
- Quantative data: boxplot of measurements for each sample/individual, to inspect and detect/remove outliers -> Use e.g. boxplot(log(metaboliteexpression, exp(2.71828))) -> Or boxplot(log(t(metaboliteexpression), exp(2.71828))) for the (transposed) 'other view'
- Qualitative data: heatmap like view of e.g. genotypes -> Use e.g. image(matrix(as.numeric(as.factor(genotypes)),nrow(genotypes),ncol(genotypes))) -> Or heatmap(matrix(as.numeric(as.factor(genotypes)),nrow(genotypes),ncol(genotypes)),scale="none",Colv=NA, Rowv=NA)

Basten:

Enter gene -> get all eQTL information! Simple as can be.
Cis/trans information
Find genes in a region and get their QTLS (sounds like gBrowse)
Ofcourse everything in document 'concept map'

Matthias:

Need a bit more than just 'gene -> QTL' info, so like the Advanced button in the simple screen
Select a matrix with QTL results, then grab a row or column (e.g. select a trait)
Filter within that list using e.g. a threshold to get a smaller list of e.g. significant markers
Using this trait and marker, build a nice table with a report
Even more advanced: cross-datamatrix filtering. Select an individual and get from all matrices values which adhere to a certain filter.
Relation between promotor-gene-transcript-protein-enzyme-pathways should be used preferably to create a biological valuable report
gBrowse as an exploration tool would be super to have, and is reusable for many organisms

Panacea xQTL review Wed 9 nov 2011

Data curation and annotation of the Panacea Database (joint effort)
Host project analyses:
- Phase 1: Just add the R scripts as files for basic provenance
- Phase 2: Being able to rerun important scripts such as sample mislabeling and QTL x Env
- Phase 3: Make it easy to add and run any R script that was used
Data matrix:
- Download or visualize with CytoScape
- Need new view with more organization/hierarchy: group by experiment or annotation
- Need to couple this hierarchy with the ability to quickly run scripts for visualization or statistics on a piece of the data
Need 'supersearch' which produces reports on concepts (ie. everything related to a marker or individual)
- This should include a special 'QTL finder' tool to quickly create reports on findings using a matrix and/or traits as inputs
Focus on pathways:
- Want to query pathway information for an organism from 1..N sources (GO, WormBase, KEGG.. ?)
- Want to couple existing gene annotations to these pathways, semi-automatically (?)
- Want to use this information to run analysis/visualizations on these batches of genes
- Want to create pathway plots (biology!) with the gathered information (e.g. QTL profiles)
- Want to create CytoScape graphs of e.g. correlation of the genes

EURATRANS xQTL review Wed 9 nov 2011

Integrate with genome browser (gBrowse, formats WGL and GFF)
Need to save any filter as a URL
Need to clarify how a tool like gBrowse can specify a 'range' in filter URL syntax and link it
Also need the 'range' to work on matrices with locus data (e.g. markers) via URL
Need to add scripts as visualizers of data: e.g. let them appear in a list of possible plot types in matrix
Search box bugs: no '_name' searches possible
Use real column names instead of labels eventually, this is still very confusing
Need for advanced matrix views:
- On similar rows & columns, create 'subheaders' with multivalue display
- Alternatively, matrix merge operation to concatenate them
- Want to have a report on a value that appears in multiple matrices
- And/or reports with plots for e.g. a marker or probe with all related information from all sources

xQTL LL user workshop Fri 28 Oct 2011 general comments

Need more download formats: STATA, SAS, CSV (not TSV)
SNP data should somehow include metadata such as: imputed (yes/no), genome build, reliability, dosage, etc.
There should be a way to merge (part of a / a filtered) phenotype set with a genotype set, and retrieve the result
The application can be slow, need more speed
There are too many tabs, GUI is unclear
Need an hourglass (or something) to indicate the application is busy (blackout screen with progress bar when busy, GeneNetwork style?)
Making selections is too complex
Column paging is confusing
Filtering works 'half' (Joris please explain)
Data viewer should be able to create selections using your own file
Id's should be unrecognizable
Search filter should work on all measurements by default
Filters are lost when a user pages, this is annoying
Like other waiting times, applying a filter should cause the app to refresh with a 'wait please' status
Merging pheno- and geno sets is a special case of using a shopping cart (batches) to save selections for re-use elsewhere, this should be implemented. So you make a selection of participants based on phenotypic criteria, and then you view SNPs of interest for only those selected, and vice versa.

Download in other formats:

Plain Text