Dasha documentation === How to configure and run query expansion in molgenis4phenotype === 1) download the ontologies from http://bioportal.bioontology.org/ You should download * (!http://rest.bioontology.org/bioportal/ontologies/download/44307?applicationid=4ea81d74-8960-4525-810b-fa1baab576ff) * Human Disease (!http://rest.bioontology.org/bioportal/ontologies/download/44309?applicationid=4ea81d74-8960-4525-810b-fa1baab576ff) * NCI Thesaurus (!http://rest.bioontology.org/bioportal/ontologies/download/42838?applicationid=4ea81d74-8960-4525-810b-fa1baab576ff) MeSH can be taken from biobank_search\!WebContent\WEB-INF 2) Change the directory names: * in DBIndexPlugin: LUCENE_INDEX_DIRECTORY * in !OntoCatIndexPlugin2: LUCENE_ONTOINDEX_DIRECTORY, ''ONTOLOGIES_DIRECTORY'' 3) Create a Molgenis database 4) Set the VM arguments for !OntoCatIndexPlugin2.java to –Xms1024M –Xmx1024M 5) Run the project 6) Upload the data into the database 7) In ''DB Index and Search'' press ''Build Index'' to build the index of your database 8) In ''Index OntoCAT'' '' ''press ''Build Ontocat Index'' 9) Now in ''DB Index and Search'' you can search your database by pressing ''Search Index'' or search your database with query expansion by choosing appropriate ontologies and pressing ''Search with query expansion'' === Project Description === '''public class DBIndexPlugin''' the plugin to index and search the database (with or without query expansion): @param LUCENE_INDEX_DIRECTORY – empty directory to put index files in '''public void buildIndexAllTables(Database db)''' –makes the index '''public void SearchAllDBTablesIndex(Database db)''' –searches the index (in “description” field) '''public void !ExpandQuery(Database db)''' –expands the query by calling expand(!OntologiesForExpansion)from !OntocatQueryExpansion_lucene '''public class !OntocatQueryExpansion_lucene''' '''public List parseQuery(String query)''' –parses the query by ignoring the punctuation, splitting the query by ‘ ‘, Boolean operators, reading phrases in quotation marks as a single unit. Calls public List chunk (List words) '''public List chunk (List words) '''– chunks the query (List words) into all possible n-grams (combinations of subsequent query words) (n ranges from 1 to words.size()) '''public void expand(List ontologiesToUse) '''– finds expansion terms in ontologiesToUse. For every n-gram of the chunked query searches it in ontologies, if found, adds expansion terms to initial query list '''public String output(List parsed) '''– constructs a new query of the initial query list, adding expansion terms with lower weight, using the same Boolean operators and quotes (if any) as in user query. '''public class !OntoCatIndexPlugin2''' the plugin that indexes and searches the ontologies @param LUCENE_ONTOINDEX_DIRECTORY - empty directory to put index files in @param ''ONTOLOGIES_DIRECTORY ''– the directory, where the ontologies are stored @param ''ontologyNamesMap ''– the list of ontologies and the correspondence between ontology names and file names containing them '''public String !SearchIndexOntocat(String query, List ontologyLabels) '''– searches the query in the ontologies with names ontologyLabels. Returns a string “!term:expansion term1; expansion term2;… expansion termN;” '''public void buildIndexOntocat() '''- builds the ontology index. Pairs (!term:expansion) are stored for each term of each ontology ''' '''