From 00eb9b125be08f39cd991183f58e17ee8b5977c9 Mon Sep 17 00:00:00 2001 From: Yuri Gorshenin Date: Wed, 8 Jun 2016 14:01:03 +0300 Subject: [PATCH 1/2] [search] Added README for search quality. --- search/search_quality/README.txt | 46 ++++++++++++++++++++++++++++++++ 1 file changed, 46 insertions(+) create mode 100644 search/search_quality/README.txt diff --git a/search/search_quality/README.txt b/search/search_quality/README.txt new file mode 100644 index 0000000000..dcc45a1361 --- /dev/null +++ b/search/search_quality/README.txt @@ -0,0 +1,46 @@ +This document describes how to use all these tools for search quality analysis. + + +1. Prerequisites. + + * get the latest version of omim (https://github.com/mapsme/omim) and build it + + * get the latest samples.lisp file with search queries. If you don't know + how to get it, please, contact the search team. + + * install Common Lisp. Note that there are many implementations, + but we recommend to use SBCL (http://www.sbcl.org/) + + * install Python 3.x and packages for data analysis (sklearn, scipy, numpy, pandas, matplotlib) + + * download maps necessary for search quality tests. + For example: + + ./download-maps.sh -v 160524 + + will download all necessary maps of version 160524 to the current directory. + + +2. This section describes how to run search engine on a set of search + queries and how to get CSV file with search engine output. + + i) Run gen-samples.lisp script to get search queries with lists of + vital or relevant responses in JSON format. For example: + + ./gen-samples.lisp < samples.lisp > samples.json + + ii) Run features_collector_tool from the build directory. + For example: + + features_collector_tool --mwm_path path-to-downloaded-maps \ + --json_in samples.json \ + --stats_path /tmp/stats.txt \ + 2>/dev/null >samples.csv + + runs search engine on all queries from samples.json, prints + useful info to /tmp/stats.txt and generates a CSV file with + search engine output on each query. + + The result CSV file is ready for analysis, i.e. for search quality + evaluation, ranking models learning etc. For details, take a look at + scoring_model.py script. From 21e366eb71774fe11d230f377282a943dc8ee46f Mon Sep 17 00:00:00 2001 From: Yuri Gorshenin Date: Wed, 8 Jun 2016 14:21:09 +0300 Subject: [PATCH 2/2] Review fixes. --- search/search_quality/README.txt | 34 +++++++++++++++++++++----------- 1 file changed, 22 insertions(+), 12 deletions(-) diff --git a/search/search_quality/README.txt b/search/search_quality/README.txt index dcc45a1361..551819904e 100644 --- a/search/search_quality/README.txt +++ b/search/search_quality/README.txt @@ -1,19 +1,19 @@ -This document describes how to use all these tools for search quality analysis. +This document describes how to use the tools for search quality analysis. 1. Prerequisites. - * get the latest version of omim (https://github.com/mapsme/omim) and build it + * Get the latest version of omim (https://github.com/mapsme/omim) and build it. - * get the latest samples.lisp file with search queries. If you don't know + * Get the latest samples.lisp file with search queries. If you don't know how to get it, please, contact the search team. - * install Common Lisp. Note that there are many implementations, - but we recommend to use SBCL (http://www.sbcl.org/) + * Install Common Lisp. Note that there are many implementations, + but we recommend to use SBCL (http://www.sbcl.org/). - * install Python 3.x and packages for data analysis (sklearn, scipy, numpy, pandas, matplotlib) + * Install Python 3.x and packages for data analysis (sklearn, scipy, numpy, pandas, matplotlib). - * download maps necessary for search quality tests. + * Download maps necessary for search quality tests. For example: ./download-maps.sh -v 160524 @@ -22,10 +22,10 @@ This document describes how to use all these tools for search quality analysis. 2. This section describes how to run search engine on a set of search - queries and how to get CSV file with search engine output. + queries and how to get a CSV file with search engine output. i) Run gen-samples.lisp script to get search queries with lists of - vital or relevant responses in JSON format. For example: + vital or relevant responses in JSON format. For example: ./gen-samples.lisp < samples.lisp > samples.json @@ -41,6 +41,16 @@ This document describes how to use all these tools for search quality analysis. useful info to /tmp/stats.txt and generates a CSV file with search engine output on each query. - The result CSV file is ready for analysis, i.e. for search quality - evaluation, ranking models learning etc. For details, take a look at - scoring_model.py script. + The resulting CSV file is ready for analysis, i.e. for search + quality evaluation, ranking models learning etc. For details, + take a look at scoring_model.py script. + + iii) To take a quick look at what the search returns without + launching the application, consider using search_quality_tool: + + search_quality_tool --viewport=moscow \ + --queries_path=path-to-omim/search/search_quality/search_quality_tool/queries.txt + --top 1 \ + 2>/dev/null + + By default, map files in path-to-omim/data are used.