<bdo id='ev8wm'></bdo><ul id='ev8wm'></ul>
      <tfoot id='ev8wm'></tfoot>
      1. <small id='ev8wm'></small><noframes id='ev8wm'>

        <legend id='ev8wm'><style id='ev8wm'><dir id='ev8wm'><q id='ev8wm'></q></dir></style></legend>

        <i id='ev8wm'><tr id='ev8wm'><dt id='ev8wm'><q id='ev8wm'><span id='ev8wm'><b id='ev8wm'><form id='ev8wm'><ins id='ev8wm'></ins><ul id='ev8wm'></ul><sub id='ev8wm'></sub></form><legend id='ev8wm'></legend><bdo id='ev8wm'><pre id='ev8wm'><center id='ev8wm'></center></pre></bdo></b><th id='ev8wm'></th></span></q></dt></tr></i><div id='ev8wm'><tfoot id='ev8wm'></tfoot><dl id='ev8wm'><fieldset id='ev8wm'></fieldset></dl></div>

        如何获取 Lucene 模糊搜索结果的匹配项?

        时间:2023-09-29
      2. <legend id='LFkBr'><style id='LFkBr'><dir id='LFkBr'><q id='LFkBr'></q></dir></style></legend>
          <tbody id='LFkBr'></tbody>

        1. <tfoot id='LFkBr'></tfoot>

            <bdo id='LFkBr'></bdo><ul id='LFkBr'></ul>
              • <small id='LFkBr'></small><noframes id='LFkBr'>

                <i id='LFkBr'><tr id='LFkBr'><dt id='LFkBr'><q id='LFkBr'><span id='LFkBr'><b id='LFkBr'><form id='LFkBr'><ins id='LFkBr'></ins><ul id='LFkBr'></ul><sub id='LFkBr'></sub></form><legend id='LFkBr'></legend><bdo id='LFkBr'><pre id='LFkBr'><center id='LFkBr'></center></pre></bdo></b><th id='LFkBr'></th></span></q></dt></tr></i><div id='LFkBr'><tfoot id='LFkBr'></tfoot><dl id='LFkBr'><fieldset id='LFkBr'></fieldset></dl></div>
                  本文介绍了如何获取 Lucene 模糊搜索结果的匹配项?的处理方法,对大家解决问题具有一定的参考价值,需要的朋友们下面随着小编来一起学习吧!

                  问题描述

                  在使用 Lucene Fuzzy Search 时如何获得匹配的模糊词及其偏移量?

                  how do you get the matching fuzzy term and its offset when using Lucene Fuzzy Search?

                      IndexSearcher mem = ....(some standard code)
                  
                      QueryParser parser = new QueryParser(Version.LUCENE_30, CONTENT_FIELD, analyzer);
                  
                      TopDocs topDocs = mem.search(parser.parse("wuzzy~"), 1);
                      // the ~ triggers the fuzzy search as per "Lucene In Action" 
                  

                  模糊搜索工作正常.如果文档包含术语fuzzy"或luzzy",则匹配.如何获得匹配的术语以及它们的偏移量是多少?

                  The fuzzy search works fine. If a document contains the term "fuzzy" or "luzzy", it is matched. How do I get which term matched and what are their offsets?

                  我已确保所有 CONTENT_FIELD 都添加了带有位置和偏移量的 termVectorStored.

                  I have made sure that all CONTENT_FIELDs are added with termVectorStored with positions and offsets .

                  推荐答案

                  没有直接的方法可以做到这一点,但是我重新考虑了 Jared 的建议并且能够使解决方案发挥作用.

                  There was no straight forward way of doing this, however I reconsidered Jared's suggestion and was able to get the solution working.

                  我在这里记录一下,以防其他人遇到同样的问题.

                  I am documenting this here just in case someone else has the same issue.

                  创建一个实现org.apache.lucene.search.highlight.Formatter的类

                  public class HitPositionCollector implements Formatter
                  {
                      // MatchOffset is a simple DTO
                      private List<MatchOffset> matchList;
                      public HitPositionCollector(
                      {
                          matchList = new ArrayList<MatchOffset>();
                      }
                  
                      // this ie where the term start and end offset as well as the actual term is captured
                      @Override
                      public String highlightTerm(String originalText, TokenGroup tokenGroup)
                      {
                          if (tokenGroup.getTotalScore() <= 0)
                          {
                          }
                          else
                          {
                              MatchOffset mo= new MatchOffset(tokenGroup.getToken(0).toString(), tokenGroup.getStartOffset(),tokenGroup.getEndOffset());
                              getMatchList().add(mo);
                          }
                  
                          return originalText;
                      }
                  
                      /**
                      * @return the matchList
                      */
                      public List<MatchOffset> getMatchList()
                      {
                          return matchList;
                      }
                  }
                  

                  主代码

                  public void testHitsWithHitPositionCollector() throws Exception
                  {
                      System.out.println(" .... testHitsWithHitPositionCollector");
                      String fuzzyStr = "bro*";
                  
                      QueryParser parser = new QueryParser(Version.LUCENE_30, "f", analyzer);
                      Query fzyQry = parser.parse(fuzzyStr);
                      TopDocs hits = searcher.search(fzyQry, 10);
                  
                      QueryScorer scorer = new QueryScorer(fzyQry, "f");
                  
                      HitPositionCollector myFormatter= new HitPositionCollector();
                  
                      //Highlighter(Formatter formatter, Scorer fragmentScorer)
                      Highlighter highlighter = new Highlighter(myFormatter,scorer);
                      highlighter.setTextFragmenter(
                          new SimpleSpanFragmenter(scorer)
                      );
                  
                      Analyzer analyzer2 = new SimpleAnalyzer();
                  
                      int loopIndex=0;
                      //for (ScoreDoc sd : hits.scoreDocs) {
                          Document doc = searcher.doc( hits.scoreDocs[0].doc);
                          String title = doc.get("f");
                  
                          TokenStream stream = TokenSources.getAnyTokenStream(searcher.getIndexReader(),
                                                      hits.scoreDocs[0].doc,
                                                      "f",
                                                      doc,
                                                      analyzer2);
                  
                          String fragment = highlighter.getBestFragment(stream, title);
                  
                          System.out.println(fragment);
                          assertEquals("the quick brown fox jumps over the lazy dog", fragment);
                          MatchOffset mo= myFormatter.getMatchList().get(loopIndex++);
                  
                          assertTrue(mo.getEndPos()==15);
                          assertTrue(mo.getStartPos()==10);
                          assertTrue(mo.getToken().equals("brown"));
                  }
                  

                  这篇关于如何获取 Lucene 模糊搜索结果的匹配项?的文章就介绍到这了,希望我们推荐的答案对大家有所帮助,也希望大家多多支持html5模板网!

                  上一篇:在包含 1 亿个字符串的大型文本文件中进行高效 下一篇:在 Lucene 中,如何确定 IndexSearcher 或 IndexWriter 是

                  相关文章

                  最新文章

                • <i id='XYCUV'><tr id='XYCUV'><dt id='XYCUV'><q id='XYCUV'><span id='XYCUV'><b id='XYCUV'><form id='XYCUV'><ins id='XYCUV'></ins><ul id='XYCUV'></ul><sub id='XYCUV'></sub></form><legend id='XYCUV'></legend><bdo id='XYCUV'><pre id='XYCUV'><center id='XYCUV'></center></pre></bdo></b><th id='XYCUV'></th></span></q></dt></tr></i><div id='XYCUV'><tfoot id='XYCUV'></tfoot><dl id='XYCUV'><fieldset id='XYCUV'></fieldset></dl></div>

                      <tfoot id='XYCUV'></tfoot>

                      <legend id='XYCUV'><style id='XYCUV'><dir id='XYCUV'><q id='XYCUV'></q></dir></style></legend>
                        <bdo id='XYCUV'></bdo><ul id='XYCUV'></ul>

                      <small id='XYCUV'></small><noframes id='XYCUV'>