Skip to content

Ignore diacritics when searching #20550

Closed Answered by maciejaszyk
elitastic asked this question in Q&A
Discussion options

You must be logged in to vote

Hi,
I would say a custom analyzer is the most elegant way. This way you don't have to handle diacritics removal in your app.

Currently you cannot derive from RavenStandardAnalyzer because it is sealed. However, you can inherit from StandardAnalyzer, as the version we use is Version.LUCENE_29.

About ReusableTokenStream, it gives some performance gains. However, please be careful when implementing an override of it to avoid state-sharing between calls.

Simple analyzer:

using System.IO;
using Lucene.Net.Analysis;
using Lucene.Net.Analysis.Standard;
namespace Analyzer;

public class LanguageAnalyzer : Lucene.Net.Analysis.Standard.StandardAnalyzer
{
    public LanguageAnalyzer() : base(Lucene.Net

Replies: 2 comments 4 replies

Comment options

You must be logged in to vote
0 replies
Answer selected by ayende
Comment options

You must be logged in to vote
4 replies
@ayende
Comment options

@elitastic
Comment options

@maciejaszyk
Comment options

@elitastic
Comment options

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Category
Q&A
Labels
None yet
3 participants