Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
menu search
person
Welcome To Ask or Share your Answers For Others

Categories

Using an open source Java automaton library, eg: org.apache.lucene.util.automaton or dk.brics.automaton, how can I build an automaton for prefix matching?

eg: an automaton created from the set of strings ["lucene", "lucid"], that will match when given "luc", or "luce", but not match when given "lucy" or "lucid dream".

See Question&Answers more detail:os

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
274 views
Welcome To Ask or Share your Answers For Others

1 Answer

Prefix matching is possible using org.apache.lucene.util.automaton by setting all states to accept, eg:

    String[] strings = new String[]{"lucene", "lucid dream"};
    final List<BytesRef> terms = new ArrayList<>();
    for(String s : strings) {
        terms.add(new BytesRef(s));
    }
    Collections.sort(terms);
    final Automaton a = DaciukMihovAutomatonBuilder.build(terms);

    for (int i = 0; i < a.getNumStates(); i++) {
        a.setAccept(i, true);
    }

与恶龙缠斗过久,自身亦成为恶龙;凝视深渊过久,深渊将回以凝视…
thumb_up_alt 0 like thumb_down_alt 0 dislike
Welcome to ShenZhenJia Knowledge Sharing Community for programmer and developer-Open, Learning and Share
...