private static void updateExtractRule(SequenceMatchRules.AnnotationExtractRule r, Env env, Function<CoreMap, Value> extractor) { MatchedExpression.SingleAnnotationExtractor annotationExtractor = SequenceMatchRules.createAnnotationExtractor(env,r); annotationExtractor.valueExtractor = extractor; r.extractRule = new SequenceMatchRules.CoreMapExtractRule<>( env, r.annotationField, new SequenceMatchRules.BasicSequenceExtractRule(annotationExtractor)); r.filterRule = new SequenceMatchRules.AnnotationMatchedFilter(annotationExtractor); }
@Override protected Frag build() { Frag f = pattern.build(); Frag frag = new Frag(new GroupStartState(captureGroupId, f.start), f.out); frag.connect(new GroupEndState(captureGroupId)); return frag; }
public AnnotationExtractRule create(Env env) { AnnotationExtractRule r = new AnnotationExtractRule(); r.resultAnnotationField = EnvLookup.getDefaultResultAnnotationKey(env); r.resultNestedAnnotationField = EnvLookup.getDefaultNestedResultsAnnotationKey(env); r.tokensAnnotationField = EnvLookup.getDefaultTokensAnnotationKey(env); r.tokensResultAnnotationField = EnvLookup.getDefaultTokensResultAnnotationKey(env); if (env != null) { r.update(env, env.getDefaults()); } return r; }
public SequenceMatchResult<T> apply(SequenceMatchResult<T> seqMatchResult, int... groups) { SequenceMatcher<T> matcher = pattern.getMatcher(seqMatchResult.elements()); if (matcher.find()) { return matcher; } else { return null; } } }
@Override protected Frag build() { Frag frag = expr.build(); frag.connect(new ValueState(value)); return frag; }
/** * Create a multi-pattern matcher for matching across multiple TokensRegex patterns. * * @param patterns Input patterns * @return A MultiPatternMatcher */ public static MultiPatternMatcher<CoreMap> getMultiPatternMatcher(TokenSequencePattern... patterns) { return new MultiPatternMatcher<>( new MultiPatternMatcher.BasicSequencePatternTrigger<>(new CoreMapNodePatternTrigger(patterns)), patterns); }
/** * Given a segment of text, returns list of spans (PhraseMatch) that corresponds * to a phrase in the table (filtered by the list of acceptable phrase) * @param acceptablePhrases - What phrases to look for (need to be subset of phrases already in table) * @param text Input text to search over * @return List of all matched spans */ public List<PhraseMatch> findAllMatches(List<Phrase> acceptablePhrases, String text) { WordList tokens = toNormalizedWordList(text); return findAllMatches(acceptablePhrases, tokens, 0, tokens.size(), false); }
protected AnnotationExtractRule create(Env env, String expr, Expression result) { AnnotationExtractRule r = super.create(env, null); if (r.annotationField == null) { r.annotationField = EnvLookup.getDefaultTextAnnotationKey(env); } r.ruleType = TEXT_PATTERN_RULE_TYPE; updateExtractRule(r, env, expr, null, result); return r; }
public List<PhraseMatch> findAllMatches(WordList tokens, int tokenStart, int tokenEnd, boolean needNormalization) { return findMatches(null, tokens, tokenStart, tokenEnd, needNormalization, true /* find all */, false /* don't need to match end exactly */); }
@Override protected Frag build() { State s = new MultiNodePatternState(multiNodePattern); return new Frag(s); }
@Override protected PatternExpr copy() { return new GroupPatternExpr(pattern.copy(), capture, captureGroupId, varname); }
@Override protected PatternExpr copy() { return new RepeatPatternExpr(pattern.copy(), minMatch, maxMatch, greedyMatch); } @Override
@Override protected Frag build() { State s = new BackRefState(matcher, captureGroupId); return new Frag(s); }
public List<PhraseMatch> findMatches(WordList tokens, int tokenStart, int tokenEnd, boolean needNormalization) { return findMatches(null, tokens, tokenStart, tokenEnd, needNormalization, false /* don't need to find all */, false /* don't need to match end exactly */); }
protected AnnotationExtractRule create(Env env, SequencePattern.PatternExpr expr, Expression result) { AnnotationExtractRule r = super.create(env, null); if (r.annotationField == null) { r.annotationField = r.tokensAnnotationField; } r.ruleType = TOKEN_PATTERN_RULE_TYPE; updateExtractRule(r, env, expr, null, result); return r; }
public TokensRegexAnnotator(String... files) { env = TokenSequencePattern.getNewEnv(); extractor = CoreMapExpressionExtractor.createExtractorFromFiles(env, files); verbose = false; }
@Override protected PatternExpr transform(NodePatternTransformer transformer) { return new ValuePatternExpr(expr.transform(transformer), value); }
/** * Create a multi-pattern matcher for matching across multiple TokensRegex patterns. * * @param patterns Collection of input patterns * @return A MultiPatternMatcher */ public static MultiPatternMatcher<CoreMap> getMultiPatternMatcher(Collection<TokenSequencePattern> patterns) { return new MultiPatternMatcher<>( new MultiPatternMatcher.BasicSequencePatternTrigger<>(new CoreMapNodePatternTrigger(patterns)), patterns); }
/** * Given a segment of text, returns list of spans (PhraseMatch) that corresponds * to a phrase in the table * @param text Input text to search over * @return List of all matched spans */ public List<PhraseMatch> findAllMatches(String text) { WordList tokens = toNormalizedWordList(text); return findAllMatches(tokens, 0, tokens.size(), false); }