IKAnalyzer2012FF + Lucene4.9 TokenStream contract violation: reset()/close() call missing

大家好，又见面了，我是你们的朋友全栈君。

异常信息如下：

java.lang.IllegalStateException: TokenStream contract violation: reset()/close() call missing, reset() called multiple times, or subclass does not call super.reset(). Please see Javadocs of TokenStream class for more information about the correct consuming workflow.
	at org.apache.lucene.analysis.Tokenizer$1.read(Tokenizer.java:111)
	at java.io.Reader.read(Reader.java:140)
	at org.wltea.analyzer.core.AnalyzeContext.fillBuffer(AnalyzeContext.java:124)
	at org.wltea.analyzer.core.IKSegmenter.next(IKSegmenter.java:122)
	at org.wltea.analyzer.lucene.IKTokenizer.incrementToken(IKTokenizer.java:78)
	at unit.test.IKAnalyzerTest.test01(IKAnalyzerTest.java:29)
	at sun.reflect.NativeMethodAccessorImpl.invoke0(Native Method)
	at sun.reflect.NativeMethodAccessorImpl.invoke(NativeMethodAccessorImpl.java:57)
	at sun.reflect.DelegatingMethodAccessorImpl.invoke(DelegatingMethodAccessorImpl.java:43)
	at java.lang.reflect.Method.invoke(Method.java:601)
	at junit.framework.TestCase.runTest(TestCase.java:168)
	at junit.framework.TestCase.runBare(TestCase.java:134)
	at junit.framework.TestResult$1.protect(TestResult.java:110)
	at junit.framework.TestResult.runProtected(TestResult.java:128)
	at junit.framework.TestResult.run(TestResult.java:113)
	at junit.framework.TestCase.run(TestCase.java:124)
	at junit.framework.TestSuite.runTest(TestSuite.java:243)
	at junit.framework.TestSuite.run(TestSuite.java:238)
	at org.junit.internal.runners.JUnit38ClassRunner.run(JUnit38ClassRunner.java:83)
	at org.eclipse.jdt.internal.junit4.runner.JUnit4TestReference.run(JUnit4TestReference.java:50)
	at org.eclipse.jdt.internal.junit.runner.TestExecution.run(TestExecution.java:38)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:467)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.runTests(RemoteTestRunner.java:683)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.run(RemoteTestRunner.java:390)
	at org.eclipse.jdt.internal.junit.runner.RemoteTestRunner.main(RemoteTestRunner.java:197)

代码：

public void test01() throws IOException {
<span style="white-space:pre">	</span>String text="基于java语言开发的轻量级的中文分词工具包，一个轻量级框架!"; 
		
        // 创建分词对象  
        Analyzer analyzer = new IKAnalyzer(true);       
        StringReader reader = new StringReader(text);  
        
        // 分词  
        TokenStream ts = analyzer.tokenStream("", reader);  
        CharTermAttribute term = ts.getAttribute(CharTermAttribute.class);  
        
        // 遍历分词数据  
        while(ts.incrementToken()){  
            System.out.print(term.toString()+" | ");  
        }  
        
        reader.close();  
        System.out.println();  
}

上面的代码为旧的分词步骤，按照新的API，调用TokenStream的流程如下：

1、Instantiation of TokenStream/TokenFilters which add/get attributes to/ the AttributeSource.
2、The consumer calls reset（）.
3、The consumer retrieves attributes the stream and stores local references to all attributes it wants to access.
4、The consumer calls incrementToken（） until it returns false consuming the attributes after each call.
5、The consumer calls end（） so that any end-of-stream operations can be performed.
6、The consumer calls close（） to release any resource when finished using the TokenStream.

所以在调用incrementToken（）之前需要调用一次reset()，如下面的第10行代码：

public void test01() throws IOException {
		String text="基于java语言开发的轻量级的中文分词工具包，一个轻量级框架!"; 
		
        // 创建分词对象  
        Analyzer analyzer = new IKAnalyzer(true);       
        StringReader reader = new StringReader(text);  
        
        // 分词  
        TokenStream ts = analyzer.tokenStream("", reader); 
        ts.reset();
        CharTermAttribute term = ts.getAttribute(CharTermAttribute.class);  
        
        // 遍历分词数据  
        while(ts.incrementToken()){  
            System.out.print(term.toString()+" | ");  
        }  
        
        reader.close();  
        System.out.println();  
	}

发布者：全栈程序员-用户IM，转载请注明出处：https://javaforall.cn/163106.html原文链接：https://javaforall.cn

【正版授权，激活自己账号】： Jetbrains全家桶Ide使用，1年售后保障，每天仅需1毛

【官方授权正版激活】： 官方授权正版激活支持Jetbrains家族下所有IDE 使用个人JB账号...