public class SolrContentHandler extends org.xml.sax.helpers.DefaultHandler implements ExtractingParams
SolrInputDocuments.
This class is not thread-safe.
User's may wish to override this class to provide their own functionality.| Modifier and Type | Field and Description |
|---|---|
protected boolean |
captureAttribs |
protected StringBuilder |
catchAllBuilder |
protected String |
contentFieldName |
protected Collection<String> |
dateFormats |
protected String |
defaultField |
protected SolrInputDocument |
document |
protected Map<String,StringBuilder> |
fieldBuilders |
protected boolean |
lowerNames |
protected org.apache.tika.metadata.Metadata |
metadata |
protected SolrParams |
params |
protected IndexSchema |
schema |
protected String |
unknownFieldPrefix |
BOOST_PREFIX, CAPTURE_ATTRIBUTES, CAPTURE_ELEMENTS, DEFAULT_FIELD, EXTRACT_FORMAT, EXTRACT_ONLY, IGNORE_TIKA_EXCEPTION, LITERALS_OVERRIDE, LITERALS_PREFIX, LOWERNAMES, MAP_PREFIX, PASSWORD_MAP_FILE, RESOURCE_NAME, RESOURCE_PASSWORD, STREAM_TYPE, UNKNOWN_FIELD_PREFIX, XPATH_EXPRESSION| Constructor and Description |
|---|
SolrContentHandler(org.apache.tika.metadata.Metadata metadata,
SolrParams params,
IndexSchema schema) |
SolrContentHandler(org.apache.tika.metadata.Metadata metadata,
SolrParams params,
IndexSchema schema,
Collection<String> dateFormats) |
| Modifier and Type | Method and Description |
|---|---|
protected void |
addCapturedContent()
Add the per field captured content to the Solr Document.
|
protected void |
addContent()
Add in the catch all content to the field.
|
protected void |
addField(String fname,
String fval,
String[] vals) |
protected void |
addLiterals()
Add in the literals to the document using the
params and the ExtractingParams.LITERALS_PREFIX. |
protected void |
addMetadata()
Add in any metadata using
metadata as the source. |
void |
characters(char[] chars,
int offset,
int length) |
void |
endElement(String uri,
String localName,
String qName) |
protected String |
findMappedName(String name)
Get the name mapping
|
protected float |
getBoost(String name)
Get the value of any boost factor for the mapped name.
|
void |
ignorableWhitespace(char[] chars,
int offset,
int length)
Treat the same as any other characters
|
SolrInputDocument |
newDocument()
This is called by a consumer when it is ready to deal with a new SolrInputDocument.
|
void |
startDocument() |
void |
startElement(String uri,
String localName,
String qName,
Attributes attributes) |
protected String |
transformValue(String val,
SchemaField schFld)
Can be used to transform input values based on their
SchemaField
This implementation only formats dates using the DateUtil. |
endDocument, endPrefixMapping, error, fatalError, notationDecl, processingInstruction, resolveEntity, setDocumentLocator, skippedEntity, startPrefixMapping, unparsedEntityDecl, warningprotected SolrInputDocument document
protected Collection<String> dateFormats
protected org.apache.tika.metadata.Metadata metadata
protected SolrParams params
protected StringBuilder catchAllBuilder
protected IndexSchema schema
protected Map<String,StringBuilder> fieldBuilders
protected boolean captureAttribs
protected boolean lowerNames
protected String contentFieldName
protected String unknownFieldPrefix
protected String defaultField
public SolrContentHandler(org.apache.tika.metadata.Metadata metadata,
SolrParams params,
IndexSchema schema)
public SolrContentHandler(org.apache.tika.metadata.Metadata metadata,
SolrParams params,
IndexSchema schema,
Collection<String> dateFormats)
public SolrInputDocument newDocument()
SolrInputDocument.addMetadata(),
addCapturedContent(),
addContent(),
addLiterals()protected void addCapturedContent()
fieldBuilders infoprotected void addContent()
contentFieldName
and the catchAllBuilderprotected void addLiterals()
params and the ExtractingParams.LITERALS_PREFIX.protected void addMetadata()
metadata as the source.public void startDocument()
throws SAXException
startDocument in interface ContentHandlerstartDocument in class org.xml.sax.helpers.DefaultHandlerSAXExceptionpublic void startElement(String uri, String localName, String qName, Attributes attributes) throws SAXException
startElement in interface ContentHandlerstartElement in class org.xml.sax.helpers.DefaultHandlerSAXExceptionpublic void endElement(String uri, String localName, String qName) throws SAXException
endElement in interface ContentHandlerendElement in class org.xml.sax.helpers.DefaultHandlerSAXExceptionpublic void characters(char[] chars,
int offset,
int length)
throws SAXException
characters in interface ContentHandlercharacters in class org.xml.sax.helpers.DefaultHandlerSAXExceptionpublic void ignorableWhitespace(char[] chars,
int offset,
int length)
throws SAXException
ignorableWhitespace in interface ContentHandlerignorableWhitespace in class org.xml.sax.helpers.DefaultHandlerSAXExceptionprotected String transformValue(String val, SchemaField schFld)
SchemaField
This implementation only formats dates using the DateUtil.val - The value to transformschFld - The SchemaFieldprotected float getBoost(String name)
name - The name of the field to see if there is a boost specifiedCopyright © 2000–2014 The Apache Software Foundation. All rights reserved.