Package com.univocity.parsers.csv
Class CsvFormatDetector
- java.lang.Object
-
- com.univocity.parsers.csv.CsvFormatDetector
-
- All Implemented Interfaces:
InputAnalysisProcess
public abstract class CsvFormatDetector extends java.lang.Object implements InputAnalysisProcess
AnInputAnalysisProcessto detect column delimiters, quotes and quote escapes in a CSV input.
-
-
Field Summary
Fields Modifier and Type Field Description private char[]allowedDelimitersprivate charcommentprivate char[]delimiterPreferenceprivate intMAX_ROW_SAMPLESprivate charnormalizedNewLineprivate charsuggestedDelimiterprivate charsuggestedQuoteprivate charsuggestedQuoteEscapeprivate intwhitespaceRangeStart
-
Constructor Summary
Constructors Constructor Description CsvFormatDetector(int maxRowSamples, CsvParserSettings settings, int whitespaceRangeStart)Builds a newCsvFormatDetector
-
Method Summary
All Methods Instance Methods Abstract Methods Concrete Methods Modifier and Type Method Description protected abstract voidapply(char delimiter, char quote, char quoteEscape)Applies the discovered CSV format elements to theCsvParserprotected java.util.Map<java.lang.Character,java.lang.Integer>calculateTotals(java.util.List<java.util.Map<java.lang.Character,java.lang.Integer>> symbolsPerRow)voidexecute(char[] characters, int length)A sequence of characters of the input buffer to be analyzed.protected chargetChar(java.util.Map<java.lang.Character,java.lang.Integer> map, java.util.Map<java.lang.Character,java.lang.Integer> totals, char defaultChar, boolean min)Returns the character with the highest or lowest associated number.protected voidincrement(java.util.Map<java.lang.Character,java.lang.Integer> map, char symbol)Increments the number associated with a character in a map by 1protected voidincrement(java.util.Map<java.lang.Character,java.lang.Integer> map, char symbol, int incrementSize)Increments the number associated with a character in a mapprotected booleanisAllowedDelimiter(char ch)protected booleanisSymbol(char ch)protected charmax(java.util.Map<java.lang.Character,java.lang.Integer> map, java.util.Map<java.lang.Character,java.lang.Integer> totals, char defaultChar)Returns the character with the highest associated number.protected charmin(java.util.Map<java.lang.Character,java.lang.Integer> map, java.util.Map<java.lang.Character,java.lang.Integer> totals, char defaultChar)Returns the character with the lowest associated number.protected charpickDelimiter(java.util.Map<java.lang.Character,java.lang.Integer> sums, java.util.Map<java.lang.Character,java.lang.Integer> totals)
-
-
-
Field Detail
-
MAX_ROW_SAMPLES
private final int MAX_ROW_SAMPLES
-
comment
private final char comment
-
suggestedDelimiter
private final char suggestedDelimiter
-
normalizedNewLine
private final char normalizedNewLine
-
whitespaceRangeStart
private final int whitespaceRangeStart
-
allowedDelimiters
private char[] allowedDelimiters
-
delimiterPreference
private char[] delimiterPreference
-
suggestedQuote
private final char suggestedQuote
-
suggestedQuoteEscape
private final char suggestedQuoteEscape
-
-
Constructor Detail
-
CsvFormatDetector
public CsvFormatDetector(int maxRowSamples, CsvParserSettings settings, int whitespaceRangeStart)Builds a newCsvFormatDetector- Parameters:
maxRowSamples- the number of row samples to collect before analyzing the statisticssettings- the configuration provided by the user with potential defaults in case the detection is unable to discover the proper column delimiter or quote character.whitespaceRangeStart- starting range of characters considered to be whitespace.
-
-
Method Detail
-
calculateTotals
protected java.util.Map<java.lang.Character,java.lang.Integer> calculateTotals(java.util.List<java.util.Map<java.lang.Character,java.lang.Integer>> symbolsPerRow)
-
execute
public void execute(char[] characters, int length)Description copied from interface:InputAnalysisProcessA sequence of characters of the input buffer to be analyzed.- Specified by:
executein interfaceInputAnalysisProcess- Parameters:
characters- the input bufferlength- the last character position loaded into the buffer.
-
pickDelimiter
protected char pickDelimiter(java.util.Map<java.lang.Character,java.lang.Integer> sums, java.util.Map<java.lang.Character,java.lang.Integer> totals)
-
increment
protected void increment(java.util.Map<java.lang.Character,java.lang.Integer> map, char symbol)Increments the number associated with a character in a map by 1- Parameters:
map- the map of characters and their numberssymbol- the character whose number should be increment
-
increment
protected void increment(java.util.Map<java.lang.Character,java.lang.Integer> map, char symbol, int incrementSize)Increments the number associated with a character in a map- Parameters:
map- the map of characters and their numberssymbol- the character whose number should be incrementincrementSize- the size of the increment
-
min
protected char min(java.util.Map<java.lang.Character,java.lang.Integer> map, java.util.Map<java.lang.Character,java.lang.Integer> totals, char defaultChar)Returns the character with the lowest associated number.- Parameters:
map- the map of characters and their numbersdefaultChar- the default character to return in case the map is empty- Returns:
- the character with the lowest number associated.
-
max
protected char max(java.util.Map<java.lang.Character,java.lang.Integer> map, java.util.Map<java.lang.Character,java.lang.Integer> totals, char defaultChar)Returns the character with the highest associated number.- Parameters:
map- the map of characters and their numbersdefaultChar- the default character to return in case the map is empty- Returns:
- the character with the highest number associated.
-
getChar
protected char getChar(java.util.Map<java.lang.Character,java.lang.Integer> map, java.util.Map<java.lang.Character,java.lang.Integer> totals, char defaultChar, boolean min)Returns the character with the highest or lowest associated number.- Parameters:
map- the map of characters and their numbersdefaultChar- the default character to return in case the map is emptymin- a flag indicating whether to return the character associated with the lowest number in the map. Iffalsethen the character associated with the highest number found will be returned.- Returns:
- the character with the highest/lowest number associated.
-
isSymbol
protected boolean isSymbol(char ch)
-
isAllowedDelimiter
protected boolean isAllowedDelimiter(char ch)
-
apply
protected abstract void apply(char delimiter, char quote, char quoteEscape)Applies the discovered CSV format elements to theCsvParser- Parameters:
delimiter- the discovered delimiter characterquote- the discovered quote characterquoteEscape- the discovered quote escape character.
-
-