Class MatchRatingApproachEncoder
- java.lang.Object
-
- org.apache.commons.codec.language.MatchRatingApproachEncoder
-
- All Implemented Interfaces:
Encoder,StringEncoder
public class MatchRatingApproachEncoder extends java.lang.Object implements StringEncoder
Match Rating Approach Phonetic Algorithm Developed by Western Airlines in 1977. This class is immutable and thread-safe.- Since:
- 1.8
- See Also:
- Wikipedia - Match Rating Approach
-
-
Field Summary
Fields Modifier and Type Field Description private static java.lang.String[]DOUBLE_CONSONANTprivate static intELEVENConstants used mainly for the min rating value.private static java.lang.StringEMPTYprivate static intFIVEConstants used mainly for the min rating value.private static intFOURConstants used mainly for the min rating value.private static intONEConstants used mainly for the min rating value.private static java.lang.StringPLAIN_ASCIIThe plain letter equivalent of the accented letters.private static intSEVENConstants used mainly for the min rating value.private static intSIXConstants used mainly for the min rating value.private static java.lang.StringSPACEprivate static intTHREEConstants used mainly for the min rating value.private static intTWELVEConstants used mainly for the min rating value.private static intTWOConstants used mainly for the min rating value.private static java.lang.StringUNICODEUnicode characters corresponding to various accented letters.
-
Constructor Summary
Constructors Constructor Description MatchRatingApproachEncoder()
-
Method Summary
All Methods Instance Methods Concrete Methods Modifier and Type Method Description (package private) java.lang.StringcleanName(java.lang.String name)Cleans up a name: 1.java.lang.Objectencode(java.lang.Object pObject)Encodes an Object using the Match Rating Approach algorithm.java.lang.Stringencode(java.lang.String name)Encodes a String using the Match Rating Approach (MRA) algorithm.(package private) java.lang.StringgetFirst3Last3(java.lang.String name)Gets the first and last 3 letters of a name (if > 6 characters) Else just returns the name.(package private) intgetMinRating(int sumLength)Obtains the min rating of the length sum of the 2 names.booleanisEncodeEquals(java.lang.String name1, java.lang.String name2)Determines if two names are homophonous via Match Rating Approach (MRA) algorithm.(package private) booleanisVowel(java.lang.String letter)Determines if a letter is a vowel.(package private) intleftToRightThenRightToLeftProcessing(java.lang.String name1, java.lang.String name2)Processes the names from left to right (first) then right to left removing identical letters in same positions.(package private) java.lang.StringremoveAccents(java.lang.String accentedWord)Removes accented letters and replaces with non-accented ascii equivalent Case is preserved.(package private) java.lang.StringremoveDoubleConsonants(java.lang.String name)Replaces any double consonant pair with the single letter equivalent.(package private) java.lang.StringremoveVowels(java.lang.String name)Deletes all vowels unless the vowel begins the word.
-
-
-
Field Detail
-
SPACE
private static final java.lang.String SPACE
- See Also:
- Constant Field Values
-
EMPTY
private static final java.lang.String EMPTY
- See Also:
- Constant Field Values
-
ONE
private static final int ONE
Constants used mainly for the min rating value.- See Also:
- Constant Field Values
-
TWO
private static final int TWO
Constants used mainly for the min rating value.- See Also:
- Constant Field Values
-
THREE
private static final int THREE
Constants used mainly for the min rating value.- See Also:
- Constant Field Values
-
FOUR
private static final int FOUR
Constants used mainly for the min rating value.- See Also:
- Constant Field Values
-
FIVE
private static final int FIVE
Constants used mainly for the min rating value.- See Also:
- Constant Field Values
-
SIX
private static final int SIX
Constants used mainly for the min rating value.- See Also:
- Constant Field Values
-
SEVEN
private static final int SEVEN
Constants used mainly for the min rating value.- See Also:
- Constant Field Values
-
ELEVEN
private static final int ELEVEN
Constants used mainly for the min rating value.- See Also:
- Constant Field Values
-
TWELVE
private static final int TWELVE
Constants used mainly for the min rating value.- See Also:
- Constant Field Values
-
PLAIN_ASCII
private static final java.lang.String PLAIN_ASCII
The plain letter equivalent of the accented letters.- See Also:
- Constant Field Values
-
UNICODE
private static final java.lang.String UNICODE
Unicode characters corresponding to various accented letters. For example: Ú is U acute etc...- See Also:
- Constant Field Values
-
DOUBLE_CONSONANT
private static final java.lang.String[] DOUBLE_CONSONANT
-
-
Method Detail
-
cleanName
java.lang.String cleanName(java.lang.String name)
Cleans up a name: 1. Upper-cases everything 2. Removes some common punctuation 3. Removes accents 4. Removes any spaces.API Usage
Consider this method private, it is package protected for unit testing only.
- Parameters:
name- The name to be cleaned- Returns:
- The cleaned name
-
encode
public final java.lang.Object encode(java.lang.Object pObject) throws EncoderExceptionEncodes an Object using the Match Rating Approach algorithm. Method is here to satisfy the requirements of the Encoder interface Throws an EncoderException if input object is not of type java.lang.String.- Specified by:
encodein interfaceEncoder- Parameters:
pObject- Object to encode- Returns:
- An object (or type java.lang.String) containing the Match Rating Approach code which corresponds to the String supplied.
- Throws:
EncoderException- if the parameter supplied is not of type java.lang.String
-
encode
public final java.lang.String encode(java.lang.String name)
Encodes a String using the Match Rating Approach (MRA) algorithm.- Specified by:
encodein interfaceStringEncoder- Parameters:
name- String object to encode- Returns:
- The MRA code corresponding to the String supplied
-
getFirst3Last3
java.lang.String getFirst3Last3(java.lang.String name)
Gets the first and last 3 letters of a name (if > 6 characters) Else just returns the name.API Usage
Consider this method private, it is package protected for unit testing only.
- Parameters:
name- The string to get the substrings from- Returns:
- Annexed first and last 3 letters of input word.
-
getMinRating
int getMinRating(int sumLength)
Obtains the min rating of the length sum of the 2 names. In essence the larger the sum length the smaller the min rating. Values strictly from documentation.API Usage
Consider this method private, it is package protected for unit testing only.
- Parameters:
sumLength- The length of 2 strings sent down- Returns:
- The min rating value
-
isEncodeEquals
public boolean isEncodeEquals(java.lang.String name1, java.lang.String name2)Determines if two names are homophonous via Match Rating Approach (MRA) algorithm. It should be noted that the strings are cleaned in the same way asencode(String).- Parameters:
name1- First of the 2 strings (names) to comparename2- Second of the 2 names to compare- Returns:
trueif the encodings are identicalfalseotherwise.
-
isVowel
boolean isVowel(java.lang.String letter)
Determines if a letter is a vowel.API Usage
Consider this method private, it is package protected for unit testing only.
- Parameters:
letter- The letter under investiagtion- Returns:
- True if a vowel, else false
-
leftToRightThenRightToLeftProcessing
int leftToRightThenRightToLeftProcessing(java.lang.String name1, java.lang.String name2)Processes the names from left to right (first) then right to left removing identical letters in same positions. Then subtracts the longer string that remains from 6 and returns this.API Usage
Consider this method private, it is package protected for unit testing only.
- Parameters:
name1- name2- Returns:
- the length as above
-
removeAccents
java.lang.String removeAccents(java.lang.String accentedWord)
Removes accented letters and replaces with non-accented ascii equivalent Case is preserved. http://www.codecodex.com/wiki/Remove_accent_from_letters_%28ex_.%C3%A9_to_e%29- Parameters:
accentedWord- The word that may have accents in it.- Returns:
- De-accented word
-
removeDoubleConsonants
java.lang.String removeDoubleConsonants(java.lang.String name)
Replaces any double consonant pair with the single letter equivalent.API Usage
Consider this method private, it is package protected for unit testing only.
- Parameters:
name- String to have double consonants removed- Returns:
- Single consonant word
-
removeVowels
java.lang.String removeVowels(java.lang.String name)
Deletes all vowels unless the vowel begins the word.API Usage
Consider this method private, it is package protected for unit testing only.
- Parameters:
name- The name to have vowels removed- Returns:
- De-voweled word
-
-