Using Kotlin
, Jetpack Compose
, and Android Studio
I’m trying to develop a feature for an application that makes heavy use of Regular Expression
or commonly known as Regex
.
That feature in question is a string parser that divides a string into a series of smaller strings. The feature is intended for a contacts application. The string that will be parsed is a string that represents the full name. The full name will be divided into one to five smaller strings depending on name size. The smaller strings represent the name prefix, first name, middle name, last name, and name suffix.
The parsing is done using a series of Regex
patterns combined with a large conditional statement. There are thirteen possible patterns that can be matched.
The user interface consists of six Text Fields
arranged in a column. Each with their own separate variable that holds their value.
Here is the code for the user interface.
// Text Field Values
var name by remember { mutableStateOf("") }
var namePrefix by remember { mutableStateOf("") }
var firstName by remember { mutableStateOf("") }
var middleName by remember { mutableStateOf("") }
var lastName by remember { mutableStateOf("") }
var nameSuffix by remember { mutableStateOf("") }
// Layout
Column(
modifier = Modifier.fillMaxSize(),
horizontalAlignment = Alignment.CenterHorizontally,
verticalArrangement = Arrangement.Center
) {
// Text Fields
TextField(value = name, onValueChange = { name = it }, label = { Text("Name") })
TextField(value = namePrefix, onValueChange = { namePrefix = it }, label = { Text("Name Prefix") })
TextField(value = firstName, onValueChange = { firstName = it }, label = { Text("First Name") })
TextField(value = middleName, onValueChange = { middleName = it }, label = { Text("Middle Name") })
TextField(value = lastName, onValueChange = { lastName = it }, label = { Text("Last Name") })
TextField(value = nameSuffix, onValueChange = { nameSuffix = it }, label = { Text("Name Suffix") })
}
Here is the code for the patterns and matches. They are divided in four groups. 1. With Prefix Without Suffix, 2. Without Prefix And Suffix, 3. With Prefix And Suffix, 4. Without Prefix With Suffix.
val namePrefixList = listOf("mr", "ms", "mrs", "dr")
val nameSuffixList = listOf("jr", "sr", "ii", "iii")
// PATTERNS -------------------------------------------------------------------------
val prefixPattern = "\s*(?i)${namePrefixList.joinToString("\s*|")}\s*".toRegex()
val suffixPattern = "\s*(?i)${nameSuffixList.joinToString("\s*|")}\s*".toRegex()
val namePattern = "\s*\w+|\d+\s*".toRegex()
// With Prefix Without Suffix
val lastNameWithPrefixWithoutSuffixPattern = "($prefixPattern) ($namePattern)\s*".toRegex()
val firstAndLastNameWithPrefixWithoutSuffixPattern = "($prefixPattern) ($namePattern) ($namePattern)\s*".toRegex()
val firstMiddleAndLastNameWithPrefixWithoutSuffixPattern = "($prefixPattern) ($namePattern) ($namePattern) ($namePattern)\s*".toRegex()
// Without Prefix And Suffix
val firstNameWithoutPrefixAndSuffixPattern = "($namePattern)\s*".toRegex()
val firstAndLastNameWithoutPrefixAndSuffixPattern = "($namePattern) ($namePattern)\s*".toRegex()
val firstMiddleAndLastNameWithoutPrefixAndSuffixPattern = "($namePattern) ($namePattern) ($namePattern)\s*".toRegex()
// With Prefix and Suffix
val lastNameWithPrefixAndSuffixPattern = "($prefixPattern) ($namePattern) ($suffixPattern)\s*".toRegex()
val firstAndLastNameWithPrefixAndSuffixPattern = "($prefixPattern) ($namePattern) ($namePattern) ($suffixPattern)\s*".toRegex()
val firstMiddleAndLastNameWithPrefixAndSuffixPattern = "($prefixPattern) ($namePattern) ($namePattern) ($namePattern) ($suffixPattern)\s*".toRegex()
// Without Prefix With Suffix
val firstNameWithoutPrefixWithSuffixPattern = "($namePattern) ($suffixPattern)\s*".toRegex()
val firstAndLastNameWithoutPrefixWithSuffixPattern = "($namePattern) ($namePattern) ($suffixPattern)\s*".toRegex()
val firstMiddleAndLastNameWithoutPrefixWithSuffixPattern = "($namePattern) ($namePattern) ($namePattern) ($suffixPattern)\s*".toRegex()
// -------------------------------------------------------------------------------------
// MATCHES -----------------------------------------------------------------------------
// With Prefix Without Suffix
val lastNameWithPrefixWithoutSuffixMatch = lastNameWithPrefixWithoutSuffixPattern.matchEntire(name)
val firstAndLastNameWithPrefixWithoutSuffixMatch = firstAndLastNameWithPrefixWithoutSuffixPattern.matchEntire(name)
val firstMiddleAndLastNameWithPrefixWithoutMatch = firstMiddleAndLastNameWithPrefixWithoutSuffixPattern.matchEntire(name)
// Without Prefix And Suffix
val firstNameWithoutPrefixAndSuffixMatch = firstNameWithoutPrefixAndSuffixPattern.matchEntire(name)
val firstAndLastNamePatternWithoutPrefixAndSuffixMatch = firstAndLastNameWithoutPrefixAndSuffixPattern.matchEntire(name)
val firstMiddleAndLastNameWithoutPrefixAndSuffixMatch = firstMiddleAndLastNameWithoutPrefixAndSuffixPattern.matchEntire(name)
// With Prefix And Suffix
val lastNameWithPrefixAndSuffixMatch = lastNameWithPrefixAndSuffixPattern.matchEntire(name)
val firstAndLastNameWithPrefixAndSuffixMatch = firstAndLastNameWithPrefixAndSuffixPattern.matchEntire(name)
val firstMiddleAndLastNameWithPrefixAndSuffixMatch = firstMiddleAndLastNameWithPrefixAndSuffixPattern.matchEntire(name)
// Without Prefix With Suffix
val firstNameWithoutPrefixWithSuffixMatch = firstNameWithoutPrefixWithSuffixPattern.matchEntire(name)
val firstAndLastNameWithoutPrefixWithSuffixMatch = firstAndLastNameWithoutPrefixWithSuffixPattern.matchEntire(name)
val firstMiddleAndLastNameWithoutPrefixWithSuffixMatch = firstMiddleAndLastNameWithoutPrefixWithSuffixPattern.matchEntire(name)
Here is the code for the conditional the determines the current pattern. It is divided into two groups. 1. With Suffix, 2. Without Suffix
if (name.matches(prefixPattern)) {
namePrefix = name.trim()
firstName = ""
middleName = ""
lastName = ""
nameSuffix = ""
currentPattern = "Prefix"
// WITH SUFFIX -------------------------------------------------------------------------
} else if (name.matches(lastNameWithPrefixAndSuffixPattern) && lastNameWithPrefixAndSuffixMatch != null) {
namePrefix = lastNameWithPrefixAndSuffixMatch.groups[1]!!.value.trim()
firstName = ""
middleName = ""
lastName = lastNameWithPrefixAndSuffixMatch.groups[2]!!.value.trim()
nameSuffix = lastNameWithPrefixAndSuffixMatch.groups[3]!!.value.trim()
} else if (name.matches(firstAndLastNameWithPrefixAndSuffixPattern) && firstAndLastNameWithPrefixAndSuffixMatch != null) {
namePrefix = firstAndLastNameWithPrefixAndSuffixMatch.groups[1]!!.value.trim()
firstName = firstAndLastNameWithPrefixAndSuffixMatch.groups[2]!!.value.trim()
middleName = ""
lastName = firstAndLastNameWithPrefixAndSuffixMatch.groups[3]!!.value.trim()
nameSuffix = firstAndLastNameWithPrefixAndSuffixMatch.groups[4]!!.value.trim()
} else if (name.matches(firstMiddleAndLastNameWithPrefixAndSuffixPattern) && firstMiddleAndLastNameWithPrefixAndSuffixMatch != null) {
namePrefix = firstMiddleAndLastNameWithPrefixAndSuffixMatch.groups[1]!!.value.trim()
firstName = firstMiddleAndLastNameWithPrefixAndSuffixMatch.groups[2]!!.value.trim()
middleName = firstMiddleAndLastNameWithPrefixAndSuffixMatch.groups[3]!!.value.trim()
lastName = firstMiddleAndLastNameWithPrefixAndSuffixMatch.groups[4]!!.value.trim()
nameSuffix =
firstMiddleAndLastNameWithPrefixAndSuffixMatch.groups[5]!!.value.trim()
} else if (name.matches(firstNameWithoutPrefixWithSuffixPattern) && firstNameWithoutPrefixWithSuffixMatch != null) {
namePrefix = ""
firstName = firstNameWithoutPrefixWithSuffixMatch.groups[1]!!.value.trim()
middleName = ""
lastName = ""
nameSuffix = firstNameWithoutPrefixWithSuffixMatch.groups[2]!!.value.trim()
} else if (name.matches(firstAndLastNameWithoutPrefixWithSuffixPattern) && firstAndLastNameWithoutPrefixWithSuffixMatch != null) {
namePrefix = ""
firstName = firstAndLastNameWithoutPrefixWithSuffixMatch.groups[1]!!.value.trim()
middleName = ""
lastName = firstAndLastNameWithoutPrefixWithSuffixMatch.groups[2]!!.value.trim()
nameSuffix = firstAndLastNameWithoutPrefixWithSuffixMatch.groups[3]!!.value.trim()
} else if (name.matches(firstMiddleAndLastNameWithoutPrefixWithSuffixPattern) && firstMiddleAndLastNameWithoutPrefixWithSuffixMatch != null) {
namePrefix = ""
firstName = firstMiddleAndLastNameWithoutPrefixWithSuffixMatch.groups[1]!!.value.trim()
middleName = firstMiddleAndLastNameWithoutPrefixWithSuffixMatch.groups[2]!!.value.trim()
lastName = firstMiddleAndLastNameWithoutPrefixWithSuffixMatch.groups[3]!!.value.trim()
nameSuffix = firstMiddleAndLastNameWithoutPrefixWithSuffixMatch.groups[4]!!.value.trim()
// -------------------------------------------------------------------------------------
// WITHOUT SUFFIX -------------------------------------------------------------------------
} else if (name.matches(lastNameWithPrefixWithoutSuffixPattern) && lastNameWithPrefixWithoutSuffixMatch != null) {
namePrefix = lastNameWithPrefixWithoutSuffixMatch.groups[1]!!.value.trim()
firstName = ""
middleName = ""
lastName = lastNameWithPrefixWithoutSuffixMatch.groups[2]!!.value.trim()
nameSuffix = ""
} else if (name.matches(firstAndLastNameWithPrefixWithoutSuffixPattern) && firstAndLastNameWithPrefixWithoutSuffixMatch != null) {
namePrefix = firstAndLastNameWithPrefixWithoutSuffixMatch.groups[1]!!.value.trim()
firstName = firstAndLastNameWithPrefixWithoutSuffixMatch.groups[2]!!.value.trim()
middleName = ""
lastName = firstAndLastNameWithPrefixWithoutSuffixMatch.groups[3]!!.value.trim()
nameSuffix = ""
} else if (name.matches(firstMiddleAndLastNameWithPrefixWithoutSuffixPattern) && firstMiddleAndLastNameWithPrefixWithoutMatch != null) {
namePrefix = firstMiddleAndLastNameWithPrefixWithoutMatch.groups[1]!!.value.trim()
firstName = firstMiddleAndLastNameWithPrefixWithoutMatch.groups[2]!!.value.trim()
middleName = firstMiddleAndLastNameWithPrefixWithoutMatch.groups[3]!!.value.trim()
lastName = firstMiddleAndLastNameWithPrefixWithoutMatch.groups[4]!!.value.trim()
nameSuffix = ""
} else if (name.matches(firstNameWithoutPrefixAndSuffixPattern) && firstNameWithoutPrefixAndSuffixMatch != null) {
namePrefix = ""
firstName = firstNameWithoutPrefixAndSuffixMatch.groups[1]!!.value.trim()
middleName = ""
lastName = ""
nameSuffix = ""
} else if (name.matches(firstAndLastNameWithoutPrefixAndSuffixPattern) && firstAndLastNamePatternWithoutPrefixAndSuffixMatch != null) {
namePrefix = ""
firstName = firstAndLastNamePatternWithoutPrefixAndSuffixMatch.groups[1]!!.value.trim()
middleName = ""
lastName = firstAndLastNamePatternWithoutPrefixAndSuffixMatch.groups[2]!!.value.trim()
nameSuffix = ""
} else if (name.matches(firstMiddleAndLastNameWithoutPrefixAndSuffixPattern) && firstMiddleAndLastNameWithoutPrefixAndSuffixMatch != null) {
namePrefix = ""
firstName = firstMiddleAndLastNameWithoutPrefixAndSuffixMatch.groups[1]!!.value.trim()
middleName = firstMiddleAndLastNameWithoutPrefixAndSuffixMatch.groups[2]!!.value.trim()
lastName = firstMiddleAndLastNameWithoutPrefixAndSuffixMatch.groups[3]!!.value.trim()
nameSuffix = ""
// ----------------------------------------------------------------------------------
} else if(name == "") {
namePrefix = ""
firstName = ""
middleName = ""
lastName = ""
nameSuffix = ""
}
So far this code does it’s job as intended. Upon finding a match, the name entered in the name Text Field
is divided up and put into the other Text Fields
. However as you can see, the code is an unreadable bloated mess. How can I take all of this and make it smaller, readable, reusable, and efficient?
I apologize, I just learned about Regular Expressions
two days ago.
2
Answers
Regex is crafted on a case-by-case basis and you have to decide whether your regex needs to be readable by the next person who has to trudge through your code.
Comments explaining the goal of a section of code are much more valuable that a comment which merely regurgitates what the code is doing syntactically; learn the difference.
Realistically, regex can only reveal that "a name-like pattern was found". Once a name-like pattern has been identified, it is up to you to perform additional processing to pigeon-hole the definitions of prefix, first name, middle name, last name, and suffix.
This regex took me longer than I care to admit:
I think with Java you can access the named groups via
.groups['name']
for further processing? But I am not sure.You can clearly see the false-positives when applied against an average snippet of text.
https://regex101.com/r/Suh9H0/1
Don’t forget to read Falsehoods Programmers Believe About Names
To improve the readability, reusability, and efficiency of your code, you can simplify the structure by extracting common logic into reusable functions, mapping patterns to their respective handlers, and processing matches dynamically. Here’s how you can refactor your code:
Refactored Code