I am a newbie constructing persistence APIs with Spring Boot, so I came about this doubt. I don’t want the user to be able to register two different Accounts with the same Email. But I also don’t want to use Email as primary key in the repo. So, I think the logic is quite self-explanatory.
private static boolean isRegisteredEmail(String email, AccountRepository accountRepository) {
for ( Account account : accountRepository.findAll() ) {
if ( account.getEmail().equals(email) ) {
return true;
}
}
return false;
}
This works, but I start to wonder: if you get a real-case situation, like Facebook’s DB in which you have almost 3 billion registers, how do you deal with this scenario? Am I overconcerning, or there’s really no chance this can keep being done efficiently?
2
Answers
you don’t need to use email as primary key, you can still search for existing users with given email address using either HQL/SQL @Query in your spring data @Repository annotated interface or by composing a method name such as findUserByEmail without writing query (if entity is named user and there is a field email spring data will fetch the data from db for you).
It’s much more efficient to select 1 record and not transfer through hundreds/thousands of users every time someone needs to create a new account.
The method
findAll()
is indeed inefficient to assert an unprecedented case in the database. To understand why it is so, we have to understand in terms ofperformance
.Performance of any application has many facets to it. But lets tone it down to
Time
andSpace
for now.Issue in the implementation
Improvements
Repository layer
But above depends on database performance too.
Cache
. Rather than fetching data fromDB
every time. Cache should have a refresh time and should be updated when a new record is inserted in DB. Cache operation isO(1)
.Database
can check and return response very fast wheneverconstraint
is violated.O(1)
time complexity is for bothlist.indexOf(obj)
andmap.get(obj)
.No-Sql
db as they have very fast retrieval time. The data is stored based on index. Many have Tree type data structure for storage.