I need to compare the 2 arrays declared here to return records that exist only in the filtered_apps array. I am using the contents of previous_apps array to see if an ID in the record exists in filtered_apps array. I will be outputting the results to a CSV and displaying records that exist in both arrays to the console.
My question is this: How do I get the records that only exist in filtered_apps? Easiest for me would be to put those unique records into a new array to work with on the csv.
start_date = Date.parse("2022-02-05")
end_date = Date.parse("2022-05-17")
valid_year = start_date.year
dupe_apps = []
uniq_apps = []
# Finding applications that meet my criteria:
filtered_apps = FinancialAssistance::Application.where(
:is_requesting_info_in_mail => true,
:aasm_state => "determined",
:submitted_at => {
"$exists" => true,
"$gte" => start_date,
"$lte" => end_date })
# Finding applications that I want to compare against filtered_apps
previous_apps = FinancialAssistance::Application.where(
is_requesting_info_in_mail: true,
:submitted_at => {
"$exists" => true,
"$gte" => valid_year })
# I'm using this to pull the ID that I'm using for comparison just to make the comparison lighter by only storing the family_id
previous_apps.each do |y|
previous_apps_array << y.family_id
end
# This is where I'm doing my comparison and it is not working.
filtered_apps.each do |app|
if app.family_id.in?(previous_apps_array) == false
then @non_dupe_apps << app
else "No duplicate found for application #{app.hbx_id}"
end
end
end
So what am I doing wrong in the last code section?
2
Answers
EDIT: My last answer did, in fact, not work.
Here is the code all nice and working.
It turns out the issue was that when comparing family_id from the set of records I forgot that the looped record was a part of the set, so it would return it, too. I added a check for the ID of the array to match the looped record and bob's your uncle.
I added the pass and reject arrays so I could check my work instead of downloading a csv every time. Leaving them in mostly because I'm scared to change anything else.
Basically, I pulled the applications into the app variable array, then filtered them by the family_id field in each record.
I had to do this because the issue at the bottom of everything was that there were records present in app that were themselves duplicates, only submitted a few days apart. Since I went on the assumption that the initial app array would be all unique, I thought the duplicates that were included were due to the rest of the code not filtering correctly.
I then use the uniq_apps array to filter through and look for matches in uniq_apps.each do, and when it finds a duplicate, it adds it to the previous_applications array inside the loop. Since this array resets each go-round, if it ever has more than 0 records in it, the app gets called out as being submitted already. Otherwise, it goes to my csv report.
Thanks for the help on this, it really got my brain thinking in another direction that I needed to. It also helped improve the code even though the issue was at the very beginning.
Let’s check your original method first (I fixed the indentation to make it clearer). There’s quite a few issues with it:
Also, you haven’t declared
previous_apps_array
anywhere in your example, you just start adding to it out of nowhere.Getting the difference between 2 arrays is dead easy in Ruby: just use
-
!You can also do this with ActiveRecord results, since they are just arrays of ActiveRecord objects. However, this doesn’t help if you specifically need to compare results using the
family_id
column.TIP: Getting the values of only a specific column/columns from your database is probably best done with the
pluck
orselect
method if you don’t need to store any other data about those objects. Withpluck
, you only get an array of values in the result, not the full objects.select
works a bit differently and returns ActiveRecord objects, but filters out everything but the selected columns.select
is usually better in nested queries, since it doesn’t trigger a separate query when used as a part of another query, whilepluck
always triggers one.I highly recommend getting really familiar with at least
filter/select
, andmap
out of the basic array methods. They make things like this way easier. The Ruby docs are a great place to learn about them and others. A very simple example of doing a similar thing to what you explained in your question withfilter/select
on 2 arrays would be something like this:NOTE: The OP is working with
ruby 2.5.9
, wherefilter
is not yet available as an array method (it was introduced in 2.6.3). However,filter
is just an alias forselect
, which can be found on earlier versions of Ruby, so they can be used interchangeably. Personally, I prefer usingfilter
because, as seen above,select
is already used in other methods, andfilter
is also the more common term in other programming languages I usually work with. Of course when both are available, it doesn’t really matter which one you use, as long as you keep it consistent.