I’m trying to run a script to parse through my sent box and scrape unique email addresses in the to and cc lines of my outgoing email. I think I might be running into a memory limit. But I’m not getting any errors back. When I run this script against a mailbox with a few hundred emails in the search parameters it works well… But if I run against a mailbox with 80+ messages sent per day it never finishes the run and states complete with data truncated. Last run only pulled 20 days vs 30 days worth of data (about 200 lines in the google sheet)
The goal is to input a range of dates (Ideally 90 days/last three months) and then pull back all the uniques that we’ve sent to. Returning them in a spreadsheet with First, Last, Email, Date columns.
I’d appreciate thoughts on how to optimize this or allow it to run for a larger mailbox.
function exportSentEmails() {
var sheet = SpreadsheetApp.getActiveSpreadsheet().getActiveSheet();
sheet.clear(); // Clear any existing content in the sheet
// Set up the headers in the first row
sheet.appendRow(["First Name", "Last Name", "Email Address", "Date"]);
// Define the date range (replace these with your desired dates)
var startDate = new Date('2024-01-01'); // Change to your desired start date
var endDate = new Date('2024-04-01'); // Change to your desired end date
// Format dates to use in the Gmail search query
var startFormatted = formatDateForQuery(startDate);
var endFormatted = formatDateForQuery(endDate);
// Get the sent emails from Gmail within the specified date range
var threads = GmailApp.search('in:sent after:' + startFormatted + ' before:' + endFormatted);
// Loop through each thread and extract email addresses and names
var emailSet = new Set(); // To track unique email addresses
threads.forEach(thread => {
var messages = thread.getMessages();
messages.forEach(message => {
var recipients = message.getTo().split(',');
var ccRecipients = message.getCc().split(','); // Extract CC recipients
recipients = recipients.concat(ccRecipients); // Combine To and CC recipients
recipients.forEach(recipient => {
try {
var email = recipient.match(/b[A-Z0-9._%+-]+@[A-Z0-9.-]+.[A-Z]{2,}b/i);
if (email) {
email = email[0].trim().toLowerCase();
if (!emailSet.has(email)) {
emailSet.add(email);
var nameParts = getNameParts(recipient, email);
var date = formatDate(message.getDate());
sheet.appendRow([nameParts.firstName, nameParts.lastName, email, date]);
}
}
} catch (e) {
Logger.log('Error processing recipient: ' + recipient + ' - ' + e.message);
}
});
});
});
Logger.log('Export completed.');
}
// Helper function to format date for Gmail query
function formatDateForQuery(date) {
return Utilities.formatDate(date, Session.getScriptTimeZone(), 'yyyy/MM/dd');
}
// Helper function to format date for spreadsheet
function formatDate(date) {
return Utilities.formatDate(date, Session.getScriptTimeZone(), 'MM/dd/yyyy');
}
// Helper function to extract name parts from a recipient string
function getNameParts(recipient, email) {
var displayName = recipient.replace(email, '').trim().replace(/["<>]/g, '');
var nameParts = { firstName: "", lastName: "" };
if (displayName) {
var nameArray = displayName.split(' ');
nameParts.firstName = nameArray[0];
nameParts.lastName = nameArray.slice(1).join(' ');
} else {
var nameFromEmail = email.split('@')[0].replace(/[._-]/g, ' ');
var nameArray = nameFromEmail.split(' ');
nameParts.firstName = nameArray[0];
nameParts.lastName = nameArray.slice(1).join(' ');
}
// Capitalize the first letter of each name part
nameParts.firstName = toTitleCase(nameParts.firstName);
nameParts.lastName = toTitleCase(nameParts.lastName);
return nameParts;
}
// Helper function to convert a string to title case
function toTitleCase(str) {
return str.split(' ').map(word => word.charAt(0).toUpperCase() + word.slice(1).toLowerCase()).join(' ');
}
2
Answers
In order to optimize your code for large data set consider processing it batch-wise(from paging the content and setting a page size). Also instead of appending rows to the spreadsheet inside the nested loops, collect the data to an array and append it in one operation. This significantly reduces the number of iterations with the spreadsheet, which can be slow.
All Google Apps Script executions create an entry on the execution log for the corresponding script, and in case of an error, it will at least set the execution state as Failed. Only in sporadic cases is there a logged error when a script is executed from the Apps Script Editor. Considering this, it’s very likely that the problem is the user at the keyboard.
Add validation for the strings intended to be modified by the users and to ensure that they are using the intended account.
Tip: Try running the script using the Debug button from the Script Editor to find where exactly the execution is failing.
If the users are all using Workspace accounts from the same organization, you might create a private add-on with a user interface to get the date range input from the user. This way, you will have a single copy of the code and logs for each mailbox in a single Apps Script project.
Regarding optimizing the code, the most essential thing you should try is reducing the number of calls to Google Apps Script services, as they are slow and there is an execution time limit. In this case, instead of using
Spreadsheet.Sheet.appendRow(rowContent)
, you might create an array to hold the collected data, then useSpreadsheet.Sheet.Range.setValues(data)
to write the data in a single operation. This also makes it easier to measure the size of the collected data, which might be necessary in case the JavaScript engine cannot handle it, so you might focus on implementing better memory usage handling.Another option is to use the Google Apps Script Advanced Services.