skip to Main Content

I have written a simple node.js application and I have a question.

Here is the general flow of the program.

  1. A local file exists with a list of URLs, each on a new line. Let’s assume there are 1,000 URLs for example.

  2. As each new line (URL) is read (actually in a read stream, one line at a time using the ‘readLine’ module), the callback takes the URL string and passes it to another async function that makes an http request.

  3. Inside this http request function, an https.request() is made. In the callback of the request, I simply take the response and transform the JSON object a bit. In fact, I simply stringify() the JSON object and turn its values into CSV format which is just a string.

  4. I pass this string to a final writeData() function which is intended to write this data to a SINGLE .csv file. Yes, the same file will be used to take in all of these async https request calls and store some local data. Nothing surprising here. Inside this writeData() function, I use fs.createWriteStream(). I pass it a single file which is ‘output.csv’ and run ws.write(‘n’ + csvString).

  • Now here is my question/concern… Since many different invocations of this fs.createWriteStream().write() function will be called asynchronously (remember, 1000 URLs), therefore causing many OS level writes to occur, how come no two writeStreams are writing to the file at the exact same time and therefore jumbling and truncating into each other? It appears that each writeStream is appending its data to the file in a nice, pretty, orderly fashion. The callback of the writeStream.write() method is and the next I expected that while one write stream is writing to the file, another write stream would be simultaneously created and write to the same file at the same time, jumbling the file contents.

Keep in mind, I don’t care about the order of the data being written, but I do care that two writes do not occur on the file at the same time on the OS level, jumbling the file up.

**Here is the writeData function, for reference:

const fs = require("fs");
const writeData = (csvString) => {
  const ws = fs.createWriteStream(output.txt), { flags: "a" });
  ws.write("n" + csvString, () => {
     console.log("A buffer has been written to the file: ");
     console.log(csvString);
  });
}

module.exports = writeData;

Expectations vs. reality:

Here is what is actually being output…which appears GOOD, orderly, almost synchronous appearing.


"A buffer has been written to the file: "
<csvString prints here>
"A buffer has been written to the file: "
<next csvString prints here>
"A buffer has been written to the file: "
<next csvString prints here>
"A buffer has been written to the file: "
<next csvString prints here>

Here is what I expected to be output…which would be BAD, jumbled and multiple async write operations appending to the file at random based on however much time the OS decides to give the async process/thread:

"A buffer has been written to the file: "
<csvString prints here>
 been written to the file: "
<next csvString prints here>
String prints here>
"A buffer has been wri
A buffer has been written to the file: "
<next csvStr
String prints here>
"A buffer has been written to the file: 
"r has been written to the file: "
<next csv

WriteFileSync()??

Also I just realized after thinking about it, perhaps writeFileSync() would clear all concerns out of my mind because then we can be certain that only ONE operation is ever writing/appending to the file at a time. The "blocking" isn’t a huge issue for this application as the write size per object to output.csv isn’t large.

2

Answers


  1. Node.js fs.createWriteStream() manages writes using an internal buffer and queue system. When you call write(), the data is added to the queue, and the stream ensures each write completes before starting the next. This sequential processing prevents concurrent writes from overlapping, so even if your async calls trigger multiple writes simultaneously, they’re executed in order.

    https://github.com/nodejs/node/blob/main/lib/internal/streams/writable.js

    Login or Signup to reply.
  2. "asynchronous" doesn’t mean "concurrent". Your 1000 callbacks will run one after the other. When a HTTP request completes, its callback will be queued to run at the next possible opportunity.

    This applies to all callbacks in JavaScript; it’s not specific to Node or to file operations.

    Login or Signup to reply.
Please signup or login to give your own answer.
Back To Top
Search