Node.js Readable Streams Explained
Implementing read streams in Node.js can be confusing. Streams are very stateful, so how they function can depend on the mode they're in.
Nov 24th, 2021 7:55am by
Photo by Arnie Chou from Pexels.
LogDNA sponsored this post.
Darin Spivey
Darin is a senior software engineer at LogDNA, where he works on product architecture and performance, advanced testing frameworks, and LogDNA's open source projects. When he’s not geeking out on code, you’ll find him tinkering with his smart home technology and traveling with his family.
What’s a Stream Implementation?
A readable implementation is a piece of code that extendsReadable, which is the Node.js base class for read streams. It can also be a simple call to the new Readable() constructor, if you want a custom stream without defining your own class. I’m sure plenty of you have used streams from the likes of HTTP res handlers to fs.createReadStream file streams. An implementation, however, needs to respect the rules for streams, namely that certain functions are overridden when the system calls them for stream flow situations. Let’s talk about what some of this looks like.
const {Readable} = require('stream')
// This data can also come from other streams :]
let dataToStream = [
'This is line 1\n'
, 'This is line 2\n'
, 'This is line 3\n'
]
class MyReadable extends Readable {
constructor(opts) {
super(opts)
}
_read() {
// The consumer is ready for more data
this.push(dataToStream.shift())
if (!dataToStream.length) {
this.push(null) // End the stream
}
}
_destroy() {
// Not necessary, but illustrates things to do on end
dataToStream = null
}
}
new MyReadable().pipe(process.stdout)
- Of course, call
super(opts)or nothing will work. _readis required and is called automatically when new data is wanted.- Calling
push(<some data>)will cause the data to go into an internal buffer, and it will be consumed when something, like a piped writable stream, wants it. push(null)is required to properly end the read stream.- An
'end'event will be emitted after this. - A
'close'event will also be emitted unlessemitClose: falsewas set in the constructor.
- An
_destroyis optional for cleanup things. Never override destroy; always use the underscored method for this and for_read.
Readable inline:
const {Readable} = require('stream')
// This data can also come from other streams :]
let dataToStream = [
'This is line 1\n'
, 'This is line 2\n'
, 'This is line 3\n'
]
const myReadable = new Readable({
read() {
this.push(dataToStream.shift())
if (!dataToStream.length) {
this.push(null) // End the stream
}
}
, destroy() {
dataToStream = null
}
})
myReadable.pipe(process.stdout)
What’s Backpressure?
Remember the internal buffer that I mentioned above? This is an in-memory data structure that holds the streaming chunks of data — objects, strings or buffers. Its size is controlled by thehighWaterMark property, and the default is 16KB of byte data, or 16 objects if the stream is in object mode. When data is pushed through the readable stream, the push method may return false. If so, that means that the highWaterMark is close to, or has been, exceeded, and that is called backpressure.
If that happens, it’s up to the implementation to stop pushing data and wait for the _read call to come, signifying that the consumer is ready for more data, so push calls can resume. This is where a lot of folks fail to implement streams properly. Here are a couple of tips about pushing data through read streams:
- It’s not necessary to wait for
_readto be called to push data as long as backpressure is respected. Data can continually be pushed until backpressure is reached. If the data size isn’t very large, it’s possible that backpressure will never be reached. - The data from the buffer will not be consumed until the stream is in a reading mode. If data is being pushed, but there are no
'data'events and nopipe, then backpressure will certainly be reached if the data size exceeds the default buffer size.
TailFile, which reads chunks from the underlying resource until backpressure is reached or all the data is read. Upon backpressure, the stream is stored and reading is resumed when _read is called.
async _readChunks(stream) {
for await (const chunk of stream) {
this[kStartPos] += chunk.length
if (!this.push(chunk)) {
this[kStream] = stream
this[kPollTimer] = null
return
}
}
// Chunks read successfully (no backpressure)
return
}
_read() {
if (this[kStream]) {
this._readChunks(this[kStream])
}
return
}
In Summary
There’s a lot more to it, especially when you talk about write streams, but the concepts are all the same. As I stated above, the information for streams is plentiful, but scattered. As I write this, I cannot find the place where I learned that ‘push can be called continuously’, but trust me, it’s a thing, even though the backpressure doc below always recommends waiting for _read. The fact is, depending on what you’re trying to implement, the code becomes less clear-cut, but as long as backpressure rules are followed and methods are overridden as required, then you’re on the right track!
Helpful Resources
These are the documents I really learned some things from. Check them out!- Backpressuring in Streams
- Understanding Streams in Node.js
- Medium blog about streams
- Node.js file streams explained!
YOUTUBE.COM/THENEWSTACK
Tech moves fast, don't miss an episode. Subscribe to our YouTube
channel to stream all our podcasts, interviews, demos, and more.