Skip to main content

Using DataWeave (DW) to Stream in Mule 4

Do you need to retrieve and process huge amounts of data, e.g. Videos, images and documents from the source to the target?
We have a potential solution called Streaming in DW to help you manage massive amounts of data, which is explained in this blog.

What is Streaming?

The term "streaming" refers to continuous, never-ending flows of data that can be used without the need for downloading.

These streams improve efficiency and scalability since there is no need to load a huge amount of data into memory before a service execution. They can also accelerate the processing of large documents without overburdening the memory. One of the features of DataWeave is that it supports what's known as "end-to-end streaming" in Mule applications. 

Streaming in DW

Instead of scanning the entire document to index it, DW processes the data as it arrives during Streaming. When using the deferred option, the Streaming DW can send the streamed output data directly to the next message processer. This behaviour allows DataWeave in Mule to process data more quickly, utilising fewer resources/memory.
To perform to enable streaming, these are the configuration properties that we need:

  • Streaming property: To read data from source as streams.
  • Deferred property: Used to pass the Output stream to the next message processor.
  • Also streaming can be enabled by:
    • Setting OutputMimeType with the required data format and streaming is set to true. Below is an example scenario where DW streaming is used with the HTTP listener.

As the data is huge, streaming is enabled to avoid memory overloading, and processing gets done more quickly. In this scenario, we have processed one file at a time.
In the current example, a third party system stores the content and is exposed as the rest API.

Streaming in DW

1. An HTTP requestor should be used from the system API to invoke the rest API, get the content, and be sent in the response without using any transform activity.

Note: If any translation is performed on the response received, the data will be stored in the memory.

Below are the steps to be followed to enable streaming.

  • Streaming property is added at the HTTP requestor of the target system from which the attachment information is fetched in the system API.
enable streaming

<http:request method="GET" doc:name="Request TARGET API To Get Content " doc:id="918128c2-fe2c-4523-be70-860da6941aa8" config-ref="HTTPS_Request_configuration" url="${getcontenturl}" outputMimeType="application/json; streaming=true">

2. From the process API, the above system API has to be called to get the content information, so in this step also, we will enable streaming to continue through the pipeline

  • The streaming property is added at the HTTP requestor of the target system, from which the attachment information is fetched in the system API.

the http requestor of the Target system

<http:request method="GET" doc:name="GET Content Details" doc:id="f58b7c40-0444-4a72-bbda-a46c4438a708" config-ref="HTTP_Request_configuration" path="/getcontent" outputMimeType="application/json; streaming=true"/>

  • If the data has to be passed to the next processor, the deferred property must be set in the transform activity and access the payload.

passed to next processor, then deferred property

In this way, DW Streaming is utilised in this scenario to deliver better performance. If multiple attachments have to be processed in this scenario, we have to use concurrency and streaming together. Also, this streaming is supported for the data formats JSON, XML and CSV.

The limitations of DW Streaming

There are a few limitations of the DataWeave streaming solution. For example:

  • If the property deferred=true is used in the transform activity and during the end-to-end processing, if any error occurs in that activity, the error will not be thrown in that transform, causing ambiguity.
  • During Streaming, it accesses each unit of the stream sequentially, due to which random access to the object/document is not supported.

If you would like to find out more about how we can help you leverage the RAML and MuleSoft, give us a call or email us at

Other useful links:

Best practices to implement Salesforce right the first time.

No need to search: technology, trends and insights for 2022.

Managing legacy systems (upgrade, replace, rebuild).

Let’s engage