Node categories
Extractor
The Extractor node is in charge of extracting data from a certain source. This could be anything
from a CSV file, a XLSX file, a DB, an API...
It will yield Items from a function we call extract().
Configuration
-
bucket_size
- type:
int - default: 1000
- Size on which the extracted items will be packaged and enqueued to be sent to children Nodes.
- type:
-
run_in_main_process
- type:
bool - default: True
- If True, the Node will run with a thread in the main process. If False, the Node will run in its own spawned process.
- type:
Transformer
The Transformer Node is in charge of transforming Items.
Every Transformer node will have a transform(item) that will receive an Item and return its
transformed version.
Configuration
-
input_key
- type:
str - default: None
- Optional. When provided, if the incoming Item is a dict, the Node will receive
item [input_key]. If the item is not a dict, the Node will receiveitem.input_key.
- type:
-
run_in_main_process
- type:
bool - default: True
- If True, the Node will run with a thread in the main process. If False, the Node will run in its own spawned process.
- type:
Filter
The Filter Node is in charge of filtering out Items.
Every Filter node will have a filter(item) that receives an Item and will return the item or
not (None) if this item meets a certain condition.
Configuration
todo
Loader
The Loader node is in charge of loading Items: storing the items in a DB, a file, a Python object...
Every Loader node will implement a load(item) method which will load the incoming Item.
Configuration
todo
Bulking
Bulker
Bulker nodes will store Items in a bulk/batch (a Python list really) until it reaches a maximum size. When that happens, the bulk is passed as an Item by itself onto the connected children Nodes.
Configuration
todo
DeBulker
DeBulker nodes will unpack the Items contained in an incoming bulk Item (this can be a list of items, a set, a previously bulked set of Items, etc...) and produce them as individual Items onto the connected children nodes.
Configuration
todo