Apache NiFi GetFile and PutFile Processors

Hi, In this post I’ll discuss on how to leverage Get and Put File Processors using Apache-NiFi

Introduction

Apache NiFi is a real time data ingestion platform, which can transfer and manage data transfer between different sources and destination systems. It supports a wide variety of data formats like logs, geo location data, social feeds, etc. It also supports many protocols like SFTP, HDFS, and KAFKA, etc. This support to wide variety of data sources and protocols making this platform popular in many IT organizations.

So we will take two processors i.e. GetFile Processor, PutFile Processor and will try to establish relationship between two procesors. Our goal is to move the files from one location to another.

GetFile Processor

GetFile Processor is used to fetch files of a specific format from a specific directory. It also provides other options to user for more control on fetching. We will discuss it in properties section below.So now we will add this processor.

Steps to add this GetFile Processor to the WorkSpace

  • Drag the processor icon from the menu and you will see the following window.

  • Now we need to add GetFileProcessor, go to the top right corner and in filter box type GetFile and double click on the result then you willl processor getting added to the workspace.

  • Now we will set GetFile Properties which is important and without we can’t start the processor.

  • So the proeprties which are in bold are the mandatory properties, we need to set values for those properties without which we can’t start the processor. Let’s fill the properties and click on the Apply button.

PutFile Processor

PutFile Processor The PutFile processor is used to store the file from the data flow to a specific location. We will discuss it in properties section below.So now we will add this processor.

Steps to add this PutFile Processor to the WorkSpace

  • We can replicate the above mentioned steps and try to filter it out by PutFilter double click on it and you will end up with this screen.

  • So the proeprties which are in bold are the mandatory properties, we need to set values for those properties without which we can’t start the processor. Let’s fill the properties in properties tab and don’t forget to check sucess and failure in setting tab.
  • Now click Apply button.

Create Connection

  • Now we will create connection between two processors and we will end up with the following screen.

  • Now we go to the input directory and I have copied 100 pdf files.

  • Now we are ready to start the processors, in order to start the processors right click on the workspace and click start. Then we are are good to go. For our understanding i will start processors individually.

  • So now we will see 100 items in the queue.

  • Now we will start Put File Processor.

  • Finally now we will check in the Output Directory.

So this how we will leverage GetFile and PutFile Processors using Apache NiFi

Thanks,
Kartheek Gummaluri