What this site is all about

If you are reading this page, congratulations! This HTML page was generated using the data flow tool NiFi. This site does not go into the configuration or 'how to use' guide for NiFi (there are plenty of articles and sites to do that). Rather, its main purpose is to demonstrate and showcase the capability of using NiFi as a solution to your data flow needs. In order to get started exploring the various NiFi flows presented here, you will need to download the guest certificate (updated 8/13/2021) accessible through this site.

  1. Download the guest cert (guest.p12)
  2. Import the cert into your (Chrome) browser and use the following password: guest_nifi
  3. Go to the (secured) NiFi Demo Web Page
  4. Under development: Download the test cert (test.p12)(test_nifi)

Once the cert is successfully installed you will be able to access the NiFi main page as a guest. The cert provides the opportunity to only view the flows. You will not be able to modify any part of the NiFi flows (a few of the flow processors are blocked out completely).

Please view the following video for step by step instructions:

Hardware/Software device configurations

This site is built upon and uses Amazon’s AWS for hosting (using EC2, t2.medium instance type) and database storage (using RDS, db.t2.micro instance type). The minifi instance runs locally on a Raspberry Pi computer. The netflow data type is version v9.

Earthquake Flow

This flow obtains the latest earthquake data from the USGS web site (retrieved once per hour). The xml file containing the quake data is parsed into individual quake ‘events’ (containing location, time, and magnitude). These results are stored into a database. The data is further processed to merge all of the quake results into a single web page for viewing, which can be forwarded via a JMS message for delivery. A sample JMS receiver flow is provided as well.
Additionally, quake events are saved in local folders based upon the magnitude of the quake.

The 24 hour and large quakes data can be viewed via a browser.

Disney World Wait Time Flow

This flow uses a custom (DisneyWaitTimes) written NiFi processor (I wrote it) to query the Disney Web site for attraction wait times for parks located at Disney World. The wait times are queried every 5 minutes between the hours of 9AM to 10PM. The JSON data is parsed into individual attraction ‘events’ (containing attraction name, park Id, and wait time). The results are stored in a database.

The attraction wait time data (generated by NiFi) for the various parks can be viewed via a web browser:

  1. Magic Kingdom
  2. Epcot
  3. Hollywood Studios
  4. Animal Kingdom

Netflow Processing

This flow receives netflow generated network traffic data from a remote site (my basement router) and stores the results in a database. You can view a detailed description of netflow here. The remote site (my basement) has a minifi agent running on a Raspberry Pi to process and forward the compressed JSON formatted netflow traffic (using nfdump). A copy of the minifi flow is provided on the main NiFi graph. Once the compressed netflow data is received, it is decompressed and the individual netflow events parsed for database storage.

The last 24 hour netflow traffic data can be viewed via a browser.

Nextbus locations (SFO area)

This flow utilizes NiFi to get the NextBus. It provides real time bus location information for San Francisco. It was modified to graphically display (using google maps), actual bus locations.

Stock Picker Flow

The Stock Picker flow retrieves several stock prices at various times during the day. Calculations are made regarding the price direction and (if a criterion is met), an email is generated containing the stock of interest. Price quotes are made (Monday-Friday) at the following times:
8:45 AM - Premarket status
9:45 AM - Market open
3:45 PM - Prior to market close
7:45 PM - After market close

Here is a sample email alert:
DUST: Latest price 21.3 at 01/12/2018 16:00:00.482EDT.
Current day price changed by -5.5%.
Stock open price 22.54. Previous close 23.19
Current price changed from previous day close by -8.15%.
Previous day stock price changed by -8.15 %.

Twitter Demo Flow

The Twitter Demo flow tracks the twitter feed of a single user. The 'tweets' about the user and sent from the user are processed and messages extracted and merged into a single result. Tweets about the user are merged into 100 message 'blocks'. They are queued for 12 hours.

Network Packet Capture Processing

The Network Packet Capture Processing flow take raw pcap files (collected by tcpdump command) and processes/sorts them into the individual protocols (ie. http, udp, ssh, telnet ...). The flow uses a custom processor (DaffodilParse) to transform the (binary) pcap data to json format.
Here is a sample:
{"PCAP": {"PCAPHeader": {"MagicNumber": "D4C3B2A1","Version": {"Major": "2","Minor": "4"},"Zone": "0","SigFigs": "0","SnapLen": "262144","Network": "1"},"Packet": [{"PacketHeader": {"Seconds": "1511573177","USeconds": "685686","InclLen": "98","OrigLen": "98"},"LinkLayer": {"Ethernet": {"MACDest": "485D36F12345","MACSrc": "6C0B84012345","Ethertype": "2048","NetworkLayer": {"IPv4": {"IPv4Header": {"Version": "4","IHL": "5","DSCP": "0","ECN": "0","Length": "84","Identification": "45724","Flags": "2","FragmentOffset": "0","TTL": "64","Protocol": "1","Checksum": "3804","IPSrc": "10.0.0.10","IPDest": "1.2.3.4"},"Protocol": "1","ICMPv4": {"Type": "8","Code": "0","Checksum": "48755","EchoRequest": {"Identifier": "14113","SequenceNumber": "1","Payload": "B9C6185A0000000067760A0000000000101112131415161718191A1B1C1D1E1F202122232425262728292A2B2C2D2E2F3031323334353637"}}}}}}},{"PacketHeader": {"Seconds": "1511573177","USeconds": "695023","InclLen": "98","OrigLen": "98"},"LinkLayer": {"Ethernet": {"MACDest": "6C0B84012345","MACSrc": "485D36F12345","Ethertype": "2048","NetworkLayer": {"IPv4": {"IPv4Header": {"Version": "4","IHL": "5","DSCP": "0","ECN": "0","Length": "84","Identification": "57858","Flags": "0","FragmentOffset": "0","TTL": "61","Protocol": "1","Checksum": "8822","IPSrc": "1.2.3.4","IPDest": "10.0.0.10"},"Protocol": "1","ICMPv4": {"Type": "0","Code": "0","Checksum": "50803","EchoReply": {"Identifier": "14113","SequenceNumber": "1","Payload": "B9C6185A0000000067760A0000000000101112131415161718191A1B1C1D1E1F202122232425262728292A2B2C2D2E2F3031323334353637"}}}}}}},{"PacketHeader": {"Seconds": "1511573178","USeconds": "687359","InclLen": "98","OrigLen": "98"},"LinkLayer": {"Ethernet": {"MACDest": "485D36F12345","MACSrc": "6C0B84012345","Ethertype": "2048","NetworkLayer": {"IPv4": {"IPv4Header": {"Version": "4","IHL": "5","DSCP": "0","ECN": "0","Length": "84","Identification": "45806","Flags": "2","FragmentOffset": "0","TTL": "64","Protocol": "1","Checksum": "3722","IPSrc": "10.0.0.10","IPDest": "1.2.3.4"},"Protocol": "1","ICMPv4": {"Type": "8","Code": "0","Checksum": "18028","EchoRequest": {"Identifier": "14113","SequenceNumber": "2","Payload": "BAC6185A00000000DE7C0A0000000000101112131415161718191A1B1C1D1E1F202122232425262728292A2B2C2D2E2F3031323334353637"}}}}}}},{"PacketHeader": {"Seconds": "1511573178","USeconds": "697692","InclLen": "98","OrigLen": "98"},"LinkLayer": {"Ethernet": {"MACDest": "6C0B84012345","MACSrc": "485D36F12345","Ethertype": "2048","NetworkLayer": {"IPv4": {"IPv4Header": {"Version": "4","IHL": "5","DSCP": "0","ECN": "0","Length": "84","Identification": "58674","Flags": "0","FragmentOffset": "0","TTL": "61","Protocol": "1","Checksum": "8006","IPSrc": "1.2.3.4","IPDest": "10.0.0.10"},"Protocol": "1","ICMPv4": {"Type": "0","Code": "0","Checksum": "20076","EchoReply": {"Identifier": "14113","SequenceNumber": "2","Payload": "BAC6185A00000000DE7C0A0000000000101112131415161718191A1B1C1D1E1F202122232425262728292A2B2C2D2E2F3031323334353637"}}}}}}}]}}

The results are routed based upon the protocol (source port). Here is an example of a DNS request:
{"PacketHeader":{"Seconds":"1511573177","USeconds":"248015","InclLen":"70","OrigLen":"70"},"LinkLayer":{"Ethernet":{"MACDest":"485D36F12345","MACSrc":"6C0B84012345","Ethertype":"2048","NetworkLayer":{"IPv4":{"IPv4Header":{"Version":"4","IHL":"5","DSCP":"0","ECN":"0","Length":"56","Identification":"16770","Flags":"0","FragmentOffset":"0","TTL":"64","Protocol":"17","Checksum":"56051","IPSrc":"10.0.0.10","IPDest":"192.36.148.17"},"Protocol":"17","TransportLayer":{"UDP":{"UDPHeader":{"PortSrc":"52206","PortDest":"53","Length":"36","Checksum":"24181"},"Data":"D88D0010000100000000000100000200010000291000000080000000"}}}}}}}

Display Web Results

This flow receives HTML web page requests and returns the results to the browser. Data for the displayed (google) graphs are retrieved from a database. Valid page requests are:

  1. Overview
  2. Download the guest cert (guest.p12)
  3. 24 hour quake results
  4. Quakes over mag. 5
  5. Magic Kingdom Wait Times
  6. Epcot Wait Times
  7. Hollywood Studios Wait Times
  8. Animal Kingdom Wait Times
  9. Last 24 hour netflow traffic
  10. Bus locations in San Francisco area

Minifi

This is a copy of the flow that resides on the remote minifi. This minifi flow is used to forward netflow data to the main NiFi graph.