Posted 16 years ago on 2/26/2008 and updated 9/24/2010
Take Away:
DFDs document a process by documenting the flow of data throughout the process. They depict how data interacts with a system. They can be used to engineer a new process, document an existing process, or re-engineer an existing process.
KB100887
Got 10 minutes? Want to get started with Data Flow Diagrams (DFD)?
This article is part of our series of 10 Minute Quick Starts. Each quick start is step by step, assumes you know very little about the subject, and takes about 10 minutes. You can use them to scratch the service of areas you want to learn and as a quick review when returning to something after a long absence.
Data Flow Diagrams
DFDs document a process by documenting the flow of data throughout the process.
--Mike Prestwood
Data Flow Diagrams (DFDs) are useful in modeling a business process because they systematically subdivide a task. They also help the analyst understand the system that they are trying to model.
DFDs are also known as Data Flow Graphs, Bubble Charts, Petri Networks, Yourdon/DeMarco notation, and Gane/Sarson notation.
DFDs document a process by documenting the flow of data throughout the process. They depict how data interacts with a system. They can be used to engineer a new process, document an existing process, or re-engineer an existing process.
Traditional Data Flow Diagrams use four (4) symbols, a square, an arrow, a circle, and parallel lines.
The square represents external entities, either sources or repositories for data. They are labeled with a noun that describes the entity (i.e. Person(s), Department, Division, Office, Customer, Manager, Clerk, Another Computer System, etc.)
The arrow represents data flowing from one place to another and is labeled with a noun that describes the data flowing (i.e. Payment, Bill, Paycheck, Money, etc.)
The circle represents a process (the conversion of input data to output data). The process is labeled with a verb-noun combination (i.e. Gather Data, Process Loan, Contact Customer, etc.)
The parallel lines represent a data store, or collection of data, and is labeled with either a noun or adjective-noun combination (i.e. Table, Customer Table, Computer File, Notebook, Paper Notebook, etc.)
The container symbols (square, circle, and parallel lines) must have a description. All arrows must have a description except for arrows going to/from a data store (parallel lines).
The rules for properly using arrows are:
Arrowhead indicates direction data is traveling.
Processes must have at least one input and one output.
Never connect two data stores (parallel lines).
Never connect two external entities (squares).
Moving from one process (circle) to another is okay.
Alternate Symbols
Many additions and variations to the traditional symbols have been developed over the years. Two of the more common ones are Gane/Sarson and Yourdan.
Gane/Sarson
Process are represented by a square with rounded corners instead of a circle.
Data Stores are represented by an open ended rectangle instead of parallel lines.
Many alternate types of connecters
Strict Use of an identifier
Yourdan
Sources of data are called External Interactors
External Interactors are represented by a rectangle (no squares).
New symbol representing the state of data.
Includes a triangle and oval as symbols.
Use of a two-line border to represent "multiple".
Various alternate connectors.
Leveling
The most common approach to creating Data Flow Diagrams involves a technique that is often called leveling. Leveling is the process of breaking the Data Flow Diagram into varying levels of detail.
The first level, or Context Diagram, is often called a Level 0 DFD. The context diagram shows the "Big Picture", as well as, how each of the main processes interact with each other. To create the context diagram, start with a process in the middle of the page and work in a free-flow order around the central process.
Each of the succeeding levels break down the processes from the previous level, beginning with the context diagram. Each process (circle) is numbered to indicate the level.
Details
A detailed DFD generally has the external entities on top and bottom. The data stores, then, go in the center and the processes go between the external entities and the data stores.
When labeling Data Flow Diagrams, use the following guidelines:
Use short titles with a short description right on the data flow diagram.
Assign identifiers after diagram is complete.
Label all containers for context diagrams and only processes for detailed Data Flow Diagrams (level 1, 2, 3, etc.)
Summary
This brief article in no way covers all the details of Data Flow Diagrams. It also does not include information on the analysis phase of DFDs. It did, however, provide a thorough introduction to DFDs, which will assist you in learning more.
...hmmm....not usually. The level 0 diagram is supposed to show the context of the data-system you are documenting and if you're tempted to create two, then I would think you are either trying to document two systems or are not thinking at a high enough "level" for the level 0 context diagram. In other words, you need to continue leveling.
However, to me, documentation is strictly for the purpose of communication. So, if one of my analysts came to me with two level 0 diagrams, I would simply ask, "how does this communicate better than a single level 0 diagram?"
Does that make sense? Can you think of a reason why you would want to use two level 0 context diagrams?
Well, after just over a year, I finally found a bit of time to circle back and add a few pics to this article. Thanks for the kind words and great suggestion.
Would you ever create two DFD diagrams to model a 'create' versus 'change' process or system? For example, if I create a sales order I would have things like customer data, product data etc as input and the actual sales order as output.
Now suppose I want to make a change to the sales order. The input to the 'Change Sales Order' might be 'sales_order_number' which would be used to read the data store 'SALES-ORDERS' to retrieve the sales order and then open it to changes by the user.
Should this be represented as two diagrams? A voice inside my head wants to say no, but I'm not sure how to resolve the difference between 'create' versus 'change' on one diagram.
Also, consider an option to only display the sales order i.e. no changes allowed - is this a third process? I understand that a process requires at least one input dataflow and one output dataflow. So while the 'Display Sales Order' process would also have 'sales_order_number' as input to read the 'SALES-ORDERS' datastore, can we consider displaying the sales order on the screen as output? In other words, since nothing is ever written back to the 'SALES-ORDERS' datastore, then the 'at least one input dataflow and one output dataflow' rule is broken.
How would you modle an application that has create, change and display options?
Thank you for not answering homework questions. I've been introduced to data flow diagrams this week and have an assignment due in two days. I'd rather understand it the hard way than have the answers handed to me. Your explanation has helped me. Thank you.
Your article is very important to me. thanks for post it.
I ve heard that no of input data flow arrows and output data flow arrows of context diagram and DFD level 0 should be equal. but I confused with that how to count them individually. how to identify that a certain data flow arrow is a input or output ?