How to prepare data for use in Intersection Magic
Intersection Magic has been designed to support a wide variety of data formats. This data may originate on a mainframe, mini, or micro computer. In some areas data may be obtained directly from the state, California, Iowa, Oregon, Arizona and Idaho to name a few. In other areas, data is stored at police departments on mainframes. In still others, data is maintained on PC database programs. Intersection Magic supports all of these cases. This article describes the various options and requirements for preparing data to import into Intersection Magic.
There are two general flavors of ASCII data files: fixed-field and delimited. Both of these formats contain standard ASCII characters (typically #32 – #125) and end of line (EOL) characters. The EOL characters are #13 + #10 (carriage return + line feed). Also, each of the lines may contain up to 2000 characters.
- Fixed-field: Fixed field files contain data that is column oriented. Each field is always found in the same column and always has the same length.
- Delimited: Delimited files contain data where each field is separated from the next with a delimiter. The most common delimiter is a comma (,). Another common delimiter is the tab character. There are always the same number of fields for any given record type. Typically, a file will have only one record type, although there are exceptions.
Intersection Magic supports both types of ASCII files. When preparing a configuration, we prefer comma delimited data where each line represents one crash. The most common supported export formats that Intersection Magic supports are:
- One line per crash: This means that each line in the file provides all the information needed to describe a crash. All vehicle data is included in the same line. (i.e. Veh1Dir, Veh2Dir, etc.) This is the easiest type of data to import.
- Multiple lines per crash, each representing one vehicle * : This means each line contains common information (i.e. case id, weather, date, time) as well as the data for a single vehicle (i.e. Direction, Movement, DUI, etc). In this case, a common “id” must exist in each line, in the same position that can be used to group the lines together to identify a single crash. Also, all of the lines that are to be grouped must be contiguous (one following the next).
- Multiple lines per crash, more than one format across lines * : This means that there are several different record formats being dumped into the same ASCII file. For example, one record type might be “environment”, including date, time, weather, location, etc. The next record type might be “vehicle”, including Direction, Movement, Type, Speed, etc. Another record type might be “occupants”, including Age, Injury severity, seatbelt, etc. Like the prior format, this format requires that each line contain common “id” to group the data lines. In addition, each line must have a field that indicates its data type. (i.e. “ENV”, “VEH”,”OCC” or “1”, “2”, “3”)
- Multiple Crash Files * : In addition to supporting these formats within a single file, crash records that originate in a relational database may be exported into multiple files, one for each table. In this case, all of the rules described in the previous formats apply. In addition, in order to “join” the files/tables at import time, there must be a common “id” across all the files/tables. Intersection Magic cannot join tables using multiple id fields. The field used as the case id in Intersection Magic must exist in each line of each file to be imported.
* Note – With crash data contained in multiple lines or files, Pd’ Programming strongly recommends an order field for vehicles and/or persons in the data files. This will assure that if data is re-imported at another date the order of vehicles and/or persons will remain the same.
During the configuration process, a sample of these data files should be sent to Pd’ Programming. It is best to send more than one year (or up to 5 meg) of data for testing purposes. It is also critical to provide the database schema, including at least:
- Field names and positions within the ASCII files.
- Lookup values for all fields in the data ( i.e. 1=”rear end”; 2=”side swipe”; N=”North”; etc.) Providing this data in electronic format is extremely helpful.
- Sample hardcopy reports. These will be used to verify the data in the ASCII files, as well as to verify the data after it has been imported . This could be as few as 5 hardcopy reports, or as many as needed. Another option is to provide a table that indicates the same information. Within the hardcopy reports, there should exist:
- At least one injury or fatality crash.
- At least one single-vehicle crash.
- At least one multi-vehicle crash.
- At least one crash at an intersection
- At least one mid-block / milepost crash
This is used to determine how distance and direction from intersection (or addresses) is coded.
- At least one left-turn crash
- At least one rear-end crash
- At least one pedestrian crash
This is used to determine how pedestrians are coded – as vehicle or as attribute to crash
- At least one bicycle crash
This is used to determine how bicycles are coded – as vehicle or as attribute to crash
- Example reports of any data elements that you feel are unique or need extra attention when describing your data. (i.e. The use of +/- in the Arizona database to indicate the direction from the intersection)