What to do when your data provider changes the format.
Intersection Magic is ‘configured’ for a particular format of ASCII data. When you first purchased IMW, we probably prepared a custom configuration. This custom configuration defines, among other things, the order of the fields in the ASCII import file; the meanings of any coded fields; the method of parsing, or extracting the data from any given field. Periodically, states re-evaluate the crash report form that they are using. When this is done, they frequently add or remove fields. Occasionally, they add or remove codes for a particular fields. Rarely, and unfortunately, they change the value that a code represents in a particular field. This article describes some things to think about if you find that your data source is different than it was when we prepared your configuration.
Intersection Magic’s import routine is very literal. When you tell it that it will find the date in the 5th column, that’s where it will look. If the ASCII file suddenly contains an extra character at the beginning of a line, then you have to tell the importer that it should look in the 6th column from now on.
This problem is often less dramatic in comma delimited files, because changing the length of a field only affects that single field. And, if the meaning of that field doesn’t change, then there really won’t be any difference in the importer at all.
If your data source has changed, you will need to consider the following questions:
- Has enough changed to warrant changing the configuration?
- It may be that the only changes are spelling, or date format, or the addition of fields that don’t even appear in the ASCII export file. In this case, the only reason for changing anything in IMW is to take advantage of new information.
- Is the only change the addition of new fields?
- If so, then the ASCII file may not have changed, or it may simply contain extra fields at the end. This will not affect the import process. However, if the new fields have been added to the middle of the ASCII record, then this will offset the fields that follow and require changes.
- Were fields that were in use, removed?
- If fields were removed (rare), then you must first determine if they were needed by IMW. The required fields for your configuration are named in your .DRD and .TBS files. The most common required fields are type of collision; direction of travel; vehicle movement; and location information. If the fields were not used by IMW, then the only concern is if the removal caused other fields in the ASCII to be moved to new positions. If not, (i.e. a blank is left in place of the old fields) then no change is needed. If other field positions were changed, you probably need to change your import definition.
- Are there new codes for some fields?
- This is actually common. It is also harmless, although some simple housekeeping can help IMW display these new codes appropriately. New codes can be added to the appropriate section in your lookup (.LUS) file. Also, it may be necessary to add these new codes to your .CNG file if the ASCII data uses text instead of codes.
- Have some codes been removed from some fields?
- This is less common, and requires that you look at your .DRD and .TBS files to see if those values were referenced. If they were not, it will probably be fine to just ignore the change. Those values simply won’t appear in future data. HOWEVER: If the codes for other values in that field have changed, you may have more work cut out for you. See the next section “Have some codes changed their meaning?”
- Have some codes changed their meaning?
- This is the tricky one. If you used to have a field that had the following coded values:
- Head on
- Approach turn
- Rear end
And it was decided to remove Approach turn, resulting in this:
- Head on
- Rear end
This will cause problems if it is not addressed. See the “solution” section below.
In most of the cases above, the solution is to update some of: your lookup file, your object table, your drawdata file, and most often, your import definition.
Then there’s the case of changed codes within the fields:
As you can see in the example above, while the program thinks that 4 means broadside, it will start receiving data where 4 means rear end. This will result in incorrect data and analysis. You have a few options in this case:
- Create a new configuration and begin maintaining 2 different databases
- This strategy is the easiest, but it means that the program can not access data across the time period where the data changed. For example, you could not create a diagram showing data from both databases. A benefit of this option is that after a few years, you have some historic data; you can throw away the old data, and you can just keep using the format.
- Create a new configuration and database and then import historic data followed by normal current data imports
- This is the best solution, provided that your data supplier has converted the historic data to match the new format. (i.e. they have converted all 5’s to 4’s and 4’s to 3’s in the above example) To find out if this is possible, just ask you data supplier if you can get the past 5 years in the new format. If so, you are on your way.
- Convert your existing historic data to the new format and re-import it into IMW
- This involves the most work. Pd’ Programming can often assist in this process, and there is usually a fee involved. This is the second best solution, as your old data ends up being made “current”.
- Import the new data by modifying it to fit the old format
- This method is not recommended, although it is often one of the easier solutions. By modifying the new data as it is imported, to look like the old data, your historic and new data will be in the same database and comperable. However, you will generally be losing some level of detail, and every future change to the data format will need to be brought backwards. It is usually worth the effort to bring the old data current instead.