Archive for April, 2010
Using the pdNickname RELFLAG Field
Posted by: | CommentspdNickname is a unique nearly 50,000 record database designed to facilitate comparing sets of first name data based on nicknames, diminutives, pet names, variations and given names. One of the most important fields in the database product is RELFLAG, which stands for “Relationship Flag”.
The RELFLAG field contains one of two possible values:
-
Close relationship between the name and variation (common variants):
Includes closely associated nicknames, diminutives and pet names as well as first name variations that are considered closely related.
-
More distant relationship between the name and variation (less common variants):
Includes alternate forms of the names, often deriving from another culture, as well as nicknames, diminutives and pet names that are relatively uncommon.
PDNICKNAME VARIATIONS FOR THE GIVEN NAME “SAMUAL”

The RELFLAG field indicates if the name and variation have a (1) close or (2) more distant relationship.
The RELFLAG field is useful for controlling what is to be considered an acceptable match. As more distant relationships are included in matches, the error rate naturally rises. The error rate increase is usually not substantial, but it is measurable in hundredths and tenths of a percent.
RECOMMENDATIONS
RESIDENTIAL LISTS: While additional accuracy can be achieved if only close relationships are considered, with residential lists, the margin of error rate increase is almost always very small even when the more distant relationships are included—rarely more than 0.02% in our testing. Therefore, under best practices, it is fully acceptable to use all RELFLAG relationships when matching residential lists. With the exception of the George Foreman family, most errors that might occur result from different given name that share the same nickname or other variation.
BUSINESS & ORGANIZATION LISTS: On the other hand, with business and organization lists, when the more distant relationships are included the margin of error rate increase is typically higher, compared to residential lists. However, our testing normally shows an increase that is still less than 0.1%, but we have seen it as high as 0.3% with some large lists. Under best practices, it is recommended that only close relationships be considered when processing business and organization lists.
Keep Database Content Consistent
Posted by: | CommentsDatabases and spreadsheets are great for organizing information, but to get the most out of them, content must be entered in a consistent manner. Too often data entry results in a mishmash of styles and formatting. For example, one record may have the name entered first name first while the next has last name first. Or some telephone numbers may be entered with parenthesis delimiting the area code while others employ a hyphen or slash. Etc.
It is important for database managers, usually in conjunction with interested associates, to develop rules for entering information into their tables. Inconsistencies make it difficult filter, sort, query, merge and process the database records and can even lead to data corruption and security vulnerabilities. Data entry rules can take the form of a written style guide, preprogrammed validation routines or a combination of both.
STYLE GUIDE
A style guide (sometimes called a style manual) is a written set of standards for entering content. It is meant to foster uniformity in style and formatting by providing rules indicating how each data element is to be handled. These elements can include names, addresses, telephone numbers, dates of birth and any other fields and field sets that would benefit from constancy. Below is an example of part of a style guide:
| Data Element | Style Requirements | Example |
|---|---|---|
| Customer name | Include. Enter first name first. Exclude title. Punctuate initials and suffix abbreviations. | George E. Jones, Jr. |
| Home telephone | Include. Format: (###) ###-####. | (800) 609-9231 |
| Fax |
Include, if applicable. Format: (###) ###-####. |
(818) 480-4391 |
| Web address | Include with http://. | http://www.peacockdata.com |
A style guide does not have to be exhaustive, only comprehensive enough to advance sufficient standardization. Depending on the situation, one or two pages is often enough, while other circumstances may require a book-sized document. In any case, style and formatting rules must be clear and readable to be effective, and the inclusion of examples has proven to be extremely helpful for end users.
VALIDATION ROUTINES
In addition to or instead of written guides, style and formatting rules are often enforced with validation routines built directly into the interfaces used for data entry. These make sure information is valid, reasonable and secure while at the same time further reduces the opportunity for user error.
Some validation routines are automated so input is restricted at the time of data entry. The most common of these is an input mask which sets a template or pattern for a field or field set that automatically formats entered content. Input masks are particularly suitable for telephone numbers, ZIP and postal codes, times and dates.
Other validation routines are programs set to run when an end user tries to exit a field or save a record. These are similar to (or often exactly like) scripts that check content entered in web forms. Some routines are simple and may only guard against invalid characters and missing information, while others may be very complex and even check spelling and grammar.
• • Data entry rules and preprogrammed validation routines are an important part of data management, and taking time to tackle this part of the development process will inevitably reap generous rewards.