Peacock Data

At our original website you will be able to read about and purchase our quality data management services ranging from list scrubbing to custom programming and beyond. You will not only find the customary array, but also one-of-a-kind services only available from Peacock Data. Our specialty is our trademarked merge/purge services. Go there >>

Peacock Data 2

At our new sister website we offer a full line of database products crafted with the same quality found in our well-known data management service. There you will find unique packages relating to ZIP Codes, United States Census demographics, GeoCoding, names and nicknames, gender coding and more—and the list is growing. Go there >>
Feb
08

Restructuring the pdNickname Database

By Peacock Data

An alternative structure for is to have one record per name with the variations in fields next to it. This tutorial explains how to do it.

Matching and merging names can be tricky. How do you relate William Smith with Bill Smith? The pdNickname database can be utilized to match names that are dissimilar because one has a given first name while another has a nickname or other variation, or vice versa.

Out of the box pdNickname is structured to allow immediate compatibility with the greatest number of database systems as well as to make it easy to become familiar with.

The nickname database is setup with two names per record. The first name field contains the names you are looking up, and in the second is a variation for each name—nickname, diminutive, given name, variant, etc. The same name can be listed several times in the first field, each time with a different variation. (See Figure 1.)

FIGURE 1: PDNICKNAME OUT OF THE BOX

If the names compared are Alexander Jones and Alex Jones, all names matching Alexander (NAME-A) are scanned until a variation is found that matches Alex (NAME-B). This works well, but there are other ways of organizing pdNickname that could work even better for you. In fact, we have restructured the table for utilization in our own services.

An alternative structure is to have one record per name and the variations in fields next to it. It is not practical to have separate fields for each variation, which can range from one to over two hundred. So what we do is have two Memo fields (also known as Long Text), one for close variations (relflag = "1") and the other for more distant variations (relflag = "2"), with the string of variations separated by delimiters for easier matching. (See Figure 2.)

FIGURE 2: PDNICKNAME RESTRUCTURED

Note: when browsing a table, normally you cannot see the content of a Memo or Long Text field because the database keeps it in a separate file. For this screenshot we have made the content visible.

Structured this way, when your program finds a match for NAME-A, it then determines if NAME-B can be found in variation field one or variation field two. This can be faster because you only access one record in each search request. The code sample below is an example in Visual FoxPro that illustrates this. Of course other programs use different commands and syntax to achieve the same outcome.

* CODE SAMPLE

*- this Visual FoxPro function receives as parameters
*- the two first names being compared - it returns a
*- variable indicating what matches are found - this
*- function is based on the restructuring of the
*- pdNickname database described in this tutorial

FUNCTION pdNickname
LPARAMETERS cNameA, cNameB
LOCAL nMatch

IF NOT USED("nicknames")
    USE nicknames ALIAS nicknames IN 0
ENDIF

cNameA = PADR(UPPER(ALLTRIM(cNameA)),25," ")
cNameB = "/"+UPPER(ALLTRIM(cNameB))+"/"

nMatch = 0
IF SEEK (cNameA, "nicknames", "name")
    DO CASE
    CASE OCCURS(cNameB, nicknames.variations) > 0
        nMatch = 1
    CASE OCCURS(cNameB, nicknames.var2) > 0
        nMatch = 2
    ENDCASE
ENDIF

RETURN nMatch

pdNickname, like all our Database Products, are structured to satisfy most users from the start. But there are many ways to integrate the databases into your system. It is up to you to determine what works best for you. Do not be afraid to experiment.

Leave a Reply

Services

Products