KEYWORDS Part Searches
Since the descriptions of our parts is inconsistant (typos, sometimes use abreviations, different abreviations, etc.) we need a way to reliably find parts in Dayton Access and StoreFront. In addition, when the on-line user does a search by description, we have no way to restrict what they type. Their search words may be very good, they may be horrible, or they may be anything in between.
Introducing: KEYWORDS.
There is a single program (SYSS9094.4) that does all of the scrubbing. Right or wrong, the 2 uses of this program do the same, consistent stuff. The 2 uses of the scrubber are:
-
To build the KEYWORDS1 for parts based on:
- The part's description
- The part's slang terms
- The part's Sales Category
- The part's Sales Category description
- The part's Sales Category slang terms
- Scrub the search criteria that a use enters when searching for parts from Dayton Access/StoreFront
Below are the loose steps used each time a "scrubbing" occurs. It is performed on each 'WORD".
- The values (either the description or the on-line user's search criteria) are converted to upper case
- Tabs are removed
- A list of all valid Sales Cats (SYSTBL SLSCAT<1>) is built (for later use)
- The master scrubbing control record is read - this will be used to try to scrub inconsistant data to consistant data
- Convert plural words to singular words (excluding known exceptions like STAINLESS, FIBERGLASS, etc.)
- Dashes (-), commas and system delimiters converted to spaces, then all words TRIMmed2
- Known strings (combinations of characters) are changed from original (unacceptable) values to standard values, then re-TRIMmed2
- Convert plural words (words ending in "S") to singular words except for known words that should end in "s", then re-TRIMmed2
-
Look for specific patters and scrub
- If (WORD MATCH "0X'.'0X'.'") then remove periods (assume abreviations)
- Remove all leading periods from WORDs
- Remove all trailing periods from WORDs
- Convert periods to spaces (AAA.BBB becomes AAA BBB) thus splitting a word into multiple words
- Change double_quotes to space+double_quote+space
- If the last 4 characters of a word = "#/FT", then replace with " # FT" (note the spaces))
- Change "W/" to "W/ " (thus splitting "W/COIL" into "W/ COIL")
- If a word begins with "#", remove the leading "#"
- If a word ends with "#", replace the trailing "#" with " LB"
- If the word is a decimal number beginning with "0.", then remove the "0."
- If the last 2 characters are "MM" and the prior character is a number, replace the trailing "MM" with " MM" (assume a milimeter measurement)
- If the WORD matched 0N/0A then convert "/" to space
- If the WORD matched 0A/0A then convert "/" to space
- If the WORD is a number ending with a "*", remove the trailing "*"
- If the WORD ends with '"P' then replace the '"P' with ' INCH PAVING'
- If the WORD ends with "'L" then replace the "'L" with " FT LENGTH"
- If the word ends with a double quote and the prior character is a number, replace the double quote with " INCH"
- If the word ends with a single quote and the prior character is a number, replace the single quote with " FT"
- Convert "##M/##M" to "## ## M
- If the last character is "M" and the prior character is a number replace the trailing "M" with " M" (assume meters)
- If the word is one of our known "codes" of rebar, append " REBAR" to the end of the word
- Remove any trailing single quotes
- Re-TRIM2 all the remaining WORDs
- Massage adjacent words and try to build a Sales Cat code - if found, add as a new WORD
- Try removing trailing letters from WORDs to see if we get a Sales Cat - if found, add as a new WORD (thus C49D would also be found as C49)
- Ignore all single letter WORDs except "M", "T" or digits
- Scrub the words that a use enters when searching for parts from Dayton Access/StoreFront
If the scrubbing program is called from the nightly run or when saving a part from Item Maint, all the scrubbed WORDS are then saved in the part record itself.
If the scrubbing program is called from Dayton Access or StoreFront, the scrubbed WORDS are used to select the desired parts by comparing these WORDS against the KEYWORDS saved in each part.